[PATCH 0/24] make atomic_read() behave consistently across all architectures

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/24] make atomic_read() behave consistently across all architectures
@ 2007-08-09 13:14 Chris Snook
  2007-08-09 12:41 ` Arnd Bergmann
  2007-08-14 22:31 ` Christoph Lameter
  0 siblings, 2 replies; 1651+ messages in thread
From: Chris Snook @ 2007-08-09 13:14 UTC (permalink / raw)
  To: linux-kernel, linux-arch, torvalds
  Cc: netdev, akpm, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl

As recent discussions[1], and bugs[2] have shown, there is a great deal of
confusion about the expected behavior of atomic_read(), compounded by the
fact that it is not the same on all architectures.  Since users expect calls
to atomic_read() to actually perform a read, it is not desirable to allow
the compiler to optimize this away.  Requiring the use of barrier() in this
case is inefficient, since we only want to re-load the atomic_t variable,
not everything else in scope.

This patchset makes the behavior of atomic_read uniform by removing the
volatile keyword from all atomic_t and atomic64_t definitions that currently
have it, and instead explicitly casts the variable as volatile in
atomic_read().  This leaves little room for creative optimization by the
compiler, and is in keeping with the principles behind "volatile considered
harmful".

Busy-waiters should still use cpu_relax(), but fast paths may be able to
reduce their use of barrier() between some atomic_read() calls.

	-- Chris

1)	http://lkml.org/lkml/2007/7/1/52
2)	http://lkml.org/lkml/2007/8/8/122

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-09 13:14 [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
@ 2007-08-09 12:41 ` Arnd Bergmann
  2007-08-09 14:29   ` Chris Snook
  2007-08-14 22:31 ` Christoph Lameter
  1 sibling, 1 reply; 1651+ messages in thread
From: Arnd Bergmann @ 2007-08-09 12:41 UTC (permalink / raw)
  To: Chris Snook
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

On Thursday 09 August 2007, Chris Snook wrote:
> This patchset makes the behavior of atomic_read uniform by removing the
> volatile keyword from all atomic_t and atomic64_t definitions that currently
> have it, and instead explicitly casts the variable as volatile in
> atomic_read().  This leaves little room for creative optimization by the
> compiler, and is in keeping with the principles behind "volatile considered
> harmful".
> 

Just an idea: since all architectures already include asm-generic/atomic.h,
why not move the definitions of atomic_t and atomic64_t, as well as anything
that does not involve architecture specific inline assembly into the generic
header?

	Arnd <><

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-09 12:41 ` Arnd Bergmann
@ 2007-08-09 14:29   ` Chris Snook
  2007-08-09 15:30     ` Arnd Bergmann
  0 siblings, 1 reply; 1651+ messages in thread
From: Chris Snook @ 2007-08-09 14:29 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

Arnd Bergmann wrote:
> On Thursday 09 August 2007, Chris Snook wrote:
>> This patchset makes the behavior of atomic_read uniform by removing the
>> volatile keyword from all atomic_t and atomic64_t definitions that currently
>> have it, and instead explicitly casts the variable as volatile in
>> atomic_read().  This leaves little room for creative optimization by the
>> compiler, and is in keeping with the principles behind "volatile considered
>> harmful".
>>
> 
> Just an idea: since all architectures already include asm-generic/atomic.h,
> why not move the definitions of atomic_t and atomic64_t, as well as anything
> that does not involve architecture specific inline assembly into the generic
> header?
> 
> 	Arnd <><

a) chicken and egg: asm-generic/atomic.h depends on definitions in asm/atomic.h

If you can find a way to reshuffle the code and make it simpler, I personally am 
all for it.  I'm skeptical that you'll get much to show for the effort.

b) The definitions aren't precisely identical between all architectures, so it 
would be a mess of special cases, which gets us right back to where we are now.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-09 14:29   ` Chris Snook
@ 2007-08-09 15:30     ` Arnd Bergmann
  0 siblings, 0 replies; 1651+ messages in thread
From: Arnd Bergmann @ 2007-08-09 15:30 UTC (permalink / raw)
  To: Chris Snook
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

On Thursday 09 August 2007, Chris Snook wrote:
> a) chicken and egg: asm-generic/atomic.h depends on definitions in asm/atomic.h

Ok, I see.
 
> If you can find a way to reshuffle the code and make it simpler, I personally am 
> all for it. I'm skeptical that you'll get much to show for the effort. 

I guess it could  be done using more macros or new headers, but I don't see
a way that would actually improve the situation.

> b) The definitions aren't precisely identical between all architectures, so it 
> would be a mess of special cases, which gets us right back to where we are now.

Why are they not identical? Anything beyond the 32/64 bit difference should
be the same afaics, or it might cause more bugs.

	Arnd <><

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-09 13:14 [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
  2007-08-09 12:41 ` Arnd Bergmann
@ 2007-08-14 22:31 ` Christoph Lameter
  2007-08-14 22:45   ` Chris Snook
  2007-08-14 23:08   ` Satyam Sharma
  1 sibling, 2 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-14 22:31 UTC (permalink / raw)
  To: Chris Snook
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

On Thu, 9 Aug 2007, Chris Snook wrote:

> This patchset makes the behavior of atomic_read uniform by removing the
> volatile keyword from all atomic_t and atomic64_t definitions that currently
> have it, and instead explicitly casts the variable as volatile in
> atomic_read().  This leaves little room for creative optimization by the
> compiler, and is in keeping with the principles behind "volatile considered
> harmful".

volatile is generally harmful even in atomic_read(). Barriers control
visibility and AFAICT things are fine.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 22:31 ` Christoph Lameter
@ 2007-08-14 22:45   ` Chris Snook
  2007-08-14 22:51     ` Christoph Lameter
  2007-08-14 23:08   ` Satyam Sharma
  1 sibling, 1 reply; 1651+ messages in thread
From: Chris Snook @ 2007-08-14 22:45 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

Christoph Lameter wrote:
> On Thu, 9 Aug 2007, Chris Snook wrote:
> 
>> This patchset makes the behavior of atomic_read uniform by removing the
>> volatile keyword from all atomic_t and atomic64_t definitions that currently
>> have it, and instead explicitly casts the variable as volatile in
>> atomic_read().  This leaves little room for creative optimization by the
>> compiler, and is in keeping with the principles behind "volatile considered
>> harmful".
> 
> volatile is generally harmful even in atomic_read(). Barriers control
> visibility and AFAICT things are fine.

But barriers force a flush of *everything* in scope, which we generally don't 
want.  On the other hand, we pretty much always want to flush atomic_* 
operations.  One way or another, we should be restricting the volatile behavior 
to the thing that needs it.  On most architectures, this patch set just moves 
that from the declaration, where it is considered harmful, to the use, where it 
is considered an occasional necessary evil.

See the resubmitted patchset, which also puts a cast in the atomic[64]_set 
operations.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 22:45   ` Chris Snook
@ 2007-08-14 22:51     ` Christoph Lameter
  0 siblings, 0 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-14 22:51 UTC (permalink / raw)
  To: Chris Snook
  Cc: linux-kernel, linux-arch, torvalds, netdev, akpm, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl

On Tue, 14 Aug 2007, Chris Snook wrote:

> But barriers force a flush of *everything* in scope, which we generally don't
> want.  On the other hand, we pretty much always want to flush atomic_*
> operations.  One way or another, we should be restricting the volatile
> behavior to the thing that needs it.  On most architectures, this patch set
> just moves that from the declaration, where it is considered harmful, to the
> use, where it is considered an occasional necessary evil.

Then we would need

	atomic_read()

and

	atomic_read_volatile()

atomic_read_volatile() would imply an object sized memory barrier before 
and after?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 22:31 ` Christoph Lameter
  2007-08-14 22:45   ` Chris Snook
@ 2007-08-14 23:08   ` Satyam Sharma
  2007-08-14 23:04     ` Chris Snook
                       ` (2 more replies)
  1 sibling, 3 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-14 23:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Chris Snook, Linux Kernel Mailing List, linux-arch, torvalds,
	netdev, Andrew Morton, ak, heiko.carstens, davem, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Tue, 14 Aug 2007, Christoph Lameter wrote:

> On Thu, 9 Aug 2007, Chris Snook wrote:
> 
> > This patchset makes the behavior of atomic_read uniform by removing the
> > volatile keyword from all atomic_t and atomic64_t definitions that currently
> > have it, and instead explicitly casts the variable as volatile in
> > atomic_read().  This leaves little room for creative optimization by the
> > compiler, and is in keeping with the principles behind "volatile considered
> > harmful".
> 
> volatile is generally harmful even in atomic_read(). Barriers control
> visibility and AFAICT things are fine.

Frankly, I don't see the need for this series myself either. Personal
opinion (others may differ), but I consider "volatile" to be a sad /
unfortunate wart in C (numerous threads on this list and on the gcc
lists/bugzilla over the years stand testimony to this) and if we _can_
steer clear of it, then why not -- why use this ill-defined primitive
whose implementation has often differed over compiler versions and
platforms? Granted, barrier() _is_ heavy-handed in that it makes the
optimizer forget _everything_, but then somebody did post a forget()
macro on this thread itself ...

[ BTW, why do we want the compiler to not optimize atomic_read()'s in
  the first place? Atomic ops guarantee atomicity, which has nothing
  to do with "volatility" -- users that expect "volatility" from
  atomic ops are the ones who must be fixed instead, IMHO. ]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:08   ` Satyam Sharma
@ 2007-08-14 23:04     ` Chris Snook
  2007-08-14 23:14       ` Christoph Lameter
  2007-08-15  6:49         ` Herbert Xu
  2007-08-14 23:26     ` Paul E. McKenney
  2007-08-15 10:35     ` Stefan Richter
  2 siblings, 2 replies; 1651+ messages in thread
From: Chris Snook @ 2007-08-14 23:04 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Linux Kernel Mailing List, linux-arch,
	torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Satyam Sharma wrote:
> 
> On Tue, 14 Aug 2007, Christoph Lameter wrote:
> 
>> On Thu, 9 Aug 2007, Chris Snook wrote:
>>
>>> This patchset makes the behavior of atomic_read uniform by removing the
>>> volatile keyword from all atomic_t and atomic64_t definitions that currently
>>> have it, and instead explicitly casts the variable as volatile in
>>> atomic_read().  This leaves little room for creative optimization by the
>>> compiler, and is in keeping with the principles behind "volatile considered
>>> harmful".
>> volatile is generally harmful even in atomic_read(). Barriers control
>> visibility and AFAICT things are fine.
> 
> Frankly, I don't see the need for this series myself either. Personal
> opinion (others may differ), but I consider "volatile" to be a sad /
> unfortunate wart in C (numerous threads on this list and on the gcc
> lists/bugzilla over the years stand testimony to this) and if we _can_
> steer clear of it, then why not -- why use this ill-defined primitive
> whose implementation has often differed over compiler versions and
> platforms? Granted, barrier() _is_ heavy-handed in that it makes the
> optimizer forget _everything_, but then somebody did post a forget()
> macro on this thread itself ...
> 
> [ BTW, why do we want the compiler to not optimize atomic_read()'s in
>   the first place? Atomic ops guarantee atomicity, which has nothing
>   to do with "volatility" -- users that expect "volatility" from
>   atomic ops are the ones who must be fixed instead, IMHO. ]

Because atomic operations are generally used for synchronization, which requires 
volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
  Some forget to and we get bugs, because people assume that atomic_read() 
actually reads something, and atomic_write() actually writes something.  Worse, 
these are architecture-specific, even compiler version-specific bugs that are 
often difficult to track down.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:04     ` Chris Snook
@ 2007-08-14 23:14       ` Christoph Lameter
  2007-08-15  6:49         ` Herbert Xu
  1 sibling, 0 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-14 23:14 UTC (permalink / raw)
  To: Chris Snook
  Cc: Satyam Sharma, Linux Kernel Mailing List, linux-arch, torvalds,
	netdev, Andrew Morton, ak, heiko.carstens, davem, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Tue, 14 Aug 2007, Chris Snook wrote:

> Because atomic operations are generally used for synchronization, which
> requires volatile behavior.  Most such codepaths currently use an inefficient
> barrier().  Some forget to and we get bugs, because people assume that
> atomic_read() actually reads something, and atomic_write() actually writes
> something.  Worse, these are architecture-specific, even compiler
> version-specific bugs that are often difficult to track down.

Looks like we need to have lock and unlock semantics?

atomic_read()

which has no barrier or volatile implications.

atomic_read_for_lock

	Acquire semantics?


atomic_read_for_unlock

	Release semantics?


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:04     ` Chris Snook
@ 2007-08-15  6:49         ` Herbert Xu
  2007-08-15  6:49         ` Herbert Xu
  1 sibling, 0 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-15  6:49 UTC (permalink / raw)
  To: Chris Snook
  Cc: satyam, clameter, linux-kernel, linux-arch, torvalds, netdev,
	akpm, ak, heiko.carstens, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

Chris Snook <csnook@redhat.com> wrote:
> 
> Because atomic operations are generally used for synchronization, which requires 
> volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
>  Some forget to and we get bugs, because people assume that atomic_read() 
> actually reads something, and atomic_write() actually writes something.  Worse, 
> these are architecture-specific, even compiler version-specific bugs that are 
> often difficult to track down.

I'm yet to see a single example from the current tree where
this patch series is the correct solution.  So far the only
example has been a buggy piece of code which has since been
fixed with a cpu_relax.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
@ 2007-08-15  6:49         ` Herbert Xu
  0 siblings, 0 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-15  6:49 UTC (permalink / raw)
  To: Chris Snook
  Cc: satyam, clameter, linux-kernel, linux-arch, torvalds, netdev,
	akpm, ak, heiko.carstens, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

Chris Snook <csnook@redhat.com> wrote:
> 
> Because atomic operations are generally used for synchronization, which requires 
> volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
>  Some forget to and we get bugs, because people assume that atomic_read() 
> actually reads something, and atomic_write() actually writes something.  Worse, 
> these are architecture-specific, even compiler version-specific bugs that are 
> often difficult to track down.

I'm yet to see a single example from the current tree where
this patch series is the correct solution.  So far the only
example has been a buggy piece of code which has since been
fixed with a cpu_relax.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15  6:49         ` Herbert Xu
  (?)
@ 2007-08-15  8:18         ` Heiko Carstens
  2007-08-15 13:53           ` Stefan Richter
  2007-08-16  0:39           ` [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert() Satyam Sharma
  -1 siblings, 2 replies; 1651+ messages in thread
From: Heiko Carstens @ 2007-08-15  8:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Chris Snook, satyam, clameter, linux-kernel, linux-arch, torvalds,
	netdev, akpm, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
> Chris Snook <csnook@redhat.com> wrote:
> > 
> > Because atomic operations are generally used for synchronization, which requires 
> > volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
> >  Some forget to and we get bugs, because people assume that atomic_read() 
> > actually reads something, and atomic_write() actually writes something.  Worse, 
> > these are architecture-specific, even compiler version-specific bugs that are 
> > often difficult to track down.
> 
> I'm yet to see a single example from the current tree where
> this patch series is the correct solution.  So far the only
> example has been a buggy piece of code which has since been
> fixed with a cpu_relax.

Btw.: we still have

include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));

Looks like they need to be fixed as well.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15  8:18         ` Heiko Carstens
@ 2007-08-15 13:53           ` Stefan Richter
  2007-08-15 14:35             ` Satyam Sharma
  2007-08-16  0:39           ` [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert() Satyam Sharma
  1 sibling, 1 reply; 1651+ messages in thread
From: Stefan Richter @ 2007-08-15 13:53 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Herbert Xu, Chris Snook, satyam, clameter, linux-kernel,
	linux-arch, torvalds, netdev, akpm, ak, davem, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On 8/15/2007 10:18 AM, Heiko Carstens wrote:
> On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
>> Chris Snook <csnook@redhat.com> wrote:
>> > 
>> > Because atomic operations are generally used for synchronization, which requires 
>> > volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
>> >  Some forget to and we get bugs, because people assume that atomic_read() 
>> > actually reads something, and atomic_write() actually writes something.  Worse, 
>> > these are architecture-specific, even compiler version-specific bugs that are 
>> > often difficult to track down.
>> 
>> I'm yet to see a single example from the current tree where
>> this patch series is the correct solution.  So far the only
>> example has been a buggy piece of code which has since been
>> fixed with a cpu_relax.
> 
> Btw.: we still have
> 
> include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
> include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));
> 
> Looks like they need to be fixed as well.


I don't know if this here is affected:

/* drivers/ieee1394/ieee1394_core.h */
static inline unsigned int get_hpsb_generation(struct hpsb_host *host)
{
	return atomic_read(&host->generation);
}

/* drivers/ieee1394/nodemgr.c */
static int nodemgr_host_thread(void *__hi)
{
	[...]

	for (;;) {
		[... sleep until bus reset event ...]

		/* Pause for 1/4 second in 1/16 second intervals,
		 * to make sure things settle down. */
		g = get_hpsb_generation(host);
		for (i = 0; i < 4 ; i++) {
			if (msleep_interruptible(63) ||
			    kthread_should_stop())
				goto exit;

	/* Now get the generation in which the node ID's we collect
	 * are valid.  During the bus scan we will use this generation
	 * for the read transactions, so that if another reset occurs
	 * during the scan the transactions will fail instead of
	 * returning bogus data. */

			generation = get_hpsb_generation(host);

	/* If we get a reset before we are done waiting, then
	 * start the waiting over again */

			if (generation != g)
				g = generation, i = 0;
		}

		[... scan bus, using generation ...]

	}
exit:
[...]
}



-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 13:53           ` Stefan Richter
@ 2007-08-15 14:35             ` Satyam Sharma
  2007-08-15 14:52               ` Herbert Xu
  2007-08-15 19:58               ` Stefan Richter
  0 siblings, 2 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-15 14:35 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Hi Stefan,

On Wed, 15 Aug 2007, Stefan Richter wrote:

> On 8/15/2007 10:18 AM, Heiko Carstens wrote:
> > On Wed, Aug 15, 2007 at 02:49:03PM +0800, Herbert Xu wrote:
> >> Chris Snook <csnook@redhat.com> wrote:
> >> > 
> >> > Because atomic operations are generally used for synchronization, which requires 
> >> > volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
> >> >  Some forget to and we get bugs, because people assume that atomic_read() 
> >> > actually reads something, and atomic_write() actually writes something.  Worse, 
> >> > these are architecture-specific, even compiler version-specific bugs that are 
> >> > often difficult to track down.
> >> 
> >> I'm yet to see a single example from the current tree where
> >> this patch series is the correct solution.  So far the only
> >> example has been a buggy piece of code which has since been
> >> fixed with a cpu_relax.
> > 
> > Btw.: we still have
> > 
> > include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
> > include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));
> > 
> > Looks like they need to be fixed as well.
> 
> 
> I don't know if this here is affected:

Yes, I think it is. You're clearly expecting the read to actually happen
when you call get_hpsb_generation(). It's clearly not a busy-loop, so
cpu_relax() sounds pointless / wrong solution for this case, so I'm now
somewhat beginning to appreciate the motivation behind this series :-)

But as I said, there are ways to achieve the same goals of this series
without using "volatile".

I think I'll submit a RFC/patch or two on this myself (will also fix
the code pieces listed here).

> /* drivers/ieee1394/ieee1394_core.h */
> static inline unsigned int get_hpsb_generation(struct hpsb_host *host)
> {
> 	return atomic_read(&host->generation);
> }
> 
> /* drivers/ieee1394/nodemgr.c */
> static int nodemgr_host_thread(void *__hi)
> {
> 	[...]
> 
> 	for (;;) {
> 		[... sleep until bus reset event ...]
> 
> 		/* Pause for 1/4 second in 1/16 second intervals,
> 		 * to make sure things settle down. */
> 		g = get_hpsb_generation(host);
> 		for (i = 0; i < 4 ; i++) {
> 			if (msleep_interruptible(63) ||
> 			    kthread_should_stop())
> 				goto exit;

Totally unrelated, but this looks weird. IMHO you actually wanted:

	msleep_interruptible(63);
	if (kthread_should_stop())
		goto exit;

here, didn't you? Otherwise the thread will exit even when
kthread_should_stop() != TRUE (just because it received a signal),
and it is not good for a kthread to exit on its own if it uses
kthread_should_stop() or if some other piece of kernel code could
eventually call kthread_stop(tsk) on it.

Ok, probably the thread will never receive a signal in the first
place because it's spawned off kthreadd which ignores all signals
beforehand, but still ...

[PATCH] ieee1394: Fix kthread stopping in nodemgr_host_thread

The nodemgr host thread can exit on its own even when kthread_should_stop
is not true, on receiving a signal (might never happen in practice, as
it ignores signals). But considering kthread_stop() must not be mixed with
kthreads that can exit on their own, I think changing the code like this
is clearer. This change means the thread can cut its sleep short when
receive a signal but looking at the code around, that sounds okay (and
again, it might never actually recieve a signal in practice).

Signed-off-by: Satyam Sharma <satyam@infradead.org>

---

 drivers/ieee1394/nodemgr.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/ieee1394/nodemgr.c b/drivers/ieee1394/nodemgr.c
index 2ffd534..981a7da 100644
--- a/drivers/ieee1394/nodemgr.c
+++ b/drivers/ieee1394/nodemgr.c
@@ -1721,7 +1721,8 @@ static int nodemgr_host_thread(void *__hi)
 		 * to make sure things settle down. */
 		g = get_hpsb_generation(host);
 		for (i = 0; i < 4 ; i++) {
-			if (msleep_interruptible(63) || kthread_should_stop())
+			msleep_interruptible(63);
+			if (kthread_should_stop())
 				goto exit;

 			/* Now get the generation in which the node ID's we collect

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:35             ` Satyam Sharma
@ 2007-08-15 14:52               ` Herbert Xu
  2007-08-15 16:09                 ` Stefan Richter
  2007-08-15 19:58               ` Stefan Richter
  1 sibling, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-15 14:52 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Heiko Carstens, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
>
> > I don't know if this here is affected:
> 
> Yes, I think it is. You're clearly expecting the read to actually happen
> when you call get_hpsb_generation(). It's clearly not a busy-loop, so
> cpu_relax() sounds pointless / wrong solution for this case, so I'm now
> somewhat beginning to appreciate the motivation behind this series :-)

Nope, we're calling schedule which is a rather heavy-weight
barrier.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:52               ` Herbert Xu
@ 2007-08-15 16:09                 ` Stefan Richter
  2007-08-15 16:27                   ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Stefan Richter @ 2007-08-15 16:09 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Heiko Carstens, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
>>> I don't know if this here is affected:

[...something like]
	b = atomic_read(a);
	for (i = 0; i < 4; i++) {
		msleep_interruptible(63);
		c = atomic_read(a);
		if (c != b) {
			b = c;
			i = 0;
		}
	}

> Nope, we're calling schedule which is a rather heavy-weight
> barrier.

How does the compiler know that msleep() has got barrier()s?
-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:09                 ` Stefan Richter
@ 2007-08-15 16:27                   ` Paul E. McKenney
  2007-08-15 17:13                     ` Satyam Sharma
  2007-08-15 18:31                     ` Segher Boessenkool
  0 siblings, 2 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 16:27 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Herbert Xu, Satyam Sharma, Heiko Carstens, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 06:09:35PM +0200, Stefan Richter wrote:
> Herbert Xu wrote:
> > On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
> >>> I don't know if this here is affected:
> 
> [...something like]
> 	b = atomic_read(a);
> 	for (i = 0; i < 4; i++) {
> 		msleep_interruptible(63);
> 		c = atomic_read(a);
> 		if (c != b) {
> 			b = c;
> 			i = 0;
> 		}
> 	}
> 
> > Nope, we're calling schedule which is a rather heavy-weight
> > barrier.
> 
> How does the compiler know that msleep() has got barrier()s?

Because msleep_interruptible() is in a separate compilation unit,
the compiler has to assume that it might modify any arbitrary global.
In many cases, the compiler also has to assume that msleep_interruptible()
might call back into a function in the current compilation unit, thus
possibly modifying global static variables.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:27                   ` Paul E. McKenney
@ 2007-08-15 17:13                     ` Satyam Sharma
  2007-08-15 18:31                     ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-15 17:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Stefan Richter, Herbert Xu, Heiko Carstens, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 06:09:35PM +0200, Stefan Richter wrote:
> > Herbert Xu wrote:
> > > On Wed, Aug 15, 2007 at 08:05:38PM +0530, Satyam Sharma wrote:
> > >>> I don't know if this here is affected:
> > 
> > [...something like]
> > 	b = atomic_read(a);
> > 	for (i = 0; i < 4; i++) {
> > 		msleep_interruptible(63);
> > 		c = atomic_read(a);
> > 		if (c != b) {
> > 			b = c;
> > 			i = 0;
> > 		}
> > 	}
> > 
> > > Nope, we're calling schedule which is a rather heavy-weight
> > > barrier.
> > 
> > How does the compiler know that msleep() has got barrier()s?
> 
> Because msleep_interruptible() is in a separate compilation unit,
> the compiler has to assume that it might modify any arbitrary global.
> In many cases, the compiler also has to assume that msleep_interruptible()
> might call back into a function in the current compilation unit, thus
> possibly modifying global static variables.

Yup, I've just verified this with a testcase. So a call to any function
outside of the current compilation unit acts as a compiler barrier. Cool.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:27                   ` Paul E. McKenney
  2007-08-15 17:13                     ` Satyam Sharma
@ 2007-08-15 18:31                     ` Segher Boessenkool
  2007-08-15 18:57                       ` Paul E. McKenney
  1 sibling, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-15 18:31 UTC (permalink / raw)
  To: paulmck
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>> How does the compiler know that msleep() has got barrier()s?
>
> Because msleep_interruptible() is in a separate compilation unit,
> the compiler has to assume that it might modify any arbitrary global.

No; compilation units have nothing to do with it, GCC can optimise
across compilation unit boundaries just fine, if you tell it to
compile more than one compilation unit at once.

What you probably mean is that the compiler has to assume any code
it cannot currently see can do anything (insofar as allowed by the
relevant standards etc.)

> In many cases, the compiler also has to assume that 
> msleep_interruptible()
> might call back into a function in the current compilation unit, thus
> possibly modifying global static variables.

It most often is smart enough to see what compilation-unit-local
variables might be modified that way, though :-)


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:31                     ` Segher Boessenkool
@ 2007-08-15 18:57                       ` Paul E. McKenney
  2007-08-15 19:54                         ` Satyam Sharma
  2007-08-15 21:05                         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
  0 siblings, 2 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 18:57 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
> >>How does the compiler know that msleep() has got barrier()s?
> >
> >Because msleep_interruptible() is in a separate compilation unit,
> >the compiler has to assume that it might modify any arbitrary global.
> 
> No; compilation units have nothing to do with it, GCC can optimise
> across compilation unit boundaries just fine, if you tell it to
> compile more than one compilation unit at once.

Last I checked, the Linux kernel build system did compile each .c file
as a separate compilation unit.

> What you probably mean is that the compiler has to assume any code
> it cannot currently see can do anything (insofar as allowed by the
> relevant standards etc.)

Indeed.

> >In many cases, the compiler also has to assume that 
> >msleep_interruptible()
> >might call back into a function in the current compilation unit, thus
> >possibly modifying global static variables.
> 
> It most often is smart enough to see what compilation-unit-local
> variables might be modified that way, though :-)

Yep.  For example, if it knows the current value of a given such local
variable, and if all code paths that would change some other variable
cannot be reached given that current value of the first variable.
At least given that gcc doesn't know about multiple threads of execution!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:57                       ` Paul E. McKenney
@ 2007-08-15 19:54                         ` Satyam Sharma
  2007-08-15 20:17                           ` Paul E. McKenney
  2007-08-15 20:47                           ` Segher Boessenkool
  2007-08-15 21:05                         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
  1 sibling, 2 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-15 19:54 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, rpjday, netdev, ak, cfriesen,
	Heiko Carstens, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

[ The Cc: list scares me. Should probably trim it. ]


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
> > >>How does the compiler know that msleep() has got barrier()s?
> > >
> > >Because msleep_interruptible() is in a separate compilation unit,
> > >the compiler has to assume that it might modify any arbitrary global.
> > 
> > No; compilation units have nothing to do with it, GCC can optimise
> > across compilation unit boundaries just fine, if you tell it to
> > compile more than one compilation unit at once.
> 
> Last I checked, the Linux kernel build system did compile each .c file
> as a separate compilation unit.
> 
> > What you probably mean is that the compiler has to assume any code
> > it cannot currently see can do anything (insofar as allowed by the
> > relevant standards etc.)

I think this was just terminology confusion here again. Isn't "any code
that it cannot currently see" the same as "another compilation unit",
and wouldn't the "compilation unit" itself expand if we ask gcc to
compile more than one unit at once? Or is there some more specific
"definition" for "compilation unit" (in gcc lingo, possibly?)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 19:54                         ` Satyam Sharma
@ 2007-08-15 20:17                           ` Paul E. McKenney
  2007-08-15 20:52                             ` Segher Boessenkool
  2007-08-15 20:47                           ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 20:17 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, rpjday, netdev, ak, cfriesen,
	Heiko Carstens, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

On Thu, Aug 16, 2007 at 01:24:42AM +0530, Satyam Sharma wrote:
> [ The Cc: list scares me. Should probably trim it. ]

Trim away!  ;-)

> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > On Wed, Aug 15, 2007 at 08:31:25PM +0200, Segher Boessenkool wrote:
> > > >>How does the compiler know that msleep() has got barrier()s?
> > > >
> > > >Because msleep_interruptible() is in a separate compilation unit,
> > > >the compiler has to assume that it might modify any arbitrary global.
> > > 
> > > No; compilation units have nothing to do with it, GCC can optimise
> > > across compilation unit boundaries just fine, if you tell it to
> > > compile more than one compilation unit at once.
> > 
> > Last I checked, the Linux kernel build system did compile each .c file
> > as a separate compilation unit.
> > 
> > > What you probably mean is that the compiler has to assume any code
> > > it cannot currently see can do anything (insofar as allowed by the
> > > relevant standards etc.)
> 
> I think this was just terminology confusion here again. Isn't "any code
> that it cannot currently see" the same as "another compilation unit",
> and wouldn't the "compilation unit" itself expand if we ask gcc to
> compile more than one unit at once? Or is there some more specific
> "definition" for "compilation unit" (in gcc lingo, possibly?)

This is indeed my understanding -- "compilation unit" is whatever the
compiler looks at in one go.  I have heard the word "module" used for
the minimal compilation unit covering a single .c file and everything
that it #includes, but there might be a better name for this.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 20:17                           ` Paul E. McKenney
@ 2007-08-15 20:52                             ` Segher Boessenkool
  2007-08-15 22:42                               ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-15 20:52 UTC (permalink / raw)
  To: paulmck
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>> I think this was just terminology confusion here again. Isn't "any 
>> code
>> that it cannot currently see" the same as "another compilation unit",
>> and wouldn't the "compilation unit" itself expand if we ask gcc to
>> compile more than one unit at once? Or is there some more specific
>> "definition" for "compilation unit" (in gcc lingo, possibly?)
>
> This is indeed my understanding -- "compilation unit" is whatever the
> compiler looks at in one go.  I have heard the word "module" used for
> the minimal compilation unit covering a single .c file and everything
> that it #includes, but there might be a better name for this.

Yes, that's what's called "compilation unit" :-)

[/me double checks]

Erm, the C standard actually calls it "translation unit".

To be exact, to avoid any more confusion:

5.1.1.1/1:
A C program need not all be translated at the same time. The
text of the program is kept in units called source files, (or
preprocessing files) in this International Standard. A source
file together with all the headers and source files included
via the preprocessing directive #include is known as a
preprocessing translation unit. After preprocessing, a
preprocessing translation unit is called a translation unit.



Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 20:52                             ` Segher Boessenkool
@ 2007-08-15 22:42                               ` Paul E. McKenney
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 22:42 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Wed, Aug 15, 2007 at 10:52:53PM +0200, Segher Boessenkool wrote:
> >>I think this was just terminology confusion here again. Isn't "any 
> >>code
> >>that it cannot currently see" the same as "another compilation unit",
> >>and wouldn't the "compilation unit" itself expand if we ask gcc to
> >>compile more than one unit at once? Or is there some more specific
> >>"definition" for "compilation unit" (in gcc lingo, possibly?)
> >
> >This is indeed my understanding -- "compilation unit" is whatever the
> >compiler looks at in one go.  I have heard the word "module" used for
> >the minimal compilation unit covering a single .c file and everything
> >that it #includes, but there might be a better name for this.
> 
> Yes, that's what's called "compilation unit" :-)
> 
> [/me double checks]
> 
> Erm, the C standard actually calls it "translation unit".
> 
> To be exact, to avoid any more confusion:
> 
> 5.1.1.1/1:
> A C program need not all be translated at the same time. The
> text of the program is kept in units called source files, (or
> preprocessing files) in this International Standard. A source
> file together with all the headers and source files included
> via the preprocessing directive #include is known as a
> preprocessing translation unit. After preprocessing, a
> preprocessing translation unit is called a translation unit.

I am OK with "translation" and "compilation" being near-synonyms.  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 19:54                         ` Satyam Sharma
  2007-08-15 20:17                           ` Paul E. McKenney
@ 2007-08-15 20:47                           ` Segher Boessenkool
  2007-08-16  0:36                             ` Satyam Sharma
  1 sibling, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-15 20:47 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: horms, Stefan Richter, Linux Kernel Mailing List,
	Paul E. McKenney, ak, netdev, cfriesen, Heiko Carstens, rpjday,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>>> What you probably mean is that the compiler has to assume any code
>>> it cannot currently see can do anything (insofar as allowed by the
>>> relevant standards etc.)
>
> I think this was just terminology confusion here again. Isn't "any code
> that it cannot currently see" the same as "another compilation unit",

It is not; try  gcc -combine  or the upcoming link-time optimisation
stuff, for example.

> and wouldn't the "compilation unit" itself expand if we ask gcc to
> compile more than one unit at once? Or is there some more specific
> "definition" for "compilation unit" (in gcc lingo, possibly?)

"compilation unit" is a C standard term.  It typically boils down
to "single .c file".


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2007-08-15 20:47                           ` Segher Boessenkool
@ 2007-08-16  0:36                             ` Satyam Sharma
  2007-08-16  0:32                               ` your mail Herbert Xu
  2007-08-16  1:38                               ` Segher Boessenkool
  0 siblings, 2 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  0:36 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Linux Kernel Mailing List,
	Paul E. McKenney, ak, netdev, cfriesen, Heiko Carstens, rpjday,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

On Wed, 15 Aug 2007, Segher Boessenkool wrote:

> > > > What you probably mean is that the compiler has to assume any code
> > > > it cannot currently see can do anything (insofar as allowed by the
> > > > relevant standards etc.)
> > 
> > I think this was just terminology confusion here again. Isn't "any code
> > that it cannot currently see" the same as "another compilation unit",
> 
> It is not; try  gcc -combine  or the upcoming link-time optimisation
> stuff, for example.
> 
> > and wouldn't the "compilation unit" itself expand if we ask gcc to
> > compile more than one unit at once? Or is there some more specific
> > "definition" for "compilation unit" (in gcc lingo, possibly?)
> 
> "compilation unit" is a C standard term.  It typically boils down
> to "single .c file".

As you mentioned later, "single .c file with all the other files (headers
or other .c files) that it pulls in via #include" is actually "translation
unit", both in the C standard as well as gcc docs. "Compilation unit"
doesn't seem to be nearly as standard a term, though in most places it
is indeed meant to be same as "translation unit", but with the new gcc
inter-module-analysis stuff that you referred to above, I suspect one may
reasonably want to call a "compilation unit" as all that the compiler sees
at a given instant.

BTW I did some auditing (only inside include/asm-{i386,x86_64}/ and
arch/{i386,x86_64}/) and found a couple more callsites that don't use
cpu_relax():

arch/i386/kernel/crash.c:101
arch/x86_64/kernel/crash.c:97

that are:

	while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
		mdelay(1);
		msecs--;
	}

where mdelay() becomes __const_udelay() which happens to be in another
translation unit (arch/i386/lib/delay.c) and hence saves this callsite
from being a bug :-)

Curiously, __const_udelay() is still marked as "inline" where it is
implemented in lib/delay.c which is weird, considering it won't ever
be inlined, would it? With the kernel presently being compiled one
translation unit at a time, I don't see how the implementation would
be visible to any callsite out there to be able to inline it.

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: your mail
  2007-08-16  0:36                             ` Satyam Sharma
@ 2007-08-16  0:32                               ` Herbert Xu
  2007-08-16  0:58                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Satyam Sharma
  2007-08-16  1:38                               ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  0:32 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, ak, netdev, cfriesen,
	Heiko Carstens, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, clameter, schwidefsky, Chris Snook, davem, Linus Torvalds,
	wensong, wjiang

On Thu, Aug 16, 2007 at 06:06:00AM +0530, Satyam Sharma wrote:
> 
> that are:
> 
> 	while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
> 		mdelay(1);
> 		msecs--;
> 	}
> 
> where mdelay() becomes __const_udelay() which happens to be in another
> translation unit (arch/i386/lib/delay.c) and hence saves this callsite
> from being a bug :-)

The udelay itself certainly should have some form of cpu_relax in it.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:32                               ` your mail Herbert Xu
@ 2007-08-16  0:58                                 ` Satyam Sharma
  2007-08-16  0:51                                   ` Herbert Xu
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  0:58 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, ak, netdev, cfriesen,
	Heiko Carstens, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, clameter, schwidefsky, Chris Snook, davem, Linus Torvalds,
	wensong, wjiang

[ Sorry for empty subject line in previous mail. I intended to make
  a patch so cleared it to change it, but ultimately neither made
  a patch nor restored subject line. Done that now. ]

On Thu, 16 Aug 2007, Herbert Xu wrote:

> On Thu, Aug 16, 2007 at 06:06:00AM +0530, Satyam Sharma wrote:
> > 
> > that are:
> > 
> > 	while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
> > 		mdelay(1);
> > 		msecs--;
> > 	}
> > 
> > where mdelay() becomes __const_udelay() which happens to be in another
> > translation unit (arch/i386/lib/delay.c) and hence saves this callsite
> > from being a bug :-)
> 
> The udelay itself certainly should have some form of cpu_relax in it.

Yes, a form of barrier() must be present in mdelay() or udelay() itself
as you say, having it in __const_udelay() is *not* enough (superflous
actually, considering it is already a separate translation unit and
invisible to the compiler).

However, there are no compiler barriers on the macro-definition-path
between mdelay(1) and __const_udelay(), so the only thing that saves us
from being a bug here is indeed the different-translation-unit concept.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:58                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Satyam Sharma
@ 2007-08-16  0:51                                   ` Herbert Xu
  2007-08-16  1:18                                     ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  0:51 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, ak, netdev, cfriesen,
	Heiko Carstens, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, clameter, schwidefsky, Chris Snook, davem, Linus Torvalds,
	wensong, wjiang

On Thu, Aug 16, 2007 at 06:28:42AM +0530, Satyam Sharma wrote:
>
> > The udelay itself certainly should have some form of cpu_relax in it.
> 
> Yes, a form of barrier() must be present in mdelay() or udelay() itself
> as you say, having it in __const_udelay() is *not* enough (superflous
> actually, considering it is already a separate translation unit and
> invisible to the compiler).

As long as __const_udelay does something which has the same
effect as barrier it is enough even if it's in the same unit.
As a matter of fact it does on i386 where __delay either uses
rep_nop or asm/volatile.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:51                                   ` Herbert Xu
@ 2007-08-16  1:18                                     ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  1:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Segher Boessenkool, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, ak, netdev, cfriesen,
	Heiko Carstens, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, clameter, schwidefsky, Chris Snook, davem, Linus Torvalds,
	wensong, wjiang

Hi Herbert,


On Thu, 16 Aug 2007, Herbert Xu wrote:

> On Thu, Aug 16, 2007 at 06:28:42AM +0530, Satyam Sharma wrote:
> >
> > > The udelay itself certainly should have some form of cpu_relax in it.
> > 
> > Yes, a form of barrier() must be present in mdelay() or udelay() itself
> > as you say, having it in __const_udelay() is *not* enough (superflous
> > actually, considering it is already a separate translation unit and
> > invisible to the compiler).
> 
> As long as __const_udelay does something which has the same
> effect as barrier it is enough even if it's in the same unit.

Only if __const_udelay() is inlined. But as I said, __const_udelay()
-- although marked "inline" -- will never be inlined anywhere in the
kernel in reality. It's an exported symbol, and never inlined from
modules. Even from built-in targets, the definition of __const_udelay
is invisible when gcc is compiling the compilation units of those
callsites. The compiler has no idea that that function has barriers
or not, so we're saved here _only_ by the lucky fact that
__const_udelay() is in a different compilation unit.


> As a matter of fact it does on i386 where __delay either uses
> rep_nop or asm/volatile.

__delay() can be either delay_tsc() or delay_loop() on i386.

delay_tsc() uses the rep_nop() there for it's own little busy
loop, actually. But for a call site that inlines __const_udelay()
-- if it were ever moved to a .h file and marked inline -- the
call to __delay() will _still_ be across compilation units. So,
again for this case, it does not matter if the callee function
has compiler barriers or not (it would've been a different story
if we were discussing real/CPU barriers, I think), what saves us
here is just the fact that a call is made to a function from a
different compilation unit, which is invisible to the compiler
when compiling the callsite, and hence acting as the compiler
barrier.

Regarding delay_loop(), it uses "volatile" for the "asm" which
has quite different semantics from the C language "volatile"
type-qualifier keyword and does not imply any compiler barrier
at all.


Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2007-08-16  0:36                             ` Satyam Sharma
  2007-08-16  0:32                               ` your mail Herbert Xu
@ 2007-08-16  1:38                               ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16  1:38 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: horms, Stefan Richter, Linux Kernel Mailing List,
	Paul E. McKenney, ak, netdev, cfriesen, Heiko Carstens, rpjday,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> "compilation unit" is a C standard term.  It typically boils down
>> to "single .c file".
>
> As you mentioned later, "single .c file with all the other files 
> (headers
> or other .c files) that it pulls in via #include" is actually 
> "translation
> unit", both in the C standard as well as gcc docs.

Yeah.  "single .c file after preprocessing".  Same thing :-)

> "Compilation unit"
> doesn't seem to be nearly as standard a term, though in most places it
> is indeed meant to be same as "translation unit", but with the new gcc
> inter-module-analysis stuff that you referred to above, I suspect one 
> may
> reasonably want to call a "compilation unit" as all that the compiler 
> sees
> at a given instant.

That would be a bit confusing, would it not?  They'd better find
some better name for that if they want to name it at all (remember,
none of these optimisations should have any effect on the semantics
of the program, you just get fewer .o files etc.).


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:57                       ` Paul E. McKenney
  2007-08-15 19:54                         ` Satyam Sharma
@ 2007-08-15 21:05                         ` Segher Boessenkool
  2007-08-15 22:44                           ` Paul E. McKenney
  1 sibling, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-15 21:05 UTC (permalink / raw)
  To: paulmck
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>> No; compilation units have nothing to do with it, GCC can optimise
>> across compilation unit boundaries just fine, if you tell it to
>> compile more than one compilation unit at once.
>
> Last I checked, the Linux kernel build system did compile each .c file
> as a separate compilation unit.

I have some patches to use -combine -fwhole-program for Linux.
Highly experimental, you need a patched bleeding edge toolchain.
If there's interest I'll clean it up and put it online.

David Woodhouse had some similar patches about a year ago.

>>> In many cases, the compiler also has to assume that
>>> msleep_interruptible()
>>> might call back into a function in the current compilation unit, thus
>>> possibly modifying global static variables.
>>
>> It most often is smart enough to see what compilation-unit-local
>> variables might be modified that way, though :-)
>
> Yep.  For example, if it knows the current value of a given such local
> variable, and if all code paths that would change some other variable
> cannot be reached given that current value of the first variable.

Or the most common thing: if neither the address of the translation-
unit local variable nor the address of any function writing to that
variable can "escape" from that translation unit, nothing outside
the translation unit can write to the variable.

> At least given that gcc doesn't know about multiple threads of 
> execution!

Heh, only about the threads it creates itself (not relevant to
the kernel, for sure :-) )


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 21:05                         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
@ 2007-08-15 22:44                           ` Paul E. McKenney
  2007-08-16  1:23                             ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 22:44 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Wed, Aug 15, 2007 at 11:05:35PM +0200, Segher Boessenkool wrote:
> >>No; compilation units have nothing to do with it, GCC can optimise
> >>across compilation unit boundaries just fine, if you tell it to
> >>compile more than one compilation unit at once.
> >
> >Last I checked, the Linux kernel build system did compile each .c file
> >as a separate compilation unit.
> 
> I have some patches to use -combine -fwhole-program for Linux.
> Highly experimental, you need a patched bleeding edge toolchain.
> If there's interest I'll clean it up and put it online.
> 
> David Woodhouse had some similar patches about a year ago.

Sounds exciting...  ;-)

> >>>In many cases, the compiler also has to assume that
> >>>msleep_interruptible()
> >>>might call back into a function in the current compilation unit, thus
> >>>possibly modifying global static variables.
> >>
> >>It most often is smart enough to see what compilation-unit-local
> >>variables might be modified that way, though :-)
> >
> >Yep.  For example, if it knows the current value of a given such local
> >variable, and if all code paths that would change some other variable
> >cannot be reached given that current value of the first variable.
> 
> Or the most common thing: if neither the address of the translation-
> unit local variable nor the address of any function writing to that
> variable can "escape" from that translation unit, nothing outside
> the translation unit can write to the variable.

But there is usually at least one externally callable function in
a .c file.

> >At least given that gcc doesn't know about multiple threads of 
> >execution!
> 
> Heh, only about the threads it creates itself (not relevant to
> the kernel, for sure :-) )

;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 22:44                           ` Paul E. McKenney
@ 2007-08-16  1:23                             ` Segher Boessenkool
  2007-08-16  2:22                               ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16  1:23 UTC (permalink / raw)
  To: paulmck
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>>>> No; compilation units have nothing to do with it, GCC can optimise
>>>> across compilation unit boundaries just fine, if you tell it to
>>>> compile more than one compilation unit at once.
>>>
>>> Last I checked, the Linux kernel build system did compile each .c 
>>> file
>>> as a separate compilation unit.
>>
>> I have some patches to use -combine -fwhole-program for Linux.
>> Highly experimental, you need a patched bleeding edge toolchain.
>> If there's interest I'll clean it up and put it online.
>>
>> David Woodhouse had some similar patches about a year ago.
>
> Sounds exciting...  ;-)

Yeah, the breakage is *quite* spectacular :-)

>>>>> In many cases, the compiler also has to assume that
>>>>> msleep_interruptible()
>>>>> might call back into a function in the current compilation unit, 
>>>>> thus
>>>>> possibly modifying global static variables.
>>>>
>>>> It most often is smart enough to see what compilation-unit-local
>>>> variables might be modified that way, though :-)
>>>
>>> Yep.  For example, if it knows the current value of a given such 
>>> local
>>> variable, and if all code paths that would change some other variable
>>> cannot be reached given that current value of the first variable.
>>
>> Or the most common thing: if neither the address of the translation-
>> unit local variable nor the address of any function writing to that
>> variable can "escape" from that translation unit, nothing outside
>> the translation unit can write to the variable.
>
> But there is usually at least one externally callable function in
> a .c file.

Of course, but often none of those will (indirectly) write a certain
static variable.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:23                             ` Segher Boessenkool
@ 2007-08-16  2:22                               ` Paul E. McKenney
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  2:22 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: horms, Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, Heiko Carstens, jesper.juhl,
	linux-arch, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Thu, Aug 16, 2007 at 03:23:28AM +0200, Segher Boessenkool wrote:
> >>>>No; compilation units have nothing to do with it, GCC can optimise
> >>>>across compilation unit boundaries just fine, if you tell it to
> >>>>compile more than one compilation unit at once.
> >>>
> >>>Last I checked, the Linux kernel build system did compile each .c 
> >>>file
> >>>as a separate compilation unit.
> >>
> >>I have some patches to use -combine -fwhole-program for Linux.
> >>Highly experimental, you need a patched bleeding edge toolchain.
> >>If there's interest I'll clean it up and put it online.
> >>
> >>David Woodhouse had some similar patches about a year ago.
> >
> >Sounds exciting...  ;-)
> 
> Yeah, the breakage is *quite* spectacular :-)

I bet!!!  ;-)

> >>>>>In many cases, the compiler also has to assume that
> >>>>>msleep_interruptible()
> >>>>>might call back into a function in the current compilation unit, 
> >>>>>thus
> >>>>>possibly modifying global static variables.
> >>>>
> >>>>It most often is smart enough to see what compilation-unit-local
> >>>>variables might be modified that way, though :-)
> >>>
> >>>Yep.  For example, if it knows the current value of a given such 
> >>>local
> >>>variable, and if all code paths that would change some other variable
> >>>cannot be reached given that current value of the first variable.
> >>
> >>Or the most common thing: if neither the address of the translation-
> >>unit local variable nor the address of any function writing to that
> >>variable can "escape" from that translation unit, nothing outside
> >>the translation unit can write to the variable.
> >
> >But there is usually at least one externally callable function in
> >a .c file.
> 
> Of course, but often none of those will (indirectly) write a certain
> static variable.

But there has to be some path to the static functions, assuming that
they are not dead code.  Yes, there can be cases where the compiler
knows enough about the state of the variables to rule out some of code
paths to them, but I have no idea how often this happens in kernel
code.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:35             ` Satyam Sharma
  2007-08-15 14:52               ` Herbert Xu
@ 2007-08-15 19:58               ` Stefan Richter
  1 sibling, 0 replies; 1651+ messages in thread
From: Stefan Richter @ 2007-08-15 19:58 UTC (permalink / raw)
  To: Satyam Sharma; +Cc: Linux Kernel Mailing List, Linus Torvalds

(trimmed Cc)

Satyam Sharma wrote:
> [PATCH] ieee1394: Fix kthread stopping in nodemgr_host_thread
> 
> The nodemgr host thread can exit on its own even when kthread_should_stop
> is not true, on receiving a signal (might never happen in practice, as
> it ignores signals). But considering kthread_stop() must not be mixed with
> kthreads that can exit on their own, I think changing the code like this
> is clearer. This change means the thread can cut its sleep short when
> receive a signal but looking at the code around, that sounds okay (and
> again, it might never actually recieve a signal in practice).

Thanks, committed to linux1394-2.6.git.
-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-15  8:18         ` Heiko Carstens
  2007-08-15 13:53           ` Stefan Richter
@ 2007-08-16  0:39           ` Satyam Sharma
  2007-08-24 11:59             ` Denys Vlasenko
  1 sibling, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  0:39 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: Herbert Xu, Chris Snook, clameter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher



On Wed, 15 Aug 2007, Heiko Carstens wrote:

> [...]
> Btw.: we still have
> 
> include/asm-i386/mach-es7000/mach_wakecpu.h:  while (!atomic_read(deassert));
> include/asm-i386/mach-default/mach_wakecpu.h: while (!atomic_read(deassert));
> 
> Looks like they need to be fixed as well.


[PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()

Use cpu_relax() in the busy loops, as atomic_read() doesn't automatically
imply volatility for i386 and x86_64. x86_64 doesn't have this issue because
it open-codes the while loop in smpboot.c:smp_callin() itself that already
uses cpu_relax().

For i386, however, smpboot.c:smp_callin() calls wait_for_init_deassert()
which is buggy for mach-default and mach-es7000 cases.

[ I test-built a kernel -- smp_callin() itself got inlined in its only
  callsite, smpboot.c:start_secondary() -- and the relevant piece of
  code disassembles to the following:

0xc1019704 <start_secondary+12>:        mov    0xc144c4c8,%eax
0xc1019709 <start_secondary+17>:        test   %eax,%eax
0xc101970b <start_secondary+19>:        je     0xc1019709 <start_secondary+17>

  init_deasserted (at 0xc144c4c8) gets fetched into %eax only once and
  then we loop over the test of the stale value in the register only,
  so these look like real bugs to me. With the fix below, this becomes:

0xc1019706 <start_secondary+14>:        pause
0xc1019708 <start_secondary+16>:        cmpl   $0x0,0xc144c4c8
0xc101970f <start_secondary+23>:        je     0xc1019706 <start_secondary+14>

  which looks nice and healthy. ]

Thanks to Heiko Carstens for noticing this.

Signed-off-by: Satyam Sharma <satyam@infradead.org>

---

 include/asm-i386/mach-default/mach_wakecpu.h |    3 ++-
 include/asm-i386/mach-es7000/mach_wakecpu.h  |    3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/asm-i386/mach-default/mach_wakecpu.h b/include/asm-i386/mach-default/mach_wakecpu.h
index 673b85c..3ebb178 100644
--- a/include/asm-i386/mach-default/mach_wakecpu.h
+++ b/include/asm-i386/mach-default/mach_wakecpu.h
@@ -15,7 +15,8 @@
 
 static inline void wait_for_init_deassert(atomic_t *deassert)
 {
-	while (!atomic_read(deassert));
+	while (!atomic_read(deassert))
+		cpu_relax();
 	return;
 }
 
diff --git a/include/asm-i386/mach-es7000/mach_wakecpu.h b/include/asm-i386/mach-es7000/mach_wakecpu.h
index efc903b..84ff583 100644
--- a/include/asm-i386/mach-es7000/mach_wakecpu.h
+++ b/include/asm-i386/mach-es7000/mach_wakecpu.h
@@ -31,7 +31,8 @@ wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip)
 static inline void wait_for_init_deassert(atomic_t *deassert)
 {
 #ifdef WAKE_SECONDARY_VIA_INIT
-	while (!atomic_read(deassert));
+	while (!atomic_read(deassert))
+		cpu_relax();
 #endif
 	return;
 }

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-16  0:39           ` [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert() Satyam Sharma
@ 2007-08-24 11:59             ` Denys Vlasenko
  2007-08-24 12:07               ` Andi Kleen
                                 ` (3 more replies)
  0 siblings, 4 replies; 1651+ messages in thread
From: Denys Vlasenko @ 2007-08-24 11:59 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
>
>  static inline void wait_for_init_deassert(atomic_t *deassert)
>  {
> -	while (!atomic_read(deassert));
> +	while (!atomic_read(deassert))
> +		cpu_relax();
>  	return;
>  }

For less-than-briliant people like me, it's totally non-obvious that
cpu_relax() is needed for correctness here, not just to make P4 happy.

IOW: "atomic_read" name quite unambiguously means "I will read
this variable from main memory". Which is not true and creates
potential for confusion and bugs.
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 11:59             ` Denys Vlasenko
@ 2007-08-24 12:07               ` Andi Kleen
  2007-08-24 12:12                 ` Kenn Humborg
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 1651+ messages in thread
From: Andi Kleen @ 2007-08-24 12:07 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Satyam Sharma, Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Friday 24 August 2007 13:59:32 Denys Vlasenko wrote:
> On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> >
> >  static inline void wait_for_init_deassert(atomic_t *deassert)
> >  {
> > -	while (!atomic_read(deassert));
> > +	while (!atomic_read(deassert))
> > +		cpu_relax();
> >  	return;
> >  }
> 
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.

I find it also non obvious. It would be really better to have a barrier
or equivalent (volatile or variable clobber) in the atomic_read()
 
> IOW: "atomic_read" name quite unambiguously means "I will read
> this variable from main memory". Which is not true and creates
> potential for confusion and bugs.

Agreed.

-Andi

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 11:59             ` Denys Vlasenko
@ 2007-08-24 12:12                 ` Kenn Humborg
  2007-08-24 12:12                 ` Kenn Humborg
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 1651+ messages in thread
From: Kenn Humborg @ 2007-08-24 12:12 UTC (permalink / raw)
  To: Denys Vlasenko, Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

> On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> >
> >  static inline void wait_for_init_deassert(atomic_t *deassert)
> >  {
> > -	while (!atomic_read(deassert));
> > +	while (!atomic_read(deassert))
> > +		cpu_relax();
> >  	return;
> >  }
> 
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.
> 
> IOW: "atomic_read" name quite unambiguously means "I will read
> this variable from main memory". Which is not true and creates
> potential for confusion and bugs.

To me, "atomic_read" means a read which is synchronized with other 
changes to the variable (using the atomic_XXX functions) in such 
a way that I will always only see the "before" or "after"
state of the variable - never an intermediate state while a 
modification is happening.  It doesn't imply that I have to 
see the "after" state immediately after another thread modifies
it.

Perhaps the Linux atomic_XXX functions work like that, or used
to work like that, but it's counter-intuitive to me that "atomic"
should imply a memory read.

Later,
Kenn


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
@ 2007-08-24 12:12                 ` Kenn Humborg
  0 siblings, 0 replies; 1651+ messages in thread
From: Kenn Humborg @ 2007-08-24 12:12 UTC (permalink / raw)
  To: Denys Vlasenko, Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

> On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> >
> >  static inline void wait_for_init_deassert(atomic_t *deassert)
> >  {
> > -	while (!atomic_read(deassert));
> > +	while (!atomic_read(deassert))
> > +		cpu_relax();
> >  	return;
> >  }
> 
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.
> 
> IOW: "atomic_read" name quite unambiguously means "I will read
> this variable from main memory". Which is not true and creates
> potential for confusion and bugs.

To me, "atomic_read" means a read which is synchronized with other 
changes to the variable (using the atomic_XXX functions) in such 
a way that I will always only see the "before" or "after"
state of the variable - never an intermediate state while a 
modification is happening.  It doesn't imply that I have to 
see the "after" state immediately after another thread modifies
it.

Perhaps the Linux atomic_XXX functions work like that, or used
to work like that, but it's counter-intuitive to me that "atomic"
should imply a memory read.

Later,
Kenn

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 12:12                 ` Kenn Humborg
  (?)
@ 2007-08-24 14:25                 ` Denys Vlasenko
  2007-08-24 17:34                   ` Linus Torvalds
  -1 siblings, 1 reply; 1651+ messages in thread
From: Denys Vlasenko @ 2007-08-24 14:25 UTC (permalink / raw)
  To: Kenn Humborg
  Cc: Satyam Sharma, Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Friday 24 August 2007 13:12, Kenn Humborg wrote:
> > On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> > >  static inline void wait_for_init_deassert(atomic_t *deassert)
> > >  {
> > > -	while (!atomic_read(deassert));
> > > +	while (!atomic_read(deassert))
> > > +		cpu_relax();
> > >  	return;
> > >  }
> >
> > For less-than-briliant people like me, it's totally non-obvious that
> > cpu_relax() is needed for correctness here, not just to make P4 happy.
> >
> > IOW: "atomic_read" name quite unambiguously means "I will read
> > this variable from main memory". Which is not true and creates
> > potential for confusion and bugs.
>
> To me, "atomic_read" means a read which is synchronized with other
> changes to the variable (using the atomic_XXX functions) in such
> a way that I will always only see the "before" or "after"
> state of the variable - never an intermediate state while a
> modification is happening.  It doesn't imply that I have to
> see the "after" state immediately after another thread modifies
> it.

So you are ok with compiler propagating n1 to n2 here:

n1 += atomic_read(x);
other_variable++;
n2 += atomic_read(x);

without accessing x second time. What's the point? Any sane coder
will say that explicitly anyway:

tmp = atomic_read(x);
n1 += tmp;
other_variable++;
n2 += tmp;

if only for the sake of code readability. Because first code
is definitely hinting that it reads RAM twice, and it's actively *bad*
for code readability when in fact it's not the case!

Locking, compiler and CPU barriers are complicated enough already,
please don't make them even harder to understand.
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 14:25                 ` Denys Vlasenko
@ 2007-08-24 17:34                   ` Linus Torvalds
  0 siblings, 0 replies; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-24 17:34 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Kenn Humborg, Satyam Sharma, Heiko Carstens, Herbert Xu,
	Chris Snook, clameter, Linux Kernel Mailing List, linux-arch,
	netdev, Andrew Morton, ak, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 24 Aug 2007, Denys Vlasenko wrote:
> 
> So you are ok with compiler propagating n1 to n2 here:
> 
> n1 += atomic_read(x);
> other_variable++;
> n2 += atomic_read(x);
> 
> without accessing x second time. What's the point? Any sane coder
> will say that explicitly anyway:

No.

This is a common mistake, and it's total crap.

Any "sane coder" will often use inline functions, macros, etc helpers to 
do certain abstract things. Those things may contain "atomic_read()" 
calls.

The biggest reason for compilers doing CSE is exactly the fact that many 
opportunities for CSE simple *are*not*visible* on a source code level. 

That is true of things like atomic_read() equally as to things like shared 
offsets inside structure member accesses. No difference what-so-ever.

Yes, we have, traditionally, tried to make it *easy* for the compiler to 
generate good code. So when we can, and when we look at performance for 
some really hot path, we *will* write the source code so that the compiler 
doesn't even have the option to screw it up, and that includes things like 
doing CSE at a source code level so that we don't see the compiler 
re-doing accesses unnecessarily.

And I'm not saying we shouldn't do that. But "performance" is not an 
either-or kind of situation, and we should:

 - spend the time at a source code level: make it reasonably easy for the 
   compiler to generate good code, and use the right algorithms at a 
   higher level (and order structures etc so that they have good cache 
   behaviour).

 - .. *and* expect the compiler to handle the cases we didn't do by hand
   pretty well anyway. In particular, quite often, abstraction levels at a 
   software level means that we give compilers "stupid" code, because some 
   function may have a certain high-level abstraction rule, but then on a 
   particular architecture it's actually a no-op, and the compiler should 
   get to "untangle" our stupid code and generate good end results.

 - .. *and* expect the hardware to be sane and do a good job even when the 
   compiler didn't generate perfect code or there were unlucky cache miss
   patterns etc.

and if we do all of that, we'll get good performance. But you really do 
want all three levels. It's not enough to be good at any one level (or 
even any two).

			Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 11:59             ` Denys Vlasenko
  2007-08-24 12:07               ` Andi Kleen
  2007-08-24 12:12                 ` Kenn Humborg
@ 2007-08-24 13:30               ` Satyam Sharma
  2007-08-24 17:06                 ` Christoph Lameter
  2007-08-24 16:19                 ` Luck, Tony
  3 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-24 13:30 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Nick Piggin

Hi Denys,

On Fri, 24 Aug 2007, Denys Vlasenko wrote:

> On Thursday 16 August 2007 01:39, Satyam Sharma wrote:
> >
> >  static inline void wait_for_init_deassert(atomic_t *deassert)
> >  {
> > -	while (!atomic_read(deassert));
> > +	while (!atomic_read(deassert))
> > +		cpu_relax();
> >  	return;
> >  }
> 
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.

This thread has been round-and-round with exactly the same discussions
:-) I had proposed few such variants to make a compiler barrier implicit
in atomic_{read,set} myself, but frankly, at least personally speaking
(now that I know better), I'm not so much in favour of implicit barriers
(compiler, memory or both) in atomic_{read,set}.

This might sound like an about-turn if you read my own postings to Nick
Piggin from a week back, but I do agree with most his opinions on the
matter now -- separation of barriers from atomic ops is actually good,
beneficial to certain code that knows what it's doing, explicit usage
of barriers stands out more clearly (most people here who deal with it
do know cpu_relax() is an explicit compiler barrier) compared to an
implicit usage in an atomic_read() or such variant ...

> IOW: "atomic_read" name quite unambiguously means "I will read
> this variable from main memory". Which is not true and creates
> potential for confusion and bugs.

I'd have to disagree here -- atomic ops are all about _atomicity_ of
memory accesses, not _making_ them happen (or visible to other CPUs)
_then and there_ itself. The latter are the job of barriers.

The behaviour (and expectations) are quite comprehensively covered in
atomic_ops.txt -- let alone atomic_{read,set}, even atomic_{inc,dec}
are permitted by archs' implementations to _not_ have any memory
barriers, for that matter. [It is unrelated that on x86 making them
SMP-safe requires the use of the LOCK prefix that also happens to be
an implicit memory barrier.]

An argument was also made about consistency of atomic_{read,set} w.r.t.
the other atomic ops -- but clearly, they are all already consistent!
All of them are atomic :-) The fact that atomic_{read,set} do _not_
require any inline asm or LOCK prefix whereas the others do, has to do
with the fact that unlike all others, atomic_{read,set} are not RMW ops
and hence guaranteed to be atomic just as they are in plain & simple C.

But if people do seem to have a mixed / confused notion of atomicity
and barriers, and if there's consensus, then as I'd said earlier, I
have no issues in going with the consensus (eg. having API variants).
Linus would be more difficult to convince, however, I suspect :-)

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 13:30               ` Satyam Sharma
@ 2007-08-24 17:06                 ` Christoph Lameter
  2007-08-24 20:26                   ` Denys Vlasenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-24 17:06 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Denys Vlasenko, Heiko Carstens, Herbert Xu, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Nick Piggin

On Fri, 24 Aug 2007, Satyam Sharma wrote:

> But if people do seem to have a mixed / confused notion of atomicity
> and barriers, and if there's consensus, then as I'd said earlier, I
> have no issues in going with the consensus (eg. having API variants).
> Linus would be more difficult to convince, however, I suspect :-)

The confusion may be the result of us having barrier semantics in 
atomic_read. If we take that out then we may avoid future confusions.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 17:06                 ` Christoph Lameter
@ 2007-08-24 20:26                   ` Denys Vlasenko
  2007-08-24 20:34                     ` Chris Snook
  0 siblings, 1 reply; 1651+ messages in thread
From: Denys Vlasenko @ 2007-08-24 20:26 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Satyam Sharma, Heiko Carstens, Herbert Xu, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Nick Piggin

On Friday 24 August 2007 18:06, Christoph Lameter wrote:
> On Fri, 24 Aug 2007, Satyam Sharma wrote:
> > But if people do seem to have a mixed / confused notion of atomicity
> > and barriers, and if there's consensus, then as I'd said earlier, I
> > have no issues in going with the consensus (eg. having API variants).
> > Linus would be more difficult to convince, however, I suspect :-)
>
> The confusion may be the result of us having barrier semantics in
> atomic_read. If we take that out then we may avoid future confusions.

I think better name may help. Nuke atomic_read() altogether.

n = atomic_value(x);	// doesnt hint as strongly at reading as "atomic_read"
n = atomic_fetch(x);	// yes, we _do_ touch RAM
n = atomic_read_uncached(x); // or this

How does that sound?
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 20:26                   ` Denys Vlasenko
@ 2007-08-24 20:34                     ` Chris Snook
  0 siblings, 0 replies; 1651+ messages in thread
From: Chris Snook @ 2007-08-24 20:34 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Christoph Lameter, Satyam Sharma, Heiko Carstens, Herbert Xu,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Nick Piggin

Denys Vlasenko wrote:
> On Friday 24 August 2007 18:06, Christoph Lameter wrote:
>> On Fri, 24 Aug 2007, Satyam Sharma wrote:
>>> But if people do seem to have a mixed / confused notion of atomicity
>>> and barriers, and if there's consensus, then as I'd said earlier, I
>>> have no issues in going with the consensus (eg. having API variants).
>>> Linus would be more difficult to convince, however, I suspect :-)
>> The confusion may be the result of us having barrier semantics in
>> atomic_read. If we take that out then we may avoid future confusions.
> 
> I think better name may help. Nuke atomic_read() altogether.
> 
> n = atomic_value(x);	// doesnt hint as strongly at reading as "atomic_read"
> n = atomic_fetch(x);	// yes, we _do_ touch RAM
> n = atomic_read_uncached(x); // or this
> 
> How does that sound?

atomic_value() vs. atomic_fetch() should be rather unambiguous. 
atomic_read_uncached() begs the question of precisely which cache we are 
avoiding, and could itself cause confusion.

So, if I were writing atomic.h from scratch, knowing what I know now, I think 
I'd use atomic_value() and atomic_fetch().  The problem is that there are a lot 
of existing users of atomic_read(), and we can't write a script to correctly 
guess their intent.  I'm not sure auditing all uses of atomic_read() is really 
worth the comparatively miniscule benefits.

We could play it safe and convert them all to atomic_fetch(), or we could 
acknowledge that changing the semantics 8 months ago was not at all disastrous, 
and make them all atomic_value(), allowing people to use atomic_fetch() where 
they really care.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
  2007-08-24 11:59             ` Denys Vlasenko
@ 2007-08-24 16:19                 ` Luck, Tony
  2007-08-24 12:12                 ` Kenn Humborg
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 1651+ messages in thread
From: Luck, Tony @ 2007-08-24 16:19 UTC (permalink / raw)
  To: Denys Vlasenko, Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

>>  static inline void wait_for_init_deassert(atomic_t *deassert)
>>  {
>> -	while (!atomic_read(deassert));
>> +	while (!atomic_read(deassert))
>> +		cpu_relax();
>>  	return;
>>  }
>
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.

Not just P4 ... there are other threaded cpus where it is useful to
let the core know that this is a busy loop so it would be a good thing
to let other threads have priority.

Even on a non-threaded cpu the cpu_relax() could be useful in the
future to hint to the cpu that it could drop into a lower power
hogging state.

But I agree with your main point that the loop without the cpu_relax()
looks like it ought to work because atomic_read() ought to actually
go out and read memory each time around the loop.

-Tony

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert()
@ 2007-08-24 16:19                 ` Luck, Tony
  0 siblings, 0 replies; 1651+ messages in thread
From: Luck, Tony @ 2007-08-24 16:19 UTC (permalink / raw)
  To: Denys Vlasenko, Satyam Sharma
  Cc: Heiko Carstens, Herbert Xu, Chris Snook, clameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

>>  static inline void wait_for_init_deassert(atomic_t *deassert)
>>  {
>> -	while (!atomic_read(deassert));
>> +	while (!atomic_read(deassert))
>> +		cpu_relax();
>>  	return;
>>  }
>
> For less-than-briliant people like me, it's totally non-obvious that
> cpu_relax() is needed for correctness here, not just to make P4 happy.

Not just P4 ... there are other threaded cpus where it is useful to
let the core know that this is a busy loop so it would be a good thing
to let other threads have priority.

Even on a non-threaded cpu the cpu_relax() could be useful in the
future to hint to the cpu that it could drop into a lower power
hogging state.

But I agree with your main point that the loop without the cpu_relax()
looks like it ought to work because atomic_read() ought to actually
go out and read memory each time around the loop.

-Tony

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15  6:49         ` Herbert Xu
  (?)
  (?)
@ 2007-08-15 16:13         ` Chris Snook
  2007-08-15 23:40           ` Herbert Xu
  -1 siblings, 1 reply; 1651+ messages in thread
From: Chris Snook @ 2007-08-15 16:13 UTC (permalink / raw)
  To: Herbert Xu
  Cc: satyam, clameter, linux-kernel, linux-arch, torvalds, netdev,
	akpm, ak, heiko.carstens, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:
> Chris Snook <csnook@redhat.com> wrote:
>> Because atomic operations are generally used for synchronization, which requires 
>> volatile behavior.  Most such codepaths currently use an inefficient barrier(). 
>>  Some forget to and we get bugs, because people assume that atomic_read() 
>> actually reads something, and atomic_write() actually writes something.  Worse, 
>> these are architecture-specific, even compiler version-specific bugs that are 
>> often difficult to track down.
> 
> I'm yet to see a single example from the current tree where
> this patch series is the correct solution.  So far the only
> example has been a buggy piece of code which has since been
> fixed with a cpu_relax.

Part of the motivation here is to fix heisenbugs.  If I knew where they 
were, I'd be posting patches for them.  Unlike most bugs, where we want 
to expose them as obviously as possible, these can be extremely 
difficult to track down, and are often due to people assuming that the 
atomic_* operations have the same semantics they've historically had. 
Remember that until recently, all SMP architectures except s390 (which 
very few kernel developers outside of IBM, Red Hat, and SuSE do much 
work on) had volatile declarations for atomic_t.  Removing the volatile 
declarations from i386 and x86_64 may have created heisenbugs that won't 
manifest themselves until GCC 6.0 comes out and people start compiling 
kernels with -O5.  We should have consistent semantics for atomic_* 
operations.

The other motivation is to reduce the need for the barriers used to 
prevent/fix such problems which clobber all your registers, and instead 
force atomic_* operations to behave in the way they're actually used. 
After the (resubmitted) patchset is merged, we'll be able to remove a 
whole bunch of barriers, shrinking our source and our binaries, and 
improving performance.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:13         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
@ 2007-08-15 23:40           ` Herbert Xu
  2007-08-15 23:51             ` Paul E. McKenney
  2007-08-16  1:26             ` Segher Boessenkool
  0 siblings, 2 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-15 23:40 UTC (permalink / raw)
  To: Chris Snook
  Cc: satyam, clameter, linux-kernel, linux-arch, torvalds, netdev,
	akpm, ak, heiko.carstens, davem, schwidefsky, wensong, horms,
	wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 12:13:12PM -0400, Chris Snook wrote:
>
> Part of the motivation here is to fix heisenbugs.  If I knew where they 

By the same token we should probably disable optimisations
altogether since that too can create heisenbugs.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:40           ` Herbert Xu
@ 2007-08-15 23:51             ` Paul E. McKenney
  2007-08-16  1:30               ` Segher Boessenkool
  2007-08-16  1:26             ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 23:51 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Chris Snook, satyam, clameter, linux-kernel, linux-arch, torvalds,
	netdev, akpm, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 07:40:21AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 12:13:12PM -0400, Chris Snook wrote:
> >
> > Part of the motivation here is to fix heisenbugs.  If I knew where they 
> 
> By the same token we should probably disable optimisations
> altogether since that too can create heisenbugs.

Precisely the point -- use of volatile (whether in casts or on asms)
in these cases are intended to disable those optimizations likely to
result in heisenbugs.  But they are also intended to leave other
valuable optimizations in force.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:51             ` Paul E. McKenney
@ 2007-08-16  1:30               ` Segher Boessenkool
  2007-08-16  2:30                 ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16  1:30 UTC (permalink / raw)
  To: paulmck
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

>>> Part of the motivation here is to fix heisenbugs.  If I knew where 
>>> they
>>
>> By the same token we should probably disable optimisations
>> altogether since that too can create heisenbugs.
>
> Precisely the point -- use of volatile (whether in casts or on asms)
> in these cases are intended to disable those optimizations likely to
> result in heisenbugs.

The only thing volatile on an asm does is create a side effect
on the asm statement; in effect, it tells the compiler "do not
remove this asm even if you don't need any of its outputs".

It's not disabling optimisation likely to result in bugs,
heisen- or otherwise; _not_ putting the volatile on an asm
that needs it simply _is_ a bug :-)


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:30               ` Segher Boessenkool
@ 2007-08-16  2:30                 ` Paul E. McKenney
  2007-08-16 19:33                   ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  2:30 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

On Thu, Aug 16, 2007 at 03:30:44AM +0200, Segher Boessenkool wrote:
> >>>Part of the motivation here is to fix heisenbugs.  If I knew where 
> >>>they
> >>
> >>By the same token we should probably disable optimisations
> >>altogether since that too can create heisenbugs.
> >
> >Precisely the point -- use of volatile (whether in casts or on asms)
> >in these cases are intended to disable those optimizations likely to
> >result in heisenbugs.
> 
> The only thing volatile on an asm does is create a side effect
> on the asm statement; in effect, it tells the compiler "do not
> remove this asm even if you don't need any of its outputs".
> 
> It's not disabling optimisation likely to result in bugs,
> heisen- or otherwise; _not_ putting the volatile on an asm
> that needs it simply _is_ a bug :-)

Yep.  And the reason it is a bug is that it fails to disable
the relevant compiler optimizations.  So I suspect that we might
actually be saying the same thing here.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:30                 ` Paul E. McKenney
@ 2007-08-16 19:33                   ` Segher Boessenkool
  0 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16 19:33 UTC (permalink / raw)
  To: paulmck
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

>> The only thing volatile on an asm does is create a side effect
>> on the asm statement; in effect, it tells the compiler "do not
>> remove this asm even if you don't need any of its outputs".
>>
>> It's not disabling optimisation likely to result in bugs,
>> heisen- or otherwise; _not_ putting the volatile on an asm
>> that needs it simply _is_ a bug :-)
>
> Yep.  And the reason it is a bug is that it fails to disable
> the relevant compiler optimizations.  So I suspect that we might
> actually be saying the same thing here.

We're not saying the same thing, but we do agree :-)


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:40           ` Herbert Xu
  2007-08-15 23:51             ` Paul E. McKenney
@ 2007-08-16  1:26             ` Segher Boessenkool
  2007-08-16  2:23               ` Nick Piggin
  1 sibling, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16  1:26 UTC (permalink / raw)
  To: Herbert Xu
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, davem, wensong, wjiang

>> Part of the motivation here is to fix heisenbugs.  If I knew where 
>> they
>
> By the same token we should probably disable optimisations
> altogether since that too can create heisenbugs.

Almost everything is a tradeoff; and so is this.  I don't
believe most people would find disabling all compiler
optimisations an acceptable price to pay for some peace
of mind.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:26             ` Segher Boessenkool
@ 2007-08-16  2:23               ` Nick Piggin
  2007-08-16 19:32                 ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Nick Piggin @ 2007-08-16  2:23 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Herbert Xu, heiko.carstens, horms, linux-kernel, rpjday, ak,
	netdev, cfriesen, akpm, torvalds, jesper.juhl, linux-arch, zlynx,
	satyam, clameter, schwidefsky, Chris Snook, davem, wensong,
	wjiang

Segher Boessenkool wrote:
>>> Part of the motivation here is to fix heisenbugs.  If I knew where they
>>
>>
>> By the same token we should probably disable optimisations
>> altogether since that too can create heisenbugs.
> 
> 
> Almost everything is a tradeoff; and so is this.  I don't
> believe most people would find disabling all compiler
> optimisations an acceptable price to pay for some peace
> of mind.

So why is this a good tradeoff?

I also think that just adding things to APIs in the hope it might fix
up some bugs isn't really a good road to go down. Where do you stop?

On the actual proposal to make atomic_operators volatile: I think the
better approach in the long term, for both maintainability of the
code and education of coders, is to make the use of barriers _more_
explicit rather than sprinkling these "just in case" ones around.

You may get rid of a few atomic_read heisenbugs (in noise when
compared to all bugs), but if the coder was using a regular atomic
load, or a test_bit (which is also atomic), etc. then they're going
to have problems.

It would be better for Linux if everyone was to have better awareness
of barriers than to hide some of the cases where they're required.
A pretty large number of bugs I see in lock free code in the VM is
due to memory ordering problems. It's hard to find those bugs, or
even be aware when you're writing buggy code if you don't have some
feel for barriers.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:23               ` Nick Piggin
@ 2007-08-16 19:32                 ` Segher Boessenkool
  2007-08-17  2:19                   ` Nick Piggin
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16 19:32 UTC (permalink / raw)
  To: Nick Piggin
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

>>>> Part of the motivation here is to fix heisenbugs.  If I knew where 
>>>> they
>>>
>>>
>>> By the same token we should probably disable optimisations
>>> altogether since that too can create heisenbugs.
>> Almost everything is a tradeoff; and so is this.  I don't
>> believe most people would find disabling all compiler
>> optimisations an acceptable price to pay for some peace
>> of mind.
>
> So why is this a good tradeoff?

It certainly is better than disabling all compiler optimisations!

> I also think that just adding things to APIs in the hope it might fix
> up some bugs isn't really a good road to go down. Where do you stop?

I look at it the other way: keeping the "volatile" semantics in
atomic_XXX() (or adding them to it, whatever) helps _prevent_ bugs;
certainly most people expect that behaviour, and also that behaviour
is *needed* in some places and no other interface provides that
functionality.


[some confusion about barriers wrt atomics snipped]


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 19:32                 ` Segher Boessenkool
@ 2007-08-17  2:19                   ` Nick Piggin
  2007-08-17  3:16                     ` Paul Mackerras
  2007-08-17 17:37                     ` Segher Boessenkool
  0 siblings, 2 replies; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  2:19 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

Segher Boessenkool wrote:
>>>>> Part of the motivation here is to fix heisenbugs.  If I knew where 
>>>>> they
>>>>
>>>>
>>>>
>>>> By the same token we should probably disable optimisations
>>>> altogether since that too can create heisenbugs.
>>>
>>> Almost everything is a tradeoff; and so is this.  I don't
>>> believe most people would find disabling all compiler
>>> optimisations an acceptable price to pay for some peace
>>> of mind.
>>
>>
>> So why is this a good tradeoff?
> 
> 
> It certainly is better than disabling all compiler optimisations!

It's easy to be better than something really stupid :)

So i386 and x86-64 don't have volatiles there, and it saves them a
few K of kernel text. What you need to justify is why it is a good
tradeoff to make them volatile (which btw, is much harder to go
the other way after we let people make those assumptions).

>> I also think that just adding things to APIs in the hope it might fix
>> up some bugs isn't really a good road to go down. Where do you stop?
> 
> 
> I look at it the other way: keeping the "volatile" semantics in
> atomic_XXX() (or adding them to it, whatever) helps _prevent_ bugs;

Yeah, but we could add lots of things to help prevent bugs and
would never be included. I would also contend that it helps _hide_
bugs and encourages people to be lazy when thinking about these
things.

Also, you dismiss the fact that we'd actually be *adding* volatile
semantics back to the 2 most widely tested architectures (in terms
of test time, number of testers, variety of configurations, and
coverage of driver code). This is a very important different from
just keeping volatile semantics because it is basically a one-way
API change.

> certainly most people expect that behaviour, and also that behaviour
> is *needed* in some places and no other interface provides that
> functionality.

I don't know that most people would expect that behaviour. Is there
any documentation anywhere that would suggest this?

Also, barrier() most definitely provides the required functionality.
It is overkill in some situations, but volatile is overkill in _most_
situations. If that's what you're worried about, we should add a new
ordering primitive.

> [some confusion about barriers wrt atomics snipped]

What were you confused about?

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:19                   ` Nick Piggin
@ 2007-08-17  3:16                     ` Paul Mackerras
  2007-08-17  3:32                       ` Nick Piggin
  2007-08-17  3:42                       ` Linus Torvalds
  2007-08-17 17:37                     ` Segher Boessenkool
  1 sibling, 2 replies; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-17  3:16 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Segher Boessenkool, heiko.carstens, horms, linux-kernel, rpjday,
	ak, netdev, cfriesen, akpm, torvalds, jesper.juhl, linux-arch,
	zlynx, satyam, clameter, schwidefsky, Chris Snook, Herbert Xu,
	davem, wensong, wjiang

Nick Piggin writes:

> So i386 and x86-64 don't have volatiles there, and it saves them a
> few K of kernel text. What you need to justify is why it is a good

I'm really surprised it's as much as a few K.  I tried it on powerpc
and it only saved 40 bytes (10 instructions) for a G5 config.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:16                     ` Paul Mackerras
@ 2007-08-17  3:32                       ` Nick Piggin
  2007-08-17  3:50                         ` Linus Torvalds
  2007-08-17  3:42                       ` Linus Torvalds
  1 sibling, 1 reply; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  3:32 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Segher Boessenkool, heiko.carstens, horms, linux-kernel, rpjday,
	ak, netdev, cfriesen, akpm, torvalds, jesper.juhl, linux-arch,
	zlynx, satyam, clameter, schwidefsky, Chris Snook, Herbert Xu,
	davem, wensong, wjiang

Paul Mackerras wrote:
> Nick Piggin writes:
> 
> 
>>So i386 and x86-64 don't have volatiles there, and it saves them a
>>few K of kernel text. What you need to justify is why it is a good
> 
> 
> I'm really surprised it's as much as a few K.  I tried it on powerpc
> and it only saved 40 bytes (10 instructions) for a G5 config.
> 
> Paul.
> 

I'm surprised too. Numbers were from the "...use asm() like the other
atomic operations already do" thread. According to them,

   text    data     bss     dec     hex filename
3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
3436203  249176  176128 3861507  3aec03 atomic_volatile/vmlinux

The first one is a stock kenel, the second is with atomic_read/set
cast to volatile. gcc-4.1 -- maybe if you have an earlier gcc it
won't optimise as much?

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:32                       ` Nick Piggin
@ 2007-08-17  3:50                         ` Linus Torvalds
  2007-08-17 23:59                           ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-17  3:50 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, satyam, clameter, schwidefsky, Chris Snook,
	Herbert Xu, davem, wensong, wjiang



On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> I'm surprised too. Numbers were from the "...use asm() like the other
> atomic operations already do" thread. According to them,
> 
>   text    data     bss     dec     hex filename
> 3434150  249176  176128 3859454  3ae3fe atomic_normal/vmlinux
> 3436203  249176  176128 3861507  3aec03 atomic_volatile/vmlinux
> 
> The first one is a stock kenel, the second is with atomic_read/set
> cast to volatile. gcc-4.1 -- maybe if you have an earlier gcc it
> won't optimise as much?

No, see my earlier reply. "volatile" really *is* an incredible piece of 
crap.

Just try it yourself:

	volatile int i;
	int j;

	int testme(void)
	{
	        return i <= 1;
	}

	int testme2(void)
	{
	        return j <= 1;
	}

and compile with all the optimizations you can.

I get:

	testme:
	        movl    i(%rip), %eax
	        subl    $1, %eax
	        setle   %al
	        movzbl  %al, %eax
	        ret

vs

	testme2:
	        xorl    %eax, %eax
	        cmpl    $1, j(%rip)
	        setle   %al
	        ret

(now, whether that "xorl + setle" is better than "setle + movzbl", I don't 
really know - maybe it is. But that's not thepoint. The point is the 
difference between

                movl    i(%rip), %eax
                subl    $1, %eax

and

                cmpl    $1, j(%rip)

and imagine this being done for *every* single volatile access.

Just do a 

	git grep atomic_read

to see how atomics are actually used. A lot of them are exactly the above 
kind of "compare against a value".

			Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:50                         ` Linus Torvalds
@ 2007-08-17 23:59                           ` Paul E. McKenney
  2007-08-18  0:09                             ` Herbert Xu
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-17 23:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Piggin, Paul Mackerras, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, satyam, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, wensong, wjiang

On Thu, Aug 16, 2007 at 08:50:30PM -0700, Linus Torvalds wrote:
> Just try it yourself:
> 
> 	volatile int i;
> 	int j;
> 
> 	int testme(void)
> 	{
> 	        return i <= 1;
> 	}
> 
> 	int testme2(void)
> 	{
> 	        return j <= 1;
> 	}
> 
> and compile with all the optimizations you can.
> 
> I get:
> 
> 	testme:
> 	        movl    i(%rip), %eax
> 	        subl    $1, %eax
> 	        setle   %al
> 	        movzbl  %al, %eax
> 	        ret
> 
> vs
> 
> 	testme2:
> 	        xorl    %eax, %eax
> 	        cmpl    $1, j(%rip)
> 	        setle   %al
> 	        ret
> 
> (now, whether that "xorl + setle" is better than "setle + movzbl", I don't 
> really know - maybe it is. But that's not thepoint. The point is the 
> difference between
> 
>                 movl    i(%rip), %eax
>                 subl    $1, %eax
> 
> and
> 
>                 cmpl    $1, j(%rip)

gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 23:59                           ` Paul E. McKenney
@ 2007-08-18  0:09                             ` Herbert Xu
  2007-08-18  1:08                               ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-18  0:09 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Linus Torvalds, Nick Piggin, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, davem, wensong, wjiang

On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
>
> gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)

I had totally forgotten that I'd already filed that bug more
than six years ago until they just closed yours as a duplicate
of mine :)

Good luck in getting it fixed!

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  0:09                             ` Herbert Xu
@ 2007-08-18  1:08                               ` Paul E. McKenney
  2007-08-18  1:24                                 ` Christoph Lameter
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-18  1:08 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linus Torvalds, Nick Piggin, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, davem, wensong, wjiang

On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
> On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
> >
> > gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
> 
> I had totally forgotten that I'd already filed that bug more
> than six years ago until they just closed yours as a duplicate
> of mine :)
> 
> Good luck in getting it fixed!

Well, just got done re-opening it for the third time.  And a local
gcc community member advised me not to give up too easily.  But I
must admit that I am impressed with the speed that it was identified
as duplicate.

Should be entertaining!  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:08                               ` Paul E. McKenney
@ 2007-08-18  1:24                                 ` Christoph Lameter
  2007-08-18  1:41                                   ` Satyam Sharma
                                                     ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-18  1:24 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Linus Torvalds, Nick Piggin, Paul Mackerras,
	Segher Boessenkool, heiko.carstens, horms, linux-kernel, rpjday,
	ak, netdev, cfriesen, akpm, jesper.juhl, linux-arch, zlynx,
	satyam, schwidefsky, Chris Snook, davem, wensong, wjiang

On Fri, 17 Aug 2007, Paul E. McKenney wrote:

> On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
> > On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
> > >
> > > gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
> > 
> > I had totally forgotten that I'd already filed that bug more
> > than six years ago until they just closed yours as a duplicate
> > of mine :)
> > 
> > Good luck in getting it fixed!
> 
> Well, just got done re-opening it for the third time.  And a local
> gcc community member advised me not to give up too easily.  But I
> must admit that I am impressed with the speed that it was identified
> as duplicate.
> 
> Should be entertaining!  ;-)

Right. ROTFL... volatile actually breaks atomic_t instead of making it 
safe. x++ becomes a register load, increment and a register store. Without 
volatile we can increment the memory directly. It seems that volatile 
requires that the variable is loaded into a register first and then 
operated upon. Understandable when you think about volatile being used to 
access memory mapped I/O registers where a RMW operation could be 
problematic.

See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3506

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:24                                 ` Christoph Lameter
@ 2007-08-18  1:41                                   ` Satyam Sharma
  2007-08-18  4:13                                     ` Linus Torvalds
  2007-08-18 21:56                                   ` Paul E. McKenney
  2007-08-20 13:31                                   ` Chris Snook
  2 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-18  1:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Herbert Xu, Linus Torvalds, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Christoph Lameter wrote:

> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
> > On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
> > > On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
> > > >
> > > > gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
> > > 
> > > I had totally forgotten that I'd already filed that bug more
> > > than six years ago until they just closed yours as a duplicate
> > > of mine :)
> > > 
> > > Good luck in getting it fixed!
> > 
> > Well, just got done re-opening it for the third time.  And a local
> > gcc community member advised me not to give up too easily.  But I
> > must admit that I am impressed with the speed that it was identified
> > as duplicate.
> > 
> > Should be entertaining!  ;-)
> 
> Right. ROTFL... volatile actually breaks atomic_t instead of making it 
> safe. x++ becomes a register load, increment and a register store. Without 
> volatile we can increment the memory directly.

No code does (or would do, or should do):

	x.counter++;

on an "atomic_t x;" anyway.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:41                                   ` Satyam Sharma
@ 2007-08-18  4:13                                     ` Linus Torvalds
  2007-08-18 13:36                                       ` Satyam Sharma
                                                         ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-18  4:13 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul E. McKenney, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang

On Sat, 18 Aug 2007, Satyam Sharma wrote:
> 
> No code does (or would do, or should do):
> 
> 	x.counter++;
> 
> on an "atomic_t x;" anyway.

That's just an example of a general problem.

No, you don't use "x.counter++". But you *do* use

	if (atomic_read(&x) <= 1)

and loading into a register is stupid and pointless, when you could just 
do it as a regular memory-operand to the cmp instruction.

And as far as the compiler is concerned, the problem is the 100% same: 
combining operations with the volatile memop.

The fact is, a compiler that thinks that

	movl mem,reg
	cmpl $val,reg

is any better than

	cmpl $val,mem

is just not a very good compiler. But when talking about "volatile", 
that's exactly what ytou always get (and always have gotten - this is 
not a regression, and I doubt gcc is alone in this).

			Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  4:13                                     ` Linus Torvalds
@ 2007-08-18 13:36                                       ` Satyam Sharma
  2007-08-18 21:54                                       ` Paul E. McKenney
  2007-08-24 12:19                                       ` Denys Vlasenko
  2 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-18 13:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Lameter, Paul E. McKenney, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Linus Torvalds wrote:

> On Sat, 18 Aug 2007, Satyam Sharma wrote:
> > 
> > No code does (or would do, or should do):
> > 
> > 	x.counter++;
> > 
> > on an "atomic_t x;" anyway.
> 
> That's just an example of a general problem.
> 
> No, you don't use "x.counter++". But you *do* use
> 
> 	if (atomic_read(&x) <= 1)
> 
> and loading into a register is stupid and pointless, when you could just 
> do it as a regular memory-operand to the cmp instruction.

True, but that makes this a bad/poor code generation issue with the
compiler, not something that affects the _correctness_ of atomic ops if
"volatile" is used for that counter object (as was suggested), because
we'd always use the atomic_inc() etc primitives to do increments, which
are always (should be!) implemented to be atomic.


> And as far as the compiler is concerned, the problem is the 100% same: 
> combining operations with the volatile memop.
> 
> The fact is, a compiler that thinks that
> 
> 	movl mem,reg
> 	cmpl $val,reg
> 
> is any better than
> 
> 	cmpl $val,mem
> 
> is just not a very good compiler.

Absolutely, this is definitely a bug report worth opening with gcc. And
what you've said to explain this previously sounds definitely correct --
seeing "volatile" for any access does appear to just scare the hell out
of gcc and makes it generate such (poor) code.


Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  4:13                                     ` Linus Torvalds
  2007-08-18 13:36                                       ` Satyam Sharma
@ 2007-08-18 21:54                                       ` Paul E. McKenney
  2007-08-18 22:41                                         ` Linus Torvalds
  2007-08-24 12:19                                       ` Denys Vlasenko
  2 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-18 21:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Satyam Sharma, Christoph Lameter, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang

On Fri, Aug 17, 2007 at 09:13:35PM -0700, Linus Torvalds wrote:
> 
> 
> On Sat, 18 Aug 2007, Satyam Sharma wrote:
> > 
> > No code does (or would do, or should do):
> > 
> > 	x.counter++;
> > 
> > on an "atomic_t x;" anyway.
> 
> That's just an example of a general problem.
> 
> No, you don't use "x.counter++". But you *do* use
> 
> 	if (atomic_read(&x) <= 1)
> 
> and loading into a register is stupid and pointless, when you could just 
> do it as a regular memory-operand to the cmp instruction.
> 
> And as far as the compiler is concerned, the problem is the 100% same: 
> combining operations with the volatile memop.
> 
> The fact is, a compiler that thinks that
> 
> 	movl mem,reg
> 	cmpl $val,reg
> 
> is any better than
> 
> 	cmpl $val,mem
> 
> is just not a very good compiler. But when talking about "volatile", 
> that's exactly what ytou always get (and always have gotten - this is 
> not a regression, and I doubt gcc is alone in this).

One of the gcc guys claimed that he thought that the two-instruction
sequence would be faster on some x86 machines.  I pointed out that
there might be a concern about code size.  I chose not to point out
that people might also care about the other x86 machines.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18 21:54                                       ` Paul E. McKenney
@ 2007-08-18 22:41                                         ` Linus Torvalds
  2007-08-18 23:19                                           ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-18 22:41 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Satyam Sharma, Christoph Lameter, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang

On Sat, 18 Aug 2007, Paul E. McKenney wrote:
> 
> One of the gcc guys claimed that he thought that the two-instruction
> sequence would be faster on some x86 machines.  I pointed out that
> there might be a concern about code size.  I chose not to point out
> that people might also care about the other x86 machines.  ;-)

Some (very few) x86 uarchs do tend to prefer "load-store" like code 
generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can 
actually be faster on some of them. Not any that are relevant today, 
though.

Also, that has nothing to do with volatile, and should be controlled by 
optimization flags (like -mtune). In fact, I thought there was a separate 
flag to do that (ie something like "-mload-store"), but I can't find it, 
so maybe that's just my fevered brain..

			Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18 22:41                                         ` Linus Torvalds
@ 2007-08-18 23:19                                           ` Paul E. McKenney
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-18 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Satyam Sharma, Christoph Lameter, Herbert Xu, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, schwidefsky, Chris Snook, davem, wensong,
	wjiang

On Sat, Aug 18, 2007 at 03:41:13PM -0700, Linus Torvalds wrote:
> 
> 
> On Sat, 18 Aug 2007, Paul E. McKenney wrote:
> > 
> > One of the gcc guys claimed that he thought that the two-instruction
> > sequence would be faster on some x86 machines.  I pointed out that
> > there might be a concern about code size.  I chose not to point out
> > that people might also care about the other x86 machines.  ;-)
> 
> Some (very few) x86 uarchs do tend to prefer "load-store" like code 
> generation, and doing a "mov [mem],reg + op reg" instead of "op [mem]" can 
> actually be faster on some of them. Not any that are relevant today, 
> though.

;-)

> Also, that has nothing to do with volatile, and should be controlled by 
> optimization flags (like -mtune). In fact, I thought there was a separate 
> flag to do that (ie something like "-mload-store"), but I can't find it, 
> so maybe that's just my fevered brain..

Good point, will suggest this if the need arises.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  4:13                                     ` Linus Torvalds
  2007-08-18 13:36                                       ` Satyam Sharma
  2007-08-18 21:54                                       ` Paul E. McKenney
@ 2007-08-24 12:19                                       ` Denys Vlasenko
  2007-08-24 17:19                                         ` Linus Torvalds
  2 siblings, 1 reply; 1651+ messages in thread
From: Denys Vlasenko @ 2007-08-24 12:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney, Herbert Xu,
	Nick Piggin, Paul Mackerras, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, schwidefsky, Chris Snook, davem,
	wensong, wjiang

On Saturday 18 August 2007 05:13, Linus Torvalds wrote:
> On Sat, 18 Aug 2007, Satyam Sharma wrote:
> > No code does (or would do, or should do):
> >
> > 	x.counter++;
> >
> > on an "atomic_t x;" anyway.
>
> That's just an example of a general problem.
>
> No, you don't use "x.counter++". But you *do* use
>
> 	if (atomic_read(&x) <= 1)
>
> and loading into a register is stupid and pointless, when you could just
> do it as a regular memory-operand to the cmp instruction.

It doesn't mean that (volatile int*) cast is bad, it means that current gcc
is bad (or "not good enough"). IOW: instead of avoiding volatile cast,
it's better to fix the compiler.

> And as far as the compiler is concerned, the problem is the 100% same:
> combining operations with the volatile memop.
>
> The fact is, a compiler that thinks that
>
> 	movl mem,reg
> 	cmpl $val,reg
>
> is any better than
>
> 	cmpl $val,mem
>
> is just not a very good compiler.

Linus, in all honesty gcc has many more cases of suboptimal code,
case of "volatile" is just one of many.

Off the top of my head:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28417

unsigned v;
void f(unsigned A) { v = ((unsigned long long)A) * 365384439 >> (27+32); }

gcc-4.1.1 -S -Os -fomit-frame-pointer t.c

f:
        movl    $365384439, %eax
        mull    4(%esp)
        movl    %edx, %eax <===== ?
        shrl    $27, %eax
        movl    %eax, v
        ret

Why is it moving %edx to %eax?

gcc-4.2.1 -S -Os -fomit-frame-pointer t.c

f:
        movl    $365384439, %eax
        mull    4(%esp)
        movl    %edx, %eax <===== ?
        xorl    %edx, %edx <===== ??!
        shrl    $27, %eax
        movl    %eax, v
        ret

Progress... Now we also zero out %edx afterwards for no apparent reason.
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-24 12:19                                       ` Denys Vlasenko
@ 2007-08-24 17:19                                         ` Linus Torvalds
  0 siblings, 0 replies; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-24 17:19 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney, Herbert Xu,
	Nick Piggin, Paul Mackerras, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, schwidefsky, Chris Snook, davem,
	wensong, wjiang

On Fri, 24 Aug 2007, Denys Vlasenko wrote:
>
> > No, you don't use "x.counter++". But you *do* use
> >
> > 	if (atomic_read(&x) <= 1)
> >
> > and loading into a register is stupid and pointless, when you could just
> > do it as a regular memory-operand to the cmp instruction.
> 
> It doesn't mean that (volatile int*) cast is bad, it means that current gcc
> is bad (or "not good enough"). IOW: instead of avoiding volatile cast,
> it's better to fix the compiler.

I would agree that fixing the compiler in this case would be a good thing, 
even quite regardless of any "atomic_read()" discussion.

I just have a strong suspicion that "volatile" performance is so low down 
the list of any C compiler persons interest, that it's never going to 
happen. And quite frankly, I cannot blame the gcc guys for it.

That's especially as "volatile" really isn't a very good feature of the C 
language, and is likely to get *less* interesting rather than more (as 
user space starts to be more and more threaded, "volatile" gets less and 
less useful.

[ Ie, currently, I think you can validly use "volatile" in a "sigatomic_t" 
  kind of way, where there is a single thread, but with asynchronous 
  events. In that kind of situation, I think it's probably useful. But 
  once you get multiple threads, it gets pointless.

  Sure: you could use "volatile" together with something like Dekker's or 
  Peterson's algorithm that doesn't depend on cache coherency (that's 
  basically what the C "volatile" keyword approximates: not atomic 
  accesses, but *uncached* accesses! But let's face it, that's way past 
  insane. ]

So I wouldn't expect "volatile" to ever really generate better code. It 
might happen as a side effect of other improvements (eg, I might hope that 
the SSA work would eventually lead to gcc having a much better defined 
model of valid optimizations, and maybe better code generation for 
volatile accesses fall out cleanly out of that), but in the end, it's such 
an ugly special case in C, and so seldom used, that I wouldn't depend on 
it.

> Linus, in all honesty gcc has many more cases of suboptimal code,
> case of "volatile" is just one of many.

Well, the thing is, quite often, many of those "suboptimal code" 
generations fall into two distinct classes:

 - complex C code. I can't really blame the compiler too much for this. 
   Some things are *hard* to optimize, and for various scalability 
   reasons, you often end up having limits in the compiler where it 
   doesn't even _try_ doing certain optimizations if you have excessive 
   complexity.

 - bad register allocation. Register allocation really is hard, and 
   sometimes gcc just does the "obviously wrong" thing, and you end up 
   having totally unnecessary spills.

> Off the top of my head:

Yes, "unsigned long long" with x86 has always generated atrocious code. In 
fact, I would say that historically it was really *really* bad. These 
days, gcc actually does a pretty good job, but I'm not surprised that it's 
still quite possible to find cases where it did some optimization (in this 
case, apparently noticing that "shift by >= 32 bits" causes the low 
register to be pointless) and then missed *another* optimization (better 
register use) because that optimization had been done *before* the first 
optimization was done.

That's a *classic* example of compiler code generation issues, and quite 
frankly, I think that's very different from the issue of "volatile".

Quite frankly, I'd like there to be more competition in the open source 
compiler game, and that might cause some upheavals, but on the whole, gcc 
actually does a pretty damn good job. 

			Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:24                                 ` Christoph Lameter
  2007-08-18  1:41                                   ` Satyam Sharma
@ 2007-08-18 21:56                                   ` Paul E. McKenney
  2007-08-20 13:31                                   ` Chris Snook
  2 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-18 21:56 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Herbert Xu, Linus Torvalds, Nick Piggin, Paul Mackerras,
	Segher Boessenkool, heiko.carstens, horms, linux-kernel, rpjday,
	ak, netdev, cfriesen, akpm, jesper.juhl, linux-arch, zlynx,
	satyam, schwidefsky, Chris Snook, davem, wensong, wjiang

On Fri, Aug 17, 2007 at 06:24:15PM -0700, Christoph Lameter wrote:
> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
> > On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
> > > On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
> > > >
> > > > gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
> > > 
> > > I had totally forgotten that I'd already filed that bug more
> > > than six years ago until they just closed yours as a duplicate
> > > of mine :)
> > > 
> > > Good luck in getting it fixed!
> > 
> > Well, just got done re-opening it for the third time.  And a local
> > gcc community member advised me not to give up too easily.  But I
> > must admit that I am impressed with the speed that it was identified
> > as duplicate.
> > 
> > Should be entertaining!  ;-)
> 
> Right. ROTFL... volatile actually breaks atomic_t instead of making it 
> safe. x++ becomes a register load, increment and a register store. Without 
> volatile we can increment the memory directly. It seems that volatile 
> requires that the variable is loaded into a register first and then 
> operated upon. Understandable when you think about volatile being used to 
> access memory mapped I/O registers where a RMW operation could be 
> problematic.
> 
> See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=3506

Yep.  The initial reaction was in fact to close my bug as a duplicate
of 3506.  But I was not asking for atomicity, but rather for smaller
code to be generated, so I reopened it.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:24                                 ` Christoph Lameter
  2007-08-18  1:41                                   ` Satyam Sharma
  2007-08-18 21:56                                   ` Paul E. McKenney
@ 2007-08-20 13:31                                   ` Chris Snook
  2007-08-20 22:04                                     ` Segher Boessenkool
  2 siblings, 1 reply; 1651+ messages in thread
From: Chris Snook @ 2007-08-20 13:31 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Herbert Xu, Linus Torvalds, Nick Piggin,
	Paul Mackerras, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, satyam, schwidefsky, davem, wensong, wjiang

Christoph Lameter wrote:
> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
>> On Sat, Aug 18, 2007 at 08:09:13AM +0800, Herbert Xu wrote:
>>> On Fri, Aug 17, 2007 at 04:59:12PM -0700, Paul E. McKenney wrote:
>>>> gcc bugzilla bug #33102, for whatever that ends up being worth.  ;-)
>>> I had totally forgotten that I'd already filed that bug more
>>> than six years ago until they just closed yours as a duplicate
>>> of mine :)
>>>
>>> Good luck in getting it fixed!
>> Well, just got done re-opening it for the third time.  And a local
>> gcc community member advised me not to give up too easily.  But I
>> must admit that I am impressed with the speed that it was identified
>> as duplicate.
>>
>> Should be entertaining!  ;-)
> 
> Right. ROTFL... volatile actually breaks atomic_t instead of making it 
> safe. x++ becomes a register load, increment and a register store. Without 
> volatile we can increment the memory directly. It seems that volatile 
> requires that the variable is loaded into a register first and then 
> operated upon. Understandable when you think about volatile being used to 
> access memory mapped I/O registers where a RMW operation could be 
> problematic.

So, if we want consistent behavior, we're pretty much screwed unless we use 
inline assembler everywhere?

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:31                                   ` Chris Snook
@ 2007-08-20 22:04                                     ` Segher Boessenkool
  2007-08-20 22:48                                       ` Russell King
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-20 22:04 UTC (permalink / raw)
  To: Chris Snook
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>> Right. ROTFL... volatile actually breaks atomic_t instead of making 
>> it safe. x++ becomes a register load, increment and a register store. 
>> Without volatile we can increment the memory directly. It seems that 
>> volatile requires that the variable is loaded into a register first 
>> and then operated upon. Understandable when you think about volatile 
>> being used to access memory mapped I/O registers where a RMW 
>> operation could be problematic.
>
> So, if we want consistent behavior, we're pretty much screwed unless 
> we use inline assembler everywhere?

Nah, this whole argument is flawed -- "without volatile" we still
*cannot* "increment the memory directly".  On x86, you need a lock
prefix; on other archs, some other mechanism to make the memory
increment an *atomic* memory increment.

And no, RMW on MMIO isn't "problematic" at all, either.

An RMW op is a read op, a modify op, and a write op, all rolled
into one opcode.  But three actual operations.


The advantages of asm code for atomic_{read,set} are:
1) all the other atomic ops are implemented that way already;
2) you have full control over the asm insns selected, in particular,
    you can guarantee you *do* get an atomic op;
3) you don't need to use "volatile <data>" which generates
    not-all-that-good code on archs like x86, and we want to get rid
    of it anyway since it is problematic in many ways;
4) you don't need to use *(volatile <type>*)&<data>, which a) doesn't
    exist in C; b) isn't documented or supported in GCC; c) has a recent
    history of bugginess; d) _still uses volatile objects_; e) _still_
    is problematic in almost all those same ways as in 3);
5) you can mix atomic and non-atomic accesses to the atomic_t, which
    you cannot with the other alternatives.

The only disadvantage I know of is potentially slightly worse
instruction scheduling.  This is a generic asm() problem: GCC
cannot see what actual insns are inside the asm() block.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 22:04                                     ` Segher Boessenkool
@ 2007-08-20 22:48                                       ` Russell King
  2007-08-20 23:02                                         ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Russell King @ 2007-08-20 22:48 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Chris Snook, Christoph Lameter, Paul Mackerras, heiko.carstens,
	horms, linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 12:04:17AM +0200, Segher Boessenkool wrote:
> And no, RMW on MMIO isn't "problematic" at all, either.
> 
> An RMW op is a read op, a modify op, and a write op, all rolled
> into one opcode.  But three actual operations.

Maybe for some CPUs, but not all.  ARM for instance can't use the
load exclusive and store exclusive instructions to MMIO space.

This means placing atomic_t or bitops into MMIO space is a definite
no-go on ARM.  It breaks.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 22:48                                       ` Russell King
@ 2007-08-20 23:02                                         ` Segher Boessenkool
  2007-08-21  0:05                                           ` Paul E. McKenney
  2007-08-21  7:05                                           ` Russell King
  0 siblings, 2 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-20 23:02 UTC (permalink / raw)
  To: Russell King
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> And no, RMW on MMIO isn't "problematic" at all, either.
>>
>> An RMW op is a read op, a modify op, and a write op, all rolled
>> into one opcode.  But three actual operations.
>
> Maybe for some CPUs, but not all.  ARM for instance can't use the
> load exclusive and store exclusive instructions to MMIO space.

Sure, your CPU doesn't have RMW instructions -- how to emulate
those if you don't have them is a totally different thing.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 23:02                                         ` Segher Boessenkool
@ 2007-08-21  0:05                                           ` Paul E. McKenney
  2007-08-21  7:08                                             ` Russell King
  2007-08-21  7:05                                           ` Russell King
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-21  0:05 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Russell King, Christoph Lameter, Paul Mackerras, heiko.carstens,
	horms, linux-kernel, ak, netdev, cfriesen, akpm, rpjday,
	Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 01:02:01AM +0200, Segher Boessenkool wrote:
> >>And no, RMW on MMIO isn't "problematic" at all, either.
> >>
> >>An RMW op is a read op, a modify op, and a write op, all rolled
> >>into one opcode.  But three actual operations.
> >
> >Maybe for some CPUs, but not all.  ARM for instance can't use the
> >load exclusive and store exclusive instructions to MMIO space.
> 
> Sure, your CPU doesn't have RMW instructions -- how to emulate
> those if you don't have them is a totally different thing.

I thought that ARM's load exclusive and store exclusive instructions
were its equivalent of LL and SC, which RISC machines typically use to
build atomic sequences of instructions -- and which normally cannot be
applied to MMIO space.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  0:05                                           ` Paul E. McKenney
@ 2007-08-21  7:08                                             ` Russell King
  0 siblings, 0 replies; 1651+ messages in thread
From: Russell King @ 2007-08-21  7:08 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Segher Boessenkool, Christoph Lameter, Paul Mackerras,
	heiko.carstens, horms, linux-kernel, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

On Mon, Aug 20, 2007 at 05:05:18PM -0700, Paul E. McKenney wrote:
> On Tue, Aug 21, 2007 at 01:02:01AM +0200, Segher Boessenkool wrote:
> > >>And no, RMW on MMIO isn't "problematic" at all, either.
> > >>
> > >>An RMW op is a read op, a modify op, and a write op, all rolled
> > >>into one opcode.  But three actual operations.
> > >
> > >Maybe for some CPUs, but not all.  ARM for instance can't use the
> > >load exclusive and store exclusive instructions to MMIO space.
> > 
> > Sure, your CPU doesn't have RMW instructions -- how to emulate
> > those if you don't have them is a totally different thing.
> 
> I thought that ARM's load exclusive and store exclusive instructions
> were its equivalent of LL and SC, which RISC machines typically use to
> build atomic sequences of instructions -- and which normally cannot be
> applied to MMIO space.

Absolutely correct.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 23:02                                         ` Segher Boessenkool
  2007-08-21  0:05                                           ` Paul E. McKenney
@ 2007-08-21  7:05                                           ` Russell King
  2007-08-21  9:33                                             ` Paul Mackerras
  2007-08-21 14:39                                             ` Segher Boessenkool
  1 sibling, 2 replies; 1651+ messages in thread
From: Russell King @ 2007-08-21  7:05 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

On Tue, Aug 21, 2007 at 01:02:01AM +0200, Segher Boessenkool wrote:
> >>And no, RMW on MMIO isn't "problematic" at all, either.
> >>
> >>An RMW op is a read op, a modify op, and a write op, all rolled
> >>into one opcode.  But three actual operations.
> >
> >Maybe for some CPUs, but not all.  ARM for instance can't use the
> >load exclusive and store exclusive instructions to MMIO space.
> 
> Sure, your CPU doesn't have RMW instructions -- how to emulate
> those if you don't have them is a totally different thing.

Let me say it more clearly: On ARM, it is impossible to perform atomic
operations on MMIO space.

-- 
Russell King
 Linux kernel    2.6 ARM Linux   - http://www.arm.linux.org.uk/
 maintainer of:

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  7:05                                           ` Russell King
@ 2007-08-21  9:33                                             ` Paul Mackerras
  2007-08-21 11:37                                               ` Andi Kleen
  2007-08-21 14:48                                               ` Segher Boessenkool
  2007-08-21 14:39                                             ` Segher Boessenkool
  1 sibling, 2 replies; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-21  9:33 UTC (permalink / raw)
  To: Russell King
  Cc: Segher Boessenkool, Christoph Lameter, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

Russell King writes:

> Let me say it more clearly: On ARM, it is impossible to perform atomic
> operations on MMIO space.

Actually, no one is suggesting that we try to do that at all.

The discussion about RMW ops on MMIO space started with a comment
attributed to the gcc developers that one reason why gcc on x86
doesn't use instructions that do RMW ops on volatile variables is that
volatile is used to mark MMIO addresses, and there was some
uncertainty about whether (non-atomic) RMW ops on x86 could be used on
MMIO.  This is in regard to the question about why gcc on x86 always
moves a volatile variable into a register before doing anything to it.

So the whole discussion is irrelevant to ARM, PowerPC and any other
architecture except x86[-64].

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  9:33                                             ` Paul Mackerras
@ 2007-08-21 11:37                                               ` Andi Kleen
  2007-08-21 14:48                                               ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Andi Kleen @ 2007-08-21 11:37 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Russell King, Segher Boessenkool, Christoph Lameter,
	heiko.carstens, horms, linux-kernel, Paul E. McKenney, ak, netdev,
	cfriesen, akpm, rpjday, Nick Piggin, linux-arch, jesper.juhl,
	satyam, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 07:33:49PM +1000, Paul Mackerras wrote:
> So the whole discussion is irrelevant to ARM, PowerPC and any other
> architecture except x86[-64].

It's even irrelevant on x86 because all modifying operations on atomic_t 
are coded in inline assembler and will always be RMW no matter
if atomic_t is volatile or not.

[ignoring atomic_set(x, atomic_read(x) + 1) which nobody should do]

The only issue is if atomic_t should have a implicit barrier or not.
My personal opinion is yes -- better safe than sorry. And any code
impact it may have is typically dwarved by the next cache miss anyways,
so it doesn't matter much.

-Andi


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  9:33                                             ` Paul Mackerras
  2007-08-21 11:37                                               ` Andi Kleen
@ 2007-08-21 14:48                                               ` Segher Boessenkool
  2007-08-21 16:16                                                 ` Paul E. McKenney
  1 sibling, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-21 14:48 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Russell King, Christoph Lameter, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> Let me say it more clearly: On ARM, it is impossible to perform atomic
>> operations on MMIO space.
>
> Actually, no one is suggesting that we try to do that at all.
>
> The discussion about RMW ops on MMIO space started with a comment
> attributed to the gcc developers that one reason why gcc on x86
> doesn't use instructions that do RMW ops on volatile variables is that
> volatile is used to mark MMIO addresses, and there was some
> uncertainty about whether (non-atomic) RMW ops on x86 could be used on
> MMIO.  This is in regard to the question about why gcc on x86 always
> moves a volatile variable into a register before doing anything to it.

This question is GCC PR33102, which was incorrectly closed as a 
duplicate
of PR3506 -- and *that* PR was closed because its reporter seemed to
claim the GCC generated code for an increment on a volatile (namely, 
three
machine instructions: load, modify, store) was incorrect, and it has to
be one machine instruction.

> So the whole discussion is irrelevant to ARM, PowerPC and any other
> architecture except x86[-64].

And even there, it's not something the kernel can take advantage of
before GCC 4.4 is in widespread use, if then.  Let's move on.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 14:48                                               ` Segher Boessenkool
@ 2007-08-21 16:16                                                 ` Paul E. McKenney
  2007-08-21 22:51                                                   ` Valdis.Kletnieks
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-21 16:16 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Paul Mackerras, Russell King, Christoph Lameter, heiko.carstens,
	horms, linux-kernel, ak, netdev, cfriesen, akpm, rpjday,
	Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 04:48:51PM +0200, Segher Boessenkool wrote:
> >>Let me say it more clearly: On ARM, it is impossible to perform atomic
> >>operations on MMIO space.
> >
> >Actually, no one is suggesting that we try to do that at all.
> >
> >The discussion about RMW ops on MMIO space started with a comment
> >attributed to the gcc developers that one reason why gcc on x86
> >doesn't use instructions that do RMW ops on volatile variables is that
> >volatile is used to mark MMIO addresses, and there was some
> >uncertainty about whether (non-atomic) RMW ops on x86 could be used on
> >MMIO.  This is in regard to the question about why gcc on x86 always
> >moves a volatile variable into a register before doing anything to it.
> 
> This question is GCC PR33102, which was incorrectly closed as a 
> duplicate
> of PR3506 -- and *that* PR was closed because its reporter seemed to
> claim the GCC generated code for an increment on a volatile (namely, 
> three
> machine instructions: load, modify, store) was incorrect, and it has to
> be one machine instruction.
> 
> >So the whole discussion is irrelevant to ARM, PowerPC and any other
> >architecture except x86[-64].
> 
> And even there, it's not something the kernel can take advantage of
> before GCC 4.4 is in widespread use, if then.  Let's move on.

I agree that instant gratification is hard to come by when synching
up compiler and kernel versions.  Nonetheless, it should be possible
to create APIs that are are conditioned on the compiler version.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 16:16                                                 ` Paul E. McKenney
@ 2007-08-21 22:51                                                   ` Valdis.Kletnieks
  2007-08-22  0:50                                                     ` Paul E. McKenney
  2007-08-22 21:38                                                     ` Adrian Bunk
  0 siblings, 2 replies; 1651+ messages in thread
From: Valdis.Kletnieks @ 2007-08-21 22:51 UTC (permalink / raw)
  To: paulmck
  Cc: Segher Boessenkool, Paul Mackerras, Russell King,
	Christoph Lameter, heiko.carstens, horms, linux-kernel, ak,
	netdev, cfriesen, akpm, rpjday, Nick Piggin, linux-arch,
	jesper.juhl, satyam, zlynx, schwidefsky, Chris Snook, Herbert Xu,
	davem, Linus Torvalds, wensong, wjiang

[-- Attachment #1: Type: text/plain, Size: 557 bytes --]

On Tue, 21 Aug 2007 09:16:43 PDT, "Paul E. McKenney" said:

> I agree that instant gratification is hard to come by when synching
> up compiler and kernel versions.  Nonetheless, it should be possible
> to create APIs that are are conditioned on the compiler version.

We've tried that, sort of.  See the mess surrounding the whole
extern/static/inline/__whatever boondogle, which seems to have
changed semantics in every single gcc release since 2.95 or so.

And recently mention was made that gcc4.4 will have *new* semantics
in this area. Yee. Hah.






[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 22:51                                                   ` Valdis.Kletnieks
@ 2007-08-22  0:50                                                     ` Paul E. McKenney
  2007-08-22 21:38                                                     ` Adrian Bunk
  1 sibling, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-22  0:50 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Segher Boessenkool, Paul Mackerras, Russell King,
	Christoph Lameter, heiko.carstens, horms, linux-kernel, ak,
	netdev, cfriesen, akpm, rpjday, Nick Piggin, linux-arch,
	jesper.juhl, satyam, zlynx, schwidefsky, Chris Snook, Herbert Xu,
	davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 06:51:16PM -0400, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 21 Aug 2007 09:16:43 PDT, "Paul E. McKenney" said:
> 
> > I agree that instant gratification is hard to come by when synching
> > up compiler and kernel versions.  Nonetheless, it should be possible
> > to create APIs that are are conditioned on the compiler version.
> 
> We've tried that, sort of.  See the mess surrounding the whole
> extern/static/inline/__whatever boondogle, which seems to have
> changed semantics in every single gcc release since 2.95 or so.
> 
> And recently mention was made that gcc4.4 will have *new* semantics
> in this area. Yee. Hah.

;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 22:51                                                   ` Valdis.Kletnieks
  2007-08-22  0:50                                                     ` Paul E. McKenney
@ 2007-08-22 21:38                                                     ` Adrian Bunk
  1 sibling, 0 replies; 1651+ messages in thread
From: Adrian Bunk @ 2007-08-22 21:38 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: paulmck, Segher Boessenkool, Paul Mackerras, Russell King,
	Christoph Lameter, heiko.carstens, horms, linux-kernel, ak,
	netdev, cfriesen, akpm, rpjday, Nick Piggin, linux-arch,
	jesper.juhl, satyam, zlynx, schwidefsky, Chris Snook, Herbert Xu,
	davem, Linus Torvalds, wensong, wjiang

On Tue, Aug 21, 2007 at 06:51:16PM -0400, Valdis.Kletnieks@vt.edu wrote:
> On Tue, 21 Aug 2007 09:16:43 PDT, "Paul E. McKenney" said:
> 
> > I agree that instant gratification is hard to come by when synching
> > up compiler and kernel versions.  Nonetheless, it should be possible
> > to create APIs that are are conditioned on the compiler version.
> 
> We've tried that, sort of.  See the mess surrounding the whole
> extern/static/inline/__whatever boondogle, which seems to have
> changed semantics in every single gcc release since 2.95 or so.
>...

There is exactly one semantics change in gcc in this area, and that is 
the change of the "extern inline" semantics in gcc 4.3 to the
C99 semantics.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  7:05                                           ` Russell King
  2007-08-21  9:33                                             ` Paul Mackerras
@ 2007-08-21 14:39                                             ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-21 14:39 UTC (permalink / raw)
  To: Russell King
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	linux-kernel, Paul E. McKenney, ak, netdev, cfriesen, akpm,
	rpjday, Nick Piggin, linux-arch, jesper.juhl, satyam, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>>>> And no, RMW on MMIO isn't "problematic" at all, either.
>>>>
>>>> An RMW op is a read op, a modify op, and a write op, all rolled
>>>> into one opcode.  But three actual operations.
>>>
>>> Maybe for some CPUs, but not all.  ARM for instance can't use the
>>> load exclusive and store exclusive instructions to MMIO space.
>>
>> Sure, your CPU doesn't have RMW instructions -- how to emulate
>> those if you don't have them is a totally different thing.
>
> Let me say it more clearly: On ARM, it is impossible to perform atomic
> operations on MMIO space.

It's all completely beside the point, see the other subthread, but...

Yeah, you can't do LL/SC to MMIO space; ARM isn't alone in that.
You could still implement atomic operations on MMIO space by taking
a lock elsewhere, in normal cacheable memory space.  Why you would
do this is a separate question, you probably don't want it :-)


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:16                     ` Paul Mackerras
  2007-08-17  3:32                       ` Nick Piggin
@ 2007-08-17  3:42                       ` Linus Torvalds
  2007-08-17  5:18                         ` Paul E. McKenney
                                           ` (4 more replies)
  1 sibling, 5 replies; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-17  3:42 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Nick Piggin, Segher Boessenkool, heiko.carstens, horms,
	linux-kernel, rpjday, ak, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, satyam, clameter, schwidefsky, Chris Snook,
	Herbert Xu, davem, wensong, wjiang

On Fri, 17 Aug 2007, Paul Mackerras wrote:
> 
> I'm really surprised it's as much as a few K.  I tried it on powerpc
> and it only saved 40 bytes (10 instructions) for a G5 config.

One of the things that "volatile" generally screws up is a simple

	volatile int i;

	i++;

which a compiler will generally get horribly, horribly wrong.

In a reasonable world, gcc should just make that be (on x86)

	addl $1,i(%rip)

on x86-64, which is indeed what it does without the volatile. But with the 
volatile, the compiler gets really nervous, and doesn't dare do it in one 
instruction, and thus generates crap like

        movl    i(%rip), %eax
        addl    $1, %eax
        movl    %eax, i(%rip)

instead. For no good reason, except that "volatile" just doesn't have any 
good/clear semantics for the compiler, so most compilers will just make it 
be "I will not touch this access in any way, shape, or form". Including 
even trivially correct instruction optimization/combination.

This is one of the reasons why we should never use "volatile". It 
pessimises code generation for no good reason - just because compilers 
don't know what the heck it even means! 

Now, people don't do "i++" on atomics (you'd use "atomic_inc()" for that), 
but people *do* do things like

	if (atomic_read(..) <= 1)
		..

On ppc, things like that probably don't much matter. But on x86, it makes 
a *huge* difference whether you do

	movl i(%rip),%eax
	cmpl $1,%eax

or if you can just use the value directly for the operation, like this:

	cmpl $1,i(%rip)

which is again a totally obvious and totally safe optimization, but is 
(again) something that gcc doesn't dare do, since "i" is volatile.

In other words: "volatile" is a horribly horribly bad way of doing things, 
because it generates *worse*code*, for no good reason. You just don't see 
it on powerpc, because it's already a load-store architecture, so there is 
no "good code" for doing direct-to-memory operations.

		Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
@ 2007-08-17  5:18                         ` Paul E. McKenney
  2007-08-17  5:56                         ` Satyam Sharma
                                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-17  5:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, Nick Piggin, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, satyam, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, wensong, wjiang

On Thu, Aug 16, 2007 at 08:42:23PM -0700, Linus Torvalds wrote:
> 
> 
> On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > 
> > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > and it only saved 40 bytes (10 instructions) for a G5 config.
> 
> One of the things that "volatile" generally screws up is a simple
> 
> 	volatile int i;
> 
> 	i++;
> 
> which a compiler will generally get horribly, horribly wrong.
> 
> In a reasonable world, gcc should just make that be (on x86)
> 
> 	addl $1,i(%rip)
> 
> on x86-64, which is indeed what it does without the volatile. But with the 
> volatile, the compiler gets really nervous, and doesn't dare do it in one 
> instruction, and thus generates crap like
> 
>         movl    i(%rip), %eax
>         addl    $1, %eax
>         movl    %eax, i(%rip)

Blech.  Sounds like a chat with some gcc people is in order.  Will
see what I can do.

						Thanx, Paul

> instead. For no good reason, except that "volatile" just doesn't have any 
> good/clear semantics for the compiler, so most compilers will just make it 
> be "I will not touch this access in any way, shape, or form". Including 
> even trivially correct instruction optimization/combination.
> 
> This is one of the reasons why we should never use "volatile". It 
> pessimises code generation for no good reason - just because compilers 
> don't know what the heck it even means! 
> 
> Now, people don't do "i++" on atomics (you'd use "atomic_inc()" for that), 
> but people *do* do things like
> 
> 	if (atomic_read(..) <= 1)
> 		..
> 
> On ppc, things like that probably don't much matter. But on x86, it makes 
> a *huge* difference whether you do
> 
> 	movl i(%rip),%eax
> 	cmpl $1,%eax
> 
> or if you can just use the value directly for the operation, like this:
> 
> 	cmpl $1,i(%rip)
> 
> which is again a totally obvious and totally safe optimization, but is 
> (again) something that gcc doesn't dare do, since "i" is volatile.
> 
> In other words: "volatile" is a horribly horribly bad way of doing things, 
> because it generates *worse*code*, for no good reason. You just don't see 
> it on powerpc, because it's already a load-store architecture, so there is 
> no "good code" for doing direct-to-memory operations.
> 
> 		Linus
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
  2007-08-17  5:18                         ` Paul E. McKenney
@ 2007-08-17  5:56                         ` Satyam Sharma
  2007-08-17  7:26                           ` Nick Piggin
  2007-08-17 22:49                           ` Segher Boessenkool
  2007-08-17  6:42                         ` Geert Uytterhoeven
                                           ` (2 subsequent siblings)
  4 siblings, 2 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  5:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, Nick Piggin, Segher Boessenkool, heiko.carstens,
	horms, Linux Kernel Mailing List, rpjday, ak, netdev, cfriesen,
	Andrew Morton, jesper.juhl, linux-arch, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

Hi Linus,

[ and others; I think there's a communication gap in a lot of this
  thread, and a little summary would be useful. Hence this posting. ]

On Thu, 16 Aug 2007, Linus Torvalds wrote:

> On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > 
> > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > and it only saved 40 bytes (10 instructions) for a G5 config.
> 
> One of the things that "volatile" generally screws up is a simple
> 
> 	volatile int i;
> 
> 	i++;
> 
> which a compiler will generally get horribly, horribly wrong.
> 
> [...] For no good reason, except that "volatile" just doesn't have any 
> good/clear semantics for the compiler, so most compilers will just make it 
> be "I will not touch this access in any way, shape, or form". Including 
> even trivially correct instruction optimization/combination.
> 
> This is one of the reasons why we should never use "volatile". It 
> pessimises code generation for no good reason - just because compilers 
> don't know what the heck it even means! 
> [...]
> In other words: "volatile" is a horribly horribly bad way of doing things, 
> because it generates *worse*code*, for no good reason. You just don't see 
> it on powerpc, because it's already a load-store architecture, so there is 
> no "good code" for doing direct-to-memory operations.

True, and I bet *everybody* on this list has already agreed for a very
long time that using "volatile" to type-qualify the _declaration_ of an
object itself as being horribly bad (taste-wise, code-generation-wise,
often even buggy for sitations where real CPU barriers should've been
used instead).

However, the discussion on this thread (IIRC) began with only "giving
volatility semantics" to atomic ops. Now that is different, and may not
require the use the "volatile" keyword (at least not in the declaration
of the object) itself.

Sadly, most arch's *still* do type-qualify the declaration of the
"counter" member of atomic_t as "volatile". This is probably a historic
hangover, and I suspect not yet rectified because of lethargy.

Anyway, some of the variants I can think of are:

[1]

#define atomic_read_volatile(v)				\
	({						\
		forget((v)->counter);			\
		((v)->counter);				\
	})

where:

#define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))

[ This is exactly equivalent to using "+m" in the constraints, as recently
  explained on a GCC list somewhere, in response to the patch in my bitops
  series a few weeks back where I thought "+m" was bogus. ]

[2]

#define atomic_read_volatile(v)		(*(volatile int *)&(v)->counter)

This is something that does work. It has reasonably good semantics
guaranteed by the C standard in conjunction with how GCC currently
behaves (and how it has behaved for all supported versions). I haven't
checked if generates much different code than the first variant above,
(it probably will generate similar code to just declaring the object
as volatile, but would still be better in terms of code-clarity and
taste, IMHO), but in any case, we should pick whichever of these variants
works for us and generates good code.

[3]

static inline int atomic_read_volatile(atomic_t *v)
{
	... arch-dependent __asm__ __volatile__ stuff ...
}

I can reasonably bet this variant would often generate worse code than
at least the variant "[1]" above.

Now, why do we even require these "volatility" semantics variants?

Note, "volatility" semantics *know* / assume that it can have a meaning
_only_ as far as the compiler, so atomic_read_volatile() doesn't really
care reading stale values from the cache for certain non-x86 archs, etc.

The first argument is "safety":

Use of atomic_read() (possibly in conjunction with other atomic ops) in
a lot of code out there in the kernel *assumes* the compiler will not
optimize away those ops. (which is possible given current definitions
of atomic_set and atomic_read on archs such as x86 in present code).
An additional argument that builds on this one says that by ensuring
the compiler will not elid or coalesce these ops, we could even avoid
potential heisenbugs in the future.

However, there is a counter-argument:

As Herbert Xu has often been making the point, there is *no* bug out
there involving "atomic_read" in busy-while-loops that should not have
a compiler barrier (or cpu_relax() in fact) anyway. As for non-busy-loops,
they would invariable call schedule() at some point (possibly directly)
and thus have an "implicit" compiler barrier by virtue of calling out
a function that is not in scope of the current compilation unit (although
users in sched.c itself would probably require an explicit compiler
barrier).

The second pro-volatility-in-atomic-ops argument is performance:
(surprise!)

Using a full memory clobber compiler barrier in busy loops will disqualify
optimizations for loop invariants so it probably makes sense to *only*
make the compiler forget *that* particular address of the atomic counter
object, and none other. All 3 variants above would work nicely here.

So the final idea may be to have a cpu_relax_no_barrier() variant as a
rep;nop (pause) *without* an included full memory clobber, and replace
a lot of kernel busy-while-loops out there with:

-	cpu_relax();
+	cpu_relax_no_barrier();
+	forget(x);

or may be just:

-	cpu_relax();
+	cpu_relax_no_barrier();

because the "forget" / "volatility" / specific-variable-compiler-barrier
could be made implicit inside the atomic ops themselves.

This could especially make a difference for register-rich CPUs (probably
not x86) where using a full memory clobber will disqualify a hell lot of
compiler optimizations for loop-invariants.

On x86 itself, cpu_relax_no_barrier() could be:

#define cpu_relax_no_barrier()	__asm__ __volatile__ ("rep;nop":::);

and still continue to do its job as it is doing presently.

However, there is still a counter-argument:

As Herbert Xu and Christoph Lameter have often been saying, giving
"volatility" semantics to the atomic ops will disqualify compiler
optimizations such as eliding / coalescing of atomic ops, etc, and
probably some sections of code in the kernel (Christoph mentioned code
in SLUB, and I saw such code in sched) benefit from such optimizations.

Paul Mackerras has, otoh, mentioned that a lot of such places probably
don't need (or shouldn't use) atomic ops in the first place.
Alternatively, such callsites should probably just cache the atomic_read
in a local variable (which compiler could just as well make a register)
explicitly, and repeating atomic_read() isn't really necessary.

There could still be legitimate uses of atomic ops that don't care about
them being elided / coalesced, but given the loop-invariant-optimization
benefit, personally, I do see some benefit in the use of defining atomic
ops variants with "volatility" semantics (for only the atomic counter
object) but also having a non-volatile atomic ops API side-by-side for
performance critical users (probably sched, slub) that may require that.

Possibly, one of the two APIs above could turn out to be redundant, but
that's still very much the issue of debate presently.

Satyam

[ Sorry if I missed anything important, but this thread has been long
  and noisy, although I've tried to keep up ... ]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:56                         ` Satyam Sharma
@ 2007-08-17  7:26                           ` Nick Piggin
  2007-08-17  8:47                             ` Satyam Sharma
  2007-08-17 22:49                           ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  7:26 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang

Satyam Sharma wrote:

> #define atomic_read_volatile(v)				\
> 	({						\
> 		forget((v)->counter);			\
> 		((v)->counter);				\
> 	})
> 
> where:

*vomit* :)

Not only do I hate the keyword volatile, but the barrier is only a
one-sided affair so its probable this is going to have slightly
different allowed reorderings than a real volatile access.

Also, why would you want to make these insane accessors for atomic_t
types? Just make sure everybody knows the basics of barriers, and they
can apply that knowledge to atomic_t and all other lockless memory
accesses as well.

> #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))

I like order(x) better, but it's not the most perfect name either.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  7:26                           ` Nick Piggin
@ 2007-08-17  8:47                             ` Satyam Sharma
  2007-08-17  9:15                               ` Nick Piggin
  2007-08-17  9:48                               ` Paul Mackerras
  0 siblings, 2 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  8:47 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> 
> > #define atomic_read_volatile(v)				\
> > 	({						\
> > 		forget((v)->counter);			\
> > 		((v)->counter);				\
> > 	})
> > 
> > where:
> 
> *vomit* :)

I wonder if this'll generate smaller and better code than _both_ the
other atomic_read_volatile() variants. Would need to build allyesconfig
on lots of diff arch's etc to test the theory though.


> Not only do I hate the keyword volatile, but the barrier is only a
> one-sided affair so its probable this is going to have slightly
> different allowed reorderings than a real volatile access.

True ...


> Also, why would you want to make these insane accessors for atomic_t
> types? Just make sure everybody knows the basics of barriers, and they
> can apply that knowledge to atomic_t and all other lockless memory
> accesses as well.

Code that looks like:

	while (!atomic_read(&v)) {
		...
		cpu_relax_no_barrier();
		forget(v.counter);
		        ^^^^^^^^
	}

would be uglier. Also think about code such as:

	a = atomic_read();
	if (!a)
		do_something();

	forget();
	a = atomic_read();
	... /* some code that depends on value of a, obviously */

	forget();
	a = atomic_read();
	...

So much explicit sprinkling of "forget()" looks ugly.

	atomic_read_volatile()

on the other hand, looks neater. The "_volatile()" suffix makes it also
no less explicit than an explicit barrier-like macro that this primitive
is something "special", for code clarity purposes.


> > #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
> 
> I like order(x) better, but it's not the most perfect name either.

forget(x) is just a stupid-placeholder-for-a-better-name. order(x) sounds
good but we could leave quibbling about function or macro names for later,
this thread is noisy as it is :-)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:47                             ` Satyam Sharma
@ 2007-08-17  9:15                               ` Nick Piggin
  2007-08-17 10:12                                 ` Satyam Sharma
  2007-08-17  9:48                               ` Paul Mackerras
  1 sibling, 1 reply; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  9:15 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang

Satyam Sharma wrote:
> 
> On Fri, 17 Aug 2007, Nick Piggin wrote:

>>Also, why would you want to make these insane accessors for atomic_t
>>types? Just make sure everybody knows the basics of barriers, and they
>>can apply that knowledge to atomic_t and all other lockless memory
>>accesses as well.
> 
> 
> Code that looks like:
> 
> 	while (!atomic_read(&v)) {
> 		...
> 		cpu_relax_no_barrier();
> 		forget(v.counter);
> 		        ^^^^^^^^
> 	}
> 
> would be uglier. Also think about code such as:

I think they would both be equally ugly, but the atomic_read_volatile
variant would be more prone to subtle bugs because of the weird
implementation.

And it would be more ugly than introducing an order(x) statement for
all memory operations, and adding an order_atomic() wrapper for it
for atomic types.

> 	a = atomic_read();
> 	if (!a)
> 		do_something();
> 
> 	forget();
> 	a = atomic_read();
> 	... /* some code that depends on value of a, obviously */
> 
> 	forget();
> 	a = atomic_read();
> 	...
> 
> So much explicit sprinkling of "forget()" looks ugly.

Firstly, why is it ugly? It's nice because of those nice explicit
statements there that give us a good heads up and would have some
comments attached to them (also, lack of the word "volatile" is
always a plus).

Secondly, what sort of code would do such a thing? In most cases,
it is probably riddled with bugs anyway (unless it is doing a
really specific sequence of interrupts or something, but in that
case it is very likely to either require locking or busy waits
anyway -> ie. barriers).

> on the other hand, looks neater. The "_volatile()" suffix makes it also
> no less explicit than an explicit barrier-like macro that this primitive
> is something "special", for code clarity purposes.

Just don't use the word volatile, and have barriers both before
and after the memory operation, and I'm OK with it. I don't see
the point though, when you could just have a single barrier(x)
barrier function defined for all memory locations, rather than
this odd thing that only works for atomics (and would have to
be duplicated for atomic_set.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:15                               ` Nick Piggin
@ 2007-08-17 10:12                                 ` Satyam Sharma
  2007-08-17 12:14                                   ` Nick Piggin
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:12 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> > 
> > On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> > > Also, why would you want to make these insane accessors for atomic_t
> > > types? Just make sure everybody knows the basics of barriers, and they
> > > can apply that knowledge to atomic_t and all other lockless memory
> > > accesses as well.
> > 
> > 
> > Code that looks like:
> > 
> > 	while (!atomic_read(&v)) {
> > 		...
> > 		cpu_relax_no_barrier();
> > 		forget(v.counter);
> > 		        ^^^^^^^^
> > 	}
> > 
> > would be uglier. Also think about code such as:
> 
> I think they would both be equally ugly,

You think both these are equivalent in terms of "looks":

					|
while (!atomic_read(&v)) {		|	while (!atomic_read_xxx(&v)) {
	...				|		...
	cpu_relax_no_barrier();		|		cpu_relax_no_barrier();
	order_atomic(&v);		|	}
}					|

(where order_atomic() is an atomic_t
specific wrapper as you mentioned below)

?

Well, taste varies, but ...

> but the atomic_read_volatile
> variant would be more prone to subtle bugs because of the weird
> implementation.

What bugs?

> And it would be more ugly than introducing an order(x) statement for
> all memory operations, and adding an order_atomic() wrapper for it
> for atomic types.

Oh, that order() / forget() macro [forget() was named such by Chuck Ebbert
earlier in this thread where he first mentioned it, btw] could definitely
be generically introduced for any memory operations.

> > 	a = atomic_read();
> > 	if (!a)
> > 		do_something();
> > 
> > 	forget();
> > 	a = atomic_read();
> > 	... /* some code that depends on value of a, obviously */
> > 
> > 	forget();
> > 	a = atomic_read();
> > 	...
> > 
> > So much explicit sprinkling of "forget()" looks ugly.
> 
> Firstly, why is it ugly? It's nice because of those nice explicit
> statements there that give us a good heads up and would have some
> comments attached to them

atomic_read_xxx (where xxx = whatever naming sounds nice to you) would
obviously also give a heads up, and could also have some comments
attached to it.

> (also, lack of the word "volatile" is always a plus).

Ok, xxx != volatile.

> Secondly, what sort of code would do such a thing?

See the nodemgr_host_thread() that does something similar, though not
exactly same.

> > on the other hand, looks neater. The "_volatile()" suffix makes it also
> > no less explicit than an explicit barrier-like macro that this primitive
> > is something "special", for code clarity purposes.
> 
> Just don't use the word volatile,

That sounds amazingly frivolous, but hey, why not. As I said, ok,
xxx != volatile.

> and have barriers both before and after the memory operation,

How could that lead to bugs? (if you can point to existing code,
but just some testcase / sample code would be fine as well).

> [...] I don't see
> the point though, when you could just have a single barrier(x)
> barrier function defined for all memory locations,

As I said, barrier() is too heavy-handed.

> rather than
> this odd thing that only works for atomics

Why would it work only for atomics? You could use that generic macro
for anything you well damn please.

> (and would have to
> be duplicated for atomic_set.

#define atomic_set_xxx for something similar. Big deal ... NOT.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 10:12                                 ` Satyam Sharma
@ 2007-08-17 12:14                                   ` Nick Piggin
  2007-08-17 13:05                                     ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17 12:14 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang

Satyam Sharma wrote:

>
>On Fri, 17 Aug 2007, Nick Piggin wrote:
>
>>I think they would both be equally ugly,
>>
>
>You think both these are equivalent in terms of "looks":
>
>					|
>while (!atomic_read(&v)) {		|	while (!atomic_read_xxx(&v)) {
>	...				|		...
>	cpu_relax_no_barrier();		|		cpu_relax_no_barrier();
>	order_atomic(&v);		|	}
>}					|
>
>(where order_atomic() is an atomic_t
>specific wrapper as you mentioned below)
>
>?
>

I think the LHS is better if your atomic_read_xxx primitive is using the
crazy one-sided barrier, because the LHS code you immediately know what
barriers are happening, and with the RHS you have to look at the 
atomic_read_xxx
definition.

If your atomic_read_xxx implementation was more intuitive, then both are
pretty well equal. More lines != ugly code.

>>but the atomic_read_volatile
>>variant would be more prone to subtle bugs because of the weird
>>implementation.
>>
>
>What bugs?
>

You can't think for yourself? Your atomic_read_volatile contains a compiler
barrier to the atomic variable before the load. 2 such reads from different
locations look like this:

asm volatile("" : "+m" (v1));
atomic_read(&v1);
asm volatile("" : "+m" (v2));
atomic_read(&v2);

Which implies that the load of v1 can be reordered to occur after the load
of v2. Bet you didn't expect that?

>>Secondly, what sort of code would do such a thing?
>>
>
>See the nodemgr_host_thread() that does something similar, though not
>exactly same.
>

I'm sorry, all this waffling about made up code which might do this and
that is just a waste of time. Seriously, the thread is bloated enough
and never going to get anywhere with all this handwaving. If someone is
saving up all the really real and actually good arguments for why we
must have a volatile here, now is the time to use them.

>>and have barriers both before and after the memory operation,
>>
>
>How could that lead to bugs? (if you can point to existing code,
>but just some testcase / sample code would be fine as well).
>

See above.

>As I said, barrier() is too heavy-handed.
>

Typo. I meant: defined for a single memory location (ie. order(x)).

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:14                                   ` Nick Piggin
@ 2007-08-17 13:05                                     ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 13:05 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Linus Torvalds, Paul Mackerras, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, ak,
	netdev, cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx,
	clameter, schwidefsky, Chris Snook, Herbert Xu, davem, wensong,
	wjiang



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> [...]
> > You think both these are equivalent in terms of "looks":
> > 
> > 					|
> > while (!atomic_read(&v)) {		|	while (!atomic_read_xxx(&v)) {
> > 	...				|		...
> > 	cpu_relax_no_barrier();		|
> > cpu_relax_no_barrier();
> > 	order_atomic(&v);		|	}
> > }					|
> > 
> > (where order_atomic() is an atomic_t
> > specific wrapper as you mentioned below)
> > 
> > ?
> 
> I think the LHS is better if your atomic_read_xxx primitive is using the
> crazy one-sided barrier,
  ^^^^^

I'd say it's purposefully one-sided.

> because the LHS code you immediately know what
> barriers are happening, and with the RHS you have to look at the
> atomic_read_xxx
> definition.

No. As I said, the _xxx (whatever the heck you want to name it as) should
give the same heads-up that your "order_atomic" thing is supposed to give.


> If your atomic_read_xxx implementation was more intuitive, then both are
> pretty well equal. More lines != ugly code.
> 
> > [...]
> > What bugs?
> 
> You can't think for yourself? Your atomic_read_volatile contains a compiler
> barrier to the atomic variable before the load. 2 such reads from different
> locations look like this:
> 
> asm volatile("" : "+m" (v1));
> atomic_read(&v1);
> asm volatile("" : "+m" (v2));
> atomic_read(&v2);
> 
> Which implies that the load of v1 can be reordered to occur after the load
> of v2.

And how would that be a bug? (sorry, I really can't think for myself)


> > > Secondly, what sort of code would do such a thing?
> > 
> > See the nodemgr_host_thread() that does something similar, though not
> > exactly same.
> 
> I'm sorry, all this waffling about made up code which might do this and
> that is just a waste of time.

First, you could try looking at the code.

And by the way, as I've already said (why do *require* people to have to
repeat things to you?) this isn't even about only existing code.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:47                             ` Satyam Sharma
  2007-08-17  9:15                               ` Nick Piggin
@ 2007-08-17  9:48                               ` Paul Mackerras
  2007-08-17 10:23                                 ` Satyam Sharma
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-17  9:48 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Nick Piggin, Linus Torvalds, Segher Boessenkool, heiko.carstens,
	horms, Linux Kernel Mailing List, rpjday, ak, netdev, cfriesen,
	Andrew Morton, jesper.juhl, linux-arch, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

Satyam Sharma writes:

> I wonder if this'll generate smaller and better code than _both_ the
> other atomic_read_volatile() variants. Would need to build allyesconfig
> on lots of diff arch's etc to test the theory though.

I'm sure it would be a tiny effect.

This whole thread is arguing about effects that are quite
insignificant.  On the one hand we have the non-volatile proponents,
who want to let the compiler do extra optimizations - which amounts to
letting it elide maybe a dozen loads in the whole kernel, loads which
would almost always be L1 cache hits.

On the other hand we have the volatile proponents, who are concerned
that some code somewhere in the kernel might be buggy without the
volatile behaviour, and who also want to be able to remove some
barriers and thus save a few bytes of code and a few loads here and
there (and possibly some stores too).

Either way the effect on code size and execution time is miniscule.

In the end the strongest argument is actually that gcc generates
unnecessarily verbose code on x86[-64] for volatile accesses.  Even
then we're only talking about ~2000 bytes, or less than 1 byte per
instance of atomic_read on average, about 0.06% of the kernel text
size.

The x86[-64] developers seem to be willing to bear the debugging cost
involved in having the non-volatile behaviour for atomic_read, in
order to save the 2kB.  That's fine with me.  Either way I think
somebody should audit all the uses of atomic_read, not just for
missing barriers, but also to find the places where it's used in a
racy manner.  Then we can work out where the races matter and fix them
if they do.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:48                               ` Paul Mackerras
@ 2007-08-17 10:23                                 ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:23 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Nick Piggin, Linus Torvalds, Segher Boessenkool, heiko.carstens,
	horms, Linux Kernel Mailing List, rpjday, ak, netdev, cfriesen,
	Andrew Morton, jesper.juhl, linux-arch, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang



On Fri, 17 Aug 2007, Paul Mackerras wrote:

> Satyam Sharma writes:
> 
> > I wonder if this'll generate smaller and better code than _both_ the
> > other atomic_read_volatile() variants. Would need to build allyesconfig
> > on lots of diff arch's etc to test the theory though.
> 
> I'm sure it would be a tiny effect.
> 
> This whole thread is arguing about effects that are quite
> insignificant.

Hmm, the fact that this thread became what it did, probably means that
most developers on this list do not mind thinking/arguing about effects
or optimizations that are otherwise "tiny". But yeah, they are tiny
nonetheless.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:56                         ` Satyam Sharma
  2007-08-17  7:26                           ` Nick Piggin
@ 2007-08-17 22:49                           ` Segher Boessenkool
  2007-08-17 23:51                             ` Satyam Sharma
  1 sibling, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:49 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Paul Mackerras, heiko.carstens, horms, Linux Kernel Mailing List,
	rpjday, ak, netdev, cfriesen, Nick Piggin, linux-arch,
	jesper.juhl, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

> #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
>
> [ This is exactly equivalent to using "+m" in the constraints, as 
> recently
>   explained on a GCC list somewhere, in response to the patch in my 
> bitops
>   series a few weeks back where I thought "+m" was bogus. ]

[It wasn't explained on a GCC list in response to your patch, as
far as I can see -- if I missed it, please point me to an archived
version of it].

One last time: it isn't equivalent on older (but still supported
by Linux) versions of GCC.  Current versions of GCC allow it, but
it has no documented behaviour at all, so use it at your own risk.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 22:49                           ` Segher Boessenkool
@ 2007-08-17 23:51                             ` Satyam Sharma
  2007-08-17 23:55                               ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 23:51 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Paul Mackerras, heiko.carstens, horms, Linux Kernel Mailing List,
	rpjday, ak, netdev, cfriesen, Nick Piggin, linux-arch,
	jesper.juhl, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
> > 
> > [ This is exactly equivalent to using "+m" in the constraints, as recently
> >   explained on a GCC list somewhere, in response to the patch in my bitops
> >   series a few weeks back where I thought "+m" was bogus. ]
> 
> [It wasn't explained on a GCC list in response to your patch, as
> far as I can see -- if I missed it, please point me to an archived
> version of it].

http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01758.html

is a follow-up in the thread on the gcc-patches@gcc.gnu.org mailing list,
which began with:

http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01677.html

that was posted by Jan Kubicka, as he quotes in that initial posting,
after I had submitted:

http://lkml.org/lkml/2007/7/23/252

which was a (wrong) patch to "rectify" what I thought was the "bogus"
"+m" constraint, as per the quoted extract from gcc docs (that was
given in that (wrong) patch's changelog).

That's when _I_ came to know how GCC interprets "+m", but probably
this has been explained on those lists multiple times. Who cares,
anyway?


> One last time: it isn't equivalent on older (but still supported
> by Linux) versions of GCC.  Current versions of GCC allow it, but
> it has no documented behaviour at all, so use it at your own risk.

True.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 23:51                             ` Satyam Sharma
@ 2007-08-17 23:55                               ` Segher Boessenkool
  0 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 23:55 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Paul Mackerras, heiko.carstens, horms, Linux Kernel Mailing List,
	rpjday, ak, netdev, cfriesen, Nick Piggin, linux-arch,
	jesper.juhl, Andrew Morton, zlynx, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, Linus Torvalds, wensong, wjiang

>>> #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
>>>
>>> [ This is exactly equivalent to using "+m" in the constraints, as 
>>> recently
>>>   explained on a GCC list somewhere, in response to the patch in my 
>>> bitops
>>>   series a few weeks back where I thought "+m" was bogus. ]
>>
>> [It wasn't explained on a GCC list in response to your patch, as
>> far as I can see -- if I missed it, please point me to an archived
>> version of it].
>
> http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01758.html

Ah yes, that old thread, thank you.

> That's when _I_ came to know how GCC interprets "+m", but probably
> this has been explained on those lists multiple times. Who cares,
> anyway?

I just couldn't find the thread you meant, I thought I missed
have it, that's all :-)


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
  2007-08-17  5:18                         ` Paul E. McKenney
  2007-08-17  5:56                         ` Satyam Sharma
@ 2007-08-17  6:42                         ` Geert Uytterhoeven
  2007-08-17  8:52                         ` Andi Kleen
  2007-08-17 22:29                         ` Segher Boessenkool
  4 siblings, 0 replies; 1651+ messages in thread
From: Geert Uytterhoeven @ 2007-08-17  6:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, Nick Piggin, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, ak, netdev, cfriesen, akpm,
	jesper.juhl, linux-arch, zlynx, satyam, clameter, schwidefsky,
	Chris Snook, Herbert Xu, davem, wensong, wjiang

On Thu, 16 Aug 2007, Linus Torvalds wrote:
> On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > and it only saved 40 bytes (10 instructions) for a G5 config.
> 
> One of the things that "volatile" generally screws up is a simple
> 
> 	volatile int i;
> 
> 	i++;
> 
> which a compiler will generally get horribly, horribly wrong.
> 
> In a reasonable world, gcc should just make that be (on x86)
> 
> 	addl $1,i(%rip)
> 
> on x86-64, which is indeed what it does without the volatile. But with the 
> volatile, the compiler gets really nervous, and doesn't dare do it in one 
> instruction, and thus generates crap like
> 
>         movl    i(%rip), %eax
>         addl    $1, %eax
>         movl    %eax, i(%rip)
> 
> instead. For no good reason, except that "volatile" just doesn't have any 
> good/clear semantics for the compiler, so most compilers will just make it 
> be "I will not touch this access in any way, shape, or form". Including 
> even trivially correct instruction optimization/combination.

Apart from having to fetch more bytes for the instructions (which does
matter), execution time is probably the same on modern processors, as they
convert the single instruction to RISC-style load, modify, store anyway.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
                                           ` (2 preceding siblings ...)
  2007-08-17  6:42                         ` Geert Uytterhoeven
@ 2007-08-17  8:52                         ` Andi Kleen
  2007-08-17 10:08                           ` Satyam Sharma
  2007-08-17 22:29                         ` Segher Boessenkool
  4 siblings, 1 reply; 1651+ messages in thread
From: Andi Kleen @ 2007-08-17  8:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, Nick Piggin, Segher Boessenkool, heiko.carstens,
	horms, linux-kernel, rpjday, netdev, cfriesen, akpm, jesper.juhl,
	linux-arch, zlynx, satyam, clameter, schwidefsky, Chris Snook,
	Herbert Xu, davem, wensong, wjiang

On Friday 17 August 2007 05:42, Linus Torvalds wrote:
> On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > and it only saved 40 bytes (10 instructions) for a G5 config.
>
> One of the things that "volatile" generally screws up is a simple
>
> 	volatile int i;
>
> 	i++;

But for atomic_t people use atomic_inc() anyways which does this correctly.
It shouldn't really matter for atomic_t.

I'm worrying a bit that the volatile atomic_t change caused subtle code 
breakage like these delay read loops people here pointed out.
Wouldn't it be safer to just re-add the volatile to atomic_read() 
for 2.6.23? Or alternatively make it asm(), but volatile seems more
proven.

-Andi

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:52                         ` Andi Kleen
@ 2007-08-17 10:08                           ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:08 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Linus Torvalds, Paul Mackerras, Nick Piggin, Segher Boessenkool,
	heiko.carstens, horms, Linux Kernel Mailing List, rpjday, netdev,
	cfriesen, Andrew Morton, jesper.juhl, linux-arch, zlynx, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

On Fri, 17 Aug 2007, Andi Kleen wrote:

> On Friday 17 August 2007 05:42, Linus Torvalds wrote:
> > On Fri, 17 Aug 2007, Paul Mackerras wrote:
> > > I'm really surprised it's as much as a few K.  I tried it on powerpc
> > > and it only saved 40 bytes (10 instructions) for a G5 config.
> >
> > One of the things that "volatile" generally screws up is a simple
> >
> > 	volatile int i;
> >
> > 	i++;
> 
> But for atomic_t people use atomic_inc() anyways which does this correctly.
> It shouldn't really matter for atomic_t.
> 
> I'm worrying a bit that the volatile atomic_t change caused subtle code 
> breakage like these delay read loops people here pointed out.

Umm, I followed most of the thread, but which breakage is this?

> Wouldn't it be safer to just re-add the volatile to atomic_read() 
> for 2.6.23? Or alternatively make it asm(), but volatile seems more
> proven.

The problem with volatile is not just trashy code generation (which also
definitely is a major problem), but definition holes, and implementation
inconsistencies. Making it asm() is not the only other alternative to
volatile either (read another reply to this mail), but considering most
of the thread has been about people not wanting even a
atomic_read_volatile() variant, making atomic_read() itself have volatile
semantics sounds ... strange :-)

PS: http://lkml.org/lkml/2007/8/15/407 was submitted a couple days back,
any word if you saw that?

I have another one for you:

[PATCH] i386, x86_64: __const_udelay() should not be marked inline

Because it can never get inlined in any callsite (each translation unit
is compiled separately for the kernel and so the implementation of
__const_udelay() would be invisible to all other callsites). In fact it
turns out, the correctness of callsites at arch/x86_64/kernel/crash.c:97
and arch/i386/kernel/crash.c:101 explicitly _depends_ upon it not being
inlined, and also it's an exported symbol (modules may want to call
mdelay() and udelay() that often becomes __const_udelay() after some
macro-ing in various headers). So let's not mark it as "inline" either.

Signed-off-by: Satyam Sharma <satyam@infradead.org>

---

 arch/i386/lib/delay.c   |    2 +-
 arch/x86_64/lib/delay.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/i386/lib/delay.c b/arch/i386/lib/delay.c
index f6edb11..0082c99 100644
--- a/arch/i386/lib/delay.c
+++ b/arch/i386/lib/delay.c
@@ -74,7 +74,7 @@ void __delay(unsigned long loops)
 	delay_fn(loops);
 }

-inline void __const_udelay(unsigned long xloops)
+void __const_udelay(unsigned long xloops)
 {
 	int d0;

diff --git a/arch/x86_64/lib/delay.c b/arch/x86_64/lib/delay.c
index 2dbebd3..d0cd9cd 100644
--- a/arch/x86_64/lib/delay.c
+++ b/arch/x86_64/lib/delay.c
@@ -38,7 +38,7 @@ void __delay(unsigned long loops)
 }
 EXPORT_SYMBOL(__delay);

-inline void __const_udelay(unsigned long xloops)
+void __const_udelay(unsigned long xloops)
 {
 	__delay(((xloops * HZ * cpu_data[raw_smp_processor_id()].loops_per_jiffy) >> 32) + 1);
 }

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:42                       ` Linus Torvalds
                                           ` (3 preceding siblings ...)
  2007-08-17  8:52                         ` Andi Kleen
@ 2007-08-17 22:29                         ` Segher Boessenkool
  4 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Paul Mackerras, heiko.carstens, horms, linux-kernel, rpjday, ak,
	netdev, cfriesen, akpm, Nick Piggin, linux-arch, jesper.juhl,
	satyam, zlynx, clameter, schwidefsky, Chris Snook, Herbert Xu,
	davem, wensong, wjiang

> In a reasonable world, gcc should just make that be (on x86)
>
> 	addl $1,i(%rip)
>
> on x86-64, which is indeed what it does without the volatile. But with 
> the
> volatile, the compiler gets really nervous, and doesn't dare do it in 
> one
> instruction, and thus generates crap like
>
>         movl    i(%rip), %eax
>         addl    $1, %eax
>         movl    %eax, i(%rip)
>
> instead. For no good reason, except that "volatile" just doesn't have 
> any
> good/clear semantics for the compiler, so most compilers will just 
> make it
> be "I will not touch this access in any way, shape, or form". Including
> even trivially correct instruction optimization/combination.

It's just a (target-specific, perhaps) missed-optimisation kind
of bug in GCC.  Care to file a bug report?

> but is
> (again) something that gcc doesn't dare do, since "i" is volatile.

Just nobody taught it it can do this; perhaps no one wanted to
add optimisations like that, maybe with a reasoning like "people
who hit the go-slow-in-unspecified-ways button should get what
they deserve" ;-)


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:19                   ` Nick Piggin
  2007-08-17  3:16                     ` Paul Mackerras
@ 2007-08-17 17:37                     ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 17:37 UTC (permalink / raw)
  To: Nick Piggin
  Cc: heiko.carstens, horms, linux-kernel, rpjday, ak, netdev, cfriesen,
	akpm, torvalds, jesper.juhl, linux-arch, zlynx, satyam, clameter,
	schwidefsky, Chris Snook, Herbert Xu, davem, wensong, wjiang

>>>>>> Part of the motivation here is to fix heisenbugs.  If I knew 
>>>>>> where they
>>>>>
>>>>> By the same token we should probably disable optimisations
>>>>> altogether since that too can create heisenbugs.
>>>>
>>>> Almost everything is a tradeoff; and so is this.  I don't
>>>> believe most people would find disabling all compiler
>>>> optimisations an acceptable price to pay for some peace
>>>> of mind.
>>>
>>>
>>> So why is this a good tradeoff?
>> It certainly is better than disabling all compiler optimisations!
>
> It's easy to be better than something really stupid :)

Sure, it wasn't me who made the comparison though.

> So i386 and x86-64 don't have volatiles there, and it saves them a
> few K of kernel text.

Which has to be investigated.  A few kB is a lot more than expected.

> What you need to justify is why it is a good
> tradeoff to make them volatile (which btw, is much harder to go
> the other way after we let people make those assumptions).

My point is that people *already* made those assumptions.  There
are two ways to clean up this mess:

1) Have the "volatile" semantics by default, change the users
    that don't need it;
2) Have "non-volatile" semantics by default, change the users
    that do need it.

Option 2) randomly breaks stuff all over the place, option 1)
doesn't.  Yeah 1) could cause some extremely minor speed or
code size regression, but only temporarily until everything has
been audited.

>>> I also think that just adding things to APIs in the hope it might fix
>>> up some bugs isn't really a good road to go down. Where do you stop?
>> I look at it the other way: keeping the "volatile" semantics in
>> atomic_XXX() (or adding them to it, whatever) helps _prevent_ bugs;
>
> Yeah, but we could add lots of things to help prevent bugs and
> would never be included. I would also contend that it helps _hide_
> bugs and encourages people to be lazy when thinking about these
> things.

Sure.  We aren't _adding_ anything here though, not on the platforms
where it is most likely to show up, anyway.

> Also, you dismiss the fact that we'd actually be *adding* volatile
> semantics back to the 2 most widely tested architectures (in terms
> of test time, number of testers, variety of configurations, and
> coverage of driver code).

I'm not dismissing that.  x86 however is one of the few architectures
where mistakenly leaving out a "volatile" will not easily show up on
user testing, since the compiler will very often produce a memory
reference anyway because it has no registers to play with.

> This is a very important different from
> just keeping volatile semantics because it is basically a one-way
> API change.

That's a good point.  Maybe we should create _two_ new APIs, one
explicitly going each way.

>> certainly most people expect that behaviour, and also that behaviour
>> is *needed* in some places and no other interface provides that
>> functionality.
>
> I don't know that most people would expect that behaviour.

I didn't conduct any formal poll either :-)

> Is there any documentation anywhere that would suggest this?

Not really I think, no.  But not the other way around, either.
Most uses of it seem to expect it though.

>> [some confusion about barriers wrt atomics snipped]
>
> What were you confused about?

Me?  Not much.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:08   ` Satyam Sharma
  2007-08-14 23:04     ` Chris Snook
@ 2007-08-14 23:26     ` Paul E. McKenney
  2007-08-15 10:35     ` Stefan Richter
  2 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-14 23:26 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, torvalds, netdev, Andrew Morton, ak, heiko.carstens,
	davem, schwidefsky, wensong, horms, wjiang, cfriesen, zlynx,
	rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 04:38:54AM +0530, Satyam Sharma wrote:
> 
> 
> On Tue, 14 Aug 2007, Christoph Lameter wrote:
> 
> > On Thu, 9 Aug 2007, Chris Snook wrote:
> > 
> > > This patchset makes the behavior of atomic_read uniform by removing the
> > > volatile keyword from all atomic_t and atomic64_t definitions that currently
> > > have it, and instead explicitly casts the variable as volatile in
> > > atomic_read().  This leaves little room for creative optimization by the
> > > compiler, and is in keeping with the principles behind "volatile considered
> > > harmful".
> > 
> > volatile is generally harmful even in atomic_read(). Barriers control
> > visibility and AFAICT things are fine.
> 
> Frankly, I don't see the need for this series myself either. Personal
> opinion (others may differ), but I consider "volatile" to be a sad /
> unfortunate wart in C (numerous threads on this list and on the gcc
> lists/bugzilla over the years stand testimony to this) and if we _can_
> steer clear of it, then why not -- why use this ill-defined primitive
> whose implementation has often differed over compiler versions and
> platforms? Granted, barrier() _is_ heavy-handed in that it makes the
> optimizer forget _everything_, but then somebody did post a forget()
> macro on this thread itself ...
> 
> [ BTW, why do we want the compiler to not optimize atomic_read()'s in
>   the first place? Atomic ops guarantee atomicity, which has nothing
>   to do with "volatility" -- users that expect "volatility" from
>   atomic ops are the ones who must be fixed instead, IMHO. ]

Interactions between mainline code and interrupt/NMI handlers on the same
CPU (for example, when both are using per-CPU variables.  See examples
previously posted in this thread, or look at the rcu_read_lock() and
rcu_read_unlock() implementations in http://lkml.org/lkml/2007/8/7/280.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-14 23:08   ` Satyam Sharma
  2007-08-14 23:04     ` Chris Snook
  2007-08-14 23:26     ` Paul E. McKenney
@ 2007-08-15 10:35     ` Stefan Richter
  2007-08-15 12:04       ` Herbert Xu
                         ` (2 more replies)
  2 siblings, 3 replies; 1651+ messages in thread
From: Stefan Richter @ 2007-08-15 10:35 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, torvalds, netdev, Andrew Morton, ak, heiko.carstens,
	davem, schwidefsky, wensong, horms, wjiang, cfriesen, zlynx,
	rpjday, jesper.juhl, segher, Herbert Xu, Paul E. McKenney

Satyam Sharma wrote:
> [ BTW, why do we want the compiler to not optimize atomic_read()'s in
>   the first place? Atomic ops guarantee atomicity, which has nothing
>   to do with "volatility" -- users that expect "volatility" from
>   atomic ops are the ones who must be fixed instead, IMHO. ]

LDD3 says on page 125:  "The following operations are defined for the
type [atomic_t] and are guaranteed to be atomic with respect to all
processors of an SMP computer."

Doesn't "atomic WRT all processors" require volatility?
-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 10:35     ` Stefan Richter
@ 2007-08-15 12:04       ` Herbert Xu
  2007-08-15 12:31       ` Satyam Sharma
  2007-08-15 19:59       ` Christoph Lameter
  2 siblings, 0 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-15 12:04 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Satyam Sharma, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Paul E. McKenney

On Wed, Aug 15, 2007 at 12:35:31PM +0200, Stefan Richter wrote:
> 
> LDD3 says on page 125:  "The following operations are defined for the
> type [atomic_t] and are guaranteed to be atomic with respect to all
> processors of an SMP computer."
> 
> Doesn't "atomic WRT all processors" require volatility?

Not at all.  We also require this to be atomic without any
hint of volatility.

	extern int foo;
	foo = 4;

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 10:35     ` Stefan Richter
  2007-08-15 12:04       ` Herbert Xu
@ 2007-08-15 12:31       ` Satyam Sharma
  2007-08-15 13:08         ` Stefan Richter
                           ` (3 more replies)
  2007-08-15 19:59       ` Christoph Lameter
  2 siblings, 4 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-15 12:31 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Herbert Xu,
	Paul E. McKenney

On Wed, 15 Aug 2007, Stefan Richter wrote:

> Satyam Sharma wrote:
> > [ BTW, why do we want the compiler to not optimize atomic_read()'s in
> >   the first place? Atomic ops guarantee atomicity, which has nothing
> >   to do with "volatility" -- users that expect "volatility" from
> >   atomic ops are the ones who must be fixed instead, IMHO. ]
> 
> LDD3 says on page 125:  "The following operations are defined for the
> type [atomic_t] and are guaranteed to be atomic with respect to all
> processors of an SMP computer."
> 
> Doesn't "atomic WRT all processors" require volatility?

No, it definitely doesn't. Why should it?

"Atomic w.r.t. all processors" is just your normal, simple "atomicity"
for SMP systems (ensure that that object is modified / set / replaced
in main memory atomically) and has nothing to do with "volatile"
behaviour.

"Volatile behaviour" itself isn't consistently defined (at least
definitely not consistently implemented in various gcc versions across
platforms), but it is /expected/ to mean something like: "ensure that
every such access actually goes all the way to memory, and is not
re-ordered w.r.t. to other accesses, as far as the compiler can take
care of these". The last "as far as compiler can take care" disclaimer
comes about due to CPUs doing their own re-ordering nowadays.

For example (say on i386):

(A)
$ cat tp1.c
int a;

void func(void)
{
	a = 10;
	a = 20;
}
$ gcc -Os -S tp1.c
$ cat tp1.s
...
movl    $20, a
...

(B)
$ cat tp2.c
volatile int a;

void func(void)
{
	a = 10;
	a = 20;
}
$ gcc -Os -S tp2.c
$ cat tp2.s
...
movl    $10, a
movl    $20, a
...

(C)
$ cat tp3.c
int a;

void func(void)
{
	*(volatile int *)&a = 10;
	*(volatile int *)&a = 20;
}
$ gcc -Os -S tp3.c
$ cat tp3.s
...
movl    $10, a
movl    $20, a
...

In (A) the compiler optimized "a = 10;" away, but the actual store
of the final value "20" to "a" was still "atomic". (B) and (C) also
exhibit "volatile" behaviour apart from the "atomicity".

But as others replied, it seems some callers out there depend upon
atomic ops exhibiting "volatile" behaviour as well, so that answers
my initial question, actually. I haven't looked at the code Paul
pointed me at, but I wonder if that "forget(x)" macro would help
those cases. I'd wish to avoid the "volatile" primitive, personally.

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 12:31       ` Satyam Sharma
@ 2007-08-15 13:08         ` Stefan Richter
  2007-08-15 13:11           ` Stefan Richter
  2007-08-15 13:47           ` Satyam Sharma
  2007-08-15 18:31         ` Segher Boessenkool
                           ` (2 subsequent siblings)
  3 siblings, 2 replies; 1651+ messages in thread
From: Stefan Richter @ 2007-08-15 13:08 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Herbert Xu,
	Paul E. McKenney

Satyam Sharma wrote:
> On Wed, 15 Aug 2007, Stefan Richter wrote:
>> Doesn't "atomic WRT all processors" require volatility?
> 
> No, it definitely doesn't. Why should it?
> 
> "Atomic w.r.t. all processors" is just your normal, simple "atomicity"
> for SMP systems (ensure that that object is modified / set / replaced
> in main memory atomically) and has nothing to do with "volatile"
> behaviour.
> 
> "Volatile behaviour" itself isn't consistently defined (at least
> definitely not consistently implemented in various gcc versions across
> platforms), but it is /expected/ to mean something like: "ensure that
> every such access actually goes all the way to memory, and is not
> re-ordered w.r.t. to other accesses, as far as the compiler can take
> care of these". The last "as far as compiler can take care" disclaimer
> comes about due to CPUs doing their own re-ordering nowadays.
> 
> For example (say on i386):

[...]

> In (A) the compiler optimized "a = 10;" away, but the actual store
> of the final value "20" to "a" was still "atomic". (B) and (C) also
> exhibit "volatile" behaviour apart from the "atomicity".
> 
> But as others replied, it seems some callers out there depend upon
> atomic ops exhibiting "volatile" behaviour as well, so that answers
> my initial question, actually. I haven't looked at the code Paul
> pointed me at, but I wonder if that "forget(x)" macro would help
> those cases. I'd wish to avoid the "volatile" primitive, personally.

So, looking at load instead of store, understand I correctly that in
your opinion

	int b;

	b = atomic_read(&a);
	if (b)
		do_something_time_consuming();

	b = atomic_read(&a);
	if (b)
		do_something_more();

should be changed to explicitly forget(&a) after
do_something_time_consuming?

If so, how about the following:

static inline void A(atomic_t *a)
{
	int b = atomic_read(a);
	if (b)
		do_something_time_consuming();
}

static inline void B(atomic_t *a)
{
	int b = atomic_read(a);
	if (b)
		do_something_more();
}

static void C(atomic_t *a)
{
	A(a);
	B(b);
}

Would this need forget(a) after A(a)?

(Is the latter actually answered in C99 or is it compiler-dependent?)
-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 13:08         ` Stefan Richter
@ 2007-08-15 13:11           ` Stefan Richter
  2007-08-15 13:47           ` Satyam Sharma
  1 sibling, 0 replies; 1651+ messages in thread
From: Stefan Richter @ 2007-08-15 13:11 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Herbert Xu,
	Paul E. McKenney

I wrote:
> static inline void A(atomic_t *a)
> {
> 	int b = atomic_read(a);
> 	if (b)
> 		do_something_time_consuming();
> }
> 
> static inline void B(atomic_t *a)
> {
> 	int b = atomic_read(a);
> 	if (b)
> 		do_something_more();
> }
> 
> static void C(atomic_t *a)
> {
> 	A(a);
> 	B(b);
	/* ^ typo */
	B(a);
> }
> 
> Would this need forget(a) after A(a)?
> 
> (Is the latter actually answered in C99 or is it compiler-dependent?)


-- 
Stefan Richter
-=====-=-=== =--- -====
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 13:08         ` Stefan Richter
  2007-08-15 13:11           ` Stefan Richter
@ 2007-08-15 13:47           ` Satyam Sharma
  2007-08-15 14:25             ` Paul E. McKenney
  1 sibling, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-15 13:47 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher, Herbert Xu,
	Paul E. McKenney



On Wed, 15 Aug 2007, Stefan Richter wrote:

> Satyam Sharma wrote:
> > On Wed, 15 Aug 2007, Stefan Richter wrote:
> >> Doesn't "atomic WRT all processors" require volatility?
> > 
> > No, it definitely doesn't. Why should it?
> > 
> > "Atomic w.r.t. all processors" is just your normal, simple "atomicity"
> > for SMP systems (ensure that that object is modified / set / replaced
> > in main memory atomically) and has nothing to do with "volatile"
> > behaviour.
> > 
> > "Volatile behaviour" itself isn't consistently defined (at least
> > definitely not consistently implemented in various gcc versions across
> > platforms), but it is /expected/ to mean something like: "ensure that
> > every such access actually goes all the way to memory, and is not
> > re-ordered w.r.t. to other accesses, as far as the compiler can take
> > care of these". The last "as far as compiler can take care" disclaimer
> > comes about due to CPUs doing their own re-ordering nowadays.
> > 
> > For example (say on i386):
> 
> [...]
> 
> > In (A) the compiler optimized "a = 10;" away, but the actual store
> > of the final value "20" to "a" was still "atomic". (B) and (C) also
> > exhibit "volatile" behaviour apart from the "atomicity".
> > 
> > But as others replied, it seems some callers out there depend upon
> > atomic ops exhibiting "volatile" behaviour as well, so that answers
> > my initial question, actually. I haven't looked at the code Paul
> > pointed me at, but I wonder if that "forget(x)" macro would help
> > those cases. I'd wish to avoid the "volatile" primitive, personally.
> 
> So, looking at load instead of store, understand I correctly that in
> your opinion
> 
> 	int b;
> 
> 	b = atomic_read(&a);
> 	if (b)
> 		do_something_time_consuming();
> 
> 	b = atomic_read(&a);
> 	if (b)
> 		do_something_more();
> 
> should be changed to explicitly forget(&a) after
> do_something_time_consuming?

No, I'd actually prefer something like what Christoph Lameter suggested,
i.e. users (such as above) who want "volatile"-like behaviour from atomic
ops can use alternative functions. How about something like:

#define atomic_read_volatile(v)			\
	({					\
		forget(&(v)->counter);		\
		((v)->counter);			\
	})

Or possibly, implement these "volatile" atomic ops variants in inline asm
like the patch that Sebastian Siewior has submitted on another thread just
a while back.

Of course, if we find there are more callers in the kernel who want the
volatility behaviour than those who don't care, we can re-define the
existing ops to such variants, and re-name the existing definitions to
somethine else, say "atomic_read_nonvolatile" for all I care.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 13:47           ` Satyam Sharma
@ 2007-08-15 14:25             ` Paul E. McKenney
  2007-08-15 15:33               ` Herbert Xu
  2007-08-15 17:55               ` Satyam Sharma
  0 siblings, 2 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 14:25 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
> On Wed, 15 Aug 2007, Stefan Richter wrote:
> > Satyam Sharma wrote:
> > > On Wed, 15 Aug 2007, Stefan Richter wrote:
> > >> Doesn't "atomic WRT all processors" require volatility?
> > > 
> > > No, it definitely doesn't. Why should it?
> > > 
> > > "Atomic w.r.t. all processors" is just your normal, simple "atomicity"
> > > for SMP systems (ensure that that object is modified / set / replaced
> > > in main memory atomically) and has nothing to do with "volatile"
> > > behaviour.
> > > 
> > > "Volatile behaviour" itself isn't consistently defined (at least
> > > definitely not consistently implemented in various gcc versions across
> > > platforms), but it is /expected/ to mean something like: "ensure that
> > > every such access actually goes all the way to memory, and is not
> > > re-ordered w.r.t. to other accesses, as far as the compiler can take
> > > care of these". The last "as far as compiler can take care" disclaimer
> > > comes about due to CPUs doing their own re-ordering nowadays.
> > > 
> > > For example (say on i386):
> > 
> > [...]
> > 
> > > In (A) the compiler optimized "a = 10;" away, but the actual store
> > > of the final value "20" to "a" was still "atomic". (B) and (C) also
> > > exhibit "volatile" behaviour apart from the "atomicity".
> > > 
> > > But as others replied, it seems some callers out there depend upon
> > > atomic ops exhibiting "volatile" behaviour as well, so that answers
> > > my initial question, actually. I haven't looked at the code Paul
> > > pointed me at, but I wonder if that "forget(x)" macro would help
> > > those cases. I'd wish to avoid the "volatile" primitive, personally.
> > 
> > So, looking at load instead of store, understand I correctly that in
> > your opinion
> > 
> > 	int b;
> > 
> > 	b = atomic_read(&a);
> > 	if (b)
> > 		do_something_time_consuming();
> > 
> > 	b = atomic_read(&a);
> > 	if (b)
> > 		do_something_more();
> > 
> > should be changed to explicitly forget(&a) after
> > do_something_time_consuming?
> 
> No, I'd actually prefer something like what Christoph Lameter suggested,
> i.e. users (such as above) who want "volatile"-like behaviour from atomic
> ops can use alternative functions. How about something like:
> 
> #define atomic_read_volatile(v)			\
> 	({					\
> 		forget(&(v)->counter);		\
> 		((v)->counter);			\
> 	})

Wouldn't the above "forget" the value, throw it away, then forget
that it forgot it, giving non-volatile semantics?

> Or possibly, implement these "volatile" atomic ops variants in inline asm
> like the patch that Sebastian Siewior has submitted on another thread just
> a while back.

Given that you are advocating a change (please keep in mind that
atomic_read() and atomic_set() had volatile semantics on almost all
platforms), care to give some example where these historical volatile
semantics are causing a problem?

> Of course, if we find there are more callers in the kernel who want the
> volatility behaviour than those who don't care, we can re-define the
> existing ops to such variants, and re-name the existing definitions to
> somethine else, say "atomic_read_nonvolatile" for all I care.

Do we really need another set of APIs?  Can you give even one example
where the pre-existing volatile semantics are causing enough of a problem
to justify adding yet more atomic_*() APIs?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:25             ` Paul E. McKenney
@ 2007-08-15 15:33               ` Herbert Xu
  2007-08-15 16:08                 ` Paul E. McKenney
  2007-08-15 18:19                 ` David Howells
  2007-08-15 17:55               ` Satyam Sharma
  1 sibling, 2 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-15 15:33 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Satyam Sharma, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
> 
> Do we really need another set of APIs?  Can you give even one example
> where the pre-existing volatile semantics are causing enough of a problem
> to justify adding yet more atomic_*() APIs?

Let's turn this around.  Can you give a single example where
the volatile semantics is needed in a legitimate way?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 15:33               ` Herbert Xu
@ 2007-08-15 16:08                 ` Paul E. McKenney
  2007-08-15 17:18                   ` Satyam Sharma
  2007-08-15 18:19                 ` David Howells
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 16:08 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
> > 
> > Do we really need another set of APIs?  Can you give even one example
> > where the pre-existing volatile semantics are causing enough of a problem
> > to justify adding yet more atomic_*() APIs?
> 
> Let's turn this around.  Can you give a single example where
> the volatile semantics is needed in a legitimate way?

Sorry, but you are the one advocating for the change.

Nice try, though!  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 16:08                 ` Paul E. McKenney
@ 2007-08-15 17:18                   ` Satyam Sharma
  2007-08-15 17:33                     ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-15 17:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher



On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
> > On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
> > > 
> > > Do we really need another set of APIs?  Can you give even one example
> > > where the pre-existing volatile semantics are causing enough of a problem
> > > to justify adding yet more atomic_*() APIs?
> > 
> > Let's turn this around.  Can you give a single example where
> > the volatile semantics is needed in a legitimate way?
> 
> Sorry, but you are the one advocating for the change.

Not for i386 and x86_64 -- those have atomic ops without any "volatile"
semantics (currently as per existing definitions).

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 17:18                   ` Satyam Sharma
@ 2007-08-15 17:33                     ` Paul E. McKenney
  2007-08-15 18:05                       ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 17:33 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, Aug 15, 2007 at 10:48:28PM +0530, Satyam Sharma wrote:
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> > On Wed, Aug 15, 2007 at 11:33:36PM +0800, Herbert Xu wrote:
> > > On Wed, Aug 15, 2007 at 07:25:16AM -0700, Paul E. McKenney wrote:
> > > > 
> > > > Do we really need another set of APIs?  Can you give even one example
> > > > where the pre-existing volatile semantics are causing enough of a problem
> > > > to justify adding yet more atomic_*() APIs?
> > > 
> > > Let's turn this around.  Can you give a single example where
> > > the volatile semantics is needed in a legitimate way?
> > 
> > Sorry, but you are the one advocating for the change.
> 
> Not for i386 and x86_64 -- those have atomic ops without any "volatile"
> semantics (currently as per existing definitions).

I claim unit volumes with arm, and the majority of the architectures, but
I cannot deny the popularity of i386 and x86_64 with many developers.  ;-)

However, I am not aware of code in the kernel that would benefit
from the compiler coalescing multiple atomic_set() and atomic_read()
invocations, thus I don't see the downside to volatility in this case.
Are there some performance-critical code fragments that I am missing?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 17:33                     ` Paul E. McKenney
@ 2007-08-15 18:05                       ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-15 18:05 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 10:48:28PM +0530, Satyam Sharma wrote:
> > [...]
> > Not for i386 and x86_64 -- those have atomic ops without any "volatile"
> > semantics (currently as per existing definitions).
> 
> I claim unit volumes with arm, and the majority of the architectures, but
> I cannot deny the popularity of i386 and x86_64 with many developers.  ;-)

Hmm, does arm really need that "volatile int counter;"? Hopefully RMK will
take a patch removing that "volatile" ... ;-)

> However, I am not aware of code in the kernel that would benefit
> from the compiler coalescing multiple atomic_set() and atomic_read()
> invocations, thus I don't see the downside to volatility in this case.
> Are there some performance-critical code fragments that I am missing?

I don't know, and yes, code with multiple atomic_set's and atomic_read's
getting optimized or coalesced does sound strange to start with. Anyway,
I'm not against "volatile semantics" per se. As replied elsewhere, I do
appreciate the motivation behind this series (to _avoid_ gotchas, not to
fix existing ones). Just that I'd like to avoid using "volatile", for
aforementioned reasons, especially given that there are perfectly
reasonable alternatives to achieve the same desired behaviour.

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 15:33               ` Herbert Xu
  2007-08-15 16:08                 ` Paul E. McKenney
@ 2007-08-15 18:19                 ` David Howells
  2007-08-15 18:45                   ` Paul E. McKenney
  1 sibling, 1 reply; 1651+ messages in thread
From: David Howells @ 2007-08-15 18:19 UTC (permalink / raw)
  To: Herbert Xu
  Cc: dhowells, Paul E. McKenney, Satyam Sharma, Stefan Richter,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu <herbert@gondor.apana.org.au> wrote:

> Let's turn this around.  Can you give a single example where
> the volatile semantics is needed in a legitimate way?

Accessing H/W registers?  But apart from that...

David

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:19                 ` David Howells
@ 2007-08-15 18:45                   ` Paul E. McKenney
  2007-08-15 23:41                     ` Herbert Xu
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 18:45 UTC (permalink / raw)
  To: David Howells
  Cc: Herbert Xu, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote:
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> > Let's turn this around.  Can you give a single example where
> > the volatile semantics is needed in a legitimate way?
> 
> Accessing H/W registers?  But apart from that...

Communicating between process context and interrupt/NMI handlers using
per-CPU variables.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:45                   ` Paul E. McKenney
@ 2007-08-15 23:41                     ` Herbert Xu
  2007-08-15 23:53                       ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-15 23:41 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 11:45:20AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote:
> > Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > 
> > > Let's turn this around.  Can you give a single example where
> > > the volatile semantics is needed in a legitimate way?
> > 
> > Accessing H/W registers?  But apart from that...
> 
> Communicating between process context and interrupt/NMI handlers using
> per-CPU variables.

Remeber we're talking about atomic_read/atomic_set.  Please
cite the actual file/function name you have in mind.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:41                     ` Herbert Xu
@ 2007-08-15 23:53                       ` Paul E. McKenney
  2007-08-16  0:12                         ` Herbert Xu
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 23:53 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 07:41:46AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 11:45:20AM -0700, Paul E. McKenney wrote:
> > On Wed, Aug 15, 2007 at 07:19:57PM +0100, David Howells wrote:
> > > Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > > 
> > > > Let's turn this around.  Can you give a single example where
> > > > the volatile semantics is needed in a legitimate way?
> > > 
> > > Accessing H/W registers?  But apart from that...
> > 
> > Communicating between process context and interrupt/NMI handlers using
> > per-CPU variables.
> 
> Remeber we're talking about atomic_read/atomic_set.  Please
> cite the actual file/function name you have in mind.

Yep, we are indeed talking about atomic_read()/atomic_set().

We have been through this issue already in this thread.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:53                       ` Paul E. McKenney
@ 2007-08-16  0:12                         ` Herbert Xu
  2007-08-16  0:23                           ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  0:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote:
>
> > > Communicating between process context and interrupt/NMI handlers using
> > > per-CPU variables.
> > 
> > Remeber we're talking about atomic_read/atomic_set.  Please
> > cite the actual file/function name you have in mind.
> 
> Yep, we are indeed talking about atomic_read()/atomic_set().
> 
> We have been through this issue already in this thread.

Sorry, but I must've missed it.  Could you cite the file or
function for my benefit?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:12                         ` Herbert Xu
@ 2007-08-16  0:23                           ` Paul E. McKenney
  2007-08-16  0:30                             ` Herbert Xu
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  0:23 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 08:12:48AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote:
> >
> > > > Communicating between process context and interrupt/NMI handlers using
> > > > per-CPU variables.
> > > 
> > > Remeber we're talking about atomic_read/atomic_set.  Please
> > > cite the actual file/function name you have in mind.
> > 
> > Yep, we are indeed talking about atomic_read()/atomic_set().
> > 
> > We have been through this issue already in this thread.
> 
> Sorry, but I must've missed it.  Could you cite the file or
> function for my benefit?

I might summarize the thread if there is interest, but I am not able to
do so right this minute.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:23                           ` Paul E. McKenney
@ 2007-08-16  0:30                             ` Herbert Xu
  2007-08-16  0:49                               ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  0:30 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 05:23:10PM -0700, Paul E. McKenney wrote:
> On Thu, Aug 16, 2007 at 08:12:48AM +0800, Herbert Xu wrote:
> > On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote:
> > >
> > > > > Communicating between process context and interrupt/NMI handlers using
> > > > > per-CPU variables.
> > > > 
> > > > Remeber we're talking about atomic_read/atomic_set.  Please
> > > > cite the actual file/function name you have in mind.
> > > 
> > > Yep, we are indeed talking about atomic_read()/atomic_set().
> > > 
> > > We have been through this issue already in this thread.
> > 
> > Sorry, but I must've missed it.  Could you cite the file or
> > function for my benefit?
> 
> I might summarize the thread if there is interest, but I am not able to
> do so right this minute.

Thanks.  But I don't need a summary of the thread, I'm asking
for an extant code snippet in our kernel that benefits from
the volatile change and is not part of a busy-wait.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:30                             ` Herbert Xu
@ 2007-08-16  0:49                               ` Paul E. McKenney
  2007-08-16  0:53                                 ` Herbert Xu
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  0:49 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 08:30:23AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 05:23:10PM -0700, Paul E. McKenney wrote:
> > On Thu, Aug 16, 2007 at 08:12:48AM +0800, Herbert Xu wrote:
> > > On Wed, Aug 15, 2007 at 04:53:35PM -0700, Paul E. McKenney wrote:
> > > >
> > > > > > Communicating between process context and interrupt/NMI handlers using
> > > > > > per-CPU variables.
> > > > > 
> > > > > Remeber we're talking about atomic_read/atomic_set.  Please
> > > > > cite the actual file/function name you have in mind.
> > > > 
> > > > Yep, we are indeed talking about atomic_read()/atomic_set().
> > > > 
> > > > We have been through this issue already in this thread.
> > > 
> > > Sorry, but I must've missed it.  Could you cite the file or
> > > function for my benefit?
> > 
> > I might summarize the thread if there is interest, but I am not able to
> > do so right this minute.
> 
> Thanks.  But I don't need a summary of the thread, I'm asking
> for an extant code snippet in our kernel that benefits from
> the volatile change and is not part of a busy-wait.

Sorry, can't help you there.  I really do believe that the information
you need (as opposed to the specific item you are asking for) really
has been put forth in this thread.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:49                               ` Paul E. McKenney
@ 2007-08-16  0:53                                 ` Herbert Xu
  2007-08-16  1:14                                   ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  0:53 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, Aug 15, 2007 at 05:49:50PM -0700, Paul E. McKenney wrote:
> On Thu, Aug 16, 2007 at 08:30:23AM +0800, Herbert Xu wrote:
>
> > Thanks.  But I don't need a summary of the thread, I'm asking
> > for an extant code snippet in our kernel that benefits from
> > the volatile change and is not part of a busy-wait.
> 
> Sorry, can't help you there.  I really do believe that the information
> you need (as opposed to the specific item you are asking for) really
> has been put forth in this thread.

That only leads me to believe that such a code snippet simply
does not exist.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:53                                 ` Herbert Xu
@ 2007-08-16  1:14                                   ` Paul E. McKenney
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  1:14 UTC (permalink / raw)
  To: Herbert Xu
  Cc: David Howells, Satyam Sharma, Stefan Richter, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 08:53:16AM +0800, Herbert Xu wrote:
> On Wed, Aug 15, 2007 at 05:49:50PM -0700, Paul E. McKenney wrote:
> > On Thu, Aug 16, 2007 at 08:30:23AM +0800, Herbert Xu wrote:
> >
> > > Thanks.  But I don't need a summary of the thread, I'm asking
> > > for an extant code snippet in our kernel that benefits from
> > > the volatile change and is not part of a busy-wait.
> > 
> > Sorry, can't help you there.  I really do believe that the information
> > you need (as opposed to the specific item you are asking for) really
> > has been put forth in this thread.
> 
> That only leads me to believe that such a code snippet simply
> does not exist.

Whatever...

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 14:25             ` Paul E. McKenney
  2007-08-15 15:33               ` Herbert Xu
@ 2007-08-15 17:55               ` Satyam Sharma
  2007-08-15 19:07                 ` Paul E. McKenney
  2007-08-15 20:58                 ` Segher Boessenkool
  1 sibling, 2 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-15 17:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

Hi Paul,


On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
> > [...]
> > No, I'd actually prefer something like what Christoph Lameter suggested,
> > i.e. users (such as above) who want "volatile"-like behaviour from atomic
> > ops can use alternative functions. How about something like:
> > 
> > #define atomic_read_volatile(v)			\
> > 	({					\
> > 		forget(&(v)->counter);		\
> > 		((v)->counter);			\
> > 	})
> 
> Wouldn't the above "forget" the value, throw it away, then forget
> that it forgot it, giving non-volatile semantics?

Nope, I don't think so. I wrote the following trivial testcases:
[ See especially tp4.c and tp4.s (last example). ]

==============================================================================
$ cat tp1.c # Using volatile access casts

#define atomic_read(a)	(*(volatile int *)&a)

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp1.c; cat tp1.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$0, a
.L2:
	movl	a, %eax
	testl	%eax, %eax
	jne	.L2
	popl	%ebp
	ret
==============================================================================
$ cat tp2.c # Using nothing; gcc will optimize the whole loop away

#define forget(x)

#define atomic_read(a)		\
	({			\
		forget(&(a));	\
		(a);		\
	})

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp2.c; cat tp2.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	popl	%ebp
	movl	$0, a
	ret
==============================================================================
$ cat tp3.c # Using a full memory clobber barrier

#define forget(x)	asm volatile ("":::"memory")

#define atomic_read(a)		\
	({			\
		forget(&(a));	\
		(a);		\
	})

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp3.c; cat tp3.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$0, a
.L2:
	cmpl	$0, a
	jne	.L2
	popl	%ebp
	ret
==============================================================================
$ cat tp4.c # Using a forget(var) macro

#define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))

#define atomic_read(a)		\
	({			\
		forget(a);	\
		(a);		\
	})

int a;

void func(void)
{
	a = 0;
	while (atomic_read(a))
		;
}
==============================================================================
$ gcc -Os -S tp4.c; cat tp4.s

func:
	pushl	%ebp
	movl	%esp, %ebp
	movl	$0, a
.L2:
	cmpl	$0, a
	jne	.L2
	popl	%ebp
	ret
==============================================================================


Possibly these were too trivial to expose any potential problems that you
may have been referring to, so would be helpful if you could write a more
concrete example / sample code.


> > Or possibly, implement these "volatile" atomic ops variants in inline asm
> > like the patch that Sebastian Siewior has submitted on another thread just
> > a while back.
> 
> Given that you are advocating a change (please keep in mind that
> atomic_read() and atomic_set() had volatile semantics on almost all
> platforms), care to give some example where these historical volatile
> semantics are causing a problem?
> [...]
> Can you give even one example
> where the pre-existing volatile semantics are causing enough of a problem
> to justify adding yet more atomic_*() APIs?

Will take this to the other sub-thread ...


> > Of course, if we find there are more callers in the kernel who want the
> > volatility behaviour than those who don't care, we can re-define the
> > existing ops to such variants, and re-name the existing definitions to
> > somethine else, say "atomic_read_nonvolatile" for all I care.
> 
> Do we really need another set of APIs?

Well, if there's one set of users who do care about volatile behaviour,
and another set that doesn't, it only sounds correct to provide both
those APIs, instead of forcing one behaviour on all users.


Thanks,
Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 17:55               ` Satyam Sharma
@ 2007-08-15 19:07                 ` Paul E. McKenney
  2007-08-15 21:07                   ` Segher Boessenkool
  2007-08-15 20:58                 ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-15 19:07 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 11:25:05PM +0530, Satyam Sharma wrote:
> Hi Paul,
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > On Wed, Aug 15, 2007 at 07:17:29PM +0530, Satyam Sharma wrote:
> > > [...]
> > > No, I'd actually prefer something like what Christoph Lameter suggested,
> > > i.e. users (such as above) who want "volatile"-like behaviour from atomic
> > > ops can use alternative functions. How about something like:
> > > 
> > > #define atomic_read_volatile(v)			\
> > > 	({					\
> > > 		forget(&(v)->counter);		\
> > > 		((v)->counter);			\
> > > 	})
> > 
> > Wouldn't the above "forget" the value, throw it away, then forget
> > that it forgot it, giving non-volatile semantics?
> 
> Nope, I don't think so. I wrote the following trivial testcases:
> [ See especially tp4.c and tp4.s (last example). ]

Right.  I should have said "wouldn't the compiler be within its rights
to forget the value, throw it away, then forget that it forgot it".
The value coming out of the #define above is an unadorned ((v)->counter),
which has no volatile semantics.

> ==============================================================================
> $ cat tp1.c # Using volatile access casts
> 
> #define atomic_read(a)	(*(volatile int *)&a)
> 
> int a;
> 
> void func(void)
> {
> 	a = 0;
> 	while (atomic_read(a))
> 		;
> }
> ==============================================================================
> $ gcc -Os -S tp1.c; cat tp1.s
> 
> func:
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	movl	$0, a
> .L2:
> 	movl	a, %eax
> 	testl	%eax, %eax
> 	jne	.L2
> 	popl	%ebp
> 	ret
> ==============================================================================
> $ cat tp2.c # Using nothing; gcc will optimize the whole loop away
> 
> #define forget(x)
> 
> #define atomic_read(a)		\
> 	({			\
> 		forget(&(a));	\
> 		(a);		\
> 	})
> 
> int a;
> 
> void func(void)
> {
> 	a = 0;
> 	while (atomic_read(a))
> 		;
> }
> ==============================================================================
> $ gcc -Os -S tp2.c; cat tp2.s
> 
> func:
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	popl	%ebp
> 	movl	$0, a
> 	ret
> ==============================================================================
> $ cat tp3.c # Using a full memory clobber barrier
> 
> #define forget(x)	asm volatile ("":::"memory")
> 
> #define atomic_read(a)		\
> 	({			\
> 		forget(&(a));	\
> 		(a);		\
> 	})
> 
> int a;
> 
> void func(void)
> {
> 	a = 0;
> 	while (atomic_read(a))
> 		;
> }
> ==============================================================================
> $ gcc -Os -S tp3.c; cat tp3.s
> 
> func:
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	movl	$0, a
> .L2:
> 	cmpl	$0, a
> 	jne	.L2
> 	popl	%ebp
> 	ret
> ==============================================================================
> $ cat tp4.c # Using a forget(var) macro
> 
> #define forget(a)	__asm__ __volatile__ ("" :"=m" (a) :"m" (a))
> 
> #define atomic_read(a)		\
> 	({			\
> 		forget(a);	\
> 		(a);		\
> 	})
> 
> int a;
> 
> void func(void)
> {
> 	a = 0;
> 	while (atomic_read(a))
> 		;
> }
> ==============================================================================
> $ gcc -Os -S tp4.c; cat tp4.s
> 
> func:
> 	pushl	%ebp
> 	movl	%esp, %ebp
> 	movl	$0, a
> .L2:
> 	cmpl	$0, a
> 	jne	.L2
> 	popl	%ebp
> 	ret
> ==============================================================================
> 
> Possibly these were too trivial to expose any potential problems that you
> may have been referring to, so would be helpful if you could write a more
> concrete example / sample code.

The trick is to have a sufficiently complicated expression to force
the compiler to run out of registers.  If the value is non-volatile,
it will refetch it (and expect it not to have changed, possibly being
disappointed by an interrupt handler running on that same CPU).

> > > Or possibly, implement these "volatile" atomic ops variants in inline asm
> > > like the patch that Sebastian Siewior has submitted on another thread just
> > > a while back.
> > 
> > Given that you are advocating a change (please keep in mind that
> > atomic_read() and atomic_set() had volatile semantics on almost all
> > platforms), care to give some example where these historical volatile
> > semantics are causing a problem?
> > [...]
> > Can you give even one example
> > where the pre-existing volatile semantics are causing enough of a problem
> > to justify adding yet more atomic_*() APIs?
> 
> Will take this to the other sub-thread ...

OK.

> > > Of course, if we find there are more callers in the kernel who want the
> > > volatility behaviour than those who don't care, we can re-define the
> > > existing ops to such variants, and re-name the existing definitions to
> > > somethine else, say "atomic_read_nonvolatile" for all I care.
> > 
> > Do we really need another set of APIs?
> 
> Well, if there's one set of users who do care about volatile behaviour,
> and another set that doesn't, it only sounds correct to provide both
> those APIs, instead of forcing one behaviour on all users.

Well, if the second set doesn't care, they should be OK with the volatile
behavior in this case.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 19:07                 ` Paul E. McKenney
@ 2007-08-15 21:07                   ` Segher Boessenkool
  0 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-15 21:07 UTC (permalink / raw)
  To: paulmck
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, rpjday, netdev, ak,
	cfriesen, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> Possibly these were too trivial to expose any potential problems that 
>> you
>> may have been referring to, so would be helpful if you could write a 
>> more
>> concrete example / sample code.
>
> The trick is to have a sufficiently complicated expression to force
> the compiler to run out of registers.

You can use -ffixed-XXX to keep the testcase simple.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 17:55               ` Satyam Sharma
  2007-08-15 19:07                 ` Paul E. McKenney
@ 2007-08-15 20:58                 ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-15 20:58 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>>> Of course, if we find there are more callers in the kernel who want 
>>> the
>>> volatility behaviour than those who don't care, we can re-define the
>>> existing ops to such variants, and re-name the existing definitions 
>>> to
>>> somethine else, say "atomic_read_nonvolatile" for all I care.
>>
>> Do we really need another set of APIs?
>
> Well, if there's one set of users who do care about volatile behaviour,
> and another set that doesn't, it only sounds correct to provide both
> those APIs, instead of forcing one behaviour on all users.

But since there currently is only one such API, and there are
users expecting the stronger behaviour, the only sane thing to
do is let the API provide that behaviour.  You can always add
a new API with weaker behaviour later, and move users that are
okay with it over to that new API.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 12:31       ` Satyam Sharma
  2007-08-15 13:08         ` Stefan Richter
@ 2007-08-15 18:31         ` Segher Boessenkool
  2007-08-15 19:40           ` Satyam Sharma
  2007-08-15 23:22         ` Paul Mackerras
  2007-08-16  3:37         ` Bill Fink
  3 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-15 18:31 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

> "Volatile behaviour" itself isn't consistently defined (at least
> definitely not consistently implemented in various gcc versions across
> platforms),

It should be consistent across platforms; if not, file a bug please.

> but it is /expected/ to mean something like: "ensure that
> every such access actually goes all the way to memory, and is not
> re-ordered w.r.t. to other accesses, as far as the compiler can take
> care of these". The last "as far as compiler can take care" disclaimer
> comes about due to CPUs doing their own re-ordering nowadays.

You can *expect* whatever you want, but this isn't in line with
reality at all.

volatile _does not_ make accesses go all the way to memory.
volatile _does not_ prevent reordering wrt other accesses.

What volatile does are a) never optimise away a read (or write)
to the object, since the data can change in ways the compiler
cannot see; and b) never move stores to the object across a
sequence point.  This does not mean other accesses cannot be
reordered wrt the volatile access.

If the abstract machine would do an access to a volatile-
qualified object, the generated machine code will do that
access too.  But, for example, it can still be optimised
away by the compiler, if it can prove it is allowed to.

If you want stuff to go all the way to memory, you need some
architecture-specific flush sequence; to make a store globally
visible before another store, you need mb(); before some following
read, you need mb(); to prevent reordering you need a barrier.

Segher

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 18:31         ` Segher Boessenkool
@ 2007-08-15 19:40           ` Satyam Sharma
  2007-08-15 20:42             ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-15 19:40 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang



On Wed, 15 Aug 2007, Segher Boessenkool wrote:

> > "Volatile behaviour" itself isn't consistently defined (at least
> > definitely not consistently implemented in various gcc versions across
> > platforms),
> 
> It should be consistent across platforms; if not, file a bug please.
> 
> > but it is /expected/ to mean something like: "ensure that
> > every such access actually goes all the way to memory, and is not
> > re-ordered w.r.t. to other accesses, as far as the compiler can take
                              ^
                              (volatile)

(or, alternatively, "other accesses to the same volatile object" ...)

> > care of these". The last "as far as compiler can take care" disclaimer
> > comes about due to CPUs doing their own re-ordering nowadays.
> 
> You can *expect* whatever you want, but this isn't in line with
> reality at all.
> 
> volatile _does not_ prevent reordering wrt other accesses.
> [...]
> What volatile does are a) never optimise away a read (or write)
> to the object, since the data can change in ways the compiler
> cannot see; and b) never move stores to the object across a
> sequence point.  This does not mean other accesses cannot be
> reordered wrt the volatile access.
> 
> If the abstract machine would do an access to a volatile-
> qualified object, the generated machine code will do that
> access too.  But, for example, it can still be optimised
> away by the compiler, if it can prove it is allowed to.

As (now) indicated above, I had meant multiple volatile accesses to
the same object, obviously.

BTW:

#define atomic_read(a)	(*(volatile int *)&(a))
#define atomic_set(a,i)	(*(volatile int *)&(a) = (i))

int a;

void func(void)
{
	int b;

	b = atomic_read(a);
	atomic_set(a, 20);
	b = atomic_read(a);
}

gives:

func:
	pushl	%ebp
	movl	a, %eax
	movl	%esp, %ebp
	movl	$20, a
	movl	a, %eax
	popl	%ebp
	ret

so the first atomic_read() wasn't optimized away.


> volatile _does not_ make accesses go all the way to memory.
> [...]
> If you want stuff to go all the way to memory, you need some
> architecture-specific flush sequence; to make a store globally
> visible before another store, you need mb(); before some following
> read, you need mb(); to prevent reordering you need a barrier.

Sure, which explains the "as far as the compiler can take care" bit.
Poor phrase / choice of words, probably.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 19:40           ` Satyam Sharma
@ 2007-08-15 20:42             ` Segher Boessenkool
  2007-08-16  1:23               ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-15 20:42 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang

>> What volatile does are a) never optimise away a read (or write)
>> to the object, since the data can change in ways the compiler
>> cannot see; and b) never move stores to the object across a
>> sequence point.  This does not mean other accesses cannot be
>> reordered wrt the volatile access.
>>
>> If the abstract machine would do an access to a volatile-
>> qualified object, the generated machine code will do that
>> access too.  But, for example, it can still be optimised
>> away by the compiler, if it can prove it is allowed to.
>
> As (now) indicated above, I had meant multiple volatile accesses to
> the same object, obviously.

Yes, accesses to volatile objects are never reordered with
respect to each other.

> BTW:
>
> #define atomic_read(a)	(*(volatile int *)&(a))
> #define atomic_set(a,i)	(*(volatile int *)&(a) = (i))
>
> int a;
>
> void func(void)
> {
> 	int b;
>
> 	b = atomic_read(a);
> 	atomic_set(a, 20);
> 	b = atomic_read(a);
> }
>
> gives:
>
> func:
> 	pushl	%ebp
> 	movl	a, %eax
> 	movl	%esp, %ebp
> 	movl	$20, a
> 	movl	a, %eax
> 	popl	%ebp
> 	ret
>
> so the first atomic_read() wasn't optimized away.

Of course.  It is executed by the abstract machine, so
it will be executed by the actual machine.  On the other
hand, try

	b = 0;
	if (b)
		b = atomic_read(a);

or similar.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 20:42             ` Segher Boessenkool
@ 2007-08-16  1:23               ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  1:23 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Linux Kernel Mailing List, Paul E. McKenney, netdev, ak, cfriesen,
	rpjday, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, davem, Linus Torvalds,
	wensong, wjiang



On Wed, 15 Aug 2007, Segher Boessenkool wrote:

> [...]
> > BTW:
> > 
> > #define atomic_read(a)	(*(volatile int *)&(a))
> > #define atomic_set(a,i)	(*(volatile int *)&(a) = (i))
> > 
> > int a;
> > 
> > void func(void)
> > {
> > 	int b;
> > 
> > 	b = atomic_read(a);
> > 	atomic_set(a, 20);
> > 	b = atomic_read(a);
> > }
> > 
> > gives:
> > 
> > func:
> > 	pushl	%ebp
> > 	movl	a, %eax
> > 	movl	%esp, %ebp
> > 	movl	$20, a
> > 	movl	a, %eax
> > 	popl	%ebp
> > 	ret
> > 
> > so the first atomic_read() wasn't optimized away.
> 
> Of course.  It is executed by the abstract machine, so
> it will be executed by the actual machine.  On the other
> hand, try
> 
> 	b = 0;
> 	if (b)
> 		b = atomic_read(a);
> 
> or similar.

Yup, obviously. Volatile accesses (or any access to volatile objects),
or even "__volatile__ asms" (which gcc normally promises never to elid)
can always be optimized for cases such as these where the compiler can
trivially determine that the code in question is not reachable.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 12:31       ` Satyam Sharma
  2007-08-15 13:08         ` Stefan Richter
  2007-08-15 18:31         ` Segher Boessenkool
@ 2007-08-15 23:22         ` Paul Mackerras
  2007-08-16  0:26           ` Christoph Lameter
  2007-08-24 12:50           ` Denys Vlasenko
  2007-08-16  3:37         ` Bill Fink
  3 siblings, 2 replies; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-15 23:22 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

Satyam Sharma writes:

> > Doesn't "atomic WRT all processors" require volatility?
> 
> No, it definitely doesn't. Why should it?
> 
> "Atomic w.r.t. all processors" is just your normal, simple "atomicity"
> for SMP systems (ensure that that object is modified / set / replaced
> in main memory atomically) and has nothing to do with "volatile"
> behaviour.

Atomic variables are "volatile" in the sense that they are liable to
be changed at any time by mechanisms that are outside the knowledge of
the C compiler, namely, other CPUs, or this CPU executing an interrupt
routine.

In the kernel we use atomic variables in precisely those situations
where a variable is potentially accessed concurrently by multiple
CPUs, and where each CPU needs to see updates done by other CPUs in a
timely fashion.  That is what they are for.  Therefore the compiler
must not cache values of atomic variables in registers; each
atomic_read must result in a load and each atomic_set must result in a
store.  Anything else will just lead to subtle bugs.

I have no strong opinion about whether or not the best way to achieve
this is through the use of the "volatile" C keyword.  Segher's idea of
using asm instead seems like a good one to me.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:22         ` Paul Mackerras
@ 2007-08-16  0:26           ` Christoph Lameter
  2007-08-16  0:34             ` Paul Mackerras
  2007-08-16  0:39             ` Paul E. McKenney
  2007-08-24 12:50           ` Denys Vlasenko
  1 sibling, 2 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16  0:26 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> In the kernel we use atomic variables in precisely those situations
> where a variable is potentially accessed concurrently by multiple
> CPUs, and where each CPU needs to see updates done by other CPUs in a
> timely fashion.  That is what they are for.  Therefore the compiler
> must not cache values of atomic variables in registers; each
> atomic_read must result in a load and each atomic_set must result in a
> store.  Anything else will just lead to subtle bugs.

This may have been the intend. However, today the visibility is controlled 
using barriers. And we have barriers that we use with atomic operations. 
Having volatile be the default just lead to confusion. Atomic read should 
just read with no extras. Extras can be added by using variants like 
atomic_read_volatile or so.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:26           ` Christoph Lameter
@ 2007-08-16  0:34             ` Paul Mackerras
  2007-08-16  0:40               ` Christoph Lameter
  2007-08-16  0:39             ` Paul E. McKenney
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  0:34 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

Christoph Lameter writes:

> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> 
> > In the kernel we use atomic variables in precisely those situations
> > where a variable is potentially accessed concurrently by multiple
> > CPUs, and where each CPU needs to see updates done by other CPUs in a
> > timely fashion.  That is what they are for.  Therefore the compiler
> > must not cache values of atomic variables in registers; each
> > atomic_read must result in a load and each atomic_set must result in a
> > store.  Anything else will just lead to subtle bugs.
> 
> This may have been the intend. However, today the visibility is controlled 
> using barriers. And we have barriers that we use with atomic operations. 

Those barriers are for when we need ordering between atomic variables
and other memory locations.  An atomic variable by itself doesn't and
shouldn't need any barriers for other CPUs to be able to see what's
happening to it.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:34             ` Paul Mackerras
@ 2007-08-16  0:40               ` Christoph Lameter
  0 siblings, 0 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16  0:40 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> Those barriers are for when we need ordering between atomic variables
> and other memory locations.  An atomic variable by itself doesn't and
> shouldn't need any barriers for other CPUs to be able to see what's
> happening to it.

It does not need any barriers. As soon as one cpu acquires the 
cacheline for write it will be invalidated in the caches of the others. So 
the other cpu will have to refetch. No need for volatile.

The issue here may be that the compiler has fetched the atomic variable 
earlier and put it into a register. However, that prefetching is limited 
because it cannot cross functions calls etc. The only problem could be 
loops where the compiler does not refetch the variable since it assumes 
that it does not change and there are no function calls in the body of the 
loop. But AFAIK these loops need cpu_relax and other measures anyways to 
avoid bad effects from busy waiting.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:26           ` Christoph Lameter
  2007-08-16  0:34             ` Paul Mackerras
@ 2007-08-16  0:39             ` Paul E. McKenney
  2007-08-16  0:42               ` Christoph Lameter
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  0:39 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 05:26:34PM -0700, Christoph Lameter wrote:
> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> 
> > In the kernel we use atomic variables in precisely those situations
> > where a variable is potentially accessed concurrently by multiple
> > CPUs, and where each CPU needs to see updates done by other CPUs in a
> > timely fashion.  That is what they are for.  Therefore the compiler
> > must not cache values of atomic variables in registers; each
> > atomic_read must result in a load and each atomic_set must result in a
> > store.  Anything else will just lead to subtle bugs.
> 
> This may have been the intend. However, today the visibility is controlled 
> using barriers. And we have barriers that we use with atomic operations. 
> Having volatile be the default just lead to confusion. Atomic read should 
> just read with no extras. Extras can be added by using variants like 
> atomic_read_volatile or so.

Seems to me that we face greater chance of confusion without the
volatile than with, particularly as compiler optimizations become
more aggressive.  Yes, we could simply disable optimization, but
optimization can be quite helpful.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:39             ` Paul E. McKenney
@ 2007-08-16  0:42               ` Christoph Lameter
  2007-08-16  0:53                 ` Paul E. McKenney
  2007-08-16  1:51                 ` Paul Mackerras
  0 siblings, 2 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16  0:42 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> Seems to me that we face greater chance of confusion without the
> volatile than with, particularly as compiler optimizations become
> more aggressive.  Yes, we could simply disable optimization, but
> optimization can be quite helpful.

A volatile default would disable optimizations for atomic_read. 
atomic_read without volatile would allow for full optimization by the 
compiler. Seems that this is what one wants in many cases.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:42               ` Christoph Lameter
@ 2007-08-16  0:53                 ` Paul E. McKenney
  2007-08-16  0:59                   ` Christoph Lameter
  2007-08-16  1:51                 ` Paul Mackerras
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  0:53 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 05:42:07PM -0700, Christoph Lameter wrote:
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > Seems to me that we face greater chance of confusion without the
> > volatile than with, particularly as compiler optimizations become
> > more aggressive.  Yes, we could simply disable optimization, but
> > optimization can be quite helpful.
> 
> A volatile default would disable optimizations for atomic_read. 
> atomic_read without volatile would allow for full optimization by the 
> compiler. Seems that this is what one wants in many cases.

The volatile cast should not disable all that many optimizations,
for example, it is much less hurtful than barrier().  Furthermore,
the main optimizations disabled (pulling atomic_read() and atomic_set()
out of loops) really do need to be disabled.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:53                 ` Paul E. McKenney
@ 2007-08-16  0:59                   ` Christoph Lameter
  2007-08-16  1:14                     ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16  0:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> The volatile cast should not disable all that many optimizations,
> for example, it is much less hurtful than barrier().  Furthermore,
> the main optimizations disabled (pulling atomic_read() and atomic_set()
> out of loops) really do need to be disabled.

In many cases you do not need a barrier. Having volatile there *will* 
impact optimization because the compiler cannot use a register that may 
contain the value that was fetched earlier. And the compiler cannot choose 
freely when to fetch the value. The order of memory accesses are fixed if 
you use volatile. If the variable is not volatile then the compiler can 
arrange memory accesses any way they fit and thus generate better code.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:59                   ` Christoph Lameter
@ 2007-08-16  1:14                     ` Paul E. McKenney
  2007-08-16  1:41                       ` Christoph Lameter
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  1:14 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 05:59:41PM -0700, Christoph Lameter wrote:
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > The volatile cast should not disable all that many optimizations,
> > for example, it is much less hurtful than barrier().  Furthermore,
> > the main optimizations disabled (pulling atomic_read() and atomic_set()
> > out of loops) really do need to be disabled.
> 
> In many cases you do not need a barrier. Having volatile there *will* 
> impact optimization because the compiler cannot use a register that may 
> contain the value that was fetched earlier. And the compiler cannot choose 
> freely when to fetch the value. The order of memory accesses are fixed if 
> you use volatile. If the variable is not volatile then the compiler can 
> arrange memory accesses any way they fit and thus generate better code.

Understood.  My point is not that the impact is precisely zero, but
rather that the impact on optimization is much less hurtful than the
problems that could arise otherwise, particularly as compilers become
more aggressive in their optimizations.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:14                     ` Paul E. McKenney
@ 2007-08-16  1:41                       ` Christoph Lameter
  2007-08-16  2:15                         ` Satyam Sharma
  2007-08-16  2:32                         ` Paul E. McKenney
  0 siblings, 2 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16  1:41 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, 15 Aug 2007, Paul E. McKenney wrote:

> Understood.  My point is not that the impact is precisely zero, but
> rather that the impact on optimization is much less hurtful than the
> problems that could arise otherwise, particularly as compilers become
> more aggressive in their optimizations.

The problems arise because barriers are not used as required. Volatile 
has wishy washy semantics and somehow marries memory barriers with data 
access. It is clearer to separate the two. Conceptual cleanness usually 
translates into better code. If one really wants the volatile then lets 
make it explicit and use

	atomic_read_volatile()

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:41                       ` Christoph Lameter
@ 2007-08-16  2:15                         ` Satyam Sharma
  2007-08-16  2:08                           ` Herbert Xu
  2007-08-16  2:32                         ` Paul E. McKenney
  1 sibling, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  2:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Paul Mackerras, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, 15 Aug 2007, Christoph Lameter wrote:

> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > Understood.  My point is not that the impact is precisely zero, but
> > rather that the impact on optimization is much less hurtful than the
> > problems that could arise otherwise, particularly as compilers become
> > more aggressive in their optimizations.
> 
> The problems arise because barriers are not used as required. Volatile 
> has wishy washy semantics and somehow marries memory barriers with data 
> access. It is clearer to separate the two. Conceptual cleanness usually 
> translates into better code. If one really wants the volatile then lets 
> make it explicit and use
> 
> 	atomic_read_volatile()

Completely agreed, again. To summarize again (had done so about ~100 mails
earlier in this thread too :-) ...

atomic_{read,set}_volatile() -- guarantees volatility also along with
atomicity (the two _are_ different concepts after all, irrespective of
whether callsites normally want one with the other or not)

atomic_{read,set}_nonvolatile() -- only guarantees atomicity, compiler
free to elid / coalesce / optimize such accesses, can keep the object
in question cached in a local register, leads to smaller text, etc.

As to which one should be the default atomic_read() is a question of
whether majority of callsites (more weightage to important / hot
codepaths, lesser to obscure callsites) want a particular behaviour.

Do we have a consensus here? (hoping against hope, probably :-)

[ This thread has gotten completely out of hand ... for my mail client
  alpine as well, it now seems. Reminds of that 1000+ GPLv3 fest :-) ]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:15                         ` Satyam Sharma
@ 2007-08-16  2:08                           ` Herbert Xu
  2007-08-16  2:18                             ` Christoph Lameter
  2007-08-16  2:18                             ` Chris Friesen
  0 siblings, 2 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  2:08 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul E. McKenney, Paul Mackerras,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 07:45:44AM +0530, Satyam Sharma wrote:
>
> Completely agreed, again. To summarize again (had done so about ~100 mails
> earlier in this thread too :-) ...
> 
> atomic_{read,set}_volatile() -- guarantees volatility also along with
> atomicity (the two _are_ different concepts after all, irrespective of
> whether callsites normally want one with the other or not)
> 
> atomic_{read,set}_nonvolatile() -- only guarantees atomicity, compiler
> free to elid / coalesce / optimize such accesses, can keep the object
> in question cached in a local register, leads to smaller text, etc.
> 
> As to which one should be the default atomic_read() is a question of
> whether majority of callsites (more weightage to important / hot
> codepaths, lesser to obscure callsites) want a particular behaviour.
> 
> Do we have a consensus here? (hoping against hope, probably :-)

I can certainly agree with this.

But I have to say that I still don't know of a single place
where one would actually use the volatile variant.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:08                           ` Herbert Xu
@ 2007-08-16  2:18                             ` Christoph Lameter
  2007-08-16  3:23                               ` Paul Mackerras
  2007-08-16  2:18                             ` Chris Friesen
  1 sibling, 1 reply; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16  2:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Paul E. McKenney, Paul Mackerras, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Herbert Xu wrote:

> > Do we have a consensus here? (hoping against hope, probably :-)
> 
> I can certainly agree with this.

I agree too.

> But I have to say that I still don't know of a single place
> where one would actually use the volatile variant.

I suspect that what you say is true after we have looked at all callers.



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:18                             ` Christoph Lameter
@ 2007-08-16  3:23                               ` Paul Mackerras
  2007-08-16  3:33                                 ` Herbert Xu
                                                   ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  3:23 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Herbert Xu, Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Christoph Lameter writes:

> > But I have to say that I still don't know of a single place
> > where one would actually use the volatile variant.
> 
> I suspect that what you say is true after we have looked at all callers.

It seems that there could be a lot of places where atomic_t is used in
a non-atomic fashion, and that those uses are either buggy, or there
is some lock held at the time which guarantees that other CPUs aren't
changing the value.  In both cases there is no point in using
atomic_t; we might as well just use an ordinary int.

In particular, atomic_read seems to lend itself to buggy uses.  People
seem to do things like:

	atomic_add(&v, something);
	if (atomic_read(&v) > something_else) ...

and expect that there is some relationship between the value that the
atomic_add stored and the value that the atomic_read will return,
which there isn't.  People seem to think that using atomic_t magically
gets rid of races.  It doesn't.

I'd go so far as to say that anywhere where you want a non-"volatile"
atomic_read, either your code is buggy, or else an int would work just
as well.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:23                               ` Paul Mackerras
@ 2007-08-16  3:33                                 ` Herbert Xu
  2007-08-16  3:48                                   ` Paul Mackerras
  2007-08-16 18:48                                 ` Christoph Lameter
  2007-08-16 19:44                                 ` Segher Boessenkool
  2 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  3:33 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 01:23:06PM +1000, Paul Mackerras wrote:
>
> In particular, atomic_read seems to lend itself to buggy uses.  People
> seem to do things like:
> 
> 	atomic_add(&v, something);
> 	if (atomic_read(&v) > something_else) ...

If you're referring to the code in sk_stream_mem_schedule
then it's working as intended.  The atomicity guarantees
that the atomic_add/atomic_sub won't be seen in parts by
other readers.

We certainly do not need to see other atomic_add/atomic_sub
operations immediately.

If you're referring to another code snippet please cite.

> I'd go so far as to say that anywhere where you want a non-"volatile"
> atomic_read, either your code is buggy, or else an int would work just
> as well.

An int won't work here because += and -= do not have the
atomicity guarantees that atomic_add/atomic_sub do.  In
particular, this may cause an atomic_read on another CPU
to give a bogus reading.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:33                                 ` Herbert Xu
@ 2007-08-16  3:48                                   ` Paul Mackerras
  2007-08-16  4:03                                     ` Herbert Xu
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  3:48 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> If you're referring to the code in sk_stream_mem_schedule
> then it's working as intended.  The atomicity guarantees

You mean it's intended that *sk->sk_prot->memory_pressure can end up
as 1 when sk->sk_prot->memory_allocated is small (less than
->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
than ->sysctl_mem[2])?  Because that's the effect of the current code.
If so I wonder why you bother computing it.

> that the atomic_add/atomic_sub won't be seen in parts by
> other readers.
> 
> We certainly do not need to see other atomic_add/atomic_sub
> operations immediately.
> 
> If you're referring to another code snippet please cite.
> 
> > I'd go so far as to say that anywhere where you want a non-"volatile"
> > atomic_read, either your code is buggy, or else an int would work just
> > as well.
> 
> An int won't work here because += and -= do not have the
> atomicity guarantees that atomic_add/atomic_sub do.  In
> particular, this may cause an atomic_read on another CPU
> to give a bogus reading.

The point is that guaranteeing the atomicity of the increment or
decrement does not suffice to make the code race-free.  In this case
the race arises from the fact that reading ->memory_allocated and
setting *->memory_pressure are separate operations.  To make that code
work properly you need a lock.  And once you have the lock an ordinary
int would suffice for ->memory_allocated.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:48                                   ` Paul Mackerras
@ 2007-08-16  4:03                                     ` Herbert Xu
  2007-08-16  4:34                                       ` Paul Mackerras
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  4:03 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 01:48:32PM +1000, Paul Mackerras wrote:
> Herbert Xu writes:
> 
> > If you're referring to the code in sk_stream_mem_schedule
> > then it's working as intended.  The atomicity guarantees
> 
> You mean it's intended that *sk->sk_prot->memory_pressure can end up
> as 1 when sk->sk_prot->memory_allocated is small (less than
> ->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
> than ->sysctl_mem[2])?  Because that's the effect of the current code.
> If so I wonder why you bother computing it.

You need to remember that there are three different limits:
minimum, pressure, and maximum.  By default we should never
be in a situation where what you say can occur.

If you set all three limits to the same thing, then yes it
won't work as intended but it's still well-behaved.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  4:03                                     ` Herbert Xu
@ 2007-08-16  4:34                                       ` Paul Mackerras
  2007-08-16  5:37                                         ` Herbert Xu
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  4:34 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> > You mean it's intended that *sk->sk_prot->memory_pressure can end up
> > as 1 when sk->sk_prot->memory_allocated is small (less than
> > ->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
> > than ->sysctl_mem[2])?  Because that's the effect of the current code.
> > If so I wonder why you bother computing it.
> 
> You need to remember that there are three different limits:
> minimum, pressure, and maximum.  By default we should never
> be in a situation where what you say can occur.
> 
> If you set all three limits to the same thing, then yes it
> won't work as intended but it's still well-behaved.

I'm not talking about setting all three limits to the same thing.

I'm talking about this situation:

CPU 0 comes into __sk_stream_mem_reclaim, reads memory_allocated, but
then before it can do the store to *memory_pressure, CPUs 1-1023 all
go through sk_stream_mem_schedule, collectively increase
memory_allocated to more than sysctl_mem[2] and set *memory_pressure.
Finally CPU 0 gets to do its store and it sets *memory_pressure back
to 0, but by this stage memory_allocated is way larger than
sysctl_mem[2].

Yes, it's unlikely, but that is the nature of race conditions - they
are unlikely, and only show up at inconvenient times, never when
someone who could fix the bug is watching. :)

Similarly it would be possible for other CPUs to decrease
memory_allocated from greater than sysctl_mem[2] to less than
sysctl_mem[0] in the interval between when we read memory_allocated
and set *memory_pressure to 1.  And it's quite possible for their
setting of *memory_pressure to 0 to happen before our setting of it to
1, so that it ends up at 1 when it should be 0.

Now, maybe it's the case that it doesn't really matter whether
*->memory_pressure is 0 or 1.  But if so, why bother computing it at
all?

People seem to think that using atomic_t means they don't need to use
a spinlock.  That's fine if there is only one variable involved, but
as soon as there's more than one, there's the possibility of a race,
whether or not you use atomic_t, and whether or not atomic_read has
"volatile" behaviour.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  4:34                                       ` Paul Mackerras
@ 2007-08-16  5:37                                         ` Herbert Xu
  2007-08-16  6:00                                           ` Paul Mackerras
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  5:37 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 02:34:25PM +1000, Paul Mackerras wrote:
>
> I'm talking about this situation:
> 
> CPU 0 comes into __sk_stream_mem_reclaim, reads memory_allocated, but
> then before it can do the store to *memory_pressure, CPUs 1-1023 all
> go through sk_stream_mem_schedule, collectively increase
> memory_allocated to more than sysctl_mem[2] and set *memory_pressure.
> Finally CPU 0 gets to do its store and it sets *memory_pressure back
> to 0, but by this stage memory_allocated is way larger than
> sysctl_mem[2].

It doesn't matter.  The memory pressure flag is an *advisory*
flag.  If we get it wrong the worst that'll happen is that we'd
waste some time doing work that'll be thrown away.

Please look at the places where it's used before jumping to
conclusions.

> Now, maybe it's the case that it doesn't really matter whether
> *->memory_pressure is 0 or 1.  But if so, why bother computing it at
> all?

As long as we get it right most of the time (and I think you
would agree that we do get it right most of the time), then
this flag has achieved its purpose.

> People seem to think that using atomic_t means they don't need to use
> a spinlock.  That's fine if there is only one variable involved, but
> as soon as there's more than one, there's the possibility of a race,
> whether or not you use atomic_t, and whether or not atomic_read has
> "volatile" behaviour.

In any case, this actually illustrates why the addition of
volatile is completely pointless.  Even if this code was
broken, which it definitely is not, having the volatile
there wouldn't have helped at all.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:37                                         ` Herbert Xu
@ 2007-08-16  6:00                                           ` Paul Mackerras
  2007-08-16 18:50                                             ` Christoph Lameter
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  6:00 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> It doesn't matter.  The memory pressure flag is an *advisory*
> flag.  If we get it wrong the worst that'll happen is that we'd
> waste some time doing work that'll be thrown away.

Ah, so it's the "racy but I don't care because it's only an
optimization" case.  That's fine.  Somehow I find it hard to believe
that all the racy uses of atomic_read in the kernel are like that,
though. :)

> In any case, this actually illustrates why the addition of
> volatile is completely pointless.  Even if this code was
> broken, which it definitely is not, having the volatile
> there wouldn't have helped at all.

Yes, adding volatile to racy code doesn't somehow make it race-free.
Neither does using atomic_t, despite what some seem to believe.

I have actually started going through all the uses of atomic_read in
the kernel.  So far out of the first 100 I have found none where we
have two atomic_reads of the same variable and the compiler could
usefully use the value from the first as the result of the second.
But there's still > 2500 to go...

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  6:00                                           ` Paul Mackerras
@ 2007-08-16 18:50                                             ` Christoph Lameter
  0 siblings, 0 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16 18:50 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> Herbert Xu writes:
> 
> > It doesn't matter.  The memory pressure flag is an *advisory*
> > flag.  If we get it wrong the worst that'll happen is that we'd
> > waste some time doing work that'll be thrown away.
> 
> Ah, so it's the "racy but I don't care because it's only an
> optimization" case.  That's fine.  Somehow I find it hard to believe
> that all the racy uses of atomic_read in the kernel are like that,
> though. :)

My use of atomic_read in SLUB is like that. Volatile does not magically 
sync up reads somehow.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:23                               ` Paul Mackerras
  2007-08-16  3:33                                 ` Herbert Xu
@ 2007-08-16 18:48                                 ` Christoph Lameter
  2007-08-16 19:44                                 ` Segher Boessenkool
  2 siblings, 0 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16 18:48 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> 
> It seems that there could be a lot of places where atomic_t is used in
> a non-atomic fashion, and that those uses are either buggy, or there
> is some lock held at the time which guarantees that other CPUs aren't
> changing the value.  In both cases there is no point in using
> atomic_t; we might as well just use an ordinary int.

The point of atomic_t is to do atomic *changes* to the variable.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:23                               ` Paul Mackerras
  2007-08-16  3:33                                 ` Herbert Xu
  2007-08-16 18:48                                 ` Christoph Lameter
@ 2007-08-16 19:44                                 ` Segher Boessenkool
  2 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16 19:44 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, Paul E. McKenney,
	netdev, ak, cfriesen, rpjday, jesper.juhl, linux-arch,
	Andrew Morton, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> I'd go so far as to say that anywhere where you want a non-"volatile"
> atomic_read, either your code is buggy, or else an int would work just
> as well.

Even, the only way to implement a "non-volatile" atomic_read() is
essentially as a plain int (you can do some tricks so you cannot
assign to the result and stuff like that, but that's not the issue
here).

So if that would be the behaviour we wanted, just get rid of that
whole atomic_read() thing, so no one can misuse it anymore.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:08                           ` Herbert Xu
  2007-08-16  2:18                             ` Christoph Lameter
@ 2007-08-16  2:18                             ` Chris Friesen
  1 sibling, 0 replies; 1651+ messages in thread
From: Chris Friesen @ 2007-08-16  2:18 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Paul Mackerras, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:

> But I have to say that I still don't know of a single place
> where one would actually use the volatile variant.

Given that many of the existing users do currently have "volatile", are 
you comfortable simply removing that behaviour from them?  Are you sure 
that you will not introduce any issues?

Forcing a re-read is only a performance penalty.  Removing it can cause 
behavioural changes.

I would be more comfortable making the default match the majority of the 
current implementations (ie: volatile semantics).  Then, if someone 
cares about performance they can explicitly validate the call path and 
convert it over to the non-volatile version.

Correctness before speed...

Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:41                       ` Christoph Lameter
  2007-08-16  2:15                         ` Satyam Sharma
@ 2007-08-16  2:32                         ` Paul E. McKenney
  1 sibling, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  2:32 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

On Wed, Aug 15, 2007 at 06:41:40PM -0700, Christoph Lameter wrote:
> On Wed, 15 Aug 2007, Paul E. McKenney wrote:
> 
> > Understood.  My point is not that the impact is precisely zero, but
> > rather that the impact on optimization is much less hurtful than the
> > problems that could arise otherwise, particularly as compilers become
> > more aggressive in their optimizations.
> 
> The problems arise because barriers are not used as required. Volatile 
> has wishy washy semantics and somehow marries memory barriers with data 
> access. It is clearer to separate the two. Conceptual cleanness usually 
> translates into better code. If one really wants the volatile then lets 
> make it explicit and use
> 
> 	atomic_read_volatile()

There are indeed architectures where you can cause gcc to emit memory
barriers in response to volatile.  I am assuming that we are -not-
making gcc do this.  Given this, then volatiles and memory barrier
instructions are orthogonal -- one controls the compiler, the other
controls the CPU.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  0:42               ` Christoph Lameter
  2007-08-16  0:53                 ` Paul E. McKenney
@ 2007-08-16  1:51                 ` Paul Mackerras
  2007-08-16  2:00                   ` Herbert Xu
  2007-08-16  2:07                   ` Segher Boessenkool
  1 sibling, 2 replies; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  1:51 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu

Christoph Lameter writes:

> A volatile default would disable optimizations for atomic_read. 
> atomic_read without volatile would allow for full optimization by the 
> compiler. Seems that this is what one wants in many cases.

Name one such case.

An atomic_read should do a load from memory.  If the programmer puts
an atomic_read() in the code then the compiler should emit a load for
it, not re-use a value returned by a previous atomic_read.  I do not
believe it would ever be useful for the compiler to collapse two
atomic_read statements into a single load.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:51                 ` Paul Mackerras
@ 2007-08-16  2:00                   ` Herbert Xu
  2007-08-16  2:05                     ` Paul Mackerras
  2007-08-16  2:07                   ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  2:00 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 11:51:42AM +1000, Paul Mackerras wrote:
> 
> Name one such case.

See sk_stream_mem_schedule in net/core/stream.c:

        /* Under limit. */
        if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
                if (*sk->sk_prot->memory_pressure)
                        *sk->sk_prot->memory_pressure = 0;
                return 1;
        }

        /* Over hard limit. */
        if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
                sk->sk_prot->enter_memory_pressure();
                goto suppress_allocation;
        }

We don't need to reload sk->sk_prot->memory_allocated here.

Now where is your example again?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:00                   ` Herbert Xu
@ 2007-08-16  2:05                     ` Paul Mackerras
  2007-08-16  2:11                       ` Herbert Xu
                                         ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  2:05 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> See sk_stream_mem_schedule in net/core/stream.c:
> 
>         /* Under limit. */
>         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
>                 if (*sk->sk_prot->memory_pressure)
>                         *sk->sk_prot->memory_pressure = 0;
>                 return 1;
>         }
> 
>         /* Over hard limit. */
>         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
>                 sk->sk_prot->enter_memory_pressure();
>                 goto suppress_allocation;
>         }
> 
> We don't need to reload sk->sk_prot->memory_allocated here.

Are you sure?  How do you know some other CPU hasn't changed the value
in between?

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:05                     ` Paul Mackerras
@ 2007-08-16  2:11                       ` Herbert Xu
  2007-08-16  2:35                         ` Paul E. McKenney
  2007-08-16  3:15                         ` Paul Mackerras
  2007-08-16  2:15                       ` Christoph Lameter
  2007-08-16  2:33                       ` Satyam Sharma
  2 siblings, 2 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  2:11 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 12:05:56PM +1000, Paul Mackerras wrote:
> Herbert Xu writes:
> 
> > See sk_stream_mem_schedule in net/core/stream.c:
> > 
> >         /* Under limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> >                 if (*sk->sk_prot->memory_pressure)
> >                         *sk->sk_prot->memory_pressure = 0;
> >                 return 1;
> >         }
> > 
> >         /* Over hard limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> >                 sk->sk_prot->enter_memory_pressure();
> >                 goto suppress_allocation;
> >         }
> > 
> > We don't need to reload sk->sk_prot->memory_allocated here.
> 
> Are you sure?  How do you know some other CPU hasn't changed the value
> in between?

Yes I'm sure, because we don't care if others have increased
the reservation.

Note that even if we did we'd be using barriers so volatile
won't do us any good here.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:11                       ` Herbert Xu
@ 2007-08-16  2:35                         ` Paul E. McKenney
  2007-08-16  3:15                         ` Paul Mackerras
  1 sibling, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16  2:35 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Christoph Lameter, Satyam Sharma, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 10:11:05AM +0800, Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 12:05:56PM +1000, Paul Mackerras wrote:
> > Herbert Xu writes:
> > 
> > > See sk_stream_mem_schedule in net/core/stream.c:
> > > 
> > >         /* Under limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> > >                 if (*sk->sk_prot->memory_pressure)
> > >                         *sk->sk_prot->memory_pressure = 0;
> > >                 return 1;
> > >         }
> > > 
> > >         /* Over hard limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> > >                 sk->sk_prot->enter_memory_pressure();
> > >                 goto suppress_allocation;
> > >         }
> > > 
> > > We don't need to reload sk->sk_prot->memory_allocated here.
> > 
> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> Yes I'm sure, because we don't care if others have increased
> the reservation.
> 
> Note that even if we did we'd be using barriers so volatile
> won't do us any good here.

If the load-coalescing is important to performance, why not load into
a local variable?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:11                       ` Herbert Xu
  2007-08-16  2:35                         ` Paul E. McKenney
@ 2007-08-16  3:15                         ` Paul Mackerras
  2007-08-16  3:43                           ` Herbert Xu
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  3:15 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> Yes I'm sure, because we don't care if others have increased
> the reservation.

But others can also reduce the reservation.  Also, the code sets and
clears *sk->sk_prot->memory_pressure nonatomically with respect to the
reads of sk->sk_prot->memory_allocated, so in fact the code doesn't
guarantee any particular relationship between the two.

That code looks like a beautiful example of buggy, racy code where
someone has sprinkled magic fix-the-races dust (otherwise known as
atomic_t) around in a vain attempt to fix the races.

That's assuming that all that stuff actually performs any useful
purpose, of course, and that there isn't some lock held by the
callers.  In the latter case it is pointless using atomic_t.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:15                         ` Paul Mackerras
@ 2007-08-16  3:43                           ` Herbert Xu
  0 siblings, 0 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  3:43 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Paul E. McKenney, Satyam Sharma,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 01:15:05PM +1000, Paul Mackerras wrote:
> 
> But others can also reduce the reservation.  Also, the code sets and
> clears *sk->sk_prot->memory_pressure nonatomically with respect to the
> reads of sk->sk_prot->memory_allocated, so in fact the code doesn't
> guarantee any particular relationship between the two.

Yes others can reduce the reservation, but the point of this
is that the code doesn't care.  We'll either see the value
before or after the reduction and in either case we'll do
something sensible.

The worst that can happen is when we're just below the hard
limit and multiple CPUs fail to allocate but that's not really
a problem because if the machine is making progress at all
then we will eventually scale back and allow these allocations
to succeed.

As to the non-atomic operation on memory_pressue, that's OK
because we only ever assign values to it and never do other
operations such as += or -=.  Remember that int/long assignments
must be atomic or Linux won't run on your architecture.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:05                     ` Paul Mackerras
  2007-08-16  2:11                       ` Herbert Xu
@ 2007-08-16  2:15                       ` Christoph Lameter
  2007-08-16  2:17                         ` Christoph Lameter
  2007-08-16  2:33                       ` Satyam Sharma
  2 siblings, 1 reply; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16  2:15 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Paul E. McKenney, Satyam Sharma, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> > We don't need to reload sk->sk_prot->memory_allocated here.
> 
> Are you sure?  How do you know some other CPU hasn't changed the value
> in between?

The cpu knows because the cacheline was not invalidated.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:15                       ` Christoph Lameter
@ 2007-08-16  2:17                         ` Christoph Lameter
  0 siblings, 0 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16  2:17 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Paul E. McKenney, Satyam Sharma, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Wed, 15 Aug 2007, Christoph Lameter wrote:

> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> 
> > > We don't need to reload sk->sk_prot->memory_allocated here.
> > 
> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> The cpu knows because the cacheline was not invalidated.

Crap my statement above is wrong..... We do not care that the 
value was changed otherwise we would have put a barrier in there.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:05                     ` Paul Mackerras
  2007-08-16  2:11                       ` Herbert Xu
  2007-08-16  2:15                       ` Christoph Lameter
@ 2007-08-16  2:33                       ` Satyam Sharma
  2007-08-16  3:01                         ` Satyam Sharma
  2007-08-16  3:05                         ` Paul Mackerras
  2 siblings, 2 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  2:33 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher



On Thu, 16 Aug 2007, Paul Mackerras wrote:

> Herbert Xu writes:
> 
> > See sk_stream_mem_schedule in net/core/stream.c:
> > 
> >         /* Under limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> >                 if (*sk->sk_prot->memory_pressure)
> >                         *sk->sk_prot->memory_pressure = 0;
> >                 return 1;
> >         }
> > 
> >         /* Over hard limit. */
> >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> >                 sk->sk_prot->enter_memory_pressure();
> >                 goto suppress_allocation;
> >         }
> > 
> > We don't need to reload sk->sk_prot->memory_allocated here.
> 
> Are you sure?  How do you know some other CPU hasn't changed the value
> in between?

I can't speak for this particular case, but there could be similar code
examples elsewhere, where we do the atomic ops on an atomic_t object
inside a higher-level locking scheme that would take care of the kind of
problem you're referring to here. It would be useful for such or similar
code if the compiler kept the value of that atomic object in a register.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:33                       ` Satyam Sharma
@ 2007-08-16  3:01                         ` Satyam Sharma
  2007-08-16  4:11                           ` Paul Mackerras
  2007-08-16  3:05                         ` Paul Mackerras
  1 sibling, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  3:01 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher



On Thu, 16 Aug 2007, Satyam Sharma wrote:

> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> > Herbert Xu writes:
> > 
> > > See sk_stream_mem_schedule in net/core/stream.c:
> > > 
> > >         /* Under limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0]) {
> > >                 if (*sk->sk_prot->memory_pressure)
> > >                         *sk->sk_prot->memory_pressure = 0;
> > >                 return 1;
> > >         }
> > > 
> > >         /* Over hard limit. */
> > >         if (atomic_read(sk->sk_prot->memory_allocated) > sk->sk_prot->sysctl_mem[2]) {
> > >                 sk->sk_prot->enter_memory_pressure();
> > >                 goto suppress_allocation;
> > >         }
> > > 
> > > We don't need to reload sk->sk_prot->memory_allocated here.
> > 
> > Are you sure?  How do you know some other CPU hasn't changed the value
> > in between?
> 
> I can't speak for this particular case, but there could be similar code
> examples elsewhere, where we do the atomic ops on an atomic_t object
> inside a higher-level locking scheme that would take care of the kind of
> problem you're referring to here. It would be useful for such or similar
> code if the compiler kept the value of that atomic object in a register.

We might not be using atomic_t (and ops) if we already have a higher-level
locking scheme, actually. So as Herbert mentioned, such cases might just
not care. [ Too much of this thread, too little sleep, sorry! ]

Anyway, the problem, of course, is that this conversion to a stronger /
safer-by-default behaviour doesn't happen with zero cost to performance.
Converting atomic ops to "volatile" behaviour did add ~2K to kernel text
for archs such as i386 (possibly to important codepaths) that didn't have
those semantics already so it would be constructive to actually look at
those differences and see if there were really any heisenbugs that got
rectified. Or if there were legitimate optimizations that got wrongly
disabled. Onus lies on those proposing the modifications, I'd say ;-)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:01                         ` Satyam Sharma
@ 2007-08-16  4:11                           ` Paul Mackerras
  2007-08-16  5:39                             ` Herbert Xu
  2007-08-16 18:54                             ` Christoph Lameter
  0 siblings, 2 replies; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  4:11 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Satyam Sharma writes:

> Anyway, the problem, of course, is that this conversion to a stronger /
> safer-by-default behaviour doesn't happen with zero cost to performance.
> Converting atomic ops to "volatile" behaviour did add ~2K to kernel text
> for archs such as i386 (possibly to important codepaths) that didn't have
> those semantics already so it would be constructive to actually look at
> those differences and see if there were really any heisenbugs that got
> rectified. Or if there were legitimate optimizations that got wrongly
> disabled. Onus lies on those proposing the modifications, I'd say ;-)

The uses of atomic_read where one might want it to allow caching of
the result seem to me to fall into 3 categories:

1. Places that are buggy because of a race arising from the way it's
   used.

2. Places where there is a race but it doesn't matter because we're
   doing some clever trick.

3. Places where there is some locking in place that eliminates any
   potential race.

In case 1, adding volatile won't solve the race, of course, but it's
hard to argue that we shouldn't do something because it will slow down
buggy code.  Case 2 is hopefully pretty rare and accompanied by large
comment blocks, and in those cases caching the result of atomic_read
explicitly in a local variable would probably make the code clearer.
And in case 3 there is no reason to use atomic_t at all; we might as
well just use an int.

So I don't see any good reason to make the atomic API more complex by
having "volatile" and "non-volatile" versions of atomic_read.  It
should just have the "volatile" behaviour.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  4:11                           ` Paul Mackerras
@ 2007-08-16  5:39                             ` Herbert Xu
  2007-08-16  6:56                               ` Paul Mackerras
  2007-08-16 18:54                             ` Christoph Lameter
  1 sibling, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  5:39 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 02:11:43PM +1000, Paul Mackerras wrote:
>
> The uses of atomic_read where one might want it to allow caching of
> the result seem to me to fall into 3 categories:
> 
> 1. Places that are buggy because of a race arising from the way it's
>    used.
> 
> 2. Places where there is a race but it doesn't matter because we're
>    doing some clever trick.
> 
> 3. Places where there is some locking in place that eliminates any
>    potential race.

Agreed.

> In case 1, adding volatile won't solve the race, of course, but it's
> hard to argue that we shouldn't do something because it will slow down
> buggy code.  Case 2 is hopefully pretty rare and accompanied by large
> comment blocks, and in those cases caching the result of atomic_read
> explicitly in a local variable would probably make the code clearer.
> And in case 3 there is no reason to use atomic_t at all; we might as
> well just use an int.

Since adding volatile doesn't help any of the 3 cases, and
takes away optimisations from both 2 and 3, I wonder what
is the point of the addition after all?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:39                             ` Herbert Xu
@ 2007-08-16  6:56                               ` Paul Mackerras
  2007-08-16  7:09                                 ` Herbert Xu
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  6:56 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> On Thu, Aug 16, 2007 at 02:11:43PM +1000, Paul Mackerras wrote:
> >
> > The uses of atomic_read where one might want it to allow caching of
> > the result seem to me to fall into 3 categories:
> > 
> > 1. Places that are buggy because of a race arising from the way it's
> >    used.
> > 
> > 2. Places where there is a race but it doesn't matter because we're
> >    doing some clever trick.
> > 
> > 3. Places where there is some locking in place that eliminates any
> >    potential race.
> 
> Agreed.
> 
> > In case 1, adding volatile won't solve the race, of course, but it's
> > hard to argue that we shouldn't do something because it will slow down
> > buggy code.  Case 2 is hopefully pretty rare and accompanied by large
> > comment blocks, and in those cases caching the result of atomic_read
> > explicitly in a local variable would probably make the code clearer.
> > And in case 3 there is no reason to use atomic_t at all; we might as
> > well just use an int.
> 
> Since adding volatile doesn't help any of the 3 cases, and
> takes away optimisations from both 2 and 3, I wonder what
> is the point of the addition after all?

Note that I said these are the cases _where one might want to allow
caching_, so of course adding volatile doesn't help _these_ cases.
There are of course other cases where one definitely doesn't want to
allow the compiler to cache the value, such as when polling an atomic
variable waiting for another CPU to change it, and from my inspection
so far these cases seem to be the majority.

The reasons for having "volatile" behaviour of atomic_read (whether or
not that is achieved by use of the "volatile" C keyword) are

- It matches the normal expectation based on the name "atomic_read"
- It matches the behaviour of the other atomic_* primitives
- It avoids bugs in the cases where "volatile" behaviour is required

To my mind these outweigh the small benefit for some code of the
non-volatile (caching-allowed) behaviour.  In fact it's pretty minor
either way, and since x86[-64] has this behaviour, one can expect the
potential bugs in generic code to have mostly been found, although
perhaps not all of them since x86[-64] has less aggressive reordering
of memory accesses and fewer registers in which to cache things than
some other architectures.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  6:56                               ` Paul Mackerras
@ 2007-08-16  7:09                                 ` Herbert Xu
  2007-08-16  8:06                                   ` Stefan Richter
  2007-08-16 14:48                                   ` Ilpo Järvinen
  0 siblings, 2 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  7:09 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 04:56:21PM +1000, Paul Mackerras wrote:
>
> Note that I said these are the cases _where one might want to allow
> caching_, so of course adding volatile doesn't help _these_ cases.
> There are of course other cases where one definitely doesn't want to
> allow the compiler to cache the value, such as when polling an atomic
> variable waiting for another CPU to change it, and from my inspection
> so far these cases seem to be the majority.

We've been through that already.  If it's a busy-wait it
should use cpu_relax.  If it's scheduling away that already
forces the compiler to reread anyway.

Do you have an actual example where volatile is needed?

> - It matches the normal expectation based on the name "atomic_read"
> - It matches the behaviour of the other atomic_* primitives

Can't argue since you left out what those expectations
or properties are.

> - It avoids bugs in the cases where "volatile" behaviour is required

Do you (or anyone else for that matter) have an example of this?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  7:09                                 ` Herbert Xu
@ 2007-08-16  8:06                                   ` Stefan Richter
  2007-08-16  8:10                                     ` Herbert Xu
  2007-08-16 14:48                                   ` Ilpo Järvinen
  1 sibling, 1 reply; 1651+ messages in thread
From: Stefan Richter @ 2007-08-16  8:06 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 04:56:21PM +1000, Paul Mackerras wrote:
>>
>> Note that I said these are the cases _where one might want to allow
>> caching_, so of course adding volatile doesn't help _these_ cases.
>> There are of course other cases where one definitely doesn't want to
>> allow the compiler to cache the value, such as when polling an atomic
>> variable waiting for another CPU to change it, and from my inspection
>> so far these cases seem to be the majority.
> 
> We've been through that already.  If it's a busy-wait it
> should use cpu_relax.  If it's scheduling away that already
> forces the compiler to reread anyway.
> 
> Do you have an actual example where volatile is needed?
> 
>> - It matches the normal expectation based on the name "atomic_read"
>> - It matches the behaviour of the other atomic_* primitives
> 
> Can't argue since you left out what those expectations
> or properties are.

We use atomic_t for data that is concurrently locklessly written and
read at arbitrary times.  My naive expectation as driver author (driver
maintainer) is that all atomic_t accessors, including atomic_read, (and
atomic bitops) work with the then current value of the atomic data.

>> - It avoids bugs in the cases where "volatile" behaviour is required
> 
> Do you (or anyone else for that matter) have an example of this?

The only code I somewhat know, the ieee1394 subsystem, was perhaps
authored and is currently maintained with the expectation that each
occurrence of atomic_read actually results in a load operation, i.e. is
not optimized away.  This means all atomic_t (bus generation, packet and
buffer refcounts, and some other state variables)* and likewise all
atomic bitops in that subsystem.

If that assumption is wrong, then what is the API or language primitive
to force a load operation to occur?


*)  Interesting what a quick LXR session in search for all atomic_t
usages in 'my' subsystem brings to light.  I now noticed an apparently
unused struct member in the bitrotting pcilynx driver, and more
importantly, a pairing of two atomic_t variables in raw1394 that should
be audited for race conditions and for possible replacement by plain int.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  8:06                                   ` Stefan Richter
@ 2007-08-16  8:10                                     ` Herbert Xu
  2007-08-16  9:54                                       ` Stefan Richter
                                                         ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16  8:10 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
> > 
> > Do you (or anyone else for that matter) have an example of this?
> 
> The only code I somewhat know, the ieee1394 subsystem, was perhaps
> authored and is currently maintained with the expectation that each
> occurrence of atomic_read actually results in a load operation, i.e. is
> not optimized away.  This means all atomic_t (bus generation, packet and
> buffer refcounts, and some other state variables)* and likewise all
> atomic bitops in that subsystem.

Can you find an actual atomic_read code snippet there that is
broken without the volatile modifier?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  8:10                                     ` Herbert Xu
@ 2007-08-16  9:54                                       ` Stefan Richter
  2007-08-16 10:31                                         ` Stefan Richter
  2007-08-16 10:35                                         ` Herbert Xu
  2007-08-16 19:48                                       ` Chris Snook
  2007-08-17  5:09                                       ` Paul Mackerras
  2 siblings, 2 replies; 1651+ messages in thread
From: Stefan Richter @ 2007-08-16  9:54 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
>> > 
>> > Do you (or anyone else for that matter) have an example of this?
>> 
>> The only code I somewhat know, the ieee1394 subsystem, was perhaps
>> authored and is currently maintained with the expectation that each
>> occurrence of atomic_read actually results in a load operation, i.e. is
>> not optimized away.  This means all atomic_t (bus generation, packet and
>> buffer refcounts, and some other state variables)* and likewise all
>> atomic bitops in that subsystem.
> 
> Can you find an actual atomic_read code snippet there that is
> broken without the volatile modifier?

What do I have to look for?  atomic_read after another read or write
access to the same variable, in the same function scope?  Or in the sum
of scopes of functions that could be merged by function inlining?

One example was discussed here earlier:  The for (;;) loop in
nodemgr_host_thread.  There an msleep_interruptible implicitly acted as
barrier (at the moment because it's in a different translation unit; if
it were the same, then because it hopefully has own barriers).  So that
happens to work, although such an implicit barrier is bad style:  Better
enforce the desired behaviour (== guaranteed load operation) *explicitly*.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  9:54                                       ` Stefan Richter
@ 2007-08-16 10:31                                         ` Stefan Richter
  2007-08-16 10:42                                           ` Herbert Xu
  2007-08-16 10:35                                         ` Herbert Xu
  1 sibling, 1 reply; 1651+ messages in thread
From: Stefan Richter @ 2007-08-16 10:31 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

I wrote:
> Herbert Xu wrote:
>> On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
[...]
>>> expectation that each
>>> occurrence of atomic_read actually results in a load operation, i.e. is
>>> not optimized away.
[...]
>> Can you find an actual atomic_read code snippet there that is
>> broken without the volatile modifier?

PS:  Just to clarify, I'm not speaking for the volatile modifier.  I'm
not speaking for any particular implementation of atomic_t and its
accessors at all.  All I am saying is that
  - we use atomically accessed data types because we concurrently but
    locklessly access this data,
  - hence a read access to this data that could be optimized away
    makes *no sense at all*.

The only sensible read accessor to an atomic datatype is a read accessor
that will not be optimized away.

So, the architecture guys can implement atomic_read however they want
--- as long as it cannot be optimized away.*

PPS:  If somebody has code where he can afford to let the compiler
coalesce atomic_read with a previous access to the same data, i.e.
doesn't need and doesn't want all guarantees that the atomic_read API
makes (or IMO should make), then he can replace the atomic_read by a
local temporary variable.


*) Exceptions:
	if (known_to_be_false)
		read_access(a);
and the like.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 10:31                                         ` Stefan Richter
@ 2007-08-16 10:42                                           ` Herbert Xu
  2007-08-16 16:34                                             ` Paul E. McKenney
  2007-08-17  5:04                                             ` Paul Mackerras
  0 siblings, 2 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16 10:42 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 12:31:03PM +0200, Stefan Richter wrote:
> 
> PS:  Just to clarify, I'm not speaking for the volatile modifier.  I'm
> not speaking for any particular implementation of atomic_t and its
> accessors at all.  All I am saying is that
>   - we use atomically accessed data types because we concurrently but
>     locklessly access this data,
>   - hence a read access to this data that could be optimized away
>     makes *no sense at all*.

No sane compiler can optimise away an atomic_read per se.
That's only possible if there's a preceding atomic_set or
atomic_read, with no barriers in the middle.

If that's the case, then one has to conclude that doing
away with the second read is acceptable, as otherwise
a memory (or at least a compiler) barrier should have been
used.

In fact, volatile doesn't guarantee that the memory gets
read anyway.  You might be reading some stale value out
of the cache.  Granted this doesn't happen on x86 but
when you're coding for the kernel you can't make such
assumptions.

So the point here is that if you don't mind getting a stale
value from the CPU cache when doing an atomic_read, then
surely you won't mind getting a stale value from the compiler
"cache".

> So, the architecture guys can implement atomic_read however they want
> --- as long as it cannot be optimized away.*

They can implement it however they want as long as it stays
atomic.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 10:42                                           ` Herbert Xu
@ 2007-08-16 16:34                                             ` Paul E. McKenney
  2007-08-16 23:59                                               ` Herbert Xu
  2007-08-17  3:15                                               ` Nick Piggin
  2007-08-17  5:04                                             ` Paul Mackerras
  1 sibling, 2 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16 16:34 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 06:42:50PM +0800, Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 12:31:03PM +0200, Stefan Richter wrote:
> > 
> > PS:  Just to clarify, I'm not speaking for the volatile modifier.  I'm
> > not speaking for any particular implementation of atomic_t and its
> > accessors at all.  All I am saying is that
> >   - we use atomically accessed data types because we concurrently but
> >     locklessly access this data,
> >   - hence a read access to this data that could be optimized away
> >     makes *no sense at all*.
> 
> No sane compiler can optimise away an atomic_read per se.
> That's only possible if there's a preceding atomic_set or
> atomic_read, with no barriers in the middle.
> 
> If that's the case, then one has to conclude that doing
> away with the second read is acceptable, as otherwise
> a memory (or at least a compiler) barrier should have been
> used.

The compiler can also reorder non-volatile accesses.  For an example
patch that cares about this, please see:

	http://lkml.org/lkml/2007/8/7/280

This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
rcu_read_unlock() to ensure that accesses aren't reordered with respect
to interrupt handlers and NMIs/SMIs running on that same CPU.

> In fact, volatile doesn't guarantee that the memory gets
> read anyway.  You might be reading some stale value out
> of the cache.  Granted this doesn't happen on x86 but
> when you're coding for the kernel you can't make such
> assumptions.
> 
> So the point here is that if you don't mind getting a stale
> value from the CPU cache when doing an atomic_read, then
> surely you won't mind getting a stale value from the compiler
> "cache".

Absolutely disagree.  An interrupt/NMI/SMI handler running on the CPU
will see the same value (whether in cache or in store buffer) that
the mainline code will see.  In this case, we don't care about CPU
misordering, only about compiler misordering.  It is easy to see
other uses that combine communication with handlers on the current
CPU with communication among CPUs -- again, see prior messages in
this thread.

> > So, the architecture guys can implement atomic_read however they want
> > --- as long as it cannot be optimized away.*
> 
> They can implement it however they want as long as it stays
> atomic.

Precisely.  And volatility is a key property of "atomic".  Let's please
not throw it away.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 16:34                                             ` Paul E. McKenney
@ 2007-08-16 23:59                                               ` Herbert Xu
  2007-08-17  1:01                                                 ` Paul E. McKenney
  2007-08-17  3:15                                               ` Nick Piggin
  1 sibling, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16 23:59 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
>
> The compiler can also reorder non-volatile accesses.  For an example
> patch that cares about this, please see:
> 
> 	http://lkml.org/lkml/2007/8/7/280
> 
> This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
> rcu_read_unlock() to ensure that accesses aren't reordered with respect
> to interrupt handlers and NMIs/SMIs running on that same CPU.

Good, finally we have some code to discuss (even though it's
not actually in the kernel yet).

First of all, I think this illustrates that what you want
here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
macro occurs a lot more times in your patch than atomic
reads/sets.  So *assuming* that it was necessary at all,
then having an ordered variant of the atomic_read/atomic_set
ops could do just as well.

However, I still don't know which atomic_read/atomic_set in
your patch would be broken if there were no volatile.  Could
you please point them out?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 23:59                                               ` Herbert Xu
@ 2007-08-17  1:01                                                 ` Paul E. McKenney
  2007-08-17  7:39                                                   ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-17  1:01 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
> >
> > The compiler can also reorder non-volatile accesses.  For an example
> > patch that cares about this, please see:
> > 
> > 	http://lkml.org/lkml/2007/8/7/280
> > 
> > This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
> > rcu_read_unlock() to ensure that accesses aren't reordered with respect
> > to interrupt handlers and NMIs/SMIs running on that same CPU.
> 
> Good, finally we have some code to discuss (even though it's
> not actually in the kernel yet).

There was some earlier in this thread as well.

> First of all, I think this illustrates that what you want
> here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> macro occurs a lot more times in your patch than atomic
> reads/sets.  So *assuming* that it was necessary at all,
> then having an ordered variant of the atomic_read/atomic_set
> ops could do just as well.

Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
to maintain ordering, then I could just use them instead of having to
create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
different patch.)

> However, I still don't know which atomic_read/atomic_set in
> your patch would be broken if there were no volatile.  Could
> you please point them out?

Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
atomic_read() and atomic_set().  Starting with __rcu_read_lock():

o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
	was ordered by the compiler after
	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
	before the rcu_flipctr.

	Then if there was an rcu_read_lock() in the SMI/NMI
	handler (which is perfectly legal), the nested rcu_read_lock()
	would believe that it could take the then-clause of the
	enclosing "if" statement.  But because the rcu_flipctr per-CPU
	variable had not yet been incremented, an RCU updater would
	be within its rights to assume that there were no RCU reads
	in progress, thus possibly yanking a data structure out from
	under the reader in the SMI/NMI function.

	Fatal outcome.  Note that only one CPU is involved here
	because these are all either per-CPU or per-task variables.

o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
	was ordered by the compiler to follow the
	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
	happened between the two, then an __rcu_read_lock() in the NMI/SMI
	would incorrectly take the "else" clause of the enclosing "if"
	statement.  If some other CPU flipped the rcu_ctrlblk.completed
	in the meantime, then the __rcu_read_lock() would (correctly)
	write the new value into rcu_flipctr_idx.

	Well and good so far.  But the problem arises in
	__rcu_read_unlock(), which then decrements the wrong counter.
	Depending on exactly how subsequent events played out, this could
	result in either prematurely ending grace periods or never-ending
	grace periods, both of which are fatal outcomes.

And the following are not needed in the current version of the
patch, but will be in a future version that either avoids disabling
irqs or that dispenses with the smp_read_barrier_depends() that I
have 99% convinced myself is unneeded:

o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);

o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;

Furthermore, in that future version, irq handlers can cause the same
mischief that SMI/NMI handlers can in this version.

Next, looking at __rcu_read_unlock():

o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
	was reordered by the compiler to follow the
	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
	then if an NMI/SMI containing an rcu_read_lock() occurs between
	the two, this nested rcu_read_lock() would incorrectly believe
	that it was protected by an enclosing RCU read-side critical
	section as described in the first reversal discussed for
	__rcu_read_lock() above.  Again, fatal outcome.

This is what we have now.  It is not hard to imagine situations that
interact with -both- interrupt handlers -and- other CPUs, as described
earlier.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  1:01                                                 ` Paul E. McKenney
@ 2007-08-17  7:39                                                   ` Satyam Sharma
  2007-08-17 14:31                                                     ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  7:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher



On Thu, 16 Aug 2007, Paul E. McKenney wrote:

> On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
> > >
> > > The compiler can also reorder non-volatile accesses.  For an example
> > > patch that cares about this, please see:
> > > 
> > > 	http://lkml.org/lkml/2007/8/7/280
> > > 
> > > This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
> > > rcu_read_unlock() to ensure that accesses aren't reordered with respect
> > > to interrupt handlers and NMIs/SMIs running on that same CPU.
> > 
> > Good, finally we have some code to discuss (even though it's
> > not actually in the kernel yet).
> 
> There was some earlier in this thread as well.

Hmm, I never quite got what all this interrupt/NMI/SMI handling and
RCU business you mentioned earlier was all about, but now that you've
pointed to the actual code and issues with it ...


> > First of all, I think this illustrates that what you want
> > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > macro occurs a lot more times in your patch than atomic
> > reads/sets.  So *assuming* that it was necessary at all,
> > then having an ordered variant of the atomic_read/atomic_set
> > ops could do just as well.
> 
> Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> to maintain ordering, then I could just use them instead of having to
> create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> different patch.)

+#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))

I suppose one could want volatile access semantics for stuff that's
a bit-field too, no?

Also, this gives *zero* "re-ordering" guarantees that your code wants
as you've explained it below) -- neither w.r.t. CPU re-ordering (which
probably you don't care about) *nor* w.r.t. compiler re-ordering
(which you definitely _do_ care about).


> > However, I still don't know which atomic_read/atomic_set in
> > your patch would be broken if there were no volatile.  Could
> > you please point them out?
> 
> Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> 
> o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> 	was ordered by the compiler after
> 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> 	before the rcu_flipctr.
> 
> 	Then if there was an rcu_read_lock() in the SMI/NMI
> 	handler (which is perfectly legal), the nested rcu_read_lock()
> 	would believe that it could take the then-clause of the
> 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> 	variable had not yet been incremented, an RCU updater would
> 	be within its rights to assume that there were no RCU reads
> 	in progress, thus possibly yanking a data structure out from
> 	under the reader in the SMI/NMI function.
> 
> 	Fatal outcome.  Note that only one CPU is involved here
> 	because these are all either per-CPU or per-task variables.

Ok, so you don't care about CPU re-ordering. Still, I should let you know
that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
want is a full compiler optimization barrier().

[ Your code probably works now, and emits correct code, but that's
  just because of gcc did what it did. Nothing in any standard,
  or in any documented behaviour of gcc, or anything about the real
  (or expected) semantics of "volatile" is protecting the code here. ]


> o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> 	was ordered by the compiler to follow the
> 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> 	would incorrectly take the "else" clause of the enclosing "if"
> 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> 	in the meantime, then the __rcu_read_lock() would (correctly)
> 	write the new value into rcu_flipctr_idx.
> 
> 	Well and good so far.  But the problem arises in
> 	__rcu_read_unlock(), which then decrements the wrong counter.
> 	Depending on exactly how subsequent events played out, this could
> 	result in either prematurely ending grace periods or never-ending
> 	grace periods, both of which are fatal outcomes.
> 
> And the following are not needed in the current version of the
> patch, but will be in a future version that either avoids disabling
> irqs or that dispenses with the smp_read_barrier_depends() that I
> have 99% convinced myself is unneeded:
> 
> o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> 
> o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> 
> Furthermore, in that future version, irq handlers can cause the same
> mischief that SMI/NMI handlers can in this version.
> 
> Next, looking at __rcu_read_unlock():
> 
> o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> 	was reordered by the compiler to follow the
> 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> 	the two, this nested rcu_read_lock() would incorrectly believe
> 	that it was protected by an enclosing RCU read-side critical
> 	section as described in the first reversal discussed for
> 	__rcu_read_lock() above.  Again, fatal outcome.
> 
> This is what we have now.  It is not hard to imagine situations that
> interact with -both- interrupt handlers -and- other CPUs, as described
> earlier.

It's not about interrupt/SMI/NMI handlers at all! What you clearly want,
simply put, is that a certain stream of C statements must be emitted
by the compiler _as they are_ with no re-ordering optimizations! You must
*definitely* use barrier(), IMHO.


Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  7:39                                                   ` Satyam Sharma
@ 2007-08-17 14:31                                                     ` Paul E. McKenney
  2007-08-17 18:31                                                       ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-17 14:31 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
> 
> 
> On Thu, 16 Aug 2007, Paul E. McKenney wrote:
> 
> > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > > On Thu, Aug 16, 2007 at 09:34:41AM -0700, Paul E. McKenney wrote:
> > > >
> > > > The compiler can also reorder non-volatile accesses.  For an example
> > > > patch that cares about this, please see:
> > > > 
> > > > 	http://lkml.org/lkml/2007/8/7/280
> > > > 
> > > > This patch uses an ORDERED_WRT_IRQ() in rcu_read_lock() and
> > > > rcu_read_unlock() to ensure that accesses aren't reordered with respect
> > > > to interrupt handlers and NMIs/SMIs running on that same CPU.
> > > 
> > > Good, finally we have some code to discuss (even though it's
> > > not actually in the kernel yet).
> > 
> > There was some earlier in this thread as well.
> 
> Hmm, I never quite got what all this interrupt/NMI/SMI handling and
> RCU business you mentioned earlier was all about, but now that you've
> pointed to the actual code and issues with it ...

Glad to help...

> > > First of all, I think this illustrates that what you want
> > > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > > macro occurs a lot more times in your patch than atomic
> > > reads/sets.  So *assuming* that it was necessary at all,
> > > then having an ordered variant of the atomic_read/atomic_set
> > > ops could do just as well.
> > 
> > Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> > to maintain ordering, then I could just use them instead of having to
> > create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> > different patch.)
> 
> +#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))
> 
> I suppose one could want volatile access semantics for stuff that's
> a bit-field too, no?

One could, but this is not supported in general.  So if you want that,
you need to use the usual bit-mask tricks and (for setting) atomic
operations.

> Also, this gives *zero* "re-ordering" guarantees that your code wants
> as you've explained it below) -- neither w.r.t. CPU re-ordering (which
> probably you don't care about) *nor* w.r.t. compiler re-ordering
> (which you definitely _do_ care about).

You are correct about CPU re-ordering (and about the fact that this
example doesn't care about it), but not about compiler re-ordering.

The compiler is prohibited from moving a volatile access across a sequence
point.  One example of a sequence point is a statement boundary.  Because
all of the volatile accesses in this code are separated by statement
boundaries, a conforming compiler is prohibited from reordering them.

> > > However, I still don't know which atomic_read/atomic_set in
> > > your patch would be broken if there were no volatile.  Could
> > > you please point them out?
> > 
> > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> > atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> > 
> > o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> > 	was ordered by the compiler after
> > 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> > 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> > 	before the rcu_flipctr.
> > 
> > 	Then if there was an rcu_read_lock() in the SMI/NMI
> > 	handler (which is perfectly legal), the nested rcu_read_lock()
> > 	would believe that it could take the then-clause of the
> > 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> > 	variable had not yet been incremented, an RCU updater would
> > 	be within its rights to assume that there were no RCU reads
> > 	in progress, thus possibly yanking a data structure out from
> > 	under the reader in the SMI/NMI function.
> > 
> > 	Fatal outcome.  Note that only one CPU is involved here
> > 	because these are all either per-CPU or per-task variables.
> 
> Ok, so you don't care about CPU re-ordering. Still, I should let you know
> that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
> want is a full compiler optimization barrier().

No.  See above.

> [ Your code probably works now, and emits correct code, but that's
>   just because of gcc did what it did. Nothing in any standard,
>   or in any documented behaviour of gcc, or anything about the real
>   (or expected) semantics of "volatile" is protecting the code here. ]

Really?  Why doesn't the prohibition against moving volatile accesses
across sequence points take care of this?

> > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> > 	was ordered by the compiler to follow the
> > 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> > 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> > 	would incorrectly take the "else" clause of the enclosing "if"
> > 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> > 	in the meantime, then the __rcu_read_lock() would (correctly)
> > 	write the new value into rcu_flipctr_idx.
> > 
> > 	Well and good so far.  But the problem arises in
> > 	__rcu_read_unlock(), which then decrements the wrong counter.
> > 	Depending on exactly how subsequent events played out, this could
> > 	result in either prematurely ending grace periods or never-ending
> > 	grace periods, both of which are fatal outcomes.
> > 
> > And the following are not needed in the current version of the
> > patch, but will be in a future version that either avoids disabling
> > irqs or that dispenses with the smp_read_barrier_depends() that I
> > have 99% convinced myself is unneeded:
> > 
> > o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > 
> > o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > 
> > Furthermore, in that future version, irq handlers can cause the same
> > mischief that SMI/NMI handlers can in this version.
> > 
> > Next, looking at __rcu_read_unlock():
> > 
> > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> > 	was reordered by the compiler to follow the
> > 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> > 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> > 	the two, this nested rcu_read_lock() would incorrectly believe
> > 	that it was protected by an enclosing RCU read-side critical
> > 	section as described in the first reversal discussed for
> > 	__rcu_read_lock() above.  Again, fatal outcome.
> > 
> > This is what we have now.  It is not hard to imagine situations that
> > interact with -both- interrupt handlers -and- other CPUs, as described
> > earlier.
> 
> It's not about interrupt/SMI/NMI handlers at all! What you clearly want,
> simply put, is that a certain stream of C statements must be emitted
> by the compiler _as they are_ with no re-ordering optimizations! You must
> *definitely* use barrier(), IMHO.

Almost.  I don't care about most of the operations, only about the loads
and stores marked volatile.  Again, although the compiler is free to
reorder volatile accesses that occur -within- a single statement, it
is prohibited by the standard from moving volatile accesses from one
statement to another.  Therefore, this code can legitimately use volatile.

Or am I missing something subtle?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 14:31                                                     ` Paul E. McKenney
@ 2007-08-17 18:31                                                       ` Satyam Sharma
  2007-08-17 18:56                                                         ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 18:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Fri, 17 Aug 2007, Paul E. McKenney wrote:

> On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
> > 
> > On Thu, 16 Aug 2007, Paul E. McKenney wrote:
> > 
> > > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > > > 
> > > > First of all, I think this illustrates that what you want
> > > > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > > > macro occurs a lot more times in your patch than atomic
> > > > reads/sets.  So *assuming* that it was necessary at all,
> > > > then having an ordered variant of the atomic_read/atomic_set
> > > > ops could do just as well.
> > > 
> > > Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> > > to maintain ordering, then I could just use them instead of having to
> > > create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> > > different patch.)
> > 
> > +#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))
> > [...]
> > Also, this gives *zero* "re-ordering" guarantees that your code wants
> > as you've explained it below) -- neither w.r.t. CPU re-ordering (which
> > probably you don't care about) *nor* w.r.t. compiler re-ordering
> > (which you definitely _do_ care about).
> 
> You are correct about CPU re-ordering (and about the fact that this
> example doesn't care about it), but not about compiler re-ordering.
> 
> The compiler is prohibited from moving a volatile access across a sequence
> point.  One example of a sequence point is a statement boundary.  Because
> all of the volatile accesses in this code are separated by statement
> boundaries, a conforming compiler is prohibited from reordering them.

Yes, you're right, and I believe precisely this was discussed elsewhere
as well today.

But I'd call attention to what Herbert mentioned there. You're using
ORDERED_WRT_IRQ() on stuff that is _not_ defined to be an atomic_t at all:

* Member "completed" of struct rcu_ctrlblk is a long.
* Per-cpu variable rcu_flipctr is an array of ints.
* Members "rcu_read_lock_nesting" and "rcu_flipctr_idx" of
  struct task_struct are ints.

So are you saying you're "having to use" this volatile-access macro
because you *couldn't* declare all the above as atomic_t and thus just
expect the right thing to happen by using the atomic ops API by default,
because it lacks volatile access semantics (on x86)?

If so, then I wonder if using the volatile access cast is really the
best way to achieve (at least in terms of code clarity) the kind of
re-ordering guarantees it wants there. (there could be alternative
solutions, such as using barrier(), or that at bottom of this mail)

What I mean is this: If you switch to atomic_t, and x86 switched to
make atomic_t have "volatile" semantics by default, the statements
would be simply a string of: atomic_inc(), atomic_add(), atomic_set(),
and atomic_read() statements, and nothing in there that clearly makes
it *explicit* that the code is correct (and not buggy) simply because
of the re-ordering guarantees that the C "volatile" type-qualifier
keyword gives us as per the standard. But now we're firmly in
"subjective" territory, so you or anybody could legitimately disagree.

> > > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> > > atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> > > 
> > > o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> > > 	was ordered by the compiler after
> > > 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> > > 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> > > 	before the rcu_flipctr.
> > > 
> > > 	Then if there was an rcu_read_lock() in the SMI/NMI
> > > 	handler (which is perfectly legal), the nested rcu_read_lock()
> > > 	would believe that it could take the then-clause of the
> > > 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> > > 	variable had not yet been incremented, an RCU updater would
> > > 	be within its rights to assume that there were no RCU reads
> > > 	in progress, thus possibly yanking a data structure out from
> > > 	under the reader in the SMI/NMI function.
> > > 
> > > 	Fatal outcome.  Note that only one CPU is involved here
> > > 	because these are all either per-CPU or per-task variables.
> > 
> > Ok, so you don't care about CPU re-ordering. Still, I should let you know
> > that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
> > want is a full compiler optimization barrier().
> 
> No.  See above.

True, *(volatile foo *)& _will_ work for this case.

But multiple calls to barrier() (granted, would invalidate all other
optimizations also) would work as well, would it not?

[ Interestingly, if you declared all those objects mentioned earlier as
  atomic_t, and x86(-64) switched to an __asm__ __volatile__ based variant
  for atomic_{read,set}_volatile(), the bugs you want to avoid would still
  be there. "volatile" the C language type-qualifier does have compiler
  re-ordering semantics you mentioned earlier, but the "volatile" that
  applies to inline asm()s gives no re-ordering guarantees. ]

> > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> > > 	was ordered by the compiler to follow the
> > > 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> > > 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> > > 	would incorrectly take the "else" clause of the enclosing "if"
> > > 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> > > 	in the meantime, then the __rcu_read_lock() would (correctly)
> > > 	write the new value into rcu_flipctr_idx.
> > > 
> > > 	Well and good so far.  But the problem arises in
> > > 	__rcu_read_unlock(), which then decrements the wrong counter.
> > > 	Depending on exactly how subsequent events played out, this could
> > > 	result in either prematurely ending grace periods or never-ending
> > > 	grace periods, both of which are fatal outcomes.
> > > 
> > > And the following are not needed in the current version of the
> > > patch, but will be in a future version that either avoids disabling
> > > irqs or that dispenses with the smp_read_barrier_depends() that I
> > > have 99% convinced myself is unneeded:
> > > 
> > > o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > > 
> > > o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > > 
> > > Furthermore, in that future version, irq handlers can cause the same
> > > mischief that SMI/NMI handlers can in this version.

So don't remove the local_irq_save/restore, which is well-established and
well-understood for such cases (it doesn't help you with SMI/NMI,
admittedly). This isn't really about RCU or per-cpu vars as such, it's
just about racy code where you don't want to get hit by a concurrent
interrupt (it does turn out that doing things in a _particular order_ will
not cause fatal/buggy behaviour, but it's still a race issue, after all).

> > > Next, looking at __rcu_read_unlock():
> > > 
> > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> > > 	was reordered by the compiler to follow the
> > > 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> > > 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> > > 	the two, this nested rcu_read_lock() would incorrectly believe
> > > 	that it was protected by an enclosing RCU read-side critical
> > > 	section as described in the first reversal discussed for
> > > 	__rcu_read_lock() above.  Again, fatal outcome.
> > > 
> > > This is what we have now.  It is not hard to imagine situations that
> > > interact with -both- interrupt handlers -and- other CPUs, as described
> > > earlier.

Unless somebody's going for a lockless implementation, such situations
normally use spin_lock_irqsave() based locking (or local_irq_save for
those who care only for current CPU) -- problem with the patch in question,
is that you want to prevent races with concurrent SMI/NMIs as well, which
is not something that a lot of code needs to consider.

[ Curiously, another thread is discussing something similar also:
  http://lkml.org/lkml/2007/8/15/393 "RFC: do get_rtc_time() correctly" ]

Anyway, I didn't look at the code in that patch very much in detail, but
why couldn't you implement some kind of synchronization variable that lets
rcu_read_lock() or rcu_read_unlock() -- when being called from inside an
NMI or SMI handler -- know that it has concurrently interrupted an ongoing
rcu_read_{un}lock() and so must do things differently ... (?)

I'm also wondering if there's other code that's not using locking in the
kernel that faces similar issues, and what they've done to deal with it
(if anything). Such bugs would be subtle, and difficult to diagnose.

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:31                                                       ` Satyam Sharma
@ 2007-08-17 18:56                                                         ` Paul E. McKenney
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-17 18:56 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Christoph Lameter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Sat, Aug 18, 2007 at 12:01:38AM +0530, Satyam Sharma wrote:
> 
> 
> On Fri, 17 Aug 2007, Paul E. McKenney wrote:
> 
> > On Fri, Aug 17, 2007 at 01:09:08PM +0530, Satyam Sharma wrote:
> > > 
> > > On Thu, 16 Aug 2007, Paul E. McKenney wrote:
> > > 
> > > > On Fri, Aug 17, 2007 at 07:59:02AM +0800, Herbert Xu wrote:
> > > > > 
> > > > > First of all, I think this illustrates that what you want
> > > > > here has nothing to do with atomic ops.  The ORDERED_WRT_IRQ
> > > > > macro occurs a lot more times in your patch than atomic
> > > > > reads/sets.  So *assuming* that it was necessary at all,
> > > > > then having an ordered variant of the atomic_read/atomic_set
> > > > > ops could do just as well.
> > > > 
> > > > Indeed.  If I could trust atomic_read()/atomic_set() to cause the compiler
> > > > to maintain ordering, then I could just use them instead of having to
> > > > create an  ORDERED_WRT_IRQ().  (Or ACCESS_ONCE(), as it is called in a
> > > > different patch.)
> > > 
> > > +#define WHATEVER(x)	(*(volatile typeof(x) *)&(x))
> > > [...]
> > > Also, this gives *zero* "re-ordering" guarantees that your code wants
> > > as you've explained it below) -- neither w.r.t. CPU re-ordering (which
> > > probably you don't care about) *nor* w.r.t. compiler re-ordering
> > > (which you definitely _do_ care about).
> > 
> > You are correct about CPU re-ordering (and about the fact that this
> > example doesn't care about it), but not about compiler re-ordering.
> > 
> > The compiler is prohibited from moving a volatile access across a sequence
> > point.  One example of a sequence point is a statement boundary.  Because
> > all of the volatile accesses in this code are separated by statement
> > boundaries, a conforming compiler is prohibited from reordering them.
> 
> Yes, you're right, and I believe precisely this was discussed elsewhere
> as well today.
> 
> But I'd call attention to what Herbert mentioned there. You're using
> ORDERED_WRT_IRQ() on stuff that is _not_ defined to be an atomic_t at all:
> 
> * Member "completed" of struct rcu_ctrlblk is a long.
> * Per-cpu variable rcu_flipctr is an array of ints.
> * Members "rcu_read_lock_nesting" and "rcu_flipctr_idx" of
>   struct task_struct are ints.
> 
> So are you saying you're "having to use" this volatile-access macro
> because you *couldn't* declare all the above as atomic_t and thus just
> expect the right thing to happen by using the atomic ops API by default,
> because it lacks volatile access semantics (on x86)?
> 
> If so, then I wonder if using the volatile access cast is really the
> best way to achieve (at least in terms of code clarity) the kind of
> re-ordering guarantees it wants there. (there could be alternative
> solutions, such as using barrier(), or that at bottom of this mail)
> 
> What I mean is this: If you switch to atomic_t, and x86 switched to
> make atomic_t have "volatile" semantics by default, the statements
> would be simply a string of: atomic_inc(), atomic_add(), atomic_set(),
> and atomic_read() statements, and nothing in there that clearly makes
> it *explicit* that the code is correct (and not buggy) simply because
> of the re-ordering guarantees that the C "volatile" type-qualifier
> keyword gives us as per the standard. But now we're firmly in
> "subjective" territory, so you or anybody could legitimately disagree.

In any case, given Linus's note, it appears that atomic_read() and
atomic_set() won't consistently have volatile semantics, at least
not while the compiler generates such ugly code for volatile accesses.
So I will continue with my current approach.

In any case, I will not be using atomic_inc() or atomic_add() in this
code, as doing so would more than double the overhead, even on machines
that are the most efficient at implementing atomic operations.

> > > > Suppose I tried replacing the ORDERED_WRT_IRQ() calls with
> > > > atomic_read() and atomic_set().  Starting with __rcu_read_lock():
> > > > 
> > > > o	If "ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])++"
> > > > 	was ordered by the compiler after
> > > > 	"ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1", then
> > > > 	suppose an NMI/SMI happened after the rcu_read_lock_nesting but
> > > > 	before the rcu_flipctr.
> > > > 
> > > > 	Then if there was an rcu_read_lock() in the SMI/NMI
> > > > 	handler (which is perfectly legal), the nested rcu_read_lock()
> > > > 	would believe that it could take the then-clause of the
> > > > 	enclosing "if" statement.  But because the rcu_flipctr per-CPU
> > > > 	variable had not yet been incremented, an RCU updater would
> > > > 	be within its rights to assume that there were no RCU reads
> > > > 	in progress, thus possibly yanking a data structure out from
> > > > 	under the reader in the SMI/NMI function.
> > > > 
> > > > 	Fatal outcome.  Note that only one CPU is involved here
> > > > 	because these are all either per-CPU or per-task variables.
> > > 
> > > Ok, so you don't care about CPU re-ordering. Still, I should let you know
> > > that your ORDERED_WRT_IRQ() -- bad name, btw -- is still buggy. What you
> > > want is a full compiler optimization barrier().
> > 
> > No.  See above.
> 
> True, *(volatile foo *)& _will_ work for this case.
> 
> But multiple calls to barrier() (granted, would invalidate all other
> optimizations also) would work as well, would it not?

They work, but are a bit slower.  So they do work, but not as well.

> [ Interestingly, if you declared all those objects mentioned earlier as
>   atomic_t, and x86(-64) switched to an __asm__ __volatile__ based variant
>   for atomic_{read,set}_volatile(), the bugs you want to avoid would still
>   be there. "volatile" the C language type-qualifier does have compiler
>   re-ordering semantics you mentioned earlier, but the "volatile" that
>   applies to inline asm()s gives no re-ordering guarantees. ]

Well, that certainly would be a point in favor of "volatile" over inline
asms.  ;-)

> > > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting + 1"
> > > > 	was ordered by the compiler to follow the
> > > > 	"ORDERED_WRT_IRQ(me->rcu_flipctr_idx) = idx", and an NMI/SMI
> > > > 	happened between the two, then an __rcu_read_lock() in the NMI/SMI
> > > > 	would incorrectly take the "else" clause of the enclosing "if"
> > > > 	statement.  If some other CPU flipped the rcu_ctrlblk.completed
> > > > 	in the meantime, then the __rcu_read_lock() would (correctly)
> > > > 	write the new value into rcu_flipctr_idx.
> > > > 
> > > > 	Well and good so far.  But the problem arises in
> > > > 	__rcu_read_unlock(), which then decrements the wrong counter.
> > > > 	Depending on exactly how subsequent events played out, this could
> > > > 	result in either prematurely ending grace periods or never-ending
> > > > 	grace periods, both of which are fatal outcomes.
> > > > 
> > > > And the following are not needed in the current version of the
> > > > patch, but will be in a future version that either avoids disabling
> > > > irqs or that dispenses with the smp_read_barrier_depends() that I
> > > > have 99% convinced myself is unneeded:
> > > > 
> > > > o	nesting = ORDERED_WRT_IRQ(me->rcu_read_lock_nesting);
> > > > 
> > > > o	idx = ORDERED_WRT_IRQ(rcu_ctrlblk.completed) & 0x1;
> > > > 
> > > > Furthermore, in that future version, irq handlers can cause the same
> > > > mischief that SMI/NMI handlers can in this version.
> 
> So don't remove the local_irq_save/restore, which is well-established and
> well-understood for such cases (it doesn't help you with SMI/NMI,
> admittedly). This isn't really about RCU or per-cpu vars as such, it's
> just about racy code where you don't want to get hit by a concurrent
> interrupt (it does turn out that doing things in a _particular order_ will
> not cause fatal/buggy behaviour, but it's still a race issue, after all).

The local_irq_save/restore are something like 30% of the overhead of
these two functions, so will be looking hard at getting rid of them.
Doing so allows the scheduling-clock interrupt to get into the mix,
and also allows preemption.  The goal would be to find some trick that
suppresses preemption, fends off the grace-period-computation code
invoked from the the scheduling-clock interrupt, and otherwise keeps
things on an even keel.

> > > > Next, looking at __rcu_read_unlock():
> > > > 
> > > > o	If "ORDERED_WRT_IRQ(me->rcu_read_lock_nesting) = nesting - 1"
> > > > 	was reordered by the compiler to follow the
> > > > 	"ORDERED_WRT_IRQ(__get_cpu_var(rcu_flipctr)[idx])--",
> > > > 	then if an NMI/SMI containing an rcu_read_lock() occurs between
> > > > 	the two, this nested rcu_read_lock() would incorrectly believe
> > > > 	that it was protected by an enclosing RCU read-side critical
> > > > 	section as described in the first reversal discussed for
> > > > 	__rcu_read_lock() above.  Again, fatal outcome.
> > > > 
> > > > This is what we have now.  It is not hard to imagine situations that
> > > > interact with -both- interrupt handlers -and- other CPUs, as described
> > > > earlier.
> 
> Unless somebody's going for a lockless implementation, such situations
> normally use spin_lock_irqsave() based locking (or local_irq_save for
> those who care only for current CPU) -- problem with the patch in question,
> is that you want to prevent races with concurrent SMI/NMIs as well, which
> is not something that a lot of code needs to consider.

Or that needs to resolve similar races with IRQs without disabling them.
One reason to avoid disabling IRQs is to avoid degrading scheduling
latency.  In any case, I do agree that the amount of code that must
worry about this is quite small at the moment.  I believe that it
will become more common, but would imagine that this belief might not
be universal.  Yet, anyway.  ;-)

> [ Curiously, another thread is discussing something similar also:
>   http://lkml.org/lkml/2007/8/15/393 "RFC: do get_rtc_time() correctly" ]
> 
> Anyway, I didn't look at the code in that patch very much in detail, but
> why couldn't you implement some kind of synchronization variable that lets
> rcu_read_lock() or rcu_read_unlock() -- when being called from inside an
> NMI or SMI handler -- know that it has concurrently interrupted an ongoing
> rcu_read_{un}lock() and so must do things differently ... (?)

Given some low-level details of the current implementation, I could
imagine manipulating rcu_read_lock_nesting on entry to and exit from
all NMI/SMI handlers, but would like to avoid that kind of architecture
dependency.  I am not confident of locating all of them, for one thing...

> I'm also wondering if there's other code that's not using locking in the
> kernel that faces similar issues, and what they've done to deal with it
> (if anything). Such bugs would be subtle, and difficult to diagnose.

Agreed!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 16:34                                             ` Paul E. McKenney
  2007-08-16 23:59                                               ` Herbert Xu
@ 2007-08-17  3:15                                               ` Nick Piggin
  2007-08-17  4:02                                                 ` Paul Mackerras
                                                                   ` (2 more replies)
  1 sibling, 3 replies; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  3:15 UTC (permalink / raw)
  To: paulmck
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Paul E. McKenney wrote:
> On Thu, Aug 16, 2007 at 06:42:50PM +0800, Herbert Xu wrote:

>>In fact, volatile doesn't guarantee that the memory gets
>>read anyway.  You might be reading some stale value out
>>of the cache.  Granted this doesn't happen on x86 but
>>when you're coding for the kernel you can't make such
>>assumptions.
>>
>>So the point here is that if you don't mind getting a stale
>>value from the CPU cache when doing an atomic_read, then
>>surely you won't mind getting a stale value from the compiler
>>"cache".
> 
> 
> Absolutely disagree.  An interrupt/NMI/SMI handler running on the CPU
> will see the same value (whether in cache or in store buffer) that
> the mainline code will see.  In this case, we don't care about CPU
> misordering, only about compiler misordering.  It is easy to see
> other uses that combine communication with handlers on the current
> CPU with communication among CPUs -- again, see prior messages in
> this thread.

I still don't agree with the underlying assumption that everybody
(or lots of kernel code) treats atomic accesses as volatile.

Nobody that does has managed to explain my logic problem either:
loads and stores to long and ptr have always been considered to be
atomic, test_bit is atomic; so why are another special subclass of
atomic loads and stores? (and yes, it is perfectly legitimate to
want a non-volatile read for a data type that you also want to do
atomic RMW operations on)

Why are people making these undocumented and just plain false
assumptions about atomic_t? If they're using lockless code (ie.
which they must be if using atomics), then they actually need to be
thinking much harder about memory ordering issues. If that is too
much for them, then they can just use locks.

>>>So, the architecture guys can implement atomic_read however they want
>>>--- as long as it cannot be optimized away.*
>>
>>They can implement it however they want as long as it stays
>>atomic.
> 
> 
> Precisely.  And volatility is a key property of "atomic".  Let's please
> not throw it away.

It isn't, though (at least not since i386 and x86-64 don't have it).
_Adding_ it is trivial, and can be done any time. Throwing it away
(ie. making the API weaker) is _hard_. So let's not add it without
really good reasons. It most definitely results in worse code
generation in practice.

I don't know why people would assume volatile of atomics. AFAIK, most
of the documentation is pretty clear that all the atomic stuff can be
reordered etc. except for those that modify and return a value.

It isn't even intuitive: `*lp = value` is like the most fundamental
atomic operation in Linux.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:15                                               ` Nick Piggin
@ 2007-08-17  4:02                                                 ` Paul Mackerras
  2007-08-17  4:39                                                   ` Nick Piggin
  2007-08-17  7:25                                                 ` Stefan Richter
  2007-08-17 22:14                                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
  2 siblings, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-17  4:02 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Herbert Xu, Stefan Richter, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin writes:

> Why are people making these undocumented and just plain false
> assumptions about atomic_t?

Well, it has only been false since December 2006.  Prior to that
atomics *were* volatile on all platforms.

> If they're using lockless code (ie.
> which they must be if using atomics), then they actually need to be
> thinking much harder about memory ordering issues.

Indeed.  I believe that most uses of atomic_read other than in polling
loops or debug printk statements are actually racy.  In some cases the
race doesn't seem to matter, but I'm sure there are cases where it
does.

> If that is too
> much for them, then they can just use locks.

Why use locks when you can just sprinkle magic fix-the-races dust (aka
atomic_t) over your code? :) :)

> > Precisely.  And volatility is a key property of "atomic".  Let's please
> > not throw it away.
> 
> It isn't, though (at least not since i386 and x86-64 don't have it).

Conceptually it is, because atomic_t is specifically for variables
which are liable to be modified by other CPUs, and volatile _means_
"liable to be changed by mechanisms outside the knowledge of the
compiler".

> _Adding_ it is trivial, and can be done any time. Throwing it away
> (ie. making the API weaker) is _hard_. So let's not add it without

Well, in one sense it's not that hard - Linus did it just 8 months ago
in commit f9e9dcb3. :)

> really good reasons. It most definitely results in worse code
> generation in practice.

0.0008% increase in kernel text size on powerpc according to my
measurement. :)

> I don't know why people would assume volatile of atomics. AFAIK, most

By making something an atomic_t you're saying "other CPUs are going to
be modifying this, so treat it specially".  It's reasonable to assume
that special treatment extends to reading and setting it.

> of the documentation is pretty clear that all the atomic stuff can be
> reordered etc. except for those that modify and return a value.

Volatility isn't primarily about reordering (though as Linus says it
does restrict reordering to some extent).

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  4:02                                                 ` Paul Mackerras
@ 2007-08-17  4:39                                                   ` Nick Piggin
  0 siblings, 0 replies; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  4:39 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: paulmck, Herbert Xu, Stefan Richter, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Paul Mackerras wrote:
> Nick Piggin writes:
> 
> 
>>Why are people making these undocumented and just plain false
>>assumptions about atomic_t?
> 
> 
> Well, it has only been false since December 2006.  Prior to that
> atomics *were* volatile on all platforms.

Hmm, although I don't think it has ever been guaranteed by the
API documentation (concede documentation is often not treated
as the authoritative source here, but for atomic it is actually
very good and obviously indispensable as the memory ordering
reference).

>>If they're using lockless code (ie.
>>which they must be if using atomics), then they actually need to be
>>thinking much harder about memory ordering issues.
> 
> 
> Indeed.  I believe that most uses of atomic_read other than in polling
> loops or debug printk statements are actually racy.  In some cases the
> race doesn't seem to matter, but I'm sure there are cases where it
> does.
> 
> 
>>If that is too
>>much for them, then they can just use locks.
> 
> 
> Why use locks when you can just sprinkle magic fix-the-races dust (aka
> atomic_t) over your code? :) :)

I agree with your skepticism of a lot of lockless code. But I think
a lot of the more subtle race problems will not be fixed with volatile.
The big, dumb infinite loop bugs would be fixed, but they're pretty
trivial to debug and even audit for.

>>>Precisely.  And volatility is a key property of "atomic".  Let's please
>>>not throw it away.
>>
>>It isn't, though (at least not since i386 and x86-64 don't have it).
> 
> 
> Conceptually it is, because atomic_t is specifically for variables
> which are liable to be modified by other CPUs, and volatile _means_
> "liable to be changed by mechanisms outside the knowledge of the
> compiler".

Usually that is the case, yes. But also most of the time we don't
care that it has been changed and don't mind it being reordered or
eliminated.

One of the only places we really care about that at all is for
variables that are modified by the *same* CPU.

>>_Adding_ it is trivial, and can be done any time. Throwing it away
>>(ie. making the API weaker) is _hard_. So let's not add it without
> 
> 
> Well, in one sense it's not that hard - Linus did it just 8 months ago
> in commit f9e9dcb3. :)

Well it would have been harder if the documentation also guaranteed
that atomic_read/atomic_set was ordered. Or it would have been harder
for _me_ to make such a change, anyway ;)

>>really good reasons. It most definitely results in worse code
>>generation in practice.
> 
> 
> 0.0008% increase in kernel text size on powerpc according to my
> measurement. :)

I don't think you're making a bad choice by keeping it volatile on
powerpc and waiting for others to shake out more of the bugs. You
get to fix everybody else's memory ordering bugs :)

>>I don't know why people would assume volatile of atomics. AFAIK, most
> 
> 
> By making something an atomic_t you're saying "other CPUs are going to
> be modifying this, so treat it specially".  It's reasonable to assume
> that special treatment extends to reading and setting it.

But I don't actually know what that "special treatment" is. Well
actually, I do know that operations will never result in a partial
modification being exposed. I also know that the operators that
do not modify and return are not guaranteed to have any sort of
ordering constraints.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:15                                               ` Nick Piggin
  2007-08-17  4:02                                                 ` Paul Mackerras
@ 2007-08-17  7:25                                                 ` Stefan Richter
  2007-08-17  8:06                                                   ` Nick Piggin
  2007-08-17 22:14                                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
  2 siblings, 1 reply; 1651+ messages in thread
From: Stefan Richter @ 2007-08-17  7:25 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin wrote:
> I don't know why people would assume volatile of atomics. AFAIK, most
> of the documentation is pretty clear that all the atomic stuff can be
> reordered etc. except for those that modify and return a value.

Which documentation is there?

For driver authors, there is LDD3.  It doesn't specifically cover
effects of optimization on accesses to atomic_t.

For architecture port authors, there is Documentation/atomic_ops.txt.
Driver authors also can learn something from that document, as it
indirectly documents the atomic_t and bitops APIs.

Prompted by this thread, I reread this document, and indeed, the
sentence "Unlike the above routines, it is required that explicit memory
barriers are performed before and after [atomic_{inc,dec}_return]"
indicates that atomic_read (one of the "above routines") is very
different from all other atomic_t accessors that return values.

This is strange.  Why is it that atomic_read stands out that way?  IMO
this API imbalance is quite unexpected by many people.  Wouldn't it be
beneficial to change the atomic_read API to behave the same like all
other atomic_t accessors that return values?

OK, it is also different from the other accessors that return data in so
far as it doesn't modify the data.  But as driver "author", i.e. user of
the API, I can't see much use of an atomic_read that can be reordered
and, more importantly, can be optimized away by the compiler.  Sure, now
that I learned of these properties I can start to audit code and insert
barriers where I believe they are needed, but this simply means that
almost all occurrences of atomic_read will get barriers (unless there
already are implicit but more or less obvious barriers like msleep).
-- 
Stefan Richter
-=====-=-=== =--- =---=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  7:25                                                 ` Stefan Richter
@ 2007-08-17  8:06                                                   ` Nick Piggin
  2007-08-17  8:58                                                     ` Satyam Sharma
                                                                       ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  8:06 UTC (permalink / raw)
  To: Stefan Richter
  Cc: paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Stefan Richter wrote:
> Nick Piggin wrote:
> 
>>I don't know why people would assume volatile of atomics. AFAIK, most
>>of the documentation is pretty clear that all the atomic stuff can be
>>reordered etc. except for those that modify and return a value.
> 
> 
> Which documentation is there?

Documentation/atomic_ops.txt


> For driver authors, there is LDD3.  It doesn't specifically cover
> effects of optimization on accesses to atomic_t.
> 
> For architecture port authors, there is Documentation/atomic_ops.txt.
> Driver authors also can learn something from that document, as it
> indirectly documents the atomic_t and bitops APIs.
>

"Semantics and Behavior of Atomic and Bitmask Operations" is
pretty direct :)

Sure, it says that it's for arch maintainers, but there is no
reason why users can't make use of it.


> Prompted by this thread, I reread this document, and indeed, the
> sentence "Unlike the above routines, it is required that explicit memory
> barriers are performed before and after [atomic_{inc,dec}_return]"
> indicates that atomic_read (one of the "above routines") is very
> different from all other atomic_t accessors that return values.
> 
> This is strange.  Why is it that atomic_read stands out that way?  IMO

It is not just atomic_read of course. It is atomic_add,sub,inc,dec,set.


> this API imbalance is quite unexpected by many people.  Wouldn't it be
> beneficial to change the atomic_read API to behave the same like all
> other atomic_t accessors that return values?

It is very consistent and well defined. Operations which both modify
the data _and_ return something are defined to have full barriers
before and after.

What do you want to add to the other atomic accessors? Full memory
barriers? Only compiler barriers? It's quite likely that if you think
some barriers will fix bugs, then there are other bugs lurking there
anyway.

Just use spinlocks if you're not absolutely clear about potential
races and memory ordering issues -- they're pretty cheap and simple.


> OK, it is also different from the other accessors that return data in so
> far as it doesn't modify the data.  But as driver "author", i.e. user of
> the API, I can't see much use of an atomic_read that can be reordered
> and, more importantly, can be optimized away by the compiler.

It will return to you an atomic snapshot of the data (loaded from
memory at some point since the last compiler barrier). All you have
to be aware of compiler barriers and the Linux SMP memory ordering
model, which should be a given if you are writing lockless code.


> Sure, now
> that I learned of these properties I can start to audit code and insert
> barriers where I believe they are needed, but this simply means that
> almost all occurrences of atomic_read will get barriers (unless there
> already are implicit but more or less obvious barriers like msleep).

You might find that these places that appear to need barriers are
buggy for other reasons anyway. Can you point to some in-tree code
we can have a look at?

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:06                                                   ` Nick Piggin
@ 2007-08-17  8:58                                                     ` Satyam Sharma
  2007-08-17  9:15                                                       ` Nick Piggin
  2007-08-17 10:48                                                     ` Stefan Richter
  2007-08-18 14:35                                                     ` LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures) Stefan Richter
  2 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  8:58 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Stefan Richter wrote:
> [...]
> Just use spinlocks if you're not absolutely clear about potential
> races and memory ordering issues -- they're pretty cheap and simple.

I fully agree with this. As Paul Mackerras mentioned elsewhere,
a lot of authors sprinkle atomic_t in code thinking they're somehow
done with *locking*. This is sad, and I wonder if it's time for a
Documentation/atomic-considered-dodgy.txt kind of document :-)


> > Sure, now
> > that I learned of these properties I can start to audit code and insert
> > barriers where I believe they are needed, but this simply means that
> > almost all occurrences of atomic_read will get barriers (unless there
> > already are implicit but more or less obvious barriers like msleep).
> 
> You might find that these places that appear to need barriers are
> buggy for other reasons anyway. Can you point to some in-tree code
> we can have a look at?

Such code was mentioned elsewhere (query nodemgr_host_thread in cscope)
that managed to escape the requirement for a barrier only because of
some completely un-obvious compilation-unit-scope thing. But I find such
an non-explicit barrier quite bad taste. Stefan, do consider plunking an
explicit call to barrier() there.


Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:58                                                     ` Satyam Sharma
@ 2007-08-17  9:15                                                       ` Nick Piggin
  2007-08-17 10:03                                                         ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  9:15 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:
> 
> On Fri, 17 Aug 2007, Nick Piggin wrote:

>>>Sure, now
>>>that I learned of these properties I can start to audit code and insert
>>>barriers where I believe they are needed, but this simply means that
>>>almost all occurrences of atomic_read will get barriers (unless there
>>>already are implicit but more or less obvious barriers like msleep).
>>
>>You might find that these places that appear to need barriers are
>>buggy for other reasons anyway. Can you point to some in-tree code
>>we can have a look at?
> 
> 
> Such code was mentioned elsewhere (query nodemgr_host_thread in cscope)
> that managed to escape the requirement for a barrier only because of
> some completely un-obvious compilation-unit-scope thing. But I find such
> an non-explicit barrier quite bad taste. Stefan, do consider plunking an
> explicit call to barrier() there.

It is very obvious. msleep calls schedule() (ie. sleeps), which is
always a barrier.

The "unobvious" thing is that you wanted to know how the compiler knows
a function is a barrier -- answer is that if it does not *know* it is not
a barrier, it must assume it is a barrier. If the whole msleep call chain
including the scheduler were defined static in the current compilation
unit, then it would still be a barrier because it would actually be able
to see the barriers in schedule(void), if nothing else.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:15                                                       ` Nick Piggin
@ 2007-08-17 10:03                                                         ` Satyam Sharma
  2007-08-17 11:50                                                           ` Nick Piggin
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:03 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> > 
> > On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> > > > Sure, now
> > > > that I learned of these properties I can start to audit code and insert
> > > > barriers where I believe they are needed, but this simply means that
> > > > almost all occurrences of atomic_read will get barriers (unless there
> > > > already are implicit but more or less obvious barriers like msleep).
> > > 
> > > You might find that these places that appear to need barriers are
> > > buggy for other reasons anyway. Can you point to some in-tree code
> > > we can have a look at?
> > 
> > 
> > Such code was mentioned elsewhere (query nodemgr_host_thread in cscope)
> > that managed to escape the requirement for a barrier only because of
> > some completely un-obvious compilation-unit-scope thing. But I find such
> > an non-explicit barrier quite bad taste. Stefan, do consider plunking an
> > explicit call to barrier() there.
> 
> It is very obvious. msleep calls schedule() (ie. sleeps), which is
> always a barrier.

Probably you didn't mean that, but no, schedule() is not barrier because
it sleeps. It's a barrier because it's invisible.

> The "unobvious" thing is that you wanted to know how the compiler knows
> a function is a barrier -- answer is that if it does not *know* it is not
> a barrier, it must assume it is a barrier.

True, that's clearly what happens here. But are you're definitely joking
that this is "obvious" in terms of code-clarity, right?

Just 5 minutes back you mentioned elsewhere you like seeing lots of
explicit calls to barrier() (with comments, no less, hmm? :-)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 10:03                                                         ` Satyam Sharma
@ 2007-08-17 11:50                                                           ` Nick Piggin
  2007-08-17 12:50                                                             ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17 11:50 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:

>
>On Fri, 17 Aug 2007, Nick Piggin wrote:
>
>
>>Satyam Sharma wrote:
>>
>>It is very obvious. msleep calls schedule() (ie. sleeps), which is
>>always a barrier.
>>
>
>Probably you didn't mean that, but no, schedule() is not barrier because
>it sleeps. It's a barrier because it's invisible.
>

Where did I say it is a barrier because it sleeps?

It is always a barrier because, at the lowest level, schedule() (and thus
anything that sleeps) is defined to always be a barrier. Regardless of
whatever obscure means the compiler might need to infer the barrier.

In other words, you can ignore those obscure details because schedule() is
always going to have an explicit barrier in it.

>>The "unobvious" thing is that you wanted to know how the compiler knows
>>a function is a barrier -- answer is that if it does not *know* it is not
>>a barrier, it must assume it is a barrier.
>>
>
>True, that's clearly what happens here. But are you're definitely joking
>that this is "obvious" in terms of code-clarity, right?
>

No. If you accept that barrier() is implemented correctly, and you know
that sleeping is defined to be a barrier, then its perfectly clear. You
don't have to know how the compiler "knows" that some function contains
a barrier.

>Just 5 minutes back you mentioned elsewhere you like seeing lots of
>explicit calls to barrier() (with comments, no less, hmm? :-)
>

Sure, but there are well known primitives which contain barriers, and
trivial recognisable code sequences for which you don't need comments.
waiting-loops using sleeps or cpu_relax() are prime examples.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 11:50                                                           ` Nick Piggin
@ 2007-08-17 12:50                                                             ` Satyam Sharma
  2007-08-17 12:56                                                               ` Nick Piggin
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 12:50 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> > On Fri, 17 Aug 2007, Nick Piggin wrote:
> > > Satyam Sharma wrote:
> > > 
> > > It is very obvious. msleep calls schedule() (ie. sleeps), which is
> > > always a barrier.
> > 
> > Probably you didn't mean that, but no, schedule() is not barrier because
> > it sleeps. It's a barrier because it's invisible.
> 
> Where did I say it is a barrier because it sleeps?

Just below. What you wrote:

> It is always a barrier because, at the lowest level, schedule() (and thus
> anything that sleeps) is defined to always be a barrier.

"It is always a barrier because, at the lowest level, anything that sleeps
is defined to always be a barrier".

> Regardless of
> whatever obscure means the compiler might need to infer the barrier.
> 
> In other words, you can ignore those obscure details because schedule() is
> always going to have an explicit barrier in it.

I didn't quite understand what you said here, so I'll tell what I think:

* foo() is a compiler barrier if the definition of foo() is invisible to
  the compiler at a callsite.

* foo() is also a compiler barrier if the definition of foo() includes
  a barrier, and it is inlined at the callsite.

If the above is wrong, or if there's something else at play as well,
do let me know.

> > > The "unobvious" thing is that you wanted to know how the compiler knows
> > > a function is a barrier -- answer is that if it does not *know* it is not
> > > a barrier, it must assume it is a barrier.
> > 
> > True, that's clearly what happens here. But are you're definitely joking
> > that this is "obvious" in terms of code-clarity, right?
> 
> No. If you accept that barrier() is implemented correctly, and you know
> that sleeping is defined to be a barrier,

Curiously, that's the second time you've said "sleeping is defined to
be a (compiler) barrier". How does the compiler even know if foo() is
a function that "sleeps"? Do compilers have some notion of "sleeping"
to ensure they automatically assume a compiler barrier whenever such
a function is called? Or are you saying that the compiler can see the
barrier() inside said function ... nopes, you're saying quite the
opposite below.

> then its perfectly clear. You
> don't have to know how the compiler "knows" that some function contains
> a barrier.

I think I do, why not? Would appreciate if you could elaborate on this.

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:50                                                             ` Satyam Sharma
@ 2007-08-17 12:56                                                               ` Nick Piggin
  2007-08-18  2:15                                                                 ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17 12:56 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:

>
>On Fri, 17 Aug 2007, Nick Piggin wrote:
>
>
>>Satyam Sharma wrote:
>>
>>>On Fri, 17 Aug 2007, Nick Piggin wrote:
>>>
>>>>Satyam Sharma wrote:
>>>>
>>>>It is very obvious. msleep calls schedule() (ie. sleeps), which is
>>>>always a barrier.
>>>>
>>>Probably you didn't mean that, but no, schedule() is not barrier because
>>>it sleeps. It's a barrier because it's invisible.
>>>
>>Where did I say it is a barrier because it sleeps?
>>
>
>Just below. What you wrote:
>
>
>>It is always a barrier because, at the lowest level, schedule() (and thus
>>anything that sleeps) is defined to always be a barrier.
>>
>
>"It is always a barrier because, at the lowest level, anything that sleeps
>is defined to always be a barrier".
>

... because it must call schedule and schedule is a barrier.


>>Regardless of
>>whatever obscure means the compiler might need to infer the barrier.
>>
>>In other words, you can ignore those obscure details because schedule() is
>>always going to have an explicit barrier in it.
>>
>
>I didn't quite understand what you said here, so I'll tell what I think:
>
>* foo() is a compiler barrier if the definition of foo() is invisible to
>  the compiler at a callsite.
>
>* foo() is also a compiler barrier if the definition of foo() includes
>  a barrier, and it is inlined at the callsite.
>
>If the above is wrong, or if there's something else at play as well,
>do let me know.
>

Right.


>>>>The "unobvious" thing is that you wanted to know how the compiler knows
>>>>a function is a barrier -- answer is that if it does not *know* it is not
>>>>a barrier, it must assume it is a barrier.
>>>>
>>>True, that's clearly what happens here. But are you're definitely joking
>>>that this is "obvious" in terms of code-clarity, right?
>>>
>>No. If you accept that barrier() is implemented correctly, and you know
>>that sleeping is defined to be a barrier,
>>
>
>Curiously, that's the second time you've said "sleeping is defined to
>be a (compiler) barrier".
>

_In Linux,_ sleeping is defined to be a compiler barrier.

>How does the compiler even know if foo() is
>a function that "sleeps"? Do compilers have some notion of "sleeping"
>to ensure they automatically assume a compiler barrier whenever such
>a function is called? Or are you saying that the compiler can see the
>barrier() inside said function ... nopes, you're saying quite the
>opposite below.
>

You're getting too worried about the compiler implementation. Start
by assuming that it does work ;)


>>then its perfectly clear. You
>>don't have to know how the compiler "knows" that some function contains
>>a barrier.
>>
>
>I think I do, why not? Would appreciate if you could elaborate on this.
>

If a function is not completely visible to the compiler (so it can't
determine whether a barrier could be in it or not), then it must always
assume it will contain a barrier so it always does the right thing.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:56                                                               ` Nick Piggin
@ 2007-08-18  2:15                                                                 ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-18  2:15 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Stefan Richter, paulmck, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> 
> > I didn't quite understand what you said here, so I'll tell what I think:
> > 
> > * foo() is a compiler barrier if the definition of foo() is invisible to
> >  the compiler at a callsite.
> > 
> > * foo() is also a compiler barrier if the definition of foo() includes
> >  a barrier, and it is inlined at the callsite.
> > 
> > If the above is wrong, or if there's something else at play as well,
> > do let me know.
> 
> [...]
> If a function is not completely visible to the compiler (so it can't
> determine whether a barrier could be in it or not), then it must always
> assume it will contain a barrier so it always does the right thing.

Yup, that's what I'd said just a few sentences above, as you can see. I
was actually asking for "elaboration" on "how a compiler determines that
function foo() (say foo == schedule), even when it cannot see that it has
a barrier(), as you'd mentioned, is a 'sleeping' function" actually, and
whether compilers have a "notion of sleep to automatically assume a
compiler barrier whenever such a sleeping function foo() is called". But
I think you've already qualified the discussion to this kernel, so okay,
I shouldn't nit-pick anymore.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:06                                                   ` Nick Piggin
  2007-08-17  8:58                                                     ` Satyam Sharma
@ 2007-08-17 10:48                                                     ` Stefan Richter
  2007-08-17 10:58                                                       ` Stefan Richter
  2007-08-18 14:35                                                     ` LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures) Stefan Richter
  2 siblings, 1 reply; 1651+ messages in thread
From: Stefan Richter @ 2007-08-17 10:48 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin wrote:
> Stefan Richter wrote:
>> For architecture port authors, there is Documentation/atomic_ops.txt.
>> Driver authors also can learn something from that document, as it
>> indirectly documents the atomic_t and bitops APIs.
> 
> "Semantics and Behavior of Atomic and Bitmask Operations" is
> pretty direct :)

"Indirect", "pretty direct"... It's subjective.

(It is not an API documentation; it is an implementation specification.)

> Sure, it says that it's for arch maintainers, but there is no
> reason why users can't make use of it.
> 
> 
>> Prompted by this thread, I reread this document, and indeed, the
>> sentence "Unlike the above routines, it is required that explicit memory
>> barriers are performed before and after [atomic_{inc,dec}_return]"
>> indicates that atomic_read (one of the "above routines") is very
>> different from all other atomic_t accessors that return values.
>> 
>> This is strange.  Why is it that atomic_read stands out that way?  IMO
> 
> It is not just atomic_read of course. It is atomic_add,sub,inc,dec,set.

Yes, but unlike these, atomic_read returns a value.

Without me (the API user) providing extra barriers, that value may
become something else whenever someone touches code in the vicinity of
the atomic_read.

>> this API imbalance is quite unexpected by many people.  Wouldn't it be
>> beneficial to change the atomic_read API to behave the same like all
>> other atomic_t accessors that return values?
> 
> It is very consistent and well defined. Operations which both modify
> the data _and_ return something are defined to have full barriers
> before and after.

You are right, atomic_read is not only different from accessors that
don't retunr values, it is also different from all other accessors that
return values (because they all also modify the value).  There is just
no actual API documentation, which contributes to the issue that some
people (or at least one: me) learn a little bit late how special
atomic_read is.

> What do you want to add to the other atomic accessors? Full memory
> barriers? Only compiler barriers? It's quite likely that if you think
> some barriers will fix bugs, then there are other bugs lurking there
> anyway.

A lot of different though related issues are discussed in this thread,
but I personally am only occupied by one particular thing:  What kind of
return values do I get from atomic_read.

> Just use spinlocks if you're not absolutely clear about potential
> races and memory ordering issues -- they're pretty cheap and simple.

Probably good advice, like generally if driver guys consider lockless
algorithms.

>> OK, it is also different from the other accessors that return data in so
>> far as it doesn't modify the data.  But as driver "author", i.e. user of
>> the API, I can't see much use of an atomic_read that can be reordered
>> and, more importantly, can be optimized away by the compiler.
> 
> It will return to you an atomic snapshot of the data (loaded from
> memory at some point since the last compiler barrier). All you have
> to be aware of compiler barriers and the Linux SMP memory ordering
> model, which should be a given if you are writing lockless code.

OK, that's what I slowly realized during this discussion, and I
appreciate the explanations that were given here.

>> Sure, now
>> that I learned of these properties I can start to audit code and insert
>> barriers where I believe they are needed, but this simply means that
>> almost all occurrences of atomic_read will get barriers (unless there
>> already are implicit but more or less obvious barriers like msleep).
> 
> You might find that these places that appear to need barriers are
> buggy for other reasons anyway. Can you point to some in-tree code
> we can have a look at?

I could, or could not, if I were through with auditing the code.  I
remembered one case and posted it (nodemgr_host_thread) which was safe
because msleep_interruptible provided the necessary barrier there, and
this implicit barrier is not in danger to be removed by future patches.
-- 
Stefan Richter
-=====-=-=== =--- =---=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 10:48                                                     ` Stefan Richter
@ 2007-08-17 10:58                                                       ` Stefan Richter
  0 siblings, 0 replies; 1651+ messages in thread
From: Stefan Richter @ 2007-08-17 10:58 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

I wrote:
> Nick Piggin wrote:
>> You might find that these places that appear to need barriers are
>> buggy for other reasons anyway. Can you point to some in-tree code
>> we can have a look at?
> 
> I could, or could not, if I were through with auditing the code.  I
> remembered one case and posted it (nodemgr_host_thread) which was safe
> because msleep_interruptible provided the necessary barrier there, and
> this implicit barrier is not in danger to be removed by future patches.

PS, just in case anybody holds his breath for more example code from me,
I don't plan to continue with an actual audit of the drivers I maintain.
It's an important issue, but my current time budget will restrict me to
look at it ad hoc, per case.  (Open bugs have higher priority than
potential bugs.)
-- 
Stefan Richter
-=====-=-=== =--- =---=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures)
  2007-08-17  8:06                                                   ` Nick Piggin
  2007-08-17  8:58                                                     ` Satyam Sharma
  2007-08-17 10:48                                                     ` Stefan Richter
@ 2007-08-18 14:35                                                     ` Stefan Richter
  2007-08-20 13:28                                                       ` Chris Snook
  2 siblings, 1 reply; 1651+ messages in thread
From: Stefan Richter @ 2007-08-18 14:35 UTC (permalink / raw)
  To: Jonathan Corbet, Greg Kroah-Hartman
  Cc: Nick Piggin, paulmck, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin wrote:
> Stefan Richter wrote:
>> Nick Piggin wrote:
>>
>>> I don't know why people would assume volatile of atomics. AFAIK, most
>>> of the documentation is pretty clear that all the atomic stuff can be
>>> reordered etc. except for those that modify and return a value.
>>
>>
>> Which documentation is there?
> 
> Documentation/atomic_ops.txt
> 
> 
>> For driver authors, there is LDD3.  It doesn't specifically cover
>> effects of optimization on accesses to atomic_t.
>>
>> For architecture port authors, there is Documentation/atomic_ops.txt.
>> Driver authors also can learn something from that document, as it
>> indirectly documents the atomic_t and bitops APIs.
>>
> 
> "Semantics and Behavior of Atomic and Bitmask Operations" is
> pretty direct :)
> 
> Sure, it says that it's for arch maintainers, but there is no
> reason why users can't make use of it.


Note, LDD3 page 238 says:  "It is worth noting that most of the other
kernel primitives dealing with synchronization, such as spinlock and
atomic_t operations, also function as memory barriers."

I don't know about Linux 2.6.10 against which LDD3 was written, but
currently only _some_ atomic_t operations function as memory barriers.

Besides, judging from some posts in this thread, saying that atomic_t
operations dealt with synchronization may not be entirely precise.
-- 
Stefan Richter
-=====-=-=== =--- =--=-
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures)
  2007-08-18 14:35                                                     ` LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures) Stefan Richter
@ 2007-08-20 13:28                                                       ` Chris Snook
  0 siblings, 0 replies; 1651+ messages in thread
From: Chris Snook @ 2007-08-20 13:28 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Jonathan Corbet, Greg Kroah-Hartman, Nick Piggin, paulmck,
	Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher

Stefan Richter wrote:
> Nick Piggin wrote:
>> Stefan Richter wrote:
>>> Nick Piggin wrote:
>>>
>>>> I don't know why people would assume volatile of atomics. AFAIK, most
>>>> of the documentation is pretty clear that all the atomic stuff can be
>>>> reordered etc. except for those that modify and return a value.
>>>
>>> Which documentation is there?
>> Documentation/atomic_ops.txt
>>
>>
>>> For driver authors, there is LDD3.  It doesn't specifically cover
>>> effects of optimization on accesses to atomic_t.
>>>
>>> For architecture port authors, there is Documentation/atomic_ops.txt.
>>> Driver authors also can learn something from that document, as it
>>> indirectly documents the atomic_t and bitops APIs.
>>>
>> "Semantics and Behavior of Atomic and Bitmask Operations" is
>> pretty direct :)
>>
>> Sure, it says that it's for arch maintainers, but there is no
>> reason why users can't make use of it.
> 
> 
> Note, LDD3 page 238 says:  "It is worth noting that most of the other
> kernel primitives dealing with synchronization, such as spinlock and
> atomic_t operations, also function as memory barriers."
> 
> I don't know about Linux 2.6.10 against which LDD3 was written, but
> currently only _some_ atomic_t operations function as memory barriers.
> 
> Besides, judging from some posts in this thread, saying that atomic_t
> operations dealt with synchronization may not be entirely precise.

atomic_t is often used as the basis for implementing more sophisticated 
synchronization mechanisms, such as rwlocks.  Whether or not they are designed 
for that purpose, the atomic_* operations are de facto synchronization primitives.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:15                                               ` Nick Piggin
  2007-08-17  4:02                                                 ` Paul Mackerras
  2007-08-17  7:25                                                 ` Stefan Richter
@ 2007-08-17 22:14                                                 ` Segher Boessenkool
  2 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:14 UTC (permalink / raw)
  To: Nick Piggin
  Cc: paulmck, Christoph Lameter, Paul Mackerras, heiko.carstens,
	Stefan Richter, horms, Satyam Sharma, Linux Kernel Mailing List,
	rpjday, netdev, ak, cfriesen, jesper.juhl, linux-arch,
	Andrew Morton, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> (and yes, it is perfectly legitimate to
> want a non-volatile read for a data type that you also want to do
> atomic RMW operations on)

...which is undefined behaviour in C (and GCC) when that data is
declared volatile, which is a good argument against implementing
atomics that way in itself.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 10:42                                           ` Herbert Xu
  2007-08-16 16:34                                             ` Paul E. McKenney
@ 2007-08-17  5:04                                             ` Paul Mackerras
  1 sibling, 0 replies; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-17  5:04 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> So the point here is that if you don't mind getting a stale
> value from the CPU cache when doing an atomic_read, then
> surely you won't mind getting a stale value from the compiler
> "cache".

No, that particular argument is bogus, because there is a cache
coherency protocol operating to keep the CPU cache coherent with
stores from other CPUs, but there isn't any such protocol (nor should
there be) for a register used as a "cache".

(Linux requires SMP systems to keep any CPU caches coherent as far as
accesses by other CPUs are concerned.  It doesn't support any SMP
systems that are not cache-coherent as far as CPU accesses are
concerned.  It does support systems with non-cache-coherent DMA.)

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  9:54                                       ` Stefan Richter
  2007-08-16 10:31                                         ` Stefan Richter
@ 2007-08-16 10:35                                         ` Herbert Xu
  1 sibling, 0 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-16 10:35 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 11:54:44AM +0200, Stefan Richter wrote:
> 
> One example was discussed here earlier:  The for (;;) loop in
> nodemgr_host_thread.  There an msleep_interruptible implicitly acted as
> barrier (at the moment because it's in a different translation unit; if
> it were the same, then because it hopefully has own barriers).  So that
> happens to work, although such an implicit barrier is bad style:  Better
> enforce the desired behaviour (== guaranteed load operation) *explicitly*.

Hmm, it's not bad style at all.  Let's assume that everything
is in the same scope.  Such a loop must either call a function
that busy-waits, which should always have a cpu_relax or
something equivalent, or it'll call a function that schedules
away which immediately invalidates any values the compiler might
have cached for the atomic_read.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  8:10                                     ` Herbert Xu
  2007-08-16  9:54                                       ` Stefan Richter
@ 2007-08-16 19:48                                       ` Chris Snook
  2007-08-17  0:02                                         ` Herbert Xu
  2007-08-17  5:09                                       ` Paul Mackerras
  2 siblings, 1 reply; 1651+ messages in thread
From: Chris Snook @ 2007-08-16 19:48 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
>>> Do you (or anyone else for that matter) have an example of this?
>> The only code I somewhat know, the ieee1394 subsystem, was perhaps
>> authored and is currently maintained with the expectation that each
>> occurrence of atomic_read actually results in a load operation, i.e. is
>> not optimized away.  This means all atomic_t (bus generation, packet and
>> buffer refcounts, and some other state variables)* and likewise all
>> atomic bitops in that subsystem.
> 
> Can you find an actual atomic_read code snippet there that is
> broken without the volatile modifier?

A whole bunch of atomic_read uses will be broken without the volatile 
modifier once we start removing barriers that aren't needed if volatile 
behavior is guaranteed.

barrier() clobbers all your registers.  volatile atomic_read() only 
clobbers one register, and more often than not it's a register you 
wanted to clobber anyway.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 19:48                                       ` Chris Snook
@ 2007-08-17  0:02                                         ` Herbert Xu
  2007-08-17  2:04                                           ` Chris Snook
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-17  0:02 UTC (permalink / raw)
  To: Chris Snook
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 03:48:54PM -0400, Chris Snook wrote:
>
> >Can you find an actual atomic_read code snippet there that is
> >broken without the volatile modifier?
> 
> A whole bunch of atomic_read uses will be broken without the volatile 
> modifier once we start removing barriers that aren't needed if volatile 
> behavior is guaranteed.

Could you please cite the file/function names so we can
see whether removing the barrier makes sense?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  0:02                                         ` Herbert Xu
@ 2007-08-17  2:04                                           ` Chris Snook
  2007-08-17  2:13                                             ` Herbert Xu
  2007-08-17  2:31                                             ` Nick Piggin
  0 siblings, 2 replies; 1651+ messages in thread
From: Chris Snook @ 2007-08-17  2:04 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 03:48:54PM -0400, Chris Snook wrote:
>>> Can you find an actual atomic_read code snippet there that is
>>> broken without the volatile modifier?
>> A whole bunch of atomic_read uses will be broken without the volatile 
>> modifier once we start removing barriers that aren't needed if volatile 
>> behavior is guaranteed.
> 
> Could you please cite the file/function names so we can
> see whether removing the barrier makes sense?
> 
> Thanks,

At a glance, several architectures' implementations of smp_call_function() have 
one or more legitimate atomic_read() busy-waits that shouldn't be using 
CPU-relax.  Some of them do work in the loop.

I'm sure there are plenty more examples that various maintainers could find in 
their own code.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:04                                           ` Chris Snook
@ 2007-08-17  2:13                                             ` Herbert Xu
  2007-08-17  2:31                                             ` Nick Piggin
  1 sibling, 0 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-08-17  2:13 UTC (permalink / raw)
  To: Chris Snook
  Cc: Stefan Richter, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 10:04:24PM -0400, Chris Snook wrote:
>
> >Could you please cite the file/function names so we can
> >see whether removing the barrier makes sense?
> 
> At a glance, several architectures' implementations of smp_call_function() 
> have one or more legitimate atomic_read() busy-waits that shouldn't be 
> using CPU-relax.  Some of them do work in the loop.

Care to name one so we can discuss it?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:04                                           ` Chris Snook
  2007-08-17  2:13                                             ` Herbert Xu
@ 2007-08-17  2:31                                             ` Nick Piggin
  1 sibling, 0 replies; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  2:31 UTC (permalink / raw)
  To: Chris Snook
  Cc: Herbert Xu, Stefan Richter, Paul Mackerras, Satyam Sharma,
	Christoph Lameter, Paul E. McKenney, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Chris Snook wrote:
> Herbert Xu wrote:
> 
>> On Thu, Aug 16, 2007 at 03:48:54PM -0400, Chris Snook wrote:
>>
>>>> Can you find an actual atomic_read code snippet there that is
>>>> broken without the volatile modifier?
>>>
>>> A whole bunch of atomic_read uses will be broken without the volatile 
>>> modifier once we start removing barriers that aren't needed if 
>>> volatile behavior is guaranteed.
>>
>>
>> Could you please cite the file/function names so we can
>> see whether removing the barrier makes sense?
>>
>> Thanks,
> 
> 
> At a glance, several architectures' implementations of 
> smp_call_function() have one or more legitimate atomic_read() busy-waits 
> that shouldn't be using CPU-relax.  Some of them do work in the loop.

sh looks like the only one there that would be broken (and that's only
because they don't have a cpu_relax there, but it should be added anyway).
Sure, if we removed volatile from other architectures, it would be wise
to audit arch code because arch maintainers do sometimes make assumptions
about their implementation details... however we can assume most generic
code is safe without volatile.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  8:10                                     ` Herbert Xu
  2007-08-16  9:54                                       ` Stefan Richter
  2007-08-16 19:48                                       ` Chris Snook
@ 2007-08-17  5:09                                       ` Paul Mackerras
  2007-08-17  5:32                                         ` Herbert Xu
  2 siblings, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-17  5:09 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> Can you find an actual atomic_read code snippet there that is
> broken without the volatile modifier?

There are some in arch-specific code, for example line 1073 of
arch/mips/kernel/smtc.c.  On mips, cpu_relax() is just barrier(), so
the empty loop body is ok provided that atomic_read actually does the
load each time around the loop.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:09                                       ` Paul Mackerras
@ 2007-08-17  5:32                                         ` Herbert Xu
  2007-08-17  5:41                                           ` Paul Mackerras
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-17  5:32 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Stefan Richter, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, Aug 17, 2007 at 03:09:57PM +1000, Paul Mackerras wrote:
> Herbert Xu writes:
> 
> > Can you find an actual atomic_read code snippet there that is
> > broken without the volatile modifier?
> 
> There are some in arch-specific code, for example line 1073 of
> arch/mips/kernel/smtc.c.  On mips, cpu_relax() is just barrier(), so
> the empty loop body is ok provided that atomic_read actually does the
> load each time around the loop.

A barrier() is all you need to force the compiler to reread
the value.

The people advocating volatile in this thread are talking
about code that doesn't use barrier()/cpu_relax().

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:32                                         ` Herbert Xu
@ 2007-08-17  5:41                                           ` Paul Mackerras
  2007-08-17  8:28                                             ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-17  5:41 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Stefan Richter, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Herbert Xu writes:

> On Fri, Aug 17, 2007 at 03:09:57PM +1000, Paul Mackerras wrote:
> > Herbert Xu writes:
> > 
> > > Can you find an actual atomic_read code snippet there that is
> > > broken without the volatile modifier?
> > 
> > There are some in arch-specific code, for example line 1073 of
> > arch/mips/kernel/smtc.c.  On mips, cpu_relax() is just barrier(), so
> > the empty loop body is ok provided that atomic_read actually does the
> > load each time around the loop.
> 
> A barrier() is all you need to force the compiler to reread
> the value.
> 
> The people advocating volatile in this thread are talking
> about code that doesn't use barrier()/cpu_relax().

Did you look at it?  Here it is:

	/* Someone else is initializing in parallel - let 'em finish */
	while (atomic_read(&idle_hook_initialized) < 1000)
		;

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  5:41                                           ` Paul Mackerras
@ 2007-08-17  8:28                                             ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  8:28 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Herbert Xu, Stefan Richter, Christoph Lameter, Paul E. McKenney,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Fri, 17 Aug 2007, Paul Mackerras wrote:

> Herbert Xu writes:
> 
> > On Fri, Aug 17, 2007 at 03:09:57PM +1000, Paul Mackerras wrote:
> > > Herbert Xu writes:
> > > 
> > > > Can you find an actual atomic_read code snippet there that is
> > > > broken without the volatile modifier?
> > > 
> > > There are some in arch-specific code, for example line 1073 of
> > > arch/mips/kernel/smtc.c.  On mips, cpu_relax() is just barrier(), so
> > > the empty loop body is ok provided that atomic_read actually does the
> > > load each time around the loop.
> > 
> > A barrier() is all you need to force the compiler to reread
> > the value.
> > 
> > The people advocating volatile in this thread are talking
> > about code that doesn't use barrier()/cpu_relax().
> 
> Did you look at it?  Here it is:
> 
> 	/* Someone else is initializing in parallel - let 'em finish */
> 	while (atomic_read(&idle_hook_initialized) < 1000)
> 		;

Honestly, this thread is suffering from HUGE communication gaps.

What Herbert (obviously) meant there was that "this loop could've
been okay _without_ using volatile-semantics-atomic_read() also, if
only it used cpu_relax()".

That does work, because cpu_relax() is _at least_ barrier() on all
archs (on some it also emits some arch-dependent "pause" kind of
instruction).

Now, saying that "MIPS does not have such an instruction so I won't
use cpu_relax() for arch-dependent-busy-while-loops in arch/mips/"
sounds like a wrong argument, because: tomorrow, such arch's _may_
introduce such an instruction, so naturally, at that time we'd
change cpu_relax() appropriately (in reality, we would actually
*re-define* cpu_relax() and ensure that the correct version gets
pulled in depending on whether the callsite code is legacy or only
for the newer such CPUs of said arch, whatever), but loops such as
this would remain un-changed, because they never used cpu_relax()!

OTOH an argument that said the following would've made a stronger case:

"I don't want to use cpu_relax() because that's a full memory
clobber barrier() and I have loop-invariants / other variables
around in that code that I *don't* want the compiler to forget
just because it used cpu_relax(), and hence I will not use
cpu_relax() but instead make my atomic_read() itself have
"volatility" semantics. Not just that, but I will introduce a
cpu_relax_no_barrier() on MIPS, that would be a no-op #define
for now, but which may not be so forever, and continue to use
that in such busy loops."

In general, please read the thread-summary I've tried to do at:
http://lkml.org/lkml/2007/8/17/25
Feel free to continue / comment / correct stuff from there, there's
too much confusion and circular-arguments happening on this thread
otherwise.

[ I might've made an incorrect statement there about
  "volatile" w.r.t. cache on non-x86 archs, I think. ]

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  7:09                                 ` Herbert Xu
  2007-08-16  8:06                                   ` Stefan Richter
@ 2007-08-16 14:48                                   ` Ilpo Järvinen
  2007-08-16 16:19                                     ` Stefan Richter
                                                       ` (2 more replies)
  1 sibling, 3 replies; 1651+ messages in thread
From: Ilpo Järvinen @ 2007-08-16 14:48 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Thu, 16 Aug 2007, Herbert Xu wrote:

> We've been through that already.  If it's a busy-wait it
> should use cpu_relax. 

I looked around a bit by using some command lines and ended up wondering 
if these are equal to busy-wait case (and should be fixed) or not:

./drivers/telephony/ixj.c
6674:   while (atomic_read(&j->DSPWrite) > 0)
6675-           atomic_dec(&j->DSPWrite);

...besides that, there are couple of more similar cases in the same file 
(with braces)...


-- 
 i.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 14:48                                   ` Ilpo Järvinen
@ 2007-08-16 16:19                                     ` Stefan Richter
  2007-08-16 19:55                                     ` Chris Snook
  2007-08-16 19:55                                     ` Chris Snook
  2 siblings, 0 replies; 1651+ messages in thread
From: Stefan Richter @ 2007-08-16 16:19 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Ilpo Järvinen wrote:
> I looked around a bit by using some command lines and ended up wondering 
> if these are equal to busy-wait case (and should be fixed) or not:
> 
> ./drivers/telephony/ixj.c
> 6674:   while (atomic_read(&j->DSPWrite) > 0)
> 6675-           atomic_dec(&j->DSPWrite);
> 
> ...besides that, there are couple of more similar cases in the same file 
> (with braces)...

Generally, ixj.c has several occurrences of couples of atomic write and
atomic read which potentially do not do what the author wanted.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 14:48                                   ` Ilpo Järvinen
  2007-08-16 16:19                                     ` Stefan Richter
@ 2007-08-16 19:55                                     ` Chris Snook
  2007-08-16 20:20                                       ` Christoph Lameter
  2007-08-16 21:08                                         ` Luck, Tony
  2007-08-16 19:55                                     ` Chris Snook
  2 siblings, 2 replies; 1651+ messages in thread
From: Chris Snook @ 2007-08-16 19:55 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Ilpo Järvinen wrote:
> On Thu, 16 Aug 2007, Herbert Xu wrote:
> 
>> We've been through that already.  If it's a busy-wait it
>> should use cpu_relax. 
> 
> I looked around a bit by using some command lines and ended up wondering 
> if these are equal to busy-wait case (and should be fixed) or not:
> 
> ./drivers/telephony/ixj.c
> 6674:   while (atomic_read(&j->DSPWrite) > 0)
> 6675-           atomic_dec(&j->DSPWrite);
> 
> ...besides that, there are couple of more similar cases in the same file 
> (with braces)...

atomic_dec() already has volatile behavior everywhere, so this is 
semantically okay, but this code (and any like it) should be calling 
cpu_relax() each iteration through the loop, unless there's a compelling 
reason not to.  I'll allow that for some hardware drivers (possibly this 
one) such a compelling reason may exist, but hardware-independent core 
subsystems probably have no excuse.

If the maintainer of this code doesn't see a compelling reason to add 
cpu_relax() in this loop, then it should be patched.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 19:55                                     ` Chris Snook
@ 2007-08-16 20:20                                       ` Christoph Lameter
  2007-08-17  1:02                                         ` Paul E. McKenney
                                                           ` (2 more replies)
  2007-08-16 21:08                                         ` Luck, Tony
  1 sibling, 3 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16 20:20 UTC (permalink / raw)
  To: Chris Snook
  Cc: Ilpo Järvinen, Herbert Xu, Paul Mackerras, Satyam Sharma,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, 16 Aug 2007, Chris Snook wrote:

> atomic_dec() already has volatile behavior everywhere, so this is semantically
> okay, but this code (and any like it) should be calling cpu_relax() each
> iteration through the loop, unless there's a compelling reason not to.  I'll
> allow that for some hardware drivers (possibly this one) such a compelling
> reason may exist, but hardware-independent core subsystems probably have no
> excuse.

No it does not have any volatile semantics. atomic_dec() can be reordered 
at will by the compiler within the current basic unit if you do not add a 
barrier.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:20                                       ` Christoph Lameter
@ 2007-08-17  1:02                                         ` Paul E. McKenney
  2007-08-17  1:28                                           ` Herbert Xu
  2007-08-17  2:16                                         ` Paul Mackerras
  2007-08-17 17:41                                         ` Segher Boessenkool
  2 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-17  1:02 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Chris Snook, Ilpo Järvinen, Herbert Xu, Paul Mackerras,
	Satyam Sharma, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Thu, Aug 16, 2007 at 01:20:26PM -0700, Christoph Lameter wrote:
> On Thu, 16 Aug 2007, Chris Snook wrote:
> 
> > atomic_dec() already has volatile behavior everywhere, so this is semantically
> > okay, but this code (and any like it) should be calling cpu_relax() each
> > iteration through the loop, unless there's a compelling reason not to.  I'll
> > allow that for some hardware drivers (possibly this one) such a compelling
> > reason may exist, but hardware-independent core subsystems probably have no
> > excuse.
> 
> No it does not have any volatile semantics. atomic_dec() can be reordered 
> at will by the compiler within the current basic unit if you do not add a 
> barrier.

Yep.  Or you can use atomic_dec_return() instead of using a barrier.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  1:02                                         ` Paul E. McKenney
@ 2007-08-17  1:28                                           ` Herbert Xu
  2007-08-17  5:07                                             ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-17  1:28 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Christoph Lameter, Chris Snook, Ilpo Järvinen,
	Paul Mackerras, Satyam Sharma, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Thu, Aug 16, 2007 at 06:02:32PM -0700, Paul E. McKenney wrote:
> 
> Yep.  Or you can use atomic_dec_return() instead of using a barrier.

Or you could use smp_mb__{before,after}_atomic_dec.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  1:28                                           ` Herbert Xu
@ 2007-08-17  5:07                                             ` Paul E. McKenney
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-17  5:07 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Chris Snook, Ilpo Järvinen,
	Paul Mackerras, Satyam Sharma, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Fri, Aug 17, 2007 at 09:28:00AM +0800, Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 06:02:32PM -0700, Paul E. McKenney wrote:
> > 
> > Yep.  Or you can use atomic_dec_return() instead of using a barrier.
> 
> Or you could use smp_mb__{before,after}_atomic_dec.

Yep.  That would be an example of a barrier, either in the
atomic_dec() itself or in the smp_mb...().

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:20                                       ` Christoph Lameter
  2007-08-17  1:02                                         ` Paul E. McKenney
@ 2007-08-17  2:16                                         ` Paul Mackerras
  2007-08-17  3:03                                           ` Linus Torvalds
  2007-08-17 17:41                                         ` Segher Boessenkool
  2 siblings, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-17  2:16 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Chris Snook, Ilpo Järvinen, Herbert Xu, Satyam Sharma,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Christoph Lameter writes:

> No it does not have any volatile semantics. atomic_dec() can be reordered 
> at will by the compiler within the current basic unit if you do not add a 
> barrier.

Volatile doesn't mean it can't be reordered; volatile means the
accesses can't be eliminated.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  2:16                                         ` Paul Mackerras
@ 2007-08-17  3:03                                           ` Linus Torvalds
  2007-08-17  3:43                                             ` Paul Mackerras
  2007-08-17 22:09                                             ` Segher Boessenkool
  0 siblings, 2 replies; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-17  3:03 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Chris Snook, Ilpo J?rvinen, Herbert Xu,
	Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Paul Mackerras wrote:
>
> Volatile doesn't mean it can't be reordered; volatile means the
> accesses can't be eliminated.

It also does limit re-ordering. 

Of course, since *normal* accesses aren't necessarily limited wrt 
re-ordering, the question then becomes one of "with regard to *what* does 
it limit re-ordering?".

A C compiler that re-orders two different volatile accesses that have a 
sequence point in between them is pretty clearly a buggy compiler. So at a 
minimum, it limits re-ordering wrt other volatiles (assuming sequence 
points exists). It also means that the compiler cannot move it 
speculatively across conditionals, but other than that it's starting to 
get fuzzy.

In general, I'd *much* rather we used barriers. Anything that "depends" on 
volatile is pretty much set up to be buggy. But I'm certainly also willing 
to have that volatile inside "atomic_read/atomic_set()" if it avoids code 
that would otherwise break - ie if it hides a bug.

		Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:03                                           ` Linus Torvalds
@ 2007-08-17  3:43                                             ` Paul Mackerras
  2007-08-17  3:53                                               ` Herbert Xu
  2007-08-17 22:09                                             ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-17  3:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Lameter, Chris Snook, Ilpo J?rvinen, Herbert Xu,
	Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Linus Torvalds writes:

> In general, I'd *much* rather we used barriers. Anything that "depends" on 
> volatile is pretty much set up to be buggy. But I'm certainly also willing 
> to have that volatile inside "atomic_read/atomic_set()" if it avoids code 
> that would otherwise break - ie if it hides a bug.

The cost of doing so seems to me to be well down in the noise - 44
bytes of extra kernel text on a ppc64 G5 config, and I don't believe
the extra few cycles for the occasional extra load would be measurable
(they should all hit in the L1 dcache).  I don't mind if x86[-64] have
atomic_read/set be nonvolatile and find all the missing barriers, but
for now on powerpc, I think that not having to find those missing
barriers is worth the 0.00076% increase in kernel text size.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:43                                             ` Paul Mackerras
@ 2007-08-17  3:53                                               ` Herbert Xu
  2007-08-17  6:26                                                 ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-17  3:53 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Linus Torvalds, Christoph Lameter, Chris Snook, Ilpo J?rvinen,
	Satyam Sharma, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, Aug 17, 2007 at 01:43:27PM +1000, Paul Mackerras wrote:
>
> The cost of doing so seems to me to be well down in the noise - 44
> bytes of extra kernel text on a ppc64 G5 config, and I don't believe
> the extra few cycles for the occasional extra load would be measurable
> (they should all hit in the L1 dcache).  I don't mind if x86[-64] have
> atomic_read/set be nonvolatile and find all the missing barriers, but
> for now on powerpc, I think that not having to find those missing
> barriers is worth the 0.00076% increase in kernel text size.

BTW, the sort of missing barriers that triggered this thread
aren't that subtle.  It'll result in a simple lock-up if the
loop condition holds upon entry.  At which point it's fairly
straightforward to find the culprit.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:53                                               ` Herbert Xu
@ 2007-08-17  6:26                                                 ` Satyam Sharma
  2007-08-17  8:38                                                   ` Nick Piggin
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  6:26 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Linus Torvalds, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Herbert Xu wrote:

> On Fri, Aug 17, 2007 at 01:43:27PM +1000, Paul Mackerras wrote:
> >
> > The cost of doing so seems to me to be well down in the noise - 44
> > bytes of extra kernel text on a ppc64 G5 config, and I don't believe
> > the extra few cycles for the occasional extra load would be measurable
> > (they should all hit in the L1 dcache).  I don't mind if x86[-64] have
> > atomic_read/set be nonvolatile and find all the missing barriers, but
> > for now on powerpc, I think that not having to find those missing
> > barriers is worth the 0.00076% increase in kernel text size.
> 
> BTW, the sort of missing barriers that triggered this thread
> aren't that subtle.  It'll result in a simple lock-up if the
> loop condition holds upon entry.  At which point it's fairly
> straightforward to find the culprit.

Not necessarily. A barrier-less buggy code such as below:

	atomic_set(&v, 0);

	... /* some initial code */

	while (atomic_read(&v))
		;

	... /* code that MUST NOT be executed unless v becomes non-zero */

(where v->counter is has no volatile access semantics)

could be generated by the compiler to simply *elid* or *do away* with
the loop itself, thereby making the:

"/* code that MUST NOT be executed unless v becomes non-zero */"

to be executed even when v is zero! That is subtle indeed, and causes
no hard lockups.

Granted, the above IS buggy code. But, the stated objective is to avoid
heisenbugs. And we have driver / subsystem maintainers such as Stefan
coming up and admitting that often a lot of code that's written to use
atomic_read() does assume the read will not be elided by the compiler.

See, I agree, "volatility" semantics != what we often want. However, if
what we want is compiler barrier, for only the object under consideration,
"volatility" semantics aren't really "nonsensical" or anything.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  6:26                                                 ` Satyam Sharma
@ 2007-08-17  8:38                                                   ` Nick Piggin
  2007-08-17  9:14                                                     ` Satyam Sharma
  2007-08-17 11:08                                                     ` Stefan Richter
  0 siblings, 2 replies; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  8:38 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:
> 
> On Fri, 17 Aug 2007, Herbert Xu wrote:
> 
> 
>>On Fri, Aug 17, 2007 at 01:43:27PM +1000, Paul Mackerras wrote:
>>
>>BTW, the sort of missing barriers that triggered this thread
>>aren't that subtle.  It'll result in a simple lock-up if the
>>loop condition holds upon entry.  At which point it's fairly
>>straightforward to find the culprit.
> 
> 
> Not necessarily. A barrier-less buggy code such as below:
> 
> 	atomic_set(&v, 0);
> 
> 	... /* some initial code */
> 
> 	while (atomic_read(&v))
> 		;
> 
> 	... /* code that MUST NOT be executed unless v becomes non-zero */
> 
> (where v->counter is has no volatile access semantics)
> 
> could be generated by the compiler to simply *elid* or *do away* with
> the loop itself, thereby making the:
> 
> "/* code that MUST NOT be executed unless v becomes non-zero */"
> 
> to be executed even when v is zero! That is subtle indeed, and causes
> no hard lockups.

Then I presume you mean

while (!atomic_read(&v))
     ;

Which is just the same old infinite loop bug solved with cpu_relax().
These are pretty trivial to audit and fix, and also to debug, I would
think.

> Granted, the above IS buggy code. But, the stated objective is to avoid
> heisenbugs.

Anyway, why are you making up code snippets that are buggy in other
ways in order to support this assertion being made that lots of kernel
code supposedly depends on volatile semantics. Just reference the
actual code.

> And we have driver / subsystem maintainers such as Stefan
> coming up and admitting that often a lot of code that's written to use
> atomic_read() does assume the read will not be elided by the compiler.

So these are broken on i386 and x86-64?

Are they definitely safe on SMP and weakly ordered machines with
just a simple compiler barrier there? Because I would not be
surprised if there are a lot of developers who don't really know
what to assume when it comes to memory ordering issues.

This is not a dig at driver writers: we still have memory ordering
problems in the VM too (and probably most of the subtle bugs in
lockless VM code are memory ordering ones). Let's not make up a
false sense of security and hope that sprinkling volatile around
will allow people to write bug-free lockless code. If a writer
can't be bothered reading API documentation and learning the Linux
memory model, they can still be productive writing safely locked
code.

-- 
SUSE Labs, Novell Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:38                                                   ` Nick Piggin
@ 2007-08-17  9:14                                                     ` Satyam Sharma
  2007-08-17  9:31                                                       ` Nick Piggin
  2007-08-17 11:08                                                     ` Stefan Richter
  1 sibling, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  9:14 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> [...]
> > Granted, the above IS buggy code. But, the stated objective is to avoid
> > heisenbugs.
    ^^^^^^^^^^

> Anyway, why are you making up code snippets that are buggy in other
> ways in order to support this assertion being made that lots of kernel
> code supposedly depends on volatile semantics. Just reference the
> actual code.

Because the point is *not* about existing bugs in kernel code. At some
point Chris Snook (who started this thread) did write that "If I knew
of the existing bugs in the kernel, I would be sending patches for them,
not this series" or something to that effect.

The point is about *author expecations*. If people do expect atomic_read()
(or a variant thereof) to have volatile semantics, why not give them such
a variant?

And by the way, the point is *also* about the fact that cpu_relax(), as
of today, implies a full memory clobber, which is not what a lot of such
loops want. (due to stuff mentioned elsewhere, summarized in that summary)

> > And we have driver / subsystem maintainers such as Stefan
> > coming up and admitting that often a lot of code that's written to use
> > atomic_read() does assume the read will not be elided by the compiler.
                                                             ^^^^^^^^^^^^^

(so it's about compiler barrier expectations only, though I fully agree
that those who're using atomic_t as if it were some magic thing that lets
them write lockless code are sorrily mistaken.)

> So these are broken on i386 and x86-64?

Possibly, but the point is not about existing bugs, as mentioned above.

Some such bugs have been found nonetheless -- reminds me, can somebody
please apply http://www.gossamer-threads.com/lists/linux/kernel/810674 ?

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:14                                                     ` Satyam Sharma
@ 2007-08-17  9:31                                                       ` Nick Piggin
  2007-08-17 10:55                                                         ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17  9:31 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:
> 
> On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> 
>>Satyam Sharma wrote:
>>[...]
>>
>>>Granted, the above IS buggy code. But, the stated objective is to avoid
>>>heisenbugs.
> 
>     ^^^^^^^^^^
> 
> 
>>Anyway, why are you making up code snippets that are buggy in other
>>ways in order to support this assertion being made that lots of kernel
>>code supposedly depends on volatile semantics. Just reference the
>>actual code.
> 
> 
> Because the point is *not* about existing bugs in kernel code. At some
> point Chris Snook (who started this thread) did write that "If I knew
> of the existing bugs in the kernel, I would be sending patches for them,
> not this series" or something to that effect.
> 
> The point is about *author expecations*. If people do expect atomic_read()
> (or a variant thereof) to have volatile semantics, why not give them such
> a variant?

Because they should be thinking about them in terms of barriers, over
which the compiler / CPU is not to reorder accesses or cache memory
operations, rather than "special" "volatile" accesses. Linux's whole
memory ordering and locking model is completely geared around the
former.


> And by the way, the point is *also* about the fact that cpu_relax(), as
> of today, implies a full memory clobber, which is not what a lot of such
> loops want. (due to stuff mentioned elsewhere, summarized in that summary)

That's not the point, because as I also mentioned, the logical extention
to Linux's barrier API to handle this is the order(x) macro. Again, not
special volatile accessors.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  9:31                                                       ` Nick Piggin
@ 2007-08-17 10:55                                                         ` Satyam Sharma
  2007-08-17 12:39                                                           ` Nick Piggin
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 10:55 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher



On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> > [...]
> > The point is about *author expecations*. If people do expect atomic_read()
> > (or a variant thereof) to have volatile semantics, why not give them such
> > a variant?
> 
> Because they should be thinking about them in terms of barriers, over
> which the compiler / CPU is not to reorder accesses or cache memory
> operations, rather than "special" "volatile" accesses.

This is obviously just a taste thing. Whether to have that forget(x)
barrier as something author should explicitly sprinkle appropriately
in appropriate places in the code by himself or use a primitive that
includes it itself.

I'm not saying "taste matters aren't important" (they are), but I'm really
skeptical if most folks would find the former tasteful.

> > And by the way, the point is *also* about the fact that cpu_relax(), as
> > of today, implies a full memory clobber, which is not what a lot of such
> > loops want. (due to stuff mentioned elsewhere, summarized in that summary)
> 
> That's not the point,

That's definitely the point, why not. This is why "barrier()", being
heavy-handed, is not the best option.

> because as I also mentioned, the logical extention
> to Linux's barrier API to handle this is the order(x) macro. Again, not
> special volatile accessors.

Sure, that forget(x) macro _is_ proposed to be made part of the generic
API. Doesn't explain why not to define/use primitives that has volatility
semantics in itself, though (taste matters apart).

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 10:55                                                         ` Satyam Sharma
@ 2007-08-17 12:39                                                           ` Nick Piggin
  2007-08-17 13:36                                                             ` Satyam Sharma
  2007-08-17 16:48                                                             ` Linus Torvalds
  0 siblings, 2 replies; 1651+ messages in thread
From: Nick Piggin @ 2007-08-17 12:39 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Satyam Sharma wrote:

>
>On Fri, 17 Aug 2007, Nick Piggin wrote:
>
>
>>Because they should be thinking about them in terms of barriers, over
>>which the compiler / CPU is not to reorder accesses or cache memory
>>operations, rather than "special" "volatile" accesses.
>>
>
>This is obviously just a taste thing. Whether to have that forget(x)
>barrier as something author should explicitly sprinkle appropriately
>in appropriate places in the code by himself or use a primitive that
>includes it itself.
>

That's not obviously just taste to me. Not when the primitive has many
(perhaps, the majority) of uses that do not require said barriers. And
this is not solely about the code generation (which, as Paul says, is
relatively minor even on x86). I prefer people to think explicitly
about barriers in their lockless code.

>I'm not saying "taste matters aren't important" (they are), but I'm really
>skeptical if most folks would find the former tasteful.
>

So I /do/ have better taste than most folks? Thanks! :-)

>>>And by the way, the point is *also* about the fact that cpu_relax(), as
>>>of today, implies a full memory clobber, which is not what a lot of such
>>>loops want. (due to stuff mentioned elsewhere, summarized in that summary)
>>>
>>That's not the point,
>>
>
>That's definitely the point, why not. This is why "barrier()", being
>heavy-handed, is not the best option.
>

That is _not_ the point (of why a volatile atomic_read is good) because 
there
has already been an alternative posted that better conforms with Linux 
barrier
API and is much more widely useful and more usable. If you are so 
worried about
barrier() being too heavyweight, then you're off to a poor start by 
wanting to
add a few K of kernel text by making atomic_read volatile.

>>because as I also mentioned, the logical extention
>>to Linux's barrier API to handle this is the order(x) macro. Again, not
>>special volatile accessors.
>>
>
>Sure, that forget(x) macro _is_ proposed to be made part of the generic
>API. Doesn't explain why not to define/use primitives that has volatility
>semantics in itself, though (taste matters apart).
>

If you follow the discussion.... You were thinking of a reason why the
semantics *should* be changed or added, and I was rebutting your argument
that it must be used when a full barrier() is too heavy (ie. by pointing
out that order() has superior semantics anyway).

Why do I keep repeating the same things? I'll not continue bloating this
thread until a new valid point comes up...

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:39                                                           ` Nick Piggin
@ 2007-08-17 13:36                                                             ` Satyam Sharma
  2007-08-17 16:48                                                             ` Linus Torvalds
  1 sibling, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 13:36 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Herbert Xu, Paul Mackerras, Linus Torvalds, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Nick Piggin wrote:

> Satyam Sharma wrote:
> 
> > On Fri, 17 Aug 2007, Nick Piggin wrote:
> > 
> > > Because they should be thinking about them in terms of barriers, over
> > > which the compiler / CPU is not to reorder accesses or cache memory
> > > operations, rather than "special" "volatile" accesses.
> > 
> > This is obviously just a taste thing. Whether to have that forget(x)
> > barrier as something author should explicitly sprinkle appropriately
> > in appropriate places in the code by himself or use a primitive that
> > includes it itself.
> 
> That's not obviously just taste to me. Not when the primitive has many
> (perhaps, the majority) of uses that do not require said barriers. And
> this is not solely about the code generation (which, as Paul says, is
> relatively minor even on x86).

See, you do *require* people to have to repeat the same things to you!

As has been written about enough times already, and if you followed the
discussion on this thread, I am *not* proposing that atomic_read()'s
semantics be changed to have any extra barriers. What is proposed is a
different atomic_read_xxx() variant thereof, that those can use who do
want that.

Now whether to have a kind of barrier ("volatile", whatever) in the
atomic_read_xxx() itself, or whether to make the code writer himself to
explicitly write the order(x) appropriately in appropriate places in the
code _is_ a matter of taste.

> > That's definitely the point, why not. This is why "barrier()", being
> > heavy-handed, is not the best option.
> 
> That is _not_ the point [...]

Again, you're requiring me to repeat things that were already made evident
on this thread (if you follow it).

This _is_ the point, because a lot of loops out there (too many of them,
I WILL NOT bother citing file_name:line_number) end up having to use a
barrier just because they're using a loop-exit-condition that depends
on a value returned by atomic_read(). It would be good for them if they
used an atomic_read_xxx() primitive that gave these "volatility" semantics
without junking compiler optimizations for other memory references.

> because there has already been an alternative posted

Whether that alternative (explicitly using forget(x), or wrappers thereof,
such as the "order_atomic" you proposed) is better than other alternatives
(such as atomic_read_xxx() which includes the volatility behaviour in
itself) is still open, and precisely what we started discussing just one
mail back.

(The above was also mostly stuff I had to repeated for you, sadly.)

> that better conforms with Linux barrier
> API and is much more widely useful and more usable.

I don't think so.

(Now *this* _is_ the "taste-dependent matter" that I mentioned earlier.)

> If you are so worried
> about
> barrier() being too heavyweight, then you're off to a poor start by wanting to
> add a few K of kernel text by making atomic_read volatile.

Repeating myself, for the N'th time, NO, I DON'T want to make atomic_read
have "volatile" semantics.

> > > because as I also mentioned, the logical extention
> > > to Linux's barrier API to handle this is the order(x) macro. Again, not
> > > special volatile accessors.
> > 
> > Sure, that forget(x) macro _is_ proposed to be made part of the generic
> > API. Doesn't explain why not to define/use primitives that has volatility
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > semantics in itself, though (taste matters apart).
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> If you follow the discussion.... You were thinking of a reason why the
> semantics *should* be changed or added, and I was rebutting your argument
> that it must be used when a full barrier() is too heavy (ie. by pointing
> out that order() has superior semantics anyway).

Amazing. Either you have reading comprehension problems, or else, please
try reading this thread (or at least this sub-thread) again. I don't want
_you_ blaming _me_ for having to repeat things to you all over again.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 12:39                                                           ` Nick Piggin
  2007-08-17 13:36                                                             ` Satyam Sharma
@ 2007-08-17 16:48                                                             ` Linus Torvalds
  2007-08-17 18:50                                                               ` Chris Friesen
                                                                                 ` (2 more replies)
  1 sibling, 3 replies; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-17 16:48 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Satyam Sharma, Herbert Xu, Paul Mackerras, Christoph Lameter,
	Chris Snook, Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Nick Piggin wrote:
> 
> That's not obviously just taste to me. Not when the primitive has many
> (perhaps, the majority) of uses that do not require said barriers. And
> this is not solely about the code generation (which, as Paul says, is
> relatively minor even on x86). I prefer people to think explicitly
> about barriers in their lockless code.

Indeed.

I think the important issues are:

 - "volatile" itself is simply a badly/weakly defined issue. The semantics 
   of it as far as the compiler is concerned are really not very good, and 
   in practice tends to boil down to "I will generate so bad code that 
   nobody can accuse me of optimizing anything away".

 - "volatile" - regardless of how well or badly defined it is - is purely 
   a compiler thing. It has absolutely no meaning for the CPU itself, so 
   it at no point implies any CPU barriers. As a result, even if the 
   compiler generates crap code and doesn't re-order anything, there's 
   nothing that says what the CPU will do.

 - in other words, the *only* possible meaning for "volatile" is a purely 
   single-CPU meaning. And if you only have a single CPU involved in the 
   process, the "volatile" is by definition pointless (because even 
   without a volatile, the compiler is required to make the C code appear 
   consistent as far as a single CPU is concerned).

So, let's take the example *buggy* code where we use "volatile" to wait 
for other CPU's:

	atomic_set(&var, 0);
	while (!atomic_read(&var))
		/* nothing */;

which generates an endless loop if we don't have atomic_read() imply 
volatile.

The point here is that it's buggy whether the volatile is there or not! 
Exactly because the user expects multi-processing behaviour, but 
"volatile" doesn't actually give any real guarantees about it. Another CPU 
may have done:

	external_ptr = kmalloc(..);
	/* Setup is now complete, inform the waiter */
	atomic_inc(&var);

but the fact is, since the other CPU isn't serialized in any way, the 
"while-loop" (even in the presense of "volatile") doesn't actually work 
right! Whatever the "atomic_read()" was waiting for may not have 
completed, because we have no barriers!

So if "volatile" makes a difference, it is invariably a sign of a bug in 
serialization (the one exception is for IO - we use "volatile" to avoid 
having to use inline asm for IO on x86) - and for "random values" like 
jiffies).

So the question should *not* be whether "volatile" actually fixes bugs. It 
*never* fixes a bug. But what it can do is to hide the obvious ones. In 
other words, adding a volaile in the above kind of situation of 
"atomic_read()" will certainly turn an obvious bug into something that 
works "practically all of the time).

So anybody who argues for "volatile" fixing bugs is fundamentally 
incorrect. It does NO SUCH THING. By arguing that, such people only show 
that you have no idea what they are talking about.

So the only reason to add back "volatile" to the atomic_read() sequence is 
not to fix bugs, but to _hide_ the bugs better. They're still there, they 
are just a lot harder to trigger, and tend to be a lot subtler.

And hey, sometimes "hiding bugs well enough" is ok. In this case, I'd 
argue that we've successfully *not* had the volatile there for eight 
months on x86-64, and that should tell people something. 

(Does _removing_ the volatile fix bugs? No - callers still need to think 
about barriers etc, and lots of people don't. So I'm not claiming that 
removing volatile fixes any bugs either, but I *am* claiming that:

 - removing volatile makes some bugs easier to see (which is mostly a good 
   thing: they were there before, anyway).

 - removing volatile generates better code (which is a good thing, even if 
   it's just 0.1%)

 - removing volatile removes a huge mental *bug* that lots of people seem 
   to have, as shown by this whole thread. Anybody who thinks that 
   "volatile" actually fixes anything has a gaping hole in their head, and 
   we should remove volatile just to make sure that nobody thinks that it 
   means soemthign that it doesn't mean!

In other words, this whole discussion has just convinced me that we should 
*not* add back "volatile" to "atomic_read()" - I was willing to do it for 
practical and "hide the bugs" reasons, but having seen people argue for 
it, thinking that it actually fixes something, I'm now convinced that the 
*last* thing we should do is to encourage that kind of superstitious 
thinking.

"volatile" is like a black cat crossing the road. Sure, it affects 
*something* (at a minimum: before, the black cat was on one side of the 
road, afterwards it is on the other side of the road), but it has no 
bigger and longer-lasting direct affects. 

People who think "volatile" really matters are just fooling themselves.

		Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 16:48                                                             ` Linus Torvalds
@ 2007-08-17 18:50                                                               ` Chris Friesen
  2007-08-17 18:54                                                                 ` Arjan van de Ven
  2007-08-17 19:08                                                                 ` Linus Torvalds
  2007-08-20 13:15                                                               ` Chris Snook
  2007-09-09 18:02                                                               ` Denys Vlasenko
  2 siblings, 2 replies; 1651+ messages in thread
From: Chris Friesen @ 2007-08-17 18:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, zlynx, rpjday, jesper.juhl, segher

Linus Torvalds wrote:

>  - in other words, the *only* possible meaning for "volatile" is a purely 
>    single-CPU meaning. And if you only have a single CPU involved in the 
>    process, the "volatile" is by definition pointless (because even 
>    without a volatile, the compiler is required to make the C code appear 
>    consistent as far as a single CPU is concerned).

I assume you mean "except for IO-related code and 'random' values like 
jiffies" as you mention later on?  I assume other values set in 
interrupt handlers would count as "random" from a volatility perspective?

> So anybody who argues for "volatile" fixing bugs is fundamentally 
> incorrect. It does NO SUCH THING. By arguing that, such people only show 
> that you have no idea what they are talking about.

What about reading values modified in interrupt handlers, as in your 
"random" case?  Or is this a bug where the user of atomic_read() is 
invalidly expecting a read each time it is called?

Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:50                                                               ` Chris Friesen
@ 2007-08-17 18:54                                                                 ` Arjan van de Ven
  2007-08-17 19:49                                                                   ` Paul E. McKenney
  2007-08-17 19:08                                                                 ` Linus Torvalds
  1 sibling, 1 reply; 1651+ messages in thread
From: Arjan van de Ven @ 2007-08-17 18:54 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher


On Fri, 2007-08-17 at 12:50 -0600, Chris Friesen wrote:
> Linus Torvalds wrote:
> 
> >  - in other words, the *only* possible meaning for "volatile" is a purely 
> >    single-CPU meaning. And if you only have a single CPU involved in the 
> >    process, the "volatile" is by definition pointless (because even 
> >    without a volatile, the compiler is required to make the C code appear 
> >    consistent as far as a single CPU is concerned).
> 
> I assume you mean "except for IO-related code and 'random' values like 
> jiffies" as you mention later on?  I assume other values set in 
> interrupt handlers would count as "random" from a volatility perspective?
> 
> > So anybody who argues for "volatile" fixing bugs is fundamentally 
> > incorrect. It does NO SUCH THING. By arguing that, such people only show 
> > that you have no idea what they are talking about.
> 
> What about reading values modified in interrupt handlers, as in your 
> "random" case?  Or is this a bug where the user of atomic_read() is 
> invalidly expecting a read each time it is called?

the interrupt handler case is an SMP case since you do not know
beforehand what cpu your interrupt handler will run on.




^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:54                                                                 ` Arjan van de Ven
@ 2007-08-17 19:49                                                                   ` Paul E. McKenney
  2007-08-17 19:49                                                                     ` Arjan van de Ven
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-17 19:49 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Chris Friesen, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher

On Fri, Aug 17, 2007 at 11:54:33AM -0700, Arjan van de Ven wrote:
> 
> On Fri, 2007-08-17 at 12:50 -0600, Chris Friesen wrote:
> > Linus Torvalds wrote:
> > 
> > >  - in other words, the *only* possible meaning for "volatile" is a purely 
> > >    single-CPU meaning. And if you only have a single CPU involved in the 
> > >    process, the "volatile" is by definition pointless (because even 
> > >    without a volatile, the compiler is required to make the C code appear 
> > >    consistent as far as a single CPU is concerned).
> > 
> > I assume you mean "except for IO-related code and 'random' values like 
> > jiffies" as you mention later on?  I assume other values set in 
> > interrupt handlers would count as "random" from a volatility perspective?
> > 
> > > So anybody who argues for "volatile" fixing bugs is fundamentally 
> > > incorrect. It does NO SUCH THING. By arguing that, such people only show 
> > > that you have no idea what they are talking about.
> > 
> > What about reading values modified in interrupt handlers, as in your 
> > "random" case?  Or is this a bug where the user of atomic_read() is 
> > invalidly expecting a read each time it is called?
> 
> the interrupt handler case is an SMP case since you do not know
> beforehand what cpu your interrupt handler will run on.

With the exception of per-CPU variables, yes.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 19:49                                                                   ` Paul E. McKenney
@ 2007-08-17 19:49                                                                     ` Arjan van de Ven
  2007-08-17 20:12                                                                       ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Arjan van de Ven @ 2007-08-17 19:49 UTC (permalink / raw)
  To: paulmck
  Cc: Chris Friesen, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher


On Fri, 2007-08-17 at 12:49 -0700, Paul E. McKenney wrote:
> > > What about reading values modified in interrupt handlers, as in your 
> > > "random" case?  Or is this a bug where the user of atomic_read() is 
> > > invalidly expecting a read each time it is called?
> > 
> > the interrupt handler case is an SMP case since you do not know
> > beforehand what cpu your interrupt handler will run on.
> 
> With the exception of per-CPU variables, yes.

if you're spinning waiting for a per-CPU variable to get changed by an
interrupt handler... you have bigger problems than "volatile" ;-)

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 19:49                                                                     ` Arjan van de Ven
@ 2007-08-17 20:12                                                                       ` Paul E. McKenney
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-17 20:12 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Chris Friesen, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, zlynx, rpjday,
	jesper.juhl, segher

On Fri, Aug 17, 2007 at 12:49:00PM -0700, Arjan van de Ven wrote:
> 
> On Fri, 2007-08-17 at 12:49 -0700, Paul E. McKenney wrote:
> > > > What about reading values modified in interrupt handlers, as in your 
> > > > "random" case?  Or is this a bug where the user of atomic_read() is 
> > > > invalidly expecting a read each time it is called?
> > > 
> > > the interrupt handler case is an SMP case since you do not know
> > > beforehand what cpu your interrupt handler will run on.
> > 
> > With the exception of per-CPU variables, yes.
> 
> if you're spinning waiting for a per-CPU variable to get changed by an
> interrupt handler... you have bigger problems than "volatile" ;-)

That would be true, if you were doing that.  But you might instead be
simply making sure that the mainline actions were seen in order by the
interrupt handler.  My current example is the NMI-save rcu_read_lock()
implementation for realtime.  Not the common case, I will admit, but
still real.  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:50                                                               ` Chris Friesen
  2007-08-17 18:54                                                                 ` Arjan van de Ven
@ 2007-08-17 19:08                                                                 ` Linus Torvalds
  1 sibling, 0 replies; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-17 19:08 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, zlynx, rpjday, jesper.juhl, segher

On Fri, 17 Aug 2007, Chris Friesen wrote:
> 
> I assume you mean "except for IO-related code and 'random' values like
> jiffies" as you mention later on?

Yes. There *are* valid uses for "volatile", but they have remained the 
same for the last few years:
 - "jiffies"
 - internal per-architecture IO implementations that can do them as normal 
   stores.

> I assume other values set in interrupt handlers would count as "random" 
> from a volatility perspective?

I don't really see any valid case. I can imagine that you have your own 
"jiffy" counter in a driver, but what's the point, really? I'd suggest not 
using volatile, and using barriers instead.

> 
> > So anybody who argues for "volatile" fixing bugs is fundamentally 
> > incorrect. It does NO SUCH THING. By arguing that, such people only 
> > show that you have no idea what they are talking about.

> What about reading values modified in interrupt handlers, as in your 
> "random" case?  Or is this a bug where the user of atomic_read() is 
> invalidly expecting a read each time it is called?

Quite frankly, the biggest reason for using "volatile" on jiffies was 
really historic. So even the "random" case is not really a very strong 
one. You'll notice that anybody who is actually careful will be using 
sequence locks for the jiffy accesses, if only because the *full* jiffy 
count is actually a 64-bit value, and so you cannot get it atomically on a 
32-bit architecture even on a single CPU (ie a timer interrupt might 
happen in between reading the low and the high word, so "volatile" is only 
used for the low 32 bits).

So even for jiffies, we actually have:

	extern u64 __jiffy_data jiffies_64;
	extern unsigned long volatile __jiffy_data jiffies;

where the *real* jiffies is not volatile: the volatile one is using linker 
tricks to alias the low 32 bits:

 - arch/i386/kernel/vmlinux.lds.S:

	...
	jiffies = jiffies_64;
	...

and the only reason we do all these games is (a) it works and (b) it's 
legacy.

Note how I do *not* say "(c) it's a good idea".

			Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 16:48                                                             ` Linus Torvalds
  2007-08-17 18:50                                                               ` Chris Friesen
@ 2007-08-20 13:15                                                               ` Chris Snook
  2007-08-20 13:32                                                                 ` Herbert Xu
  2007-08-21  5:46                                                                 ` Linus Torvalds
  2007-09-09 18:02                                                               ` Denys Vlasenko
  2 siblings, 2 replies; 1651+ messages in thread
From: Chris Snook @ 2007-08-20 13:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

Linus Torvalds wrote:
> So the only reason to add back "volatile" to the atomic_read() sequence is 
> not to fix bugs, but to _hide_ the bugs better. They're still there, they 
> are just a lot harder to trigger, and tend to be a lot subtler.

What about barrier removal?  With consistent semantics we could optimize a fair 
amount of code.  Whether or not that constitutes "premature" optimization is 
open to debate, but there's no question we could reduce our register wiping in 
some places.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:15                                                               ` Chris Snook
@ 2007-08-20 13:32                                                                 ` Herbert Xu
  2007-08-20 13:38                                                                   ` Chris Snook
  2007-08-21  5:46                                                                 ` Linus Torvalds
  1 sibling, 1 reply; 1651+ messages in thread
From: Herbert Xu @ 2007-08-20 13:32 UTC (permalink / raw)
  To: Chris Snook
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Mon, Aug 20, 2007 at 09:15:11AM -0400, Chris Snook wrote:
> Linus Torvalds wrote:
> >So the only reason to add back "volatile" to the atomic_read() sequence is 
> >not to fix bugs, but to _hide_ the bugs better. They're still there, they 
> >are just a lot harder to trigger, and tend to be a lot subtler.
> 
> What about barrier removal?  With consistent semantics we could optimize a 
> fair amount of code.  Whether or not that constitutes "premature" 
> optimization is open to debate, but there's no question we could reduce our 
> register wiping in some places.

If you've been reading all of Linus's emails you should be
thinking about adding memory barriers, and not removing
compiler barriers.

He's just told you that code of the kind

	while (!atomic_read(cond))
		;

	do_something()

probably needs a memory barrier (not just compiler) so that
do_something() doesn't see stale cache content that occured
before cond flipped.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:32                                                                 ` Herbert Xu
@ 2007-08-20 13:38                                                                   ` Chris Snook
  2007-08-20 22:07                                                                     ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Chris Snook @ 2007-08-20 13:38 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

Herbert Xu wrote:
> On Mon, Aug 20, 2007 at 09:15:11AM -0400, Chris Snook wrote:
>> Linus Torvalds wrote:
>>> So the only reason to add back "volatile" to the atomic_read() sequence is 
>>> not to fix bugs, but to _hide_ the bugs better. They're still there, they 
>>> are just a lot harder to trigger, and tend to be a lot subtler.
>> What about barrier removal?  With consistent semantics we could optimize a 
>> fair amount of code.  Whether or not that constitutes "premature" 
>> optimization is open to debate, but there's no question we could reduce our 
>> register wiping in some places.
> 
> If you've been reading all of Linus's emails you should be
> thinking about adding memory barriers, and not removing
> compiler barriers.
> 
> He's just told you that code of the kind
> 
> 	while (!atomic_read(cond))
> 		;
> 
> 	do_something()
> 
> probably needs a memory barrier (not just compiler) so that
> do_something() doesn't see stale cache content that occured
> before cond flipped.

Such code generally doesn't care precisely when it gets the update, just that 
the update is atomic, and it doesn't loop forever.  Regardless, I'm convinced we 
just need to do it all in assembly.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:38                                                                   ` Chris Snook
@ 2007-08-20 22:07                                                                     ` Segher Boessenkool
  0 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-20 22:07 UTC (permalink / raw)
  To: Chris Snook
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, Stefan Richter,
	horms, Satyam Sharma, Ilpo Jarvinen, Linux Kernel Mailing List,
	David Miller, Paul E. McKenney, ak, Netdev, cfriesen, rpjday,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, schwidefsky,
	Herbert Xu, Linus Torvalds, wensong, Nick Piggin, wjiang

> Such code generally doesn't care precisely when it gets the update, 
> just that the update is atomic, and it doesn't loop forever.

Yes, it _does_ care that it gets the update _at all_, and preferably
as early as possible.

> Regardless, I'm convinced we just need to do it all in assembly.

So do you want "volatile asm" or "plain asm", for atomic_read()?
The asm version has two ways to go about it too...


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-20 13:15                                                               ` Chris Snook
  2007-08-20 13:32                                                                 ` Herbert Xu
@ 2007-08-21  5:46                                                                 ` Linus Torvalds
  2007-08-21  7:04                                                                   ` David Miller
  1 sibling, 1 reply; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-21  5:46 UTC (permalink / raw)
  To: Chris Snook
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher



On Mon, 20 Aug 2007, Chris Snook wrote:
>
> What about barrier removal?  With consistent semantics we could optimize a
> fair amount of code.  Whether or not that constitutes "premature" optimization
> is open to debate, but there's no question we could reduce our register wiping
> in some places.

Why do people think that barriers are expensive? They really aren't. 
Especially the regular compiler barrier is basically zero cost. Any 
reasonable compiler will just flush the stuff it holds in registers that 
isn't already automatic local variables, and for regular kernel code, that 
tends to basically be nothing at all.

Ie a "barrier()" is likely _cheaper_ than the code generation downside 
from using "volatile".

		Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  5:46                                                                 ` Linus Torvalds
@ 2007-08-21  7:04                                                                   ` David Miller
  2007-08-21 13:50                                                                     ` Chris Snook
  0 siblings, 1 reply; 1651+ messages in thread
From: David Miller @ 2007-08-21  7:04 UTC (permalink / raw)
  To: torvalds
  Cc: csnook, piggin, satyam, herbert, paulus, clameter, ilpo.jarvinen,
	paulmck, stefanr, linux-kernel, linux-arch, netdev, akpm, ak,
	heiko.carstens, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 20 Aug 2007 22:46:47 -0700 (PDT)

> Ie a "barrier()" is likely _cheaper_ than the code generation downside 
> from using "volatile".

Assuming GCC were ever better about the code generation badness
with volatile that has been discussed here, I much prefer
we tell GCC "this memory piece changed" rather than "every
piece of memory has changed" which is what the barrier() does.

I happened to have been scanning a lot of assembler lately to
track down a gcc-4.2 miscompilation on sparc64, and the barriers
do hurt quite a bit in some places.  Instead of keeping unrelated
variables around cached in local registers, it reloads everything.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21  7:04                                                                   ` David Miller
@ 2007-08-21 13:50                                                                     ` Chris Snook
  2007-08-21 14:59                                                                       ` Segher Boessenkool
                                                                                         ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Chris Snook @ 2007-08-21 13:50 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, piggin, satyam, herbert, paulus, clameter,
	ilpo.jarvinen, paulmck, stefanr, linux-kernel, linux-arch, netdev,
	akpm, ak, heiko.carstens, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

David Miller wrote:
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Mon, 20 Aug 2007 22:46:47 -0700 (PDT)
> 
>> Ie a "barrier()" is likely _cheaper_ than the code generation downside 
>> from using "volatile".
> 
> Assuming GCC were ever better about the code generation badness
> with volatile that has been discussed here, I much prefer
> we tell GCC "this memory piece changed" rather than "every
> piece of memory has changed" which is what the barrier() does.
> 
> I happened to have been scanning a lot of assembler lately to
> track down a gcc-4.2 miscompilation on sparc64, and the barriers
> do hurt quite a bit in some places.  Instead of keeping unrelated
> variables around cached in local registers, it reloads everything.

Moore's law is definitely working against us here.  Register counts, 
pipeline depths, core counts, and clock multipliers are all increasing 
in the long run.  At some point in the future, barrier() will be 
universally regarded as a hammer too big for most purposes.  Whether or 
not removing it now constitutes premature optimization is arguable, but 
I think we should allow such optimization to happen (or not happen) in 
architecture-dependent code, and provide a consistent API that doesn't 
require the use of such things in arch-independent code where it might 
turn into a totally superfluous performance killer depending on what 
hardware it gets compiled for.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 13:50                                                                     ` Chris Snook
@ 2007-08-21 14:59                                                                       ` Segher Boessenkool
  2007-08-21 16:31                                                                       ` Satyam Sharma
  2007-08-21 16:43                                                                       ` Linus Torvalds
  2 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-21 14:59 UTC (permalink / raw)
  To: Chris Snook
  Cc: paulmck, heiko.carstens, ilpo.jarvinen, horms, linux-kernel,
	David Miller, rpjday, netdev, ak, piggin, akpm, torvalds,
	cfriesen, jesper.juhl, linux-arch, paulus, herbert, satyam,
	clameter, stefanr, schwidefsky, zlynx, wensong, wjiang

> At some point in the future, barrier() will be universally regarded as 
> a hammer too big for most purposes.  Whether or not removing it now

You can't just remove it, it is needed in some places; you want to
replace it in most places with a more fine-grained "compiler barrier",
I presume?

> constitutes premature optimization is arguable, but I think we should 
> allow such optimization to happen (or not happen) in 
> architecture-dependent code, and provide a consistent API that doesn't 
> require the use of such things in arch-independent code where it might 
> turn into a totally superfluous performance killer depending on what 
> hardware it gets compiled for.

Explicit barrier()s won't be too hard to replace -- but what to do
about the implicit barrier()s in rmb() etc. etc. -- *those* will be
hard to get rid of, if only because it is hard enough to teach driver
authors about how to use those primitives *already*.  It is far from
clear what a good interface like that would look like, anyway.

Probably we should first start experimenting with a forget()-style
micro-barrier (but please, find a better name), and see if a nice
usage pattern shows up that can be turned into an API.

Segher

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 13:50                                                                     ` Chris Snook
  2007-08-21 14:59                                                                       ` Segher Boessenkool
@ 2007-08-21 16:31                                                                       ` Satyam Sharma
  2007-08-21 16:43                                                                       ` Linus Torvalds
  2 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-21 16:31 UTC (permalink / raw)
  To: Chris Snook
  Cc: David Miller, Linus Torvalds, piggin, herbert, paulus, clameter,
	ilpo.jarvinen, paulmck, stefanr, Linux Kernel Mailing List,
	linux-arch, netdev, Andrew Morton, ak, heiko.carstens,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Tue, 21 Aug 2007, Chris Snook wrote:

> David Miller wrote:
> > From: Linus Torvalds <torvalds@linux-foundation.org>
> > Date: Mon, 20 Aug 2007 22:46:47 -0700 (PDT)
> > 
> > > Ie a "barrier()" is likely _cheaper_ than the code generation downside
> > > from using "volatile".
> > 
> > Assuming GCC were ever better about the code generation badness
> > with volatile that has been discussed here, I much prefer
> > we tell GCC "this memory piece changed" rather than "every
> > piece of memory has changed" which is what the barrier() does.
> > 
> > I happened to have been scanning a lot of assembler lately to
> > track down a gcc-4.2 miscompilation on sparc64, and the barriers
> > do hurt quite a bit in some places.  Instead of keeping unrelated
> > variables around cached in local registers, it reloads everything.
> 
> Moore's law is definitely working against us here.  Register counts, pipeline
> depths, core counts, and clock multipliers are all increasing in the long run.
> At some point in the future, barrier() will be universally regarded as a
> hammer too big for most purposes.

I do agree, and the important point to note is that the benefits of a
/lighter/ compiler barrier, such as what David referred to above, _can_
be had without having to do anything with the "volatile" keyword at all.
And such a primitive has already been mentioned/proposed on this thread.

But this is all tangential to the core question at hand -- whether to have
implicit (compiler, possibly "light-weight" of the kind referred above)
barrier semantics in atomic ops that do not have them, or not.

I was lately looking in the kernel for _actual_ code that uses atomic_t
and benefits from the lack of any implicit barrier, with the compiler
being free to cache the atomic_t in a register. Now that often does _not_
happen, because all other ops (implemented in asm with LOCK prefix on x86)
_must_ therefore constrain the atomic_t to memory anyway. So typically all
atomic ops code sequences end up operating on memory.

Then I did locate sched.c:select_nohz_load_balancer() -- it repeatedly
references the same atomic_t object, and the code that I saw generated
(with CC_OPTIMIZE_FOR_SIZE=y) did cache it in a register for a sequence of
instructions. It uses atomic_cmpxchg, thereby not requiring explicit
memory barriers anywhere in the code, and is an example of an atomic_t
user that is safe, and yet benefits from its memory loads/stores being
elided/coalesced by the compiler.

# at this point, %%eax holds num_online_cpus() and
# %%ebx holds cpus_weight(nohz.cpu_mask)
# the variable "cpu" is in %esi

0xc1018e1d:      cmp    %eax,%ebx		# if No.A.
0xc1018e1f:      mov    0xc134d900,%eax		# first atomic_read()
0xc1018e24:      jne    0xc1018e36
0xc1018e26:      cmp    %esi,%eax		# if No.B.
0xc1018e28:      jne    0xc1018e80		# returns with 0
0xc1018e2a:      movl   $0xffffffff,0xc134d900	# atomic_set(-1), and ...
0xc1018e34:      jmp    0xc1018e80		# ... returns with 0
0xc1018e36:      cmp    $0xffffffff,%eax	# if No.C. (NOTE!)
0xc1018e39:      jne    0xc1018e46
0xc1018e3b:      lock cmpxchg %esi,0xc134d900	# atomic_cmpxchg()
0xc1018e43:      inc    %eax
0xc1018e44:      jmp    0xc1018e48
0xc1018e46:      cmp    %esi,%eax		# if No.D. (NOTE!)
0xc1018e48:      jne    0xc1018e80		# if !=, default return 0 (if No.E.)
0xc1018e4a:      jmp    0xc1018e84		# otherwise (==) returns with 1

The above is:

	if (cpus_weight(nohz.cpu_mask) == num_online_cpus()) {	/* if No.A. */
		if (atomic_read(&nohz.load_balancer) == cpu)	/* if No.B. */
			atomic_set(&nohz.load_balancer, -1);	/* XXX */
		return 0;
	}
	if (atomic_read(&nohz.load_balancer) == -1) {		/* if No.C. */
		/* make me the ilb owner */
		if (atomic_cmpxchg(&nohz.load_balancer, -1, cpu) == -1)	/* if No.E. */
			return 1;
	} else if (atomic_read(&nohz.load_balancer) == cpu)	/* if No.D. */
		return 1;
	...
	...
	return 0; /* default return from function */

As you can see, the atomic_read()'s of "if"s Nos. B, C, and D, were _all_
coalesced into a single memory reference "mov    0xc134d900,%eax" at the
top of the function, and then "if"s Nos. C and D simply used the value
from %%eax itself. But that's perfectly safe, such is the logic of this
function. It uses cmpxchg _whenever_ updating the value in the memory
atomic_t and then returns appropriately. The _only_ point that a casual
reader may find racy is that marked /* XXX */ above -- atomic_read()
followed by atomic_set() with no barrier in between. But even that is ok,
because if one thread ever finds that condition to succeed, it is 100%
guaranteed no other thread on any other CPU will find _any_ condition
to be true, thereby avoiding any race in the modification of that value.

BTW it does sound reasonable that a lot of atomic_t users that want a
compiler barrier probably also want a memory barrier. Do we make _that_
implicit too? Quite clearly, making _either_ one of those implicit in
atomic_{read,set} (in any form of implementation -- a forget() macro
based, *(volatile int *)& based, or inline asm based) would end up
harming code such as that cited above.

Lastly, the most obvious reason that should be considered against implicit
barriers in atomic ops is that it isn't "required" -- atomicity does not
imply any barrier after all, and making such a distinction would actually
be a healthy separation that helps people think more clearly when writing
lockless code.

[ But the "authors' expectations" / heisenbugs argument also holds some
  water ... for that, we can have a _variant_ in the API for atomic ops
  that has implicit compiler/memory barriers, to make it easier on those
  who want that behaviour. But let us not penalize code that knows what
  it is doing by changing the default to that, please. ]

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-21 13:50                                                                     ` Chris Snook
  2007-08-21 14:59                                                                       ` Segher Boessenkool
  2007-08-21 16:31                                                                       ` Satyam Sharma
@ 2007-08-21 16:43                                                                       ` Linus Torvalds
  2 siblings, 0 replies; 1651+ messages in thread
From: Linus Torvalds @ 2007-08-21 16:43 UTC (permalink / raw)
  To: Chris Snook
  Cc: David Miller, piggin, satyam, herbert, paulus, clameter,
	ilpo.jarvinen, paulmck, stefanr, linux-kernel, linux-arch, netdev,
	akpm, ak, heiko.carstens, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Tue, 21 Aug 2007, Chris Snook wrote:
> 
> Moore's law is definitely working against us here.  Register counts, pipeline
> depths, core counts, and clock multipliers are all increasing in the long run.
> At some point in the future, barrier() will be universally regarded as a
> hammer too big for most purposes.

Note that "barrier()" is purely a compiler barrier. It has zero impact on 
the CPU pipeline itself, and also has zero impact on anything that gcc 
knows isn't visible in memory (ie local variables that don't have their 
address taken), so barrier() really is pretty cheap.

Now, it's possible that gcc messes up in some circumstances, and that the 
memory clobber will cause gcc to also do things like flush local registers 
unnecessarily to their stack slots, but quite frankly, if that happens, 
it's a gcc problem, and I also have to say that I've not seen that myself.

So in a very real sense, "barrier()" will just make sure that there is a 
stronger sequence point for the compiler where things are stable. In most 
cases it has absolutely zero performance impact - apart from the 
-intended- impact of making sure that the compiler doesn't re-order or 
cache stuff around it.

And sure, we could make it more finegrained, and also introduce a 
per-variable barrier, but the fact is, people _already_ have problems with 
thinking about these kinds of things, and adding new abstraction issues 
with subtle semantics is the last thing we want.

So I really think you'd want to show a real example of real code that 
actually gets noticeably slower or bigger.

In removing "volatile", we have shown that. It may not have made a big 
difference on powerpc, but it makes a real difference on x86 - and more 
importantly, it removes something that people clearly don't know how it 
works, and incorrectly expect to just fix bugs.

[ There are *other* barriers - the ones that actually add memory barriers 
  to the CPU - that really can be quite expensive. The good news is that 
  the expense is going down rather than up: both Intel and AMD are not 
  only removing the need for some of them (ie "smp_rmb()" will become a 
  compiler-only barrier), but we're _also_ seeing the whole "pipeline 
  flush" approach go away, and be replaced by the CPU itself actually 
  being better - so even the actual CPU pipeline barriers are getting
  cheaper, not more expensive. ]

For example, did anybody even _test_ how expensive "barrier()" is? Just 
as a lark, I did

	#undef barrier
	#define barrier() do { } while (0)

in kernel/sched.c (which only has three of them in it, but hey, that's 
more than most files), and there were _zero_ code generation downsides. 
One instruction was moved (and a few line numbers changed), so it wasn't 
like the assembly language was identical, but the point is, barrier() 
simply doesn't have the same kinds of downsides that "volatile" has.

(That may not be true on other architectures or in other source files, of 
course. This *does* depend on code generation details. But anybody who 
thinks that "barrier()" is fundamentally expensive is simply incorrect. It 
is *fundamnetally* a no-op).

		Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 16:48                                                             ` Linus Torvalds
  2007-08-17 18:50                                                               ` Chris Friesen
  2007-08-20 13:15                                                               ` Chris Snook
@ 2007-09-09 18:02                                                               ` Denys Vlasenko
  2007-09-09 18:18                                                                 ` Arjan van de Ven
  2 siblings, 1 reply; 1651+ messages in thread
From: Denys Vlasenko @ 2007-09-09 18:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
	Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
	wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
	segher

On Friday 17 August 2007 17:48, Linus Torvalds wrote:
> 
> On Fri, 17 Aug 2007, Nick Piggin wrote:
> > 
> > That's not obviously just taste to me. Not when the primitive has many
> > (perhaps, the majority) of uses that do not require said barriers. And
> > this is not solely about the code generation (which, as Paul says, is
> > relatively minor even on x86). I prefer people to think explicitly
> > about barriers in their lockless code.
> 
> Indeed.
> 
> I think the important issues are:
> 
>  - "volatile" itself is simply a badly/weakly defined issue. The semantics 
>    of it as far as the compiler is concerned are really not very good, and 
>    in practice tends to boil down to "I will generate so bad code that 
>    nobody can accuse me of optimizing anything away".
> 
>  - "volatile" - regardless of how well or badly defined it is - is purely 
>    a compiler thing. It has absolutely no meaning for the CPU itself, so 
>    it at no point implies any CPU barriers. As a result, even if the 
>    compiler generates crap code and doesn't re-order anything, there's 
>    nothing that says what the CPU will do.
> 
>  - in other words, the *only* possible meaning for "volatile" is a purely 
>    single-CPU meaning. And if you only have a single CPU involved in the 
>    process, the "volatile" is by definition pointless (because even 
>    without a volatile, the compiler is required to make the C code appear 
>    consistent as far as a single CPU is concerned).
> 
> So, let's take the example *buggy* code where we use "volatile" to wait 
> for other CPU's:
> 
> 	atomic_set(&var, 0);
> 	while (!atomic_read(&var))
> 		/* nothing */;
> 
> 
> which generates an endless loop if we don't have atomic_read() imply 
> volatile.
> 
> The point here is that it's buggy whether the volatile is there or not! 
> Exactly because the user expects multi-processing behaviour, but 
> "volatile" doesn't actually give any real guarantees about it. Another CPU 
> may have done:
> 
> 	external_ptr = kmalloc(..);
> 	/* Setup is now complete, inform the waiter */
> 	atomic_inc(&var);
> 
> but the fact is, since the other CPU isn't serialized in any way, the 
> "while-loop" (even in the presense of "volatile") doesn't actually work 
> right! Whatever the "atomic_read()" was waiting for may not have 
> completed, because we have no barriers!

Why is all this fixation on "volatile"? I don't think
people want "volatile" keyword per se, they want atomic_read(&x) to
_always_ compile into an memory-accessing instruction, not register access.
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-09 18:02                                                               ` Denys Vlasenko
@ 2007-09-09 18:18                                                                 ` Arjan van de Ven
  2007-09-10 10:56                                                                   ` Denys Vlasenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Arjan van de Ven @ 2007-09-09 18:18 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Sun, 9 Sep 2007 19:02:54 +0100
Denys Vlasenko <vda.linux@googlemail.com> wrote:

> Why is all this fixation on "volatile"? I don't think
> people want "volatile" keyword per se, they want atomic_read(&x) to
> _always_ compile into an memory-accessing instruction, not register
> access.

and ... why is that?
is there any valid, non-buggy code sequence that makes that a
reasonable requirement?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-09 18:18                                                                 ` Arjan van de Ven
@ 2007-09-10 10:56                                                                   ` Denys Vlasenko
  2007-09-10 11:15                                                                     ` Herbert Xu
                                                                                       ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Denys Vlasenko @ 2007-09-10 10:56 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Sunday 09 September 2007 19:18, Arjan van de Ven wrote:
> On Sun, 9 Sep 2007 19:02:54 +0100
> Denys Vlasenko <vda.linux@googlemail.com> wrote:
> 
> > Why is all this fixation on "volatile"? I don't think
> > people want "volatile" keyword per se, they want atomic_read(&x) to
> > _always_ compile into an memory-accessing instruction, not register
> > access.
> 
> and ... why is that?
> is there any valid, non-buggy code sequence that makes that a
> reasonable requirement?

Well, if you insist on having it again:

Waiting for atomic value to be zero:

        while (atomic_read(&x))
                continue;

gcc may happily convert it into:

        reg = atomic_read(&x);
        while (reg)
                continue;

Expecting every driver writer to remember that atomic_read is not in fact
a "read from memory" is naive. That won't happen. Face it, majority of
driver authors are a bit less talented than Ingo Molnar or Arjan van de Ven ;)
The name of the macro is saying that it's a read.
We are confusing users here.

It's doubly confusing that cpy_relax(), which says _nothing_ about barriers
in its name, is actually a barrier you need to insert here.
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 10:56                                                                   ` Denys Vlasenko
@ 2007-09-10 11:15                                                                     ` Herbert Xu
  2007-09-10 12:22                                                                     ` Kyle Moffett
  2007-09-10 14:51                                                                     ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Arjan van de Ven
  2 siblings, 0 replies; 1651+ messages in thread
From: Herbert Xu @ 2007-09-10 11:15 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Mon, Sep 10, 2007 at 11:56:29AM +0100, Denys Vlasenko wrote:
> 
> Expecting every driver writer to remember that atomic_read is not in fact
> a "read from memory" is naive. That won't happen. Face it, majority of
> driver authors are a bit less talented than Ingo Molnar or Arjan van de Ven ;)
> The name of the macro is saying that it's a read.
> We are confusing users here.

For driver authors who're too busy to learn the intricacies
of atomic operations, we have the plain old spin lock which
then lets you use normal data structures such as u32 safely.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 10:56                                                                   ` Denys Vlasenko
  2007-09-10 11:15                                                                     ` Herbert Xu
@ 2007-09-10 12:22                                                                     ` Kyle Moffett
  2007-09-10 13:38                                                                       ` Denys Vlasenko
  2007-09-10 14:51                                                                     ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Arjan van de Ven
  2 siblings, 1 reply; 1651+ messages in thread
From: Kyle Moffett @ 2007-09-10 12:22 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Sep 10, 2007, at 06:56:29, Denys Vlasenko wrote:
> On Sunday 09 September 2007 19:18, Arjan van de Ven wrote:
>> On Sun, 9 Sep 2007 19:02:54 +0100
>> Denys Vlasenko <vda.linux@googlemail.com> wrote:
>>
>>> Why is all this fixation on "volatile"? I don't think people want  
>>> "volatile" keyword per se, they want atomic_read(&x) to _always_  
>>> compile into an memory-accessing instruction, not register access.
>>
>> and ... why is that?  is there any valid, non-buggy code sequence  
>> that makes that a reasonable requirement?
>
> Well, if you insist on having it again:
>
> Waiting for atomic value to be zero:
>
>         while (atomic_read(&x))
>                 continue;
>
> gcc may happily convert it into:
>
>         reg = atomic_read(&x);
>         while (reg)
>                 continue;

Bzzt.  Even if you fixed gcc to actually convert it to a busy loop on  
a memory variable, you STILL HAVE A BUG as it may *NOT* be gcc that  
does the conversion, it may be that the CPU does the caching of the  
memory value.  GCC has no mechanism to do cache-flushes or memory- 
barriers except through our custom inline assembly.  Also, you  
probably want a cpu_relax() in there somewhere to avoid overheating  
the CPU.  Thirdly, on a large system it may take some arbitrarily  
large amount of time for cache-propagation to update the value of the  
variable in your local CPU cache.  Finally, if atomics are based on  
based on spinlock+interrupt-disable then you will sit in a tight busy- 
loop of spin_lock_irqsave()->spin_unlock_irqrestore().  Depending on  
your system's internal model this may practically lock up your core  
because the spin_lock() will take the cacheline for exclusive access  
and doing that in a loop can prevent any other CPU from doing any  
operation on it!  Since your IRQs are disabled you even have a very  
small window that an IRQ will come along and free it up long enough  
for the update to take place.

The earlier code segment of:
> while(atomic_read(&x) > 0)
> 	atomic_dec(&x);
is *completely* buggy because you could very easily have 4 CPUs doing  
this on an atomic variable with a value of 1 and end up with it at  
negative 3 by the time you are done.  Moreover all the alternatives  
are also buggy, with the sole exception of this rather obvious- 
seeming one:
> atomic_set(&x, 0);

You simply CANNOT use an atomic_t as your sole synchronizing  
primitive, it doesn't work!  You virtually ALWAYS want to use an  
atomic_t in the following types of situations:

(A) As an object refcount.  The value is never read except as part of  
an atomic_dec_return().  Why aren't you using "struct kref"?

(B) As an atomic value counter (number of processes, for example).   
Just "reading" the value is racy anyways, if you want to enforce a  
limit or something then use atomic_inc_return(), check the result,  
and use atomic_dec() if it's too big.  If you just want to return the  
statistics then you are going to be instantaneous-point-in-time anyways.

(C) As an optimization value (statistics-like, but exact accuracy  
isn't important).

Atomics are NOT A REPLACEMENT for the proper kernel subsystem, like  
completions, mutexes, semaphores, spinlocks, krefs, etc.  It's not  
useful for synchronization, only for keeping track of simple integer  
RMW values.  Note that atomic_read() and atomic_set() aren't very  
useful RMW primitives (read-nomodify-nowrite and read-set-zero- 
write).  Code which assumes anything else is probably buggy in other  
ways too.

So while I see no real reason for the "volatile" on the atomics, I  
also see no real reason why it's terribly harmful.  Regardless of the  
"volatile" on the operation the CPU is perfectly happy to cache it  
anyways so it doesn't buy you any actual "always-access-memory"  
guarantees.  If you are just interested in it as an optimization you  
could probably just read the properly-aligned integer counter  
directly, an atomic read on most CPUs.

If you really need it to hit main memory *every* *single* *time*  
(Why?  Are you using it instead of the proper kernel subsystem?)   
then you probably need a custom inline assembly helper anyways.

Cheers,
Kyle Moffett

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 12:22                                                                     ` Kyle Moffett
@ 2007-09-10 13:38                                                                       ` Denys Vlasenko
  2007-09-10 14:16                                                                         ` Denys Vlasenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Denys Vlasenko @ 2007-09-10 13:38 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Monday 10 September 2007 13:22, Kyle Moffett wrote:
> On Sep 10, 2007, at 06:56:29, Denys Vlasenko wrote:
> > On Sunday 09 September 2007 19:18, Arjan van de Ven wrote:
> >> On Sun, 9 Sep 2007 19:02:54 +0100
> >> Denys Vlasenko <vda.linux@googlemail.com> wrote:
> >>
> >>> Why is all this fixation on "volatile"? I don't think people want  
> >>> "volatile" keyword per se, they want atomic_read(&x) to _always_  
> >>> compile into an memory-accessing instruction, not register access.
> >>
> >> and ... why is that?  is there any valid, non-buggy code sequence  
> >> that makes that a reasonable requirement?
> >
> > Well, if you insist on having it again:
> >
> > Waiting for atomic value to be zero:
> >
> >         while (atomic_read(&x))
> >                 continue;
> >
> > gcc may happily convert it into:
> >
> >         reg = atomic_read(&x);
> >         while (reg)
> >                 continue;
> 
> Bzzt.  Even if you fixed gcc to actually convert it to a busy loop on  
> a memory variable, you STILL HAVE A BUG as it may *NOT* be gcc that  
> does the conversion, it may be that the CPU does the caching of the  
> memory value.  GCC has no mechanism to do cache-flushes or memory- 
> barriers except through our custom inline assembly.

CPU can cache the value all right, but it cannot use that cached value
*forever*, it has to react to invalidate cycles on the shared bus
and re-fetch new data.

IOW: atomic_read(&x) which compiles down to memory accessor
will work properly.

> the CPU.  Thirdly, on a large system it may take some arbitrarily  
> large amount of time for cache-propagation to update the value of the  
> variable in your local CPU cache.

Yes, but "arbitrarily large amount of time" is actually measured
in nanoseconds here. Let's say 1000ns max for hundreds of CPUs?

> Also, you   
> probably want a cpu_relax() in there somewhere to avoid overheating  
> the CPU.

Yes, but 
1. CPU shouldn't overheat (in a sense that it gets damaged),
   it will only use more power than needed.
2. cpu_relax() just throttles down my CPU, so it's performance
   optimization only. Wait, it isn't, it's a barrier too.
   Wow, "cpu_relax" is a barrier? How am I supposed to know
   that without reading lkml flamewars and/or header files?

Let's try reading headers. asm-x86_64/processor.h:

#define cpu_relax()   rep_nop()

So, is it a barrier? No clue yet.

/* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */
static inline void rep_nop(void)
{
        __asm__ __volatile__("rep;nop": : :"memory");
}

Comment explicitly says that it is "a good thing" (doesn't say
that it is mandatory) and says NOTHING about barriers!

Barrier-ness is not mentioned and is hidden in "memory" clobber.

Do you think it's obvious enough for average driver writer?
I think not, especially that it's unlikely for him to even start
suspecting that it is a memory barrier based on the "cpu_relax"
name.

> You simply CANNOT use an atomic_t as your sole synchronizing
> primitive, it doesn't work!  You virtually ALWAYS want to use an  
> atomic_t in the following types of situations:
> 
> (A) As an object refcount.  The value is never read except as part of  
> an atomic_dec_return().  Why aren't you using "struct kref"?
> 
> (B) As an atomic value counter (number of processes, for example).   
> Just "reading" the value is racy anyways, if you want to enforce a  
> limit or something then use atomic_inc_return(), check the result,  
> and use atomic_dec() if it's too big.  If you just want to return the  
> statistics then you are going to be instantaneous-point-in-time anyways.
> 
> (C) As an optimization value (statistics-like, but exact accuracy  
> isn't important).
> 
> Atomics are NOT A REPLACEMENT for the proper kernel subsystem, like  
> completions, mutexes, semaphores, spinlocks, krefs, etc.  It's not  
> useful for synchronization, only for keeping track of simple integer  
> RMW values.  Note that atomic_read() and atomic_set() aren't very  
> useful RMW primitives (read-nomodify-nowrite and read-set-zero- 
> write).  Code which assumes anything else is probably buggy in other  
> ways too.

You are basically trying to educate me how to use atomic properly.
You don't need to do it, as I am (currently) not a driver author.

I am saying that people who are already using atomic_read()
(and who unfortunately did not read your explanation above)
will still sometimes use atomic_read() as a way to read atomic value
*from memory*, and will create nasty heisenbugs for you to debug.
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 13:38                                                                       ` Denys Vlasenko
@ 2007-09-10 14:16                                                                         ` Denys Vlasenko
  2007-09-10 15:09                                                                           ` Linus Torvalds
  0 siblings, 1 reply; 1651+ messages in thread
From: Denys Vlasenko @ 2007-09-10 14:16 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Arjan van de Ven, Linus Torvalds, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Monday 10 September 2007 14:38, Denys Vlasenko wrote:
> You are basically trying to educate me how to use atomic properly.
> You don't need to do it, as I am (currently) not a driver author.
> 
> I am saying that people who are already using atomic_read()
> (and who unfortunately did not read your explanation above)
> will still sometimes use atomic_read() as a way to read atomic value
> *from memory*, and will create nasty heisenbugs for you to debug.

static inline int
qla2x00_wait_for_loop_ready(scsi_qla_host_t *ha)
{
        int      return_status = QLA_SUCCESS;
        unsigned long loop_timeout ;
        scsi_qla_host_t *pha = to_qla_parent(ha);

        /* wait for 5 min at the max for loop to be ready */
        loop_timeout = jiffies + (MAX_LOOP_TIMEOUT * HZ);

        while ((!atomic_read(&pha->loop_down_timer) &&
            atomic_read(&pha->loop_state) == LOOP_DOWN) ||
            atomic_read(&pha->loop_state) != LOOP_READY) {
                if (atomic_read(&pha->loop_state) == LOOP_DEAD) {
                        return_status = QLA_FUNCTION_FAILED;
                        break;
                }
                msleep(1000);
                if (time_after_eq(jiffies, loop_timeout)) {
                        return_status = QLA_FUNCTION_FAILED;
                        break;
                }
        }
        return (return_status);
}

Is above correct or buggy? Correct, because msleep is a barrier.
Is it obvious? No.

static void
qla2x00_rst_aen(scsi_qla_host_t *ha)
{
        if (ha->flags.online && !ha->flags.reset_active &&
            !atomic_read(&ha->loop_down_timer) &&
            !(test_bit(ABORT_ISP_ACTIVE, &ha->dpc_flags))) {
                do {
                        clear_bit(RESET_MARKER_NEEDED, &ha->dpc_flags);

                        /*
                         * Issue marker command only when we are going to start
                         * the I/O.
                         */
                        ha->marker_needed = 1;
                } while (!atomic_read(&ha->loop_down_timer) &&
                    (test_bit(RESET_MARKER_NEEDED, &ha->dpc_flags)));
        }
}

Is above correct? I honestly don't know. Correct, because set_bit is
a barrier on _all _memory_? Will it break if set_bit will be changed
to be a barrier only on its operand? Probably yes.

drivers/kvm/kvm_main.c

        while (atomic_read(&completed) != needed) {
                cpu_relax();
                barrier();
        }

Obviously author did not know that cpu_relax is already a barrier.
See why I think driver authors will be confused?

arch/x86_64/kernel/crash.c

static void nmi_shootdown_cpus(void)
{
...
        msecs = 1000; /* Wait at most a second for the other cpus to stop */
        while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
                mdelay(1);
                msecs--;
        }
...
}

Is mdelay(1) a barrier? Yes, because it is a function on x86_64.
Absolutely the same code will be buggy on an arch where
mdelay(1) == udelay(1000), and udelay is implemented
as inline busy-wait.

arch/sparc64/kernel/smp.c

        /* Wait for response */
        while (atomic_read(&data.finished) != cpus)
                cpu_relax();
...later in the same file...
                while (atomic_read(&smp_capture_registry) != ncpus)
                        rmb();

I'm confused. Do we need cpu_relax() or rmb()? Does cpu_relax() imply rmb()?
(No it doesn't). Which of those two while loops needs correcting?
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 14:16                                                                         ` Denys Vlasenko
@ 2007-09-10 15:09                                                                           ` Linus Torvalds
  2007-09-10 16:46                                                                             ` Denys Vlasenko
                                                                                               ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Linus Torvalds @ 2007-09-10 15:09 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Kyle Moffett, Arjan van de Ven, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Mon, 10 Sep 2007, Denys Vlasenko wrote:
> 
> static inline int
> qla2x00_wait_for_loop_ready(scsi_qla_host_t *ha)
> {
>         int      return_status = QLA_SUCCESS;
>         unsigned long loop_timeout ;
>         scsi_qla_host_t *pha = to_qla_parent(ha);
> 
>         /* wait for 5 min at the max for loop to be ready */
>         loop_timeout = jiffies + (MAX_LOOP_TIMEOUT * HZ);
> 
>         while ((!atomic_read(&pha->loop_down_timer) &&
>             atomic_read(&pha->loop_state) == LOOP_DOWN) ||
>             atomic_read(&pha->loop_state) != LOOP_READY) {
>                 if (atomic_read(&pha->loop_state) == LOOP_DEAD) {
...
> Is above correct or buggy? Correct, because msleep is a barrier.
> Is it obvious? No.

It's *buggy*. But it has nothing to do with any msleep() in the loop, or 
anything else.

And more importantly, it would be equally buggy even *with* a "volatile" 
atomic_read().

Why is this so hard for people to understand? You're all acting like 
morons.

The reason it is buggy has absolutely nothing to do with whether the read 
is done or not, it has to do with the fact that the CPU may re-order the 
reads *regardless* of whether the read is done in some specific order by 
the compiler ot not! In effect, there is zero ordering between all those 
three reads, and if you don't have memory barriers (or a lock or other 
serialization), that code is buggy.

So stop this idiotic discussion thread already. The above kind of code 
needs memory barriers to be non-buggy. The whole "volatile or not" 
discussion is totally idiotic, and pointless, and anybody who doesn't 
understand that by now needs to just shut up and think about it more, 
rather than make this discussion drag out even further.

The fact is, "volatile" *only* makes things worse. It generates worse 
code, and never fixes any real bugs. This is a *fact*.

			Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 15:09                                                                           ` Linus Torvalds
@ 2007-09-10 16:46                                                                             ` Denys Vlasenko
  2007-09-10 19:59                                                                               ` Kyle Moffett
  2007-09-10 18:59                                                                             ` Christoph Lameter
  2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
  2 siblings, 1 reply; 1651+ messages in thread
From: Denys Vlasenko @ 2007-09-10 16:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kyle Moffett, Arjan van de Ven, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Monday 10 September 2007 16:09, Linus Torvalds wrote:
> On Mon, 10 Sep 2007, Denys Vlasenko wrote:
> > static inline int
> > qla2x00_wait_for_loop_ready(scsi_qla_host_t *ha)
> > {
> >         int      return_status = QLA_SUCCESS;
> >         unsigned long loop_timeout ;
> >         scsi_qla_host_t *pha = to_qla_parent(ha);
> > 
> >         /* wait for 5 min at the max for loop to be ready */
> >         loop_timeout = jiffies + (MAX_LOOP_TIMEOUT * HZ);
> > 
> >         while ((!atomic_read(&pha->loop_down_timer) &&
> >             atomic_read(&pha->loop_state) == LOOP_DOWN) ||
> >             atomic_read(&pha->loop_state) != LOOP_READY) {
> >                 if (atomic_read(&pha->loop_state) == LOOP_DEAD) {
> ...
> > Is above correct or buggy? Correct, because msleep is a barrier.
> > Is it obvious? No.
> 
> It's *buggy*. But it has nothing to do with any msleep() in the loop, or 
> anything else.
> 
> And more importantly, it would be equally buggy even *with* a "volatile" 
> atomic_read().

I am not saying that this code is okay, this isn't the point.
(The code is in fact awful for several more reasons).

My point is that people are confused as to what atomic_read()
exactly means, and this is bad. Same for cpu_relax().
First one says "read", and second one doesn't say "barrier".

This is real code from current kernel which demonstrates this:

"I don't know that cpu_relax() is a barrier already":

drivers/kvm/kvm_main.c
        while (atomic_read(&completed) != needed) {
                cpu_relax();
                barrier();
        }

"I think that atomic_read() is a read from memory and therefore
I don't need a barrier":

arch/x86_64/kernel/crash.c
        msecs = 1000; /* Wait at most a second for the other cpus to stop */
        while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) {
                mdelay(1);
                msecs--;
        }

Since neither camp seems to give up, I am proposing renaming
them to something less confusing, and make everybody happy.

cpu_relax_barrier()
atomic_value(&x)
atomic_fetch(&x)

I'm not native English speaker, do these sound better?
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 16:46                                                                             ` Denys Vlasenko
@ 2007-09-10 19:59                                                                               ` Kyle Moffett
  0 siblings, 0 replies; 1651+ messages in thread
From: Kyle Moffett @ 2007-09-10 19:59 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Arjan van de Ven, Nick Piggin, Satyam Sharma,
	Herbert Xu, Paul Mackerras, Christoph Lameter, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Sep 10, 2007, at 12:46:33, Denys Vlasenko wrote:
> My point is that people are confused as to what atomic_read()   
> exactly means, and this is bad. Same for cpu_relax().  First one  
> says "read", and second one doesn't say "barrier".

Q&A:

Q:  When is it OK to use atomic_read()?
A:  You are asking the question, so never.

Q:  But I need to check the value of the atomic at this point in time...
A:  Your code is buggy if it needs to do that on an atomic_t for  
anything other than debugging or optimization.  Use either  
atomic_*_return() or a lock and some normal integers.

Q:  "So why can't the atomic_read DTRT magically?"
A:  Because "the right thing" depends on the situation and is usually  
best done with something other than atomic_t.

If somebody can post some non-buggy code which is correctly using  
atomic_read() *and* depends on the compiler generating extra  
nonsensical loads due to "volatile" then the issue *might* be  
reconsidered.  This also includes samples of code which uses  
atomic_read() and needs memory barriers (so that we can fix the buggy  
code, not so we can change atomic_read()).  So far the only code  
samples anybody has posted are buggy regardless of whether or not the  
value and/or accessors are flagged "volatile" or not.  And hey, maybe  
the volatile ops *should* be implemented in inline ASM for future- 
proof-ness, but that's a separate issue.

Cheers,
Kyle Moffett

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 15:09                                                                           ` Linus Torvalds
  2007-09-10 16:46                                                                             ` Denys Vlasenko
@ 2007-09-10 18:59                                                                             ` Christoph Lameter
  2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
  2 siblings, 0 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-09-10 18:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Denys Vlasenko, Kyle Moffett, Arjan van de Ven, Nick Piggin,
	Satyam Sharma, Herbert Xu, Paul Mackerras, Chris Snook,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Mon, 10 Sep 2007, Linus Torvalds wrote:

> The fact is, "volatile" *only* makes things worse. It generates worse 
> code, and never fixes any real bugs. This is a *fact*.

Yes, lets just drop the volatiles now! We need a patch that gets rid of 
them.... Volunteers?



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH] Document non-semantics of atomic_read() and atomic_set()
  2007-09-10 15:09                                                                           ` Linus Torvalds
  2007-09-10 16:46                                                                             ` Denys Vlasenko
  2007-09-10 18:59                                                                             ` Christoph Lameter
@ 2007-09-10 23:19                                                                             ` Chris Snook
  2007-09-10 23:44                                                                               ` Paul E. McKenney
  2007-09-11 19:35                                                                               ` Christoph Lameter
  2 siblings, 2 replies; 1651+ messages in thread
From: Chris Snook @ 2007-09-10 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Denys Vlasenko, Kyle Moffett, Arjan van de Ven, Nick Piggin,
	Satyam Sharma, Herbert Xu, Paul Mackerras, Christoph Lameter,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

From: Chris Snook <csnook@redhat.com>

Unambiguously document the fact that atomic_read() and atomic_set()
do not imply any ordering or memory access, and that callers are
obligated to explicitly invoke barriers as needed to ensure that
changes to atomic variables are visible in all contexts that need
to see them.

Signed-off-by: Chris Snook <csnook@redhat.com>

--- a/Documentation/atomic_ops.txt	2007-07-08 19:32:17.000000000 -0400
+++ b/Documentation/atomic_ops.txt	2007-09-10 19:02:50.000000000 -0400
@@ -12,7 +12,11 @@
 C integer type will fail.  Something like the following should
 suffice:
 
-	typedef struct { volatile int counter; } atomic_t;
+	typedef struct { int counter; } atomic_t;
+
+	Historically, counter has been declared volatile.  This is now
+discouraged.  See Documentation/volatile-considered-harmful.txt for the
+complete rationale.
 
 	The first operations to implement for atomic_t's are the
 initializers and plain reads.
@@ -42,6 +46,22 @@
 
 which simply reads the current value of the counter.
 
+*** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! ***
+
+Some architectures may choose to use the volatile keyword, barriers, or
+inline assembly to guarantee some degree of immediacy for atomic_read()
+and atomic_set().  This is not uniformly guaranteed, and may change in
+the future, so all users of atomic_t should treat atomic_read() and
+atomic_set() as simple C assignment statements that may be reordered or
+optimized away entirely by the compiler or processor, and explicitly
+invoke the appropriate compiler and/or memory barrier for each use case.
+Failure to do so will result in code that may suddenly break when used with
+different architectures or compiler optimizations, or even changes in
+unrelated code which changes how the compiler optimizes the section
+accessing atomic_t variables.
+
+*** YOU HAVE BEEN WARNED! ***
+
 Now, we move onto the actual atomic operation interfaces.
 
 	void atomic_add(int i, atomic_t *v);

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] Document non-semantics of atomic_read() and atomic_set()
  2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
@ 2007-09-10 23:44                                                                               ` Paul E. McKenney
  2007-09-11 19:35                                                                               ` Christoph Lameter
  1 sibling, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-09-10 23:44 UTC (permalink / raw)
  To: Chris Snook
  Cc: Linus Torvalds, Denys Vlasenko, Kyle Moffett, Arjan van de Ven,
	Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Christoph Lameter, Ilpo Jarvinen, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

On Mon, Sep 10, 2007 at 07:19:44PM -0400, Chris Snook wrote:
> From: Chris Snook <csnook@redhat.com>
> 
> Unambiguously document the fact that atomic_read() and atomic_set()
> do not imply any ordering or memory access, and that callers are
> obligated to explicitly invoke barriers as needed to ensure that
> changes to atomic variables are visible in all contexts that need
> to see them.

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Signed-off-by: Chris Snook <csnook@redhat.com>
> 
> --- a/Documentation/atomic_ops.txt	2007-07-08 19:32:17.000000000 -0400
> +++ b/Documentation/atomic_ops.txt	2007-09-10 19:02:50.000000000 -0400
> @@ -12,7 +12,11 @@
>  C integer type will fail.  Something like the following should
>  suffice:
> 
> -	typedef struct { volatile int counter; } atomic_t;
> +	typedef struct { int counter; } atomic_t;
> +
> +	Historically, counter has been declared volatile.  This is now
> +discouraged.  See Documentation/volatile-considered-harmful.txt for the
> +complete rationale.
> 
>  	The first operations to implement for atomic_t's are the
>  initializers and plain reads.
> @@ -42,6 +46,22 @@
> 
>  which simply reads the current value of the counter.
> 
> +*** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! ***
> +
> +Some architectures may choose to use the volatile keyword, barriers, or
> +inline assembly to guarantee some degree of immediacy for atomic_read()
> +and atomic_set().  This is not uniformly guaranteed, and may change in
> +the future, so all users of atomic_t should treat atomic_read() and
> +atomic_set() as simple C assignment statements that may be reordered or
> +optimized away entirely by the compiler or processor, and explicitly
> +invoke the appropriate compiler and/or memory barrier for each use case.
> +Failure to do so will result in code that may suddenly break when used with
> +different architectures or compiler optimizations, or even changes in
> +unrelated code which changes how the compiler optimizes the section
> +accessing atomic_t variables.
> +
> +*** YOU HAVE BEEN WARNED! ***
> +
>  Now, we move onto the actual atomic operation interfaces.
> 
>  	void atomic_add(int i, atomic_t *v);

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] Document non-semantics of atomic_read() and atomic_set()
  2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
  2007-09-10 23:44                                                                               ` Paul E. McKenney
@ 2007-09-11 19:35                                                                               ` Christoph Lameter
  1 sibling, 0 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-09-11 19:35 UTC (permalink / raw)
  To: Chris Snook
  Cc: Linus Torvalds, Denys Vlasenko, Kyle Moffett, Arjan van de Ven,
	Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
	Ilpo Jarvinen, Paul E. McKenney, Stefan Richter,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Acked-by: Christoph Lameter <clameter@sgi.com>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 10:56                                                                   ` Denys Vlasenko
  2007-09-10 11:15                                                                     ` Herbert Xu
  2007-09-10 12:22                                                                     ` Kyle Moffett
@ 2007-09-10 14:51                                                                     ` Arjan van de Ven
  2007-09-10 14:38                                                                       ` Denys Vlasenko
  2 siblings, 1 reply; 1651+ messages in thread
From: Arjan van de Ven @ 2007-09-10 14:51 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Mon, 10 Sep 2007 11:56:29 +0100
Denys Vlasenko <vda.linux@googlemail.com> wrote:

> 
> Well, if you insist on having it again:
> 
> Waiting for atomic value to be zero:
> 
>         while (atomic_read(&x))
>                 continue;
> 

and this I would say is buggy code all the way.

Not from a pure C level semantics, but from a "busy waiting is buggy"
semantics level and a "I'm inventing my own locking" semantics level.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 14:51                                                                     ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Arjan van de Ven
@ 2007-09-10 14:38                                                                       ` Denys Vlasenko
  2007-09-10 17:02                                                                         ` Arjan van de Ven
  0 siblings, 1 reply; 1651+ messages in thread
From: Denys Vlasenko @ 2007-09-10 14:38 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Monday 10 September 2007 15:51, Arjan van de Ven wrote:
> On Mon, 10 Sep 2007 11:56:29 +0100
> Denys Vlasenko <vda.linux@googlemail.com> wrote:
> 
> > 
> > Well, if you insist on having it again:
> > 
> > Waiting for atomic value to be zero:
> > 
> >         while (atomic_read(&x))
> >                 continue;
> > 
> 
> and this I would say is buggy code all the way.
>
> Not from a pure C level semantics, but from a "busy waiting is buggy"
> semantics level and a "I'm inventing my own locking" semantics level.

After inspecting arch/*, I cannot agree with you.
Otherwise almost all major architectures use
"conceptually buggy busy-waiting":

arch/alpha
arch/i386
arch/ia64
arch/m32r
arch/mips
arch/parisc
arch/powerpc
arch/sh
arch/sparc64
arch/um
arch/x86_64

All of the above contain busy-waiting on atomic_read.

Including these loops without barriers:

arch/mips/kernel/smtc.c
			while (atomic_read(&idle_hook_initialized) < 1000)
				;
arch/mips/sgi-ip27/ip27-nmi.c
	while (atomic_read(&nmied_cpus) != num_online_cpus());

[Well maybe num_online_cpus() is a barrier, I didn't check]

arch/sh/kernel/smp.c
	if (wait)
		while (atomic_read(&smp_fn_call.finished) != (nr_cpus - 1));

Bugs?
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 14:38                                                                       ` Denys Vlasenko
@ 2007-09-10 17:02                                                                         ` Arjan van de Ven
  0 siblings, 0 replies; 1651+ messages in thread
From: Arjan van de Ven @ 2007-09-10 17:02 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
	Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
	David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
	zlynx, rpjday, jesper.juhl, segher

On Mon, 10 Sep 2007 15:38:23 +0100
Denys Vlasenko <vda.linux@googlemail.com> wrote:

> On Monday 10 September 2007 15:51, Arjan van de Ven wrote:
> > On Mon, 10 Sep 2007 11:56:29 +0100
> > Denys Vlasenko <vda.linux@googlemail.com> wrote:
> > 
> > > 
> > > Well, if you insist on having it again:
> > > 
> > > Waiting for atomic value to be zero:
> > > 
> > >         while (atomic_read(&x))
> > >                 continue;
> > > 
> > 
> > and this I would say is buggy code all the way.
> >
> > Not from a pure C level semantics, but from a "busy waiting is
> > buggy" semantics level and a "I'm inventing my own locking"
> > semantics level.
> 
> After inspecting arch/*, I cannot agree with you.

the arch/ people obviously are allowed to do their own locking stuff...
BECAUSE THEY HAVE TO IMPLEMENT THAT!


the arch maintainers know EXACTLY how their hw behaves (well, we hope)
so they tend to be the exception to many rules in the kernel....

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  8:38                                                   ` Nick Piggin
  2007-08-17  9:14                                                     ` Satyam Sharma
@ 2007-08-17 11:08                                                     ` Stefan Richter
  1 sibling, 0 replies; 1651+ messages in thread
From: Stefan Richter @ 2007-08-17 11:08 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Satyam Sharma, Herbert Xu, Paul Mackerras, Linus Torvalds,
	Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
	Linux Kernel Mailing List, linux-arch, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Nick Piggin wrote:
> Satyam Sharma wrote:
>> And we have driver / subsystem maintainers such as Stefan
>> coming up and admitting that often a lot of code that's written to use
>> atomic_read() does assume the read will not be elided by the compiler.
> 
> So these are broken on i386 and x86-64?

The ieee1394 and firewire subsystems have open, undiagnosed bugs, also
on i386 and x86-64.  But whether there is any bug because of wrong
assumptions about atomic_read among them, I don't know.  I don't know
which assumptions the authors made, I only know that I wasn't aware of
all the properties of atomic_read until now.

> Are they definitely safe on SMP and weakly ordered machines with
> just a simple compiler barrier there? Because I would not be
> surprised if there are a lot of developers who don't really know
> what to assume when it comes to memory ordering issues.
> 
> This is not a dig at driver writers: we still have memory ordering
> problems in the VM too (and probably most of the subtle bugs in
> lockless VM code are memory ordering ones). Let's not make up a
> false sense of security and hope that sprinkling volatile around
> will allow people to write bug-free lockless code. If a writer
> can't be bothered reading API documentation

...or, if there is none, the implementation specification (as in case of
the atomic ops), or, if there is none, the implementation (as in case of
a some infrastructure code here and there)...

> and learning the Linux memory model, they can still be productive
> writing safely locked code.

Provided they are aware that they might not have the full picture of the
lockless primitives.  :-)
-- 
Stefan Richter
-=====-=-=== =--- =---=
http://arcgraph.de/sr/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  3:03                                           ` Linus Torvalds
  2007-08-17  3:43                                             ` Paul Mackerras
@ 2007-08-17 22:09                                             ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Satyam Sharma, Ilpo J?rvinen,
	Linux Kernel Mailing List, David Miller, Paul E. McKenney, ak,
	Netdev, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, wensong, wjiang

> Of course, since *normal* accesses aren't necessarily limited wrt
> re-ordering, the question then becomes one of "with regard to *what* 
> does
> it limit re-ordering?".
>
> A C compiler that re-orders two different volatile accesses that have a
> sequence point in between them is pretty clearly a buggy compiler. So 
> at a
> minimum, it limits re-ordering wrt other volatiles (assuming sequence
> points exists). It also means that the compiler cannot move it
> speculatively across conditionals, but other than that it's starting to
> get fuzzy.

This is actually really well-defined in C, not fuzzy at all.
"Volatile accesses" are a side effect, and no side effects can
be reordered with respect to sequence points.  The side effects
that matter in the kernel environment are: 1) accessing a volatile
object; 2) modifying an object; 3) volatile asm(); 4) calling a
function that does any of these.

We certainly should avoid volatile whenever possible, but "because
it's fuzzy wrt reordering" is not a reason -- all alternatives have
exactly the same issues.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:20                                       ` Christoph Lameter
  2007-08-17  1:02                                         ` Paul E. McKenney
  2007-08-17  2:16                                         ` Paul Mackerras
@ 2007-08-17 17:41                                         ` Segher Boessenkool
  2007-08-17 18:38                                           ` Satyam Sharma
  2007-09-10 18:59                                           ` Christoph Lameter
  2 siblings, 2 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 17:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>> atomic_dec() already has volatile behavior everywhere, so this is 
>> semantically
>> okay, but this code (and any like it) should be calling cpu_relax() 
>> each
>> iteration through the loop, unless there's a compelling reason not 
>> to.  I'll
>> allow that for some hardware drivers (possibly this one) such a 
>> compelling
>> reason may exist, but hardware-independent core subsystems probably 
>> have no
>> excuse.
>
> No it does not have any volatile semantics. atomic_dec() can be 
> reordered
> at will by the compiler within the current basic unit if you do not 
> add a
> barrier.

"volatile" has nothing to do with reordering.  atomic_dec() writes
to memory, so it _does_ have "volatile semantics", implicitly, as
long as the compiler cannot optimise the atomic variable away
completely -- any store counts as a side effect.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 17:41                                         ` Segher Boessenkool
@ 2007-08-17 18:38                                           ` Satyam Sharma
  2007-08-17 23:17                                             ` Segher Boessenkool
  2007-09-10 18:59                                           ` Christoph Lameter
  1 sibling, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 18:38 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang



On Fri, 17 Aug 2007, Segher Boessenkool wrote:

> > > atomic_dec() already has volatile behavior everywhere, so this is
> > > semantically
> > > okay, but this code (and any like it) should be calling cpu_relax() each
> > > iteration through the loop, unless there's a compelling reason not to.
> > > I'll
> > > allow that for some hardware drivers (possibly this one) such a compelling
> > > reason may exist, but hardware-independent core subsystems probably have
> > > no
> > > excuse.
> > 
> > No it does not have any volatile semantics. atomic_dec() can be reordered
> > at will by the compiler within the current basic unit if you do not add a
> > barrier.
> 
> "volatile" has nothing to do with reordering.

If you're talking of "volatile" the type-qualifier keyword, then
http://lkml.org/lkml/2007/8/16/231 (and sub-thread below it) shows
otherwise.

> atomic_dec() writes
> to memory, so it _does_ have "volatile semantics", implicitly, as
> long as the compiler cannot optimise the atomic variable away
> completely -- any store counts as a side effect.

I don't think an atomic_dec() implemented as an inline "asm volatile"
or one that uses a "forget" macro would have the same re-ordering
guarantees as an atomic_dec() that uses a volatile access cast.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 18:38                                           ` Satyam Sharma
@ 2007-08-17 23:17                                             ` Segher Boessenkool
  2007-08-17 23:55                                               ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 23:17 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>>> No it does not have any volatile semantics. atomic_dec() can be 
>>> reordered
>>> at will by the compiler within the current basic unit if you do not 
>>> add a
>>> barrier.
>>
>> "volatile" has nothing to do with reordering.
>
> If you're talking of "volatile" the type-qualifier keyword, then
> http://lkml.org/lkml/2007/8/16/231 (and sub-thread below it) shows
> otherwise.

I'm not sure what in that mail you mean, but anyway...

Yes, of course, the fact that "volatile" creates a side effect
prevents certain things from being reordered wrt the atomic_dec();
but the atomic_dec() has a side effect *already* so the volatile
doesn't change anything.

>> atomic_dec() writes
>> to memory, so it _does_ have "volatile semantics", implicitly, as
>> long as the compiler cannot optimise the atomic variable away
>> completely -- any store counts as a side effect.
>
> I don't think an atomic_dec() implemented as an inline "asm volatile"
> or one that uses a "forget" macro would have the same re-ordering
> guarantees as an atomic_dec() that uses a volatile access cast.

The "asm volatile" implementation does have exactly the same
reordering guarantees as the "volatile cast" thing, if that is
implemented by GCC in the "obvious" way.  Even a "plain" asm()
will do the same.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 23:17                                             ` Segher Boessenkool
@ 2007-08-17 23:55                                               ` Satyam Sharma
  2007-08-18  0:04                                                 ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17 23:55 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > > No it does not have any volatile semantics. atomic_dec() can be
> > > > reordered
> > > > at will by the compiler within the current basic unit if you do not add
> > > > a
> > > > barrier.
> > > 
> > > "volatile" has nothing to do with reordering.
> > 
> > If you're talking of "volatile" the type-qualifier keyword, then
> > http://lkml.org/lkml/2007/8/16/231 (and sub-thread below it) shows
> > otherwise.
> 
> I'm not sure what in that mail you mean, but anyway...
> 
> Yes, of course, the fact that "volatile" creates a side effect
> prevents certain things from being reordered wrt the atomic_dec();
> but the atomic_dec() has a side effect *already* so the volatile
> doesn't change anything.

That's precisely what that sub-thread (read down to the last mail
there, and not the first mail only) shows. So yes, "volatile" does
have something to do with re-ordering (as guaranteed by the C
standard).


> > > atomic_dec() writes
> > > to memory, so it _does_ have "volatile semantics", implicitly, as
> > > long as the compiler cannot optimise the atomic variable away
> > > completely -- any store counts as a side effect.
> > 
> > I don't think an atomic_dec() implemented as an inline "asm volatile"
> > or one that uses a "forget" macro would have the same re-ordering
> > guarantees as an atomic_dec() that uses a volatile access cast.
> 
> The "asm volatile" implementation does have exactly the same
> reordering guarantees as the "volatile cast" thing,

I don't think so.

> if that is
> implemented by GCC in the "obvious" way.  Even a "plain" asm()
> will do the same.

Read the relevant GCC documentation.

[ of course, if the (latest) GCC documentation is *yet again*
  wrong, then alright, not much I can do about it, is there. ]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 23:55                                               ` Satyam Sharma
@ 2007-08-18  0:04                                                 ` Segher Boessenkool
  2007-08-18  1:56                                                   ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-18  0:04 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>>>> atomic_dec() writes
>>>> to memory, so it _does_ have "volatile semantics", implicitly, as
>>>> long as the compiler cannot optimise the atomic variable away
>>>> completely -- any store counts as a side effect.
>>>
>>> I don't think an atomic_dec() implemented as an inline "asm volatile"
>>> or one that uses a "forget" macro would have the same re-ordering
>>> guarantees as an atomic_dec() that uses a volatile access cast.
>>
>> The "asm volatile" implementation does have exactly the same
>> reordering guarantees as the "volatile cast" thing,
>
> I don't think so.

"asm volatile" creates a side effect.  Side effects aren't
allowed to be reordered wrt sequence points.  This is exactly
the same reason as why "volatile accesses" cannot be reordered.

>> if that is
>> implemented by GCC in the "obvious" way.  Even a "plain" asm()
>> will do the same.
>
> Read the relevant GCC documentation.

I did, yes.

> [ of course, if the (latest) GCC documentation is *yet again*
>   wrong, then alright, not much I can do about it, is there. ]

There was (and is) nothing wrong about the "+m" documentation, if
that is what you are talking about.  It could be extended now, to
allow "+m" -- but that takes more than just "fixing" the documentation.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  0:04                                                 ` Segher Boessenkool
@ 2007-08-18  1:56                                                   ` Satyam Sharma
  2007-08-18  2:15                                                     ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-18  1:56 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > > > atomic_dec() writes
> > > > > to memory, so it _does_ have "volatile semantics", implicitly, as
> > > > > long as the compiler cannot optimise the atomic variable away
> > > > > completely -- any store counts as a side effect.
> > > > 
> > > > I don't think an atomic_dec() implemented as an inline "asm volatile"
> > > > or one that uses a "forget" macro would have the same re-ordering
> > > > guarantees as an atomic_dec() that uses a volatile access cast.
> > > 
> > > The "asm volatile" implementation does have exactly the same
> > > reordering guarantees as the "volatile cast" thing,
> > 
> > I don't think so.
> 
> "asm volatile" creates a side effect.

Yeah.

> Side effects aren't
> allowed to be reordered wrt sequence points.

Yeah.

> This is exactly
> the same reason as why "volatile accesses" cannot be reordered.

No, the code in that sub-thread I earlier pointed you at *WAS* written
such that there was a sequence point after all the uses of that volatile
access cast macro, and _therefore_ we were safe from re-ordering
(behaviour guaranteed by the C standard).

But you seem to be missing the simple and basic fact that:

	(something_that_has_side_effects || statement)
			!= something_that_is_a_sequence_point

Now, one cannot fantasize that "volatile asms" are also sequence points.
In fact such an argument would be sadly mistaken, because "sequence
points" are defined by the C standard and it'd be horribly wrong to
even _try_ claiming that the C standard knows about "volatile asms".

> > > if that is
> > > implemented by GCC in the "obvious" way.  Even a "plain" asm()
> > > will do the same.
> > 
> > Read the relevant GCC documentation.
> 
> I did, yes.

No, you didn't read:

http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

Read the bit about the need for artificial dependencies, and the example
given there:

	asm volatile("mtfsf 255,%0" : : "f" (fpenv));
	sum = x + y;

The docs explicitly say the addition can be moved before the "volatile
asm". Hopefully, as you know, (x + y) is an C "expression" and hence
a "sequence point" as defined by the standard. So the "volatile asm"
should've happened before it, right? Wrong.

I know there is also stuff written about "side-effects" there which
_could_ give the same semantic w.r.t. sequence points as the volatile
access casts, but hey, it's GCC's own documentation, you obviously can't
find fault with _me_ if there's wrong stuff written in there. Say that
to GCC ...

See, "volatile" C keyword, for all it's ill-definition and dodgy
semantics, is still at least given somewhat of a treatment in the C
standard (whose quality is ... ummm, sadly not always good and clear,
but unsurprisingly, still about 5,482 orders-of-magnitude times
better than GCC docs). Semantics of "volatile" as applies to inline
asm, OTOH? You're completely relying on the compiler for that ...

> > [ of course, if the (latest) GCC documentation is *yet again*
> >   wrong, then alright, not much I can do about it, is there. ]
> 
> There was (and is) nothing wrong about the "+m" documentation, if
> that is what you are talking about.  It could be extended now, to
> allow "+m" -- but that takes more than just "fixing" the documentation.

No, there was (and is) _everything_ wrong about the "+" documentation as
applies to memory-constrained operands. I don't give a whit if it's
some workaround in their gimplifier, or the other, that makes it possible
to use "+m" (like the current kernel code does). The docs suggest
otherwise, so there's obviously a clear disconnect between the docs and
actual GCC behaviour.

[ You seem to often take issue with _amazingly_ petty and pedantic things,
  by the way :-) ]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  1:56                                                   ` Satyam Sharma
@ 2007-08-18  2:15                                                     ` Segher Boessenkool
  2007-08-18  3:33                                                       ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-18  2:15 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>>>> The "asm volatile" implementation does have exactly the same
>>>> reordering guarantees as the "volatile cast" thing,
>>>
>>> I don't think so.
>>
>> "asm volatile" creates a side effect.
>
> Yeah.
>
>> Side effects aren't
>> allowed to be reordered wrt sequence points.
>
> Yeah.
>
>> This is exactly
>> the same reason as why "volatile accesses" cannot be reordered.
>
> No, the code in that sub-thread I earlier pointed you at *WAS* written
> such that there was a sequence point after all the uses of that 
> volatile
> access cast macro, and _therefore_ we were safe from re-ordering
> (behaviour guaranteed by the C standard).

And exactly the same is true for the "asm" version.

> Now, one cannot fantasize that "volatile asms" are also sequence 
> points.

Sure you can do that.  I don't though.

> In fact such an argument would be sadly mistaken, because "sequence
> points" are defined by the C standard and it'd be horribly wrong to
> even _try_ claiming that the C standard knows about "volatile asms".

That's nonsense.  GCC can extend the C standard any way they
bloody well please -- witness the fact that they added an
extra class of side effects...

>>> Read the relevant GCC documentation.
>>
>> I did, yes.
>
> No, you didn't read:
>
> http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
>
> Read the bit about the need for artificial dependencies, and the 
> example
> given there:
>
> 	asm volatile("mtfsf 255,%0" : : "f" (fpenv));
> 	sum = x + y;
>
> The docs explicitly say the addition can be moved before the "volatile
> asm". Hopefully, as you know, (x + y) is an C "expression" and hence
> a "sequence point" as defined by the standard.

The _end of a full expression_ is a sequence point, not every
expression.  And that is irrelevant here anyway.

It is perfectly fine to compute x+y any time before the
assignment; the C compiler is allowed to compute it _after_
the assignment even, if it could figure out how ;-)

x+y does not contain a side effect, you know.

> I know there is also stuff written about "side-effects" there which
> _could_ give the same semantic w.r.t. sequence points as the volatile
> access casts,

s/could/does/

> but hey, it's GCC's own documentation, you obviously can't
> find fault with _me_ if there's wrong stuff written in there. Say that
> to GCC ...

There's nothing wrong there.

> See, "volatile" C keyword, for all it's ill-definition and dodgy
> semantics, is still at least given somewhat of a treatment in the C
> standard (whose quality is ... ummm, sadly not always good and clear,
> but unsurprisingly, still about 5,482 orders-of-magnitude times
> better than GCC docs).

If you find any problems/shortcomings in the GCC documentation,
please file a PR, don't go whine on some unrelated mailing lists.
Thank you.

> Semantics of "volatile" as applies to inline
> asm, OTOH? You're completely relying on the compiler for that ...

Yes, and?  GCC promises the behaviour it has documented.

>>> [ of course, if the (latest) GCC documentation is *yet again*
>>>   wrong, then alright, not much I can do about it, is there. ]
>>
>> There was (and is) nothing wrong about the "+m" documentation, if
>> that is what you are talking about.  It could be extended now, to
>> allow "+m" -- but that takes more than just "fixing" the 
>> documentation.
>
> No, there was (and is) _everything_ wrong about the "+" documentation 
> as
> applies to memory-constrained operands. I don't give a whit if it's
> some workaround in their gimplifier, or the other, that makes it 
> possible
> to use "+m" (like the current kernel code does). The docs suggest
> otherwise, so there's obviously a clear disconnect between the docs and
> actual GCC behaviour.

The documentation simply doesn't say "+m" is allowed.  The code to
allow it was added for the benefit of people who do not read the
documentation.  Documentation for "+m" might get added later if it
is decided this [the code, not the documentation] is a sane thing
to have (which isn't directly obvious).

> [ You seem to often take issue with _amazingly_ petty and pedantic 
> things,
>   by the way :-) ]

If you're talking details, you better get them right.  Handwaving is
fine with me as long as you're not purporting you're not.

And I simply cannot stand false assertions.

You can always ignore me if _you_ take issue with _that_ :-)


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  2:15                                                     ` Segher Boessenkool
@ 2007-08-18  3:33                                                       ` Satyam Sharma
  2007-08-18  5:18                                                         ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-18  3:33 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > > > The "asm volatile" implementation does have exactly the same
> > > > > reordering guarantees as the "volatile cast" thing,
> > > > 
> > > > I don't think so.
> > > 
> > > "asm volatile" creates a side effect.
> > 
> > Yeah.
> > 
> > > Side effects aren't
> > > allowed to be reordered wrt sequence points.
> > 
> > Yeah.
> > 
> > > This is exactly
> > > the same reason as why "volatile accesses" cannot be reordered.
> > 
> > No, the code in that sub-thread I earlier pointed you at *WAS* written
> > such that there was a sequence point after all the uses of that volatile
> > access cast macro, and _therefore_ we were safe from re-ordering
> > (behaviour guaranteed by the C standard).
> 
> And exactly the same is true for the "asm" version.
> 
> > Now, one cannot fantasize that "volatile asms" are also sequence points.
> 
> Sure you can do that.  I don't though.
> 
> > In fact such an argument would be sadly mistaken, because "sequence
> > points" are defined by the C standard and it'd be horribly wrong to
> > even _try_ claiming that the C standard knows about "volatile asms".
> 
> That's nonsense.  GCC can extend the C standard any way they
> bloody well please -- witness the fact that they added an
> extra class of side effects...
> 
> > > > Read the relevant GCC documentation.
> > > 
> > > I did, yes.
> > 
> > No, you didn't read:
> > 
> > http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
> > 
> > Read the bit about the need for artificial dependencies, and the example
> > given there:
> > 
> > 	asm volatile("mtfsf 255,%0" : : "f" (fpenv));
> > 	sum = x + y;
> > 
> > The docs explicitly say the addition can be moved before the "volatile
> > asm". Hopefully, as you know, (x + y) is an C "expression" and hence
> > a "sequence point" as defined by the standard.
> 
> The _end of a full expression_ is a sequence point, not every
> expression.  And that is irrelevant here anyway.
> 
> It is perfectly fine to compute x+y any time before the
> assignment; the C compiler is allowed to compute it _after_
> the assignment even, if it could figure out how ;-)
> 
> x+y does not contain a side effect, you know.
> 
> > I know there is also stuff written about "side-effects" there which
> > _could_ give the same semantic w.r.t. sequence points as the volatile
> > access casts,
> 
> s/could/does/
> 
> > but hey, it's GCC's own documentation, you obviously can't
> > find fault with _me_ if there's wrong stuff written in there. Say that
> > to GCC ...
> 
> There's nothing wrong there.
> 
> > See, "volatile" C keyword, for all it's ill-definition and dodgy
> > semantics, is still at least given somewhat of a treatment in the C
> > standard (whose quality is ... ummm, sadly not always good and clear,
> > but unsurprisingly, still about 5,482 orders-of-magnitude times
> > better than GCC docs).
> 
> If you find any problems/shortcomings in the GCC documentation,
> please file a PR, don't go whine on some unrelated mailing lists.
> Thank you.
> 
> > Semantics of "volatile" as applies to inline
> > asm, OTOH? You're completely relying on the compiler for that ...
> 
> Yes, and?  GCC promises the behaviour it has documented.

LOTS there, which obviously isn't correct, but which I'll reply to later,
easier stuff first. (call this "handwaving" if you want, but don't worry,
I /will/ bother myself to reply)


> > > > [ of course, if the (latest) GCC documentation is *yet again*
> > > >   wrong, then alright, not much I can do about it, is there. ]
> > > 
> > > There was (and is) nothing wrong about the "+m" documentation, if
> > > that is what you are talking about.  It could be extended now, to
> > > allow "+m" -- but that takes more than just "fixing" the documentation.
> > 
> > No, there was (and is) _everything_ wrong about the "+" documentation as
> > applies to memory-constrained operands. I don't give a whit if it's
> > some workaround in their gimplifier, or the other, that makes it possible
> > to use "+m" (like the current kernel code does). The docs suggest
> > otherwise, so there's obviously a clear disconnect between the docs and
> > actual GCC behaviour.
> 
> The documentation simply doesn't say "+m" is allowed.  The code to
> allow it was added for the benefit of people who do not read the
> documentation.  Documentation for "+m" might get added later if it
> is decided this [the code, not the documentation] is a sane thing
> to have (which isn't directly obvious).

Huh?

"If the (current) documentation doesn't match up with the (current)
code, then _at least one_ of them has to be (as of current) wrong."

I wonder how could you even try to disagree with that.

And I didn't go whining about this ... you asked me. (I think I'd said
something to the effect of GCC docs are often wrong, which is true,
but probably you feel saying that is "not allowed" on non-gcc lists?)

As for the "PR" you're requesting me to file with GCC for this, that
gcc-patches@ thread did precisely that and more (submitted a patch to
said documentation -- and no, saying "documentation might get added
later" is totally bogus and nonsensical -- documentation exists to
document current behaviour, not past). But come on, this is wholly
petty. I wouldn't have replied, really, if you weren't so provoking.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  3:33                                                       ` Satyam Sharma
@ 2007-08-18  5:18                                                         ` Segher Boessenkool
  2007-08-18 13:20                                                           ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-18  5:18 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>> The documentation simply doesn't say "+m" is allowed.  The code to
>> allow it was added for the benefit of people who do not read the
>> documentation.  Documentation for "+m" might get added later if it
>> is decided this [the code, not the documentation] is a sane thing
>> to have (which isn't directly obvious).
>
> Huh?
>
> "If the (current) documentation doesn't match up with the (current)
> code, then _at least one_ of them has to be (as of current) wrong."
>
> I wonder how could you even try to disagree with that.

Easy.

The GCC documentation you're referring to is the user's manual.
See the blurb on the first page:

"This manual documents how to use the GNU compilers, as well as their
features and incompatibilities, and how to report bugs.  It corresponds
to GCC version 4.3.0.  The internals of the GNU compilers, including
how to port them to new targets and some information about how to write
front ends for new languages, are documented in a separate manual."

_How to use_.  This documentation doesn't describe in minute detail
everything the compiler does (see the source code for that -- no, it
isn't described in the internals manual either).

If it doesn't tell you how to use "+m", and even tells you _not_ to
use it, maybe that is what it means to say?  It doesn't mean "+m"
doesn't actually do something.  It also doesn't mean it does what
you think it should do.  It might do just that of course.  But treating
writing C code as an empirical science isn't such a smart idea.

> And I didn't go whining about this ... you asked me. (I think I'd said
> something to the effect of GCC docs are often wrong,

No need to guess at what you said, even if you managed to delete
your own mail already, there are plenty of free web-based archives
around.  You said:

> See, "volatile" C keyword, for all it's ill-definition and dodgy
> semantics, is still at least given somewhat of a treatment in the C
> standard (whose quality is ... ummm, sadly not always good and clear,
> but unsurprisingly, still about 5,482 orders-of-magnitude times
> better than GCC docs).

and that to me reads as complaining that the ISO C standard "isn't
very good" and that the GCC documentation is 10**5482 times worse
even.  Which of course is hyperbole and cannot be true.  It also
isn't helpful in any way or form for anyone on this list.  I call
that whining.

> which is true,

Yes, documentation of that size often has shortcomings.  No surprise
there.  However, great effort is made to make it better documentation,
and especially to keep it up to date; if you find any errors or
omissions, please report them.  There are many ways how to do that,
see the GCC homepage.</end-of-marketing-blurb>

> but probably you feel saying that is "not allowed" on non-gcc lists?)

You're allowed to say whatever you want.  Let's have a quote again
shall we?  I said:

> If you find any problems/shortcomings in the GCC documentation,
> please file a PR, don't go whine on some unrelated mailing lists.
> Thank you.

I read that as a friendly request, not a prohibition.  Well maybe
not actually friendly, more a bit angry.  A request, either way.

> As for the "PR"

"Problem report", a bugzilla ticket.  Sorry for using terminology
unknown to you.

> you're requesting me to file with GCC for this, that
> gcc-patches@ thread did precisely that

Actually not -- PRs make sure issues aren't forgotten (although
they might gather dust, sure).  But yes, submitting patches is a
Great Thing(tm).

> and more (submitted a patch to
> said documentation -- and no, saying "documentation might get added
> later" is totally bogus and nonsensical -- documentation exists to
> document current behaviour, not past).

When code like you want to write becomes a supported feature, that
will be reflected in the user manual.  It is completely nonsensical
to expect everything that is *not* a supported feature to be mentioned
there.

> I wouldn't have replied, really, if you weren't so provoking.

Hey, maybe that character trait is good for something, then.
Now to build a business plan around it...

Segher

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-18  5:18                                                         ` Segher Boessenkool
@ 2007-08-18 13:20                                                           ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-18 13:20 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, zlynx, Andrew Morton,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

[ LOL, you _are_ shockingly petty! ]


On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > The documentation simply doesn't say "+m" is allowed.  The code to
> > > allow it was added for the benefit of people who do not read the
> > > documentation.  Documentation for "+m" might get added later if it
> > > is decided this [the code, not the documentation] is a sane thing
> > > to have (which isn't directly obvious).
> > 
> > Huh?
> > 
> > "If the (current) documentation doesn't match up with the (current)
> > code, then _at least one_ of them has to be (as of current) wrong."
> > 
> > I wonder how could you even try to disagree with that.
> 
> Easy.
> 
> The GCC documentation you're referring to is the user's manual.
> See the blurb on the first page:
> 
> "This manual documents how to use the GNU compilers, as well as their
> features and incompatibilities, and how to report bugs.  It corresponds
> to GCC version 4.3.0.  The internals of the GNU compilers, including
> how to port them to new targets and some information about how to write
> front ends for new languages, are documented in a separate manual."
> 
> _How to use_.  This documentation doesn't describe in minute detail
> everything the compiler does (see the source code for that -- no, it
> isn't described in the internals manual either).

Wow, now that's a nice "disclaimer". By your (poor) standards of writing
documentation, one can as well write any factually incorrect stuff that
one wants in a document once you've got such a blurb in place :-)


> If it doesn't tell you how to use "+m", and even tells you _not_ to
> use it, maybe that is what it means to say?  It doesn't mean "+m"
> doesn't actually do something.  It also doesn't mean it does what
> you think it should do.  It might do just that of course.  But treating
> writing C code as an empirical science isn't such a smart idea.

Oh, really? Considering how much is (left out of being) documented, often
one would reasonably have to experimentally see (with testcases) how the
compiler behaves for some given code. Well, at least _I_ do it often
(several others on this list do as well), and I think there's everything
smart about it rather than having to read gcc sources -- I'd be surprised
(unless you have infinite free time on your hands, which does look like
teh case actually) if someone actually prefers reading gcc sources first
to know what/how gcc does something for some given code, rather than
simply write it out, compile and look the generated code (saves time for
those who don't have an infinite amount of it).


> > And I didn't go whining about this ... you asked me. (I think I'd said
> > something to the effect of GCC docs are often wrong,
> 
> No need to guess at what you said, even if you managed to delete
> your own mail already, there are plenty of free web-based archives
> around.  You said:
> 
> > See, "volatile" C keyword, for all it's ill-definition and dodgy
> > semantics, is still at least given somewhat of a treatment in the C
> > standard (whose quality is ... ummm, sadly not always good and clear,
> > but unsurprisingly, still about 5,482 orders-of-magnitude times
> > better than GCC docs).

Try _reading_ what I said there, for a change, dude. I'd originally only
said "unless GCC's docs is yet again wrong" ... then _you_ asked me what,
after which this discussion began and I wrote the above [which I fully
agree with -- so what if I used hyperbole in my sentence (yup, that was
intended, and obviously, exaggeration), am I not even allowed to do that?
Man, you're a Nazi or what ...] I didn't go whining about on my own as
you'd had earlier suggested, until _you_ asked me.

[ Ick, I somehow managed to reply this ... this is such a ...
  *disgustingly* petty argument you made here. ]


> > which is true,
> 
> Yes, documentation of that size often has shortcomings.  No surprise
> there.  However, great effort is made to make it better documentation,
> and especially to keep it up to date; if you find any errors or
> omissions, please report them.  There are many ways how to do that,
> see the GCC homepage.</end-of-marketing-blurb>
                         ^^^^^^^^^^^^^^^^^^^^^^

Looks like you even get paid :-)


> > but probably you feel saying that is "not allowed" on non-gcc lists?)
> 
> [amazingly pointless stuff snipped]
> 
> > As for the "PR"
> > you're requesting me to file with GCC for this, that
> > gcc-patches@ thread did precisely that
> 
> [more amazingly pointless stuff snipped]
> 
> > and more (submitted a patch to
> > said documentation -- and no, saying "documentation might get added
> > later" is totally bogus and nonsensical -- documentation exists to
> > document current behaviour, not past).
> 
> When code like you want to write becomes a supported feature, that
> will be reflected in the user manual.  It is completely nonsensical
> to expect everything that is *not* a supported feature to be mentioned
> there.

What crap. It is _perfectly reasonable_ to expect (current) documentation
to keep up with (current) code behaviour. In fact trying to justify such
a state is completely bogus and nonsensical.


Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 17:41                                         ` Segher Boessenkool
  2007-08-17 18:38                                           ` Satyam Sharma
@ 2007-09-10 18:59                                           ` Christoph Lameter
  2007-09-10 20:54                                             ` Paul E. McKenney
  2007-09-11  2:27                                             ` Segher Boessenkool
  1 sibling, 2 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-09-10 18:59 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Paul Mackerras, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

On Fri, 17 Aug 2007, Segher Boessenkool wrote:

> "volatile" has nothing to do with reordering.  atomic_dec() writes
> to memory, so it _does_ have "volatile semantics", implicitly, as
> long as the compiler cannot optimise the atomic variable away
> completely -- any store counts as a side effect.

Stores can be reordered. Only x86 has (mostly) implicit write ordering. So 
no atomic_dec has no volatile semantics and may be reordered on a variety 
of processors. Writes to memory may not follow code order on several 
processors.



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 18:59                                           ` Christoph Lameter
@ 2007-09-10 20:54                                             ` Paul E. McKenney
  2007-09-10 21:36                                               ` Christoph Lameter
  2007-09-11  2:27                                             ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2007-09-10 20:54 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Segher Boessenkool, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	David Miller, Ilpo Järvinen, ak, cfriesen, rpjday, Netdev,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, Linus Torvalds, wensong, wjiang

On Mon, Sep 10, 2007 at 11:59:29AM -0700, Christoph Lameter wrote:
> On Fri, 17 Aug 2007, Segher Boessenkool wrote:
> 
> > "volatile" has nothing to do with reordering.  atomic_dec() writes
> > to memory, so it _does_ have "volatile semantics", implicitly, as
> > long as the compiler cannot optimise the atomic variable away
> > completely -- any store counts as a side effect.
> 
> Stores can be reordered. Only x86 has (mostly) implicit write ordering. So 
> no atomic_dec has no volatile semantics and may be reordered on a variety 
> of processors. Writes to memory may not follow code order on several 
> processors.

The one exception to this being the case where process-level code is
communicating to an interrupt handler running on that same CPU -- on
all CPUs that I am aware of, a given CPU always sees its own writes
in order.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 20:54                                             ` Paul E. McKenney
@ 2007-09-10 21:36                                               ` Christoph Lameter
  2007-09-10 21:50                                                 ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Christoph Lameter @ 2007-09-10 21:36 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Segher Boessenkool, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	David Miller, Ilpo Järvinen, ak, cfriesen, rpjday, Netdev,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, Linus Torvalds, wensong, wjiang

On Mon, 10 Sep 2007, Paul E. McKenney wrote:

> The one exception to this being the case where process-level code is
> communicating to an interrupt handler running on that same CPU -- on
> all CPUs that I am aware of, a given CPU always sees its own writes
> in order.

Yes but that is due to the code path effectively continuing in the 
interrupt handler. The cpu makes sure that op codes being executed always 
see memory in a consistent way. The basic ordering problem with out of 
order writes is therefore coming from other processors concurrently 
executing code and holding variables in registers that are modified 
elsewhere. The only solution that I know of are one or the other form of 
barrier.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 21:36                                               ` Christoph Lameter
@ 2007-09-10 21:50                                                 ` Paul E. McKenney
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-09-10 21:50 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Segher Boessenkool, Paul Mackerras, heiko.carstens, horms,
	Stefan Richter, Satyam Sharma, Linux Kernel Mailing List,
	David Miller, Ilpo Järvinen, ak, cfriesen, rpjday, Netdev,
	jesper.juhl, linux-arch, Andrew Morton, zlynx, schwidefsky,
	Chris Snook, Herbert Xu, Linus Torvalds, wensong, wjiang

On Mon, Sep 10, 2007 at 02:36:26PM -0700, Christoph Lameter wrote:
> On Mon, 10 Sep 2007, Paul E. McKenney wrote:
> 
> > The one exception to this being the case where process-level code is
> > communicating to an interrupt handler running on that same CPU -- on
> > all CPUs that I am aware of, a given CPU always sees its own writes
> > in order.
> 
> Yes but that is due to the code path effectively continuing in the 
> interrupt handler. The cpu makes sure that op codes being executed always 
> see memory in a consistent way. The basic ordering problem with out of 
> order writes is therefore coming from other processors concurrently 
> executing code and holding variables in registers that are modified 
> elsewhere. The only solution that I know of are one or the other form of 
> barrier.

So we are agreed then -- volatile accesses may be of some assistance when
interacting with interrupt handlers running on the same CPU (presumably
when using per-CPU variables), but are generally useless when sharing
variables among CPUs.  Correct?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-09-10 18:59                                           ` Christoph Lameter
  2007-09-10 20:54                                             ` Paul E. McKenney
@ 2007-09-11  2:27                                             ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-09-11  2:27 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, David Miller,
	Paul E. McKenney, Ilpo Järvinen, ak, cfriesen, rpjday,
	Netdev, jesper.juhl, linux-arch, Andrew Morton, zlynx,
	schwidefsky, Chris Snook, Herbert Xu, Linus Torvalds, wensong,
	wjiang

>> "volatile" has nothing to do with reordering.  atomic_dec() writes
>> to memory, so it _does_ have "volatile semantics", implicitly, as
>> long as the compiler cannot optimise the atomic variable away
>> completely -- any store counts as a side effect.
>
> Stores can be reordered. Only x86 has (mostly) implicit write ordering.
> So no atomic_dec has no volatile semantics

Read again: I said the C "volatile" construct has nothing to do
with CPU memory access reordering.

> and may be reordered on a variety
> of processors. Writes to memory may not follow code order on several
> processors.

The _compiler_ isn't allowed to reorder things here.  Yes, of course
you do need stronger barriers for many purposes, volatile isn't all
that useful you know.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 19:55                                     ` Chris Snook
@ 2007-08-16 21:08                                         ` Luck, Tony
  2007-08-16 21:08                                         ` Luck, Tony
  1 sibling, 0 replies; 1651+ messages in thread
From: Luck, Tony @ 2007-08-16 21:08 UTC (permalink / raw)
  To: Chris Snook, Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

>> 6674:   while (atomic_read(&j->DSPWrite) > 0)
>> 6675-           atomic_dec(&j->DSPWrite);
>
> If the maintainer of this code doesn't see a compelling reason to add 
> cpu_relax() in this loop, then it should be patched.

Shouldn't it be just re-written without the loop:

	if ((tmp = atomic_read(&j->DSPWrite)) > 0)
		atomic_sub(&j->DSPWrite, tmp);

Has all the same bugs, but runs much faster :-)

-Tony

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH 0/24] make atomic_read() behave consistently across all architectures
@ 2007-08-16 21:08                                         ` Luck, Tony
  0 siblings, 0 replies; 1651+ messages in thread
From: Luck, Tony @ 2007-08-16 21:08 UTC (permalink / raw)
  To: Chris Snook, Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

>> 6674:   while (atomic_read(&j->DSPWrite) > 0)
>> 6675-           atomic_dec(&j->DSPWrite);
>
> If the maintainer of this code doesn't see a compelling reason to add 
> cpu_relax() in this loop, then it should be patched.

Shouldn't it be just re-written without the loop:

	if ((tmp = atomic_read(&j->DSPWrite)) > 0)
		atomic_sub(&j->DSPWrite, tmp);

Has all the same bugs, but runs much faster :-)

-Tony

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 14:48                                   ` Ilpo Järvinen
  2007-08-16 16:19                                     ` Stefan Richter
  2007-08-16 19:55                                     ` Chris Snook
@ 2007-08-16 19:55                                     ` Chris Snook
  2 siblings, 0 replies; 1651+ messages in thread
From: Chris Snook @ 2007-08-16 19:55 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Herbert Xu, Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, Netdev, Andrew Morton, ak,
	heiko.carstens, David Miller, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher

Ilpo Järvinen wrote:
> On Thu, 16 Aug 2007, Herbert Xu wrote:
> 
>> We've been through that already.  If it's a busy-wait it
>> should use cpu_relax. 
> 
> I looked around a bit by using some command lines and ended up wondering 
> if these are equal to busy-wait case (and should be fixed) or not:
> 
> ./drivers/telephony/ixj.c
> 6674:   while (atomic_read(&j->DSPWrite) > 0)
> 6675-           atomic_dec(&j->DSPWrite);
> 
> ...besides that, there are couple of more similar cases in the same file 
> (with braces)...

atomic_dec() already has volatile behavior everywhere, so this is 
semantically okay, but this code (and any like it) should be calling 
cpu_relax() each iteration through the loop, unless there's a compelling 
reason not to.  I'll allow that for some hardware drivers (possibly this 
one) such a compelling reason may exist, but hardware-independent core 
subsystems probably have no excuse.

If the maintainer of this code doesn't see a compelling reason not to 
add cpu_relax() in this loop, then it should be patched.

	-- Chris

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  4:11                           ` Paul Mackerras
  2007-08-16  5:39                             ` Herbert Xu
@ 2007-08-16 18:54                             ` Christoph Lameter
  2007-08-16 20:07                               ` Paul E. McKenney
  1 sibling, 1 reply; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-16 18:54 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Herbert Xu, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, 16 Aug 2007, Paul Mackerras wrote:

> The uses of atomic_read where one might want it to allow caching of
> the result seem to me to fall into 3 categories:
> 
> 1. Places that are buggy because of a race arising from the way it's
>    used.
> 
> 2. Places where there is a race but it doesn't matter because we're
>    doing some clever trick.
> 
> 3. Places where there is some locking in place that eliminates any
>    potential race.
> 
> In case 1, adding volatile won't solve the race, of course, but it's
> hard to argue that we shouldn't do something because it will slow down
> buggy code.  Case 2 is hopefully pretty rare and accompanied by large
> comment blocks, and in those cases caching the result of atomic_read
> explicitly in a local variable would probably make the code clearer.
> And in case 3 there is no reason to use atomic_t at all; we might as
> well just use an int.

In 2 + 3 you may increment the atomic variable in some places. The value 
of the atomic variable may not matter because you only do optimizations.

Checking a atomic_t for a definite state has to involve either
some side conditions (lock only taken if refcount is <= 0 or so) or done 
by changing the state (see f.e. atomic_inc_unless_zero).

> So I don't see any good reason to make the atomic API more complex by
> having "volatile" and "non-volatile" versions of atomic_read.  It
> should just have the "volatile" behaviour.

If you want to make it less complex then drop volatile which causes weird 
side effects without solving any problems as you just pointed out.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 18:54                             ` Christoph Lameter
@ 2007-08-16 20:07                               ` Paul E. McKenney
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul E. McKenney @ 2007-08-16 20:07 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Herbert Xu, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

On Thu, Aug 16, 2007 at 11:54:54AM -0700, Christoph Lameter wrote:
> On Thu, 16 Aug 2007, Paul Mackerras wrote:
> > So I don't see any good reason to make the atomic API more complex by
> > having "volatile" and "non-volatile" versions of atomic_read.  It
> > should just have the "volatile" behaviour.
> 
> If you want to make it less complex then drop volatile which causes weird 
> side effects without solving any problems as you just pointed out.

The other set of problems are communication between process context
and interrupt/NMI handlers.  Volatile does help here.  And the performance
impact of volatile is pretty near zero, so why have the non-volatile
variant?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  2:33                       ` Satyam Sharma
  2007-08-16  3:01                         ` Satyam Sharma
@ 2007-08-16  3:05                         ` Paul Mackerras
  2007-08-16 19:39                           ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Paul Mackerras @ 2007-08-16  3:05 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher

Satyam Sharma writes:

> I can't speak for this particular case, but there could be similar code
> examples elsewhere, where we do the atomic ops on an atomic_t object
> inside a higher-level locking scheme that would take care of the kind of
> problem you're referring to here. It would be useful for such or similar
> code if the compiler kept the value of that atomic object in a register.

If there is a higher-level locking scheme then there is no point to
using atomic_t variables.  Atomic_t is specifically for the situation
where multiple CPUs are updating a variable without locking.

Paul.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:05                         ` Paul Mackerras
@ 2007-08-16 19:39                           ` Segher Boessenkool
  0 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16 19:39 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, Paul E. McKenney,
	netdev, ak, cfriesen, rpjday, jesper.juhl, linux-arch,
	Andrew Morton, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

>> I can't speak for this particular case, but there could be similar 
>> code
>> examples elsewhere, where we do the atomic ops on an atomic_t object
>> inside a higher-level locking scheme that would take care of the kind 
>> of
>> problem you're referring to here. It would be useful for such or 
>> similar
>> code if the compiler kept the value of that atomic object in a 
>> register.
>
> If there is a higher-level locking scheme then there is no point to
> using atomic_t variables.  Atomic_t is specifically for the situation
> where multiple CPUs are updating a variable without locking.

And don't forget about the case where it is an I/O device that is
updating the memory (in buffer descriptors or similar).  The driver
needs to do a "volatile" atomic read to get at the most recent version
of that data, which can be important for optimising latency (or 
throughput
even).  There is no other way the kernel can get that info -- doing an
MMIO read is way way too expensive.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  1:51                 ` Paul Mackerras
  2007-08-16  2:00                   ` Herbert Xu
@ 2007-08-16  2:07                   ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16  2:07 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Satyam Sharma, Linux Kernel Mailing List, Paul E. McKenney,
	netdev, ak, cfriesen, rpjday, jesper.juhl, linux-arch,
	Andrew Morton, zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

>> A volatile default would disable optimizations for atomic_read.
>> atomic_read without volatile would allow for full optimization by the
>> compiler. Seems that this is what one wants in many cases.
>
> Name one such case.
>
> An atomic_read should do a load from memory.  If the programmer puts
> an atomic_read() in the code then the compiler should emit a load for
> it, not re-use a value returned by a previous atomic_read.  I do not
> believe it would ever be useful for the compiler to collapse two
> atomic_read statements into a single load.

An atomic_read() implemented as a "normal" C variable read would
allow that read to be combined with another "normal" read from
that variable.  This could perhaps be marginally useful, although
I'd bet you cannot see it unless counting cycles on a simulator
or counting bits in the binary size.

With an asm() implementation, the compiler can not do this; with
a "volatile" implementation (either volatile variable or volatile-cast),
this invokes undefined behaviour (in both C and GCC).


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 23:22         ` Paul Mackerras
  2007-08-16  0:26           ` Christoph Lameter
@ 2007-08-24 12:50           ` Denys Vlasenko
  2007-08-24 17:15             ` Christoph Lameter
  1 sibling, 1 reply; 1651+ messages in thread
From: Denys Vlasenko @ 2007-08-24 12:50 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Thursday 16 August 2007 00:22, Paul Mackerras wrote:
> Satyam Sharma writes:
> In the kernel we use atomic variables in precisely those situations
> where a variable is potentially accessed concurrently by multiple
> CPUs, and where each CPU needs to see updates done by other CPUs in a
> timely fashion.  That is what they are for.  Therefore the compiler
> must not cache values of atomic variables in registers; each
> atomic_read must result in a load and each atomic_set must result in a
> store.  Anything else will just lead to subtle bugs.

Amen.
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-24 12:50           ` Denys Vlasenko
@ 2007-08-24 17:15             ` Christoph Lameter
  2007-08-24 20:21               ` Denys Vlasenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-24 17:15 UTC (permalink / raw)
  To: Denys Vlasenko
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Fri, 24 Aug 2007, Denys Vlasenko wrote:

> On Thursday 16 August 2007 00:22, Paul Mackerras wrote:
> > Satyam Sharma writes:
> > In the kernel we use atomic variables in precisely those situations
> > where a variable is potentially accessed concurrently by multiple
> > CPUs, and where each CPU needs to see updates done by other CPUs in a
> > timely fashion.  That is what they are for.  Therefore the compiler
> > must not cache values of atomic variables in registers; each
> > atomic_read must result in a load and each atomic_set must result in a
> > store.  Anything else will just lead to subtle bugs.
> 
> Amen.

A "timely" fashion? One cannot rely on something like that when coding. 
The visibility of updates is insured by barriers and not by some fuzzy 
notion of "timeliness".

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-24 17:15             ` Christoph Lameter
@ 2007-08-24 20:21               ` Denys Vlasenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Denys Vlasenko @ 2007-08-24 20:21 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul Mackerras, Satyam Sharma, Stefan Richter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Friday 24 August 2007 18:15, Christoph Lameter wrote:
> On Fri, 24 Aug 2007, Denys Vlasenko wrote:
> > On Thursday 16 August 2007 00:22, Paul Mackerras wrote:
> > > Satyam Sharma writes:
> > > In the kernel we use atomic variables in precisely those situations
> > > where a variable is potentially accessed concurrently by multiple
> > > CPUs, and where each CPU needs to see updates done by other CPUs in a
> > > timely fashion.  That is what they are for.  Therefore the compiler
> > > must not cache values of atomic variables in registers; each
> > > atomic_read must result in a load and each atomic_set must result in a
> > > store.  Anything else will just lead to subtle bugs.
> >
> > Amen.
>
> A "timely" fashion? One cannot rely on something like that when coding.
> The visibility of updates is insured by barriers and not by some fuzzy
> notion of "timeliness".

But here you do have some notion of time:

	while (atomic_read(&x))
		continue;

"continue when other CPU(s) decrement it down to zero".
If "read" includes an insn which accesses RAM, you will
see "new" value sometime after other CPU decrements it.
"Sometime after" is on the order of nanoseconds here.
It is a valid concept of time, right?

The whole confusion is about whether atomic_read implies
"read from RAM" or not. I am in a camp which thinks it does.
You are in an opposite one.

We just need a less ambiguous name.

What about this:

/**
 * atomic_read - read atomic variable
 * @v: pointer of type atomic_t
 *
 * Atomically reads the value of @v.
 * No compiler barrier implied.
 */
#define atomic_read(v)          ((v)->counter)

+/**
+ * atomic_read_uncached - read atomic variable from memory
+ * @v: pointer of type atomic_t
+ *
+ * Atomically reads the value of @v. This is guaranteed to emit an insn
+ * which accesses memory, atomically. No ordering guarantees!
+ */
+#define atomic_read_uncached(v)  asm_or_volatile_ptr_magic(v)

I was thinking of s/atomic_read/atomic_get/ too, but it implies "taking"
atomic a-la get_cpu()...
--
vda

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 12:31       ` Satyam Sharma
                           ` (2 preceding siblings ...)
  2007-08-15 23:22         ` Paul Mackerras
@ 2007-08-16  3:37         ` Bill Fink
  2007-08-16  5:20           ` Satyam Sharma
  3 siblings, 1 reply; 1651+ messages in thread
From: Bill Fink @ 2007-08-16  3:37 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

On Wed, 15 Aug 2007, Satyam Sharma wrote:

> (C)
> $ cat tp3.c
> int a;
> 
> void func(void)
> {
> 	*(volatile int *)&a = 10;
> 	*(volatile int *)&a = 20;
> }
> $ gcc -Os -S tp3.c
> $ cat tp3.s
> ...
> movl    $10, a
> movl    $20, a
> ...

I'm curious about one minor tangential point.  Why, instead of:

	b = *(volatile int *)&a;

why can't this just be expressed as:

	b = (volatile int)a;

Isn't it the contents of a that's volatile, i.e. it's value can change
invisibly to the compiler, and that's why you want to force a read from
memory?  Why do you need the "*(volatile int *)&" construct?

						-Bill

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  3:37         ` Bill Fink
@ 2007-08-16  5:20           ` Satyam Sharma
  2007-08-16  5:57             ` Satyam Sharma
  2007-08-16 20:50             ` Segher Boessenkool
  0 siblings, 2 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  5:20 UTC (permalink / raw)
  To: Bill Fink
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

Hi Bill,

On Wed, 15 Aug 2007, Bill Fink wrote:

> On Wed, 15 Aug 2007, Satyam Sharma wrote:
> 
> > (C)
> > $ cat tp3.c
> > int a;
> > 
> > void func(void)
> > {
> > 	*(volatile int *)&a = 10;
> > 	*(volatile int *)&a = 20;
> > }
> > $ gcc -Os -S tp3.c
> > $ cat tp3.s
> > ...
> > movl    $10, a
> > movl    $20, a
> > ...
> 
> I'm curious about one minor tangential point.  Why, instead of:
> 
> 	b = *(volatile int *)&a;
> 
> why can't this just be expressed as:
> 
> 	b = (volatile int)a;
> 
> Isn't it the contents of a that's volatile, i.e. it's value can change
> invisibly to the compiler, and that's why you want to force a read from
> memory?  Why do you need the "*(volatile int *)&" construct?

"b = (volatile int)a;" doesn't help us because a cast to a qualified type
has the same effect as a cast to an unqualified version of that type, as
mentioned in 6.5.4:4 (footnote 86) of the standard. Note that "volatile"
is a type-qualifier, not a type itself, so a cast of the _object_ itself
to a qualified-type i.e. (volatile int) would not make the access itself
volatile-qualified.

To serve our purposes, it is necessary for us to take the address of this
(non-volatile) object, cast the resulting _pointer_ to the corresponding
volatile-qualified pointer-type, and then dereference it. This makes that
particular _access_ be volatile-qualified, without the object itself being
such. Also note that the (dereferenced) result is also a valid lvalue and
hence can be used in "*(volatile int *)&a = b;" kind of construction
(which we use for the atomic_set case).

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:20           ` Satyam Sharma
@ 2007-08-16  5:57             ` Satyam Sharma
  2007-08-16  9:25               ` Satyam Sharma
  2007-08-16 21:00               ` Segher Boessenkool
  2007-08-16 20:50             ` Segher Boessenkool
  1 sibling, 2 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  5:57 UTC (permalink / raw)
  To: Bill Fink
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney



On Thu, 16 Aug 2007, Satyam Sharma wrote:

> Hi Bill,
> 
> 
> On Wed, 15 Aug 2007, Bill Fink wrote:
> 
> > On Wed, 15 Aug 2007, Satyam Sharma wrote:
> > 
> > > (C)
> > > $ cat tp3.c
> > > int a;
> > > 
> > > void func(void)
> > > {
> > > 	*(volatile int *)&a = 10;
> > > 	*(volatile int *)&a = 20;
> > > }
> > > $ gcc -Os -S tp3.c
> > > $ cat tp3.s
> > > ...
> > > movl    $10, a
> > > movl    $20, a
> > > ...
> > 
> > I'm curious about one minor tangential point.  Why, instead of:
> > 
> > 	b = *(volatile int *)&a;
> > 
> > why can't this just be expressed as:
> > 
> > 	b = (volatile int)a;
> > 
> > Isn't it the contents of a that's volatile, i.e. it's value can change
> > invisibly to the compiler, and that's why you want to force a read from
> > memory?  Why do you need the "*(volatile int *)&" construct?
> 
> "b = (volatile int)a;" doesn't help us because a cast to a qualified type
> has the same effect as a cast to an unqualified version of that type, as
> mentioned in 6.5.4:4 (footnote 86) of the standard. Note that "volatile"
> is a type-qualifier, not a type itself, so a cast of the _object_ itself
> to a qualified-type i.e. (volatile int) would not make the access itself
> volatile-qualified.
> 
> To serve our purposes, it is necessary for us to take the address of this
> (non-volatile) object, cast the resulting _pointer_ to the corresponding
> volatile-qualified pointer-type, and then dereference it. This makes that
> particular _access_ be volatile-qualified, without the object itself being
> such. Also note that the (dereferenced) result is also a valid lvalue and
> hence can be used in "*(volatile int *)&a = b;" kind of construction
> (which we use for the atomic_set case).

Here, I should obviously admit that the semantics of *(volatile int *)&
aren't any neater or well-defined in the _language standard_ at all. The
standard does say (verbatim) "precisely what constitutes as access to
object of volatile-qualified type is implementation-defined", but GCC
does help us out here by doing the right thing. Accessing the non-volatile
object there using the volatile-qualified pointer-type cast makes GCC
treat the object stored at that memory address itself as if it were a 
volatile object, thus making the access end up having what we're calling
as "volatility" semantics here.

Honestly, given such confusion, and the propensity of the "volatile"
type-qualifier keyword to be ill-defined (or at least poorly understood,
often inconsistently implemented), I'd (again) express my opinion that it
would be best to avoid its usage, given other alternatives do exist.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:57             ` Satyam Sharma
@ 2007-08-16  9:25               ` Satyam Sharma
  2007-08-16 21:00               ` Segher Boessenkool
  1 sibling, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-16  9:25 UTC (permalink / raw)
  To: Bill Fink
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney

[ Bill tells me in private communication he gets this already, but I
  think it's more complicated than the shoddy explanation I'd made
  earlier so would wish to make this clearer in detail one last time,
  for the benefit of others listening in or reading the archives. ]

On Thu, 16 Aug 2007, Satyam Sharma wrote:

> On Thu, 16 Aug 2007, Satyam Sharma wrote:
> [...]
> > On Wed, 15 Aug 2007, Bill Fink wrote:
> > > [...]
> > > I'm curious about one minor tangential point.  Why, instead of:
> > > 
> > > 	b = *(volatile int *)&a;
> > > 
> > > why can't this just be expressed as:
> > > 
> > > 	b = (volatile int)a;
> > > 
> > > Isn't it the contents of a that's volatile, i.e. it's value can change
> > > invisibly to the compiler, and that's why you want to force a read from
> > > memory?  Why do you need the "*(volatile int *)&" construct?
> > 
> > "b = (volatile int)a;" doesn't help us because a cast to a qualified type
> > has the same effect as a cast to an unqualified version of that type, as
> > mentioned in 6.5.4:4 (footnote 86) of the standard. Note that "volatile"
> > is a type-qualifier, not a type itself, so a cast of the _object_ itself
> > to a qualified-type i.e. (volatile int) would not make the access itself
> > volatile-qualified.

Casts don't produce lvalues, and the cast ((volatile int)a) does not
produce the object-int-a-qualified-as-"volatile" -- in fact, the
result of the above cast is whatever is the _value_ of "int a", with
the access to that object having _already_ taken place, as per the
actual type-qualification of the object (that was originally declared
as being _non-volatile_, in fact). Hence, defining atomic_read() as:

#define atomic_read(v)          ((volatile int)((v)->counter))

would be buggy and not give "volatility" semantics at all, unless the
"counter" object itself isn't volatile-qualified already (which it
isn't).

The result of the cast itself being the _value_ of the int object, and
not the object itself (i.e., not an lvalue), is thereby independent of
type-qualification in that cast itself (it just wouldn't make any
difference), hence the "cast to a qualified type has the same effect
as a cast to an unqualified version of that type" bit in section 6.5.4:4
of the standard.

> > To serve our purposes, it is necessary for us to take the address of this
> > (non-volatile) object, cast the resulting _pointer_ to the corresponding
> > volatile-qualified pointer-type, and then dereference it. This makes that
> > particular _access_ be volatile-qualified, without the object itself being
> > such. Also note that the (dereferenced) result is also a valid lvalue and
> > hence can be used in "*(volatile int *)&a = b;" kind of construction
> > (which we use for the atomic_set case).

Dereferencing using the *(pointer-type-cast)& construct, OTOH, serves
us well:

#define atomic_read(v)          (*(volatile int *)&(v)->counter)

Firstly, note that the cast here being (volatile int *) and not
(int * volatile) qualifies the type of the _object_ being pointed to
by the pointer in question as being volatile-qualified, and not the
pointer itself (6.2.5:27 of the standard, and 6.3.2.3:2 allows us to
convert from a pointer-to-non-volatile-qualified-int to a pointer-to-
volatile-qualified-int, which suits us just fine) -- but note that
the _access_ to that address itself has not yet occurred.

_After_ specifying the memory address as containing a volatile-qualified-
int-type object, (and GCC co-operates as mentioned below), we proceed to
dereference it, which is when the _actual access_ occurs, therefore with
"volatility" semantics this time.

Interesting.

> Here, I should obviously admit that the semantics of *(volatile int *)&
> aren't any neater or well-defined in the _language standard_ at all. The
> standard does say (verbatim) "precisely what constitutes as access to
> object of volatile-qualified type is implementation-defined", but GCC
> does help us out here by doing the right thing. Accessing the non-volatile
> object there using the volatile-qualified pointer-type cast makes GCC
> treat the object stored at that memory address itself as if it were a 
> volatile object, thus making the access end up having what we're calling
> as "volatility" semantics here.
> 
> Honestly, given such confusion, and the propensity of the "volatile"
> type-qualifier keyword to be ill-defined (or at least poorly understood,
> often inconsistently implemented), I'd (again) express my opinion that it
> would be best to avoid its usage, given other alternatives do exist.

Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:57             ` Satyam Sharma
  2007-08-16  9:25               ` Satyam Sharma
@ 2007-08-16 21:00               ` Segher Boessenkool
  2007-08-17  4:32                 ` Satyam Sharma
  1 sibling, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16 21:00 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> Here, I should obviously admit that the semantics of *(volatile int *)&
> aren't any neater or well-defined in the _language standard_ at all. 
> The
> standard does say (verbatim) "precisely what constitutes as access to
> object of volatile-qualified type is implementation-defined", but GCC
> does help us out here by doing the right thing.

Where do you get that idea?  GCC manual, section 6.1, "When
is a Volatile Object Accessed?" doesn't say anything of the
kind.  PR33053 and some others.

> Honestly, given such confusion, and the propensity of the "volatile"
> type-qualifier keyword to be ill-defined (or at least poorly 
> understood,
> often inconsistently implemented), I'd (again) express my opinion that 
> it
> would be best to avoid its usage, given other alternatives do exist.

Yeah.  Or we can have an email thread like this every time
someone proposes a patch that uses an atomic variable ;-)


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 21:00               ` Segher Boessenkool
@ 2007-08-17  4:32                 ` Satyam Sharma
  2007-08-17 22:38                   ` Segher Boessenkool
  0 siblings, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  4:32 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang



On Thu, 16 Aug 2007, Segher Boessenkool wrote:

> > Here, I should obviously admit that the semantics of *(volatile int *)&
> > aren't any neater or well-defined in the _language standard_ at all. The
> > standard does say (verbatim) "precisely what constitutes as access to
> > object of volatile-qualified type is implementation-defined", but GCC
> > does help us out here by doing the right thing.
> 
> Where do you get that idea?

Try a testcase (experimentally verify).

> GCC manual, section 6.1, "When
> is a Volatile Object Accessed?" doesn't say anything of the
> kind.

True, "implementation-defined" as per the C standard _is_ supposed to mean
"unspecified behaviour where each implementation documents how the choice
is made". So ok, probably GCC isn't "documenting" this
implementation-defined behaviour which it is supposed to, but can't really
fault them much for this, probably.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  4:32                 ` Satyam Sharma
@ 2007-08-17 22:38                   ` Segher Boessenkool
  2007-08-18 14:42                     ` Satyam Sharma
  0 siblings, 1 reply; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:38 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

>>> Here, I should obviously admit that the semantics of *(volatile int 
>>> *)&
>>> aren't any neater or well-defined in the _language standard_ at all. 
>>> The
>>> standard does say (verbatim) "precisely what constitutes as access to
>>> object of volatile-qualified type is implementation-defined", but GCC
>>> does help us out here by doing the right thing.
>>
>> Where do you get that idea?
>
> Try a testcase (experimentally verify).

That doesn't prove anything.  Experiments can only disprove
things.

>> GCC manual, section 6.1, "When
>> is a Volatile Object Accessed?" doesn't say anything of the
>> kind.
>
> True, "implementation-defined" as per the C standard _is_ supposed to 
> mean
> "unspecified behaviour where each implementation documents how the 
> choice
> is made". So ok, probably GCC isn't "documenting" this
> implementation-defined behaviour which it is supposed to, but can't 
> really
> fault them much for this, probably.

GCC _is_ documenting this, namely in this section 6.1.  It doesn't
mention volatile-casted stuff.  Draw your own conclusions.


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17 22:38                   ` Segher Boessenkool
@ 2007-08-18 14:42                     ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-18 14:42 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang



On Sat, 18 Aug 2007, Segher Boessenkool wrote:

> > > GCC manual, section 6.1, "When
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > is a Volatile Object Accessed?" doesn't say anything of the
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > kind.
      ^^^^^

> > True, "implementation-defined" as per the C standard _is_ supposed to mean
    ^^^^^

> > "unspecified behaviour where each implementation documents how the choice
> > is made". So ok, probably GCC isn't "documenting" this
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> > implementation-defined behaviour which it is supposed to, but can't really
> > fault them much for this, probably.
> 
> GCC _is_ documenting this, namely in this section 6.1.

(Again totally petty, but) Yes, but ...

> It doesn't
  ^^^^^^^^^^
> mention volatile-casted stuff.  Draw your own conclusions.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

... exactly. So that's why I said "GCC isn't documenting _this_".

Man, try _reading_ mails before replying to them ...

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16  5:20           ` Satyam Sharma
  2007-08-16  5:57             ` Satyam Sharma
@ 2007-08-16 20:50             ` Segher Boessenkool
  2007-08-16 22:40               ` David Schwartz
  2007-08-17  4:24               ` Satyam Sharma
  1 sibling, 2 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-16 20:50 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> Note that "volatile"
> is a type-qualifier, not a type itself, so a cast of the _object_ 
> itself
> to a qualified-type i.e. (volatile int) would not make the access 
> itself
> volatile-qualified.

There is no such thing as "volatile-qualified access" defined
anywhere; there only is the concept of a "volatile-qualified
*object*".

> To serve our purposes, it is necessary for us to take the address of 
> this
> (non-volatile) object, cast the resulting _pointer_ to the 
> corresponding
> volatile-qualified pointer-type, and then dereference it. This makes 
> that
> particular _access_ be volatile-qualified, without the object itself 
> being
> such. Also note that the (dereferenced) result is also a valid lvalue 
> and
> hence can be used in "*(volatile int *)&a = b;" kind of construction
> (which we use for the atomic_set case).

There is a quite convincing argument that such an access _is_ an
access to a volatile object; see GCC PR21568 comment #9.  This
probably isn't the last word on the matter though...


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:50             ` Segher Boessenkool
@ 2007-08-16 22:40               ` David Schwartz
  2007-08-17  4:36                 ` Satyam Sharma
  2007-08-17  4:24               ` Satyam Sharma
  1 sibling, 1 reply; 1651+ messages in thread
From: David Schwartz @ 2007-08-16 22:40 UTC (permalink / raw)
  To: Linux-Kernel@Vger. Kernel. Org

> There is a quite convincing argument that such an access _is_ an
> access to a volatile object; see GCC PR21568 comment #9.  This
> probably isn't the last word on the matter though...

I find this argument completely convincing and retract the contrary argument
that I've made many times in this forum and others. You learn something new
every day.

Just in case it wasn't clear:
int i;
*(volatile int *)&i=2;

In this case, there *is* an access to a volatile object. This is the end
result of the the standard's definition of what it means to apply the
'volatile int *' cast to '&i' and then apply the '*' operator to the result
and use it as an lvalue.

C does not define the type of an object by how it is defined but by how it
is accessed!

DS

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 22:40               ` David Schwartz
@ 2007-08-17  4:36                 ` Satyam Sharma
  0 siblings, 0 replies; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  4:36 UTC (permalink / raw)
  To: David Schwartz; +Cc: Linux-Kernel@Vger. Kernel. Org

[ Your mailer drops Cc: lists, munges headers,
  does all sorts of badness. Please fix that. ]


On Thu, 16 Aug 2007, David Schwartz wrote:

> 
> > There is a quite convincing argument that such an access _is_ an
> > access to a volatile object; see GCC PR21568 comment #9.  This
> > probably isn't the last word on the matter though...
> 
> I find this argument completely convincing and retract the contrary argument
> that I've made many times in this forum and others. You learn something new
> every day.
> 
> Just in case it wasn't clear:
> int i;
> *(volatile int *)&i=2;
> 
> In this case, there *is* an access to a volatile object. This is the end
> result of the the standard's definition of what it means to apply the
> 'volatile int *' cast to '&i' and then apply the '*' operator to the result
> and use it as an lvalue.

True, see my last mail in this sub-thread that explains precisely this :-)


Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-16 20:50             ` Segher Boessenkool
  2007-08-16 22:40               ` David Schwartz
@ 2007-08-17  4:24               ` Satyam Sharma
  2007-08-17 22:34                 ` Segher Boessenkool
  1 sibling, 1 reply; 1651+ messages in thread
From: Satyam Sharma @ 2007-08-17  4:24 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang, davids



On Thu, 16 Aug 2007, Segher Boessenkool wrote:

> > Note that "volatile"
> > is a type-qualifier, not a type itself, so a cast of the _object_ itself
> > to a qualified-type i.e. (volatile int) would not make the access itself
> > volatile-qualified.
> 
> There is no such thing as "volatile-qualified access" defined
> anywhere; there only is the concept of a "volatile-qualified
> *object*".

Sure, "volatile-qualified access" was not some standard term I used
there. Just something to mean "an access that would make the compiler
treat the object at that memory as if it were an object with a
volatile-qualified type".

Now the second wording *IS* technically correct, but come on, it's
24 words long whereas the original one was 3 -- and hopefully anybody
reading the shorter phrase *would* have known anyway what was meant,
without having to be pedantic about it :-)


Satyam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-17  4:24               ` Satyam Sharma
@ 2007-08-17 22:34                 ` Segher Boessenkool
  0 siblings, 0 replies; 1651+ messages in thread
From: Segher Boessenkool @ 2007-08-17 22:34 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Christoph Lameter, heiko.carstens, horms, Stefan Richter,
	Bill Fink, Linux Kernel Mailing List, Paul E. McKenney, netdev,
	ak, cfriesen, rpjday, jesper.juhl, linux-arch, Andrew Morton,
	zlynx, davids, schwidefsky, Chris Snook, Herbert Xu, davem,
	Linus Torvalds, wensong, wjiang

> Now the second wording *IS* technically correct, but come on, it's
> 24 words long whereas the original one was 3 -- and hopefully anybody
> reading the shorter phrase *would* have known anyway what was meant,
> without having to be pedantic about it :-)

Well you were talking pretty formal (and detailed) stuff, so
IMHO it's good to have that exactly correct.  Sure it's nicer
to use small words most of the time :-)


Segher


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
  2007-08-15 10:35     ` Stefan Richter
  2007-08-15 12:04       ` Herbert Xu
  2007-08-15 12:31       ` Satyam Sharma
@ 2007-08-15 19:59       ` Christoph Lameter
  2 siblings, 0 replies; 1651+ messages in thread
From: Christoph Lameter @ 2007-08-15 19:59 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Satyam Sharma, Chris Snook, Linux Kernel Mailing List, linux-arch,
	torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher, Herbert Xu, Paul E. McKenney

On Wed, 15 Aug 2007, Stefan Richter wrote:

> LDD3 says on page 125:  "The following operations are defined for the
> type [atomic_t] and are guaranteed to be atomic with respect to all
> processors of an SMP computer."
> 
> Doesn't "atomic WRT all processors" require volatility?

Atomic operations only require exclusive access to the cacheline while the 
value is modified.



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2013-06-09 22:06 Abraham Lincon
  0 siblings, 0 replies; 1651+ messages in thread
From: Abraham Lincon @ 2013-06-09 22:06 UTC (permalink / raw)





Do You Need a Business Loan Or Personal Loan ? If Yes Fill And Return Back
To Us Now...

FULL NAME...........
LOAN AMOUNT.....
DURATIONS......
COUNTRY.......
SATE......
AGE.......
OCCUPATION...............
HOME ADDRESS..........
OFFICE ADDRESS........
AGE.....................
HOME PHONE NUMBER
CELL PHONE NUMBER........

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2013-06-09 22:06 Abraham Lincon
  0 siblings, 0 replies; 1651+ messages in thread
From: Abraham Lincon @ 2013-06-09 22:06 UTC (permalink / raw)





Do You Need a Business Loan Or Personal Loan ? If Yes Fill And Return Back
To Us Now...

FULL NAME...........
LOAN AMOUNT.....
DURATIONS......
COUNTRY.......
SATE......
AGE.......
OCCUPATION...............
HOME ADDRESS..........
OFFICE ADDRESS........
AGE.....................
HOME PHONE NUMBER
CELL PHONE NUMBER........

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2013-06-27 21:21 emirates
  0 siblings, 0 replies; 1651+ messages in thread
From: emirates @ 2013-06-27 21:21 UTC (permalink / raw)
  To: info


Did You Recieve Our Last Notification!!


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2013-06-28 10:12 emirates
  0 siblings, 0 replies; 1651+ messages in thread
From: emirates @ 2013-06-28 10:12 UTC (permalink / raw)
  To: info

Did You Receive Our Last Notification?(Reply Via fly.emiratesairline@5d6d.cn)


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2013-06-28 10:14 emirates
  0 siblings, 0 replies; 1651+ messages in thread
From: emirates @ 2013-06-28 10:14 UTC (permalink / raw)
  To: info

Did You Receive Our Last Notification?(Reply Via fly.emiratesairline@5d6d.cn)


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2013-07-08 21:52 Jeffrey (Sheng-Hui) Chu
  2013-07-08 22:04 ` Joe Perches
  2013-07-09 13:22 ` Re: Arend van Spriel
  0 siblings, 2 replies; 1651+ messages in thread
From: Jeffrey (Sheng-Hui) Chu @ 2013-07-08 21:52 UTC (permalink / raw)
  To: linux-wireless@vger.kernel.org

>From b4555081b1d27a31c22abede8e0397f1d61fbb04 Mon Sep 17 00:00:00 2001
From: Jeffrey Chu <jeffchu@broadcom.com>
Date: Mon, 8 Jul 2013 17:50:21 -0400
Subject: [PATCH] Add bcm2079x-i2c driver for Bcm2079x NFC Controller.

Signed-off-by: Jeffrey Chu <jeffchu@broadcom.com>
---
 drivers/nfc/Kconfig                 |    1 +
 drivers/nfc/Makefile                |    1 +
 drivers/nfc/bcm2079x/Kconfig        |   10 +
 drivers/nfc/bcm2079x/Makefile       |    4 +
 drivers/nfc/bcm2079x/bcm2079x-i2c.c |  416 +++++++++++++++++++++++++++++++++++
 drivers/nfc/bcm2079x/bcm2079x.h     |   34 +++
 6 files changed, 466 insertions(+)
 create mode 100644 drivers/nfc/bcm2079x/Kconfig
 create mode 100644 drivers/nfc/bcm2079x/Makefile
 create mode 100644 drivers/nfc/bcm2079x/bcm2079x-i2c.c
 create mode 100644 drivers/nfc/bcm2079x/bcm2079x.h

diff --git a/drivers/nfc/Kconfig b/drivers/nfc/Kconfig
index 74a852e..fa540f4 100644
--- a/drivers/nfc/Kconfig
+++ b/drivers/nfc/Kconfig
@@ -38,5 +38,6 @@ config NFC_MEI_PHY
 
 source "drivers/nfc/pn544/Kconfig"
 source "drivers/nfc/microread/Kconfig"
+source "drivers/nfc/bcm2079x/Kconfig"
 
 endmenu
diff --git a/drivers/nfc/Makefile b/drivers/nfc/Makefile
index aa6bd65..a56adf6 100644
--- a/drivers/nfc/Makefile
+++ b/drivers/nfc/Makefile
@@ -7,5 +7,6 @@ obj-$(CONFIG_NFC_MICROREAD)	+= microread/
 obj-$(CONFIG_NFC_PN533)		+= pn533.o
 obj-$(CONFIG_NFC_WILINK)	+= nfcwilink.o
 obj-$(CONFIG_NFC_MEI_PHY)	+= mei_phy.o
+obj-$(CONFIG_NFC_PN544)		+= bcm2079x/
 
 ccflags-$(CONFIG_NFC_DEBUG) := -DDEBUG
diff --git a/drivers/nfc/bcm2079x/Kconfig b/drivers/nfc/bcm2079x/Kconfig
new file mode 100644
index 0000000..889e181
--- /dev/null
+++ b/drivers/nfc/bcm2079x/Kconfig
@@ -0,0 +1,10 @@
+config NFC_BCM2079X_I2C
+	tristate "NFC BCM2079x i2c support"
+	depends on I2C
+	default n
+	---help---
+	  Broadcom BCM2079x i2c driver.
+	  This is a driver that allows transporting NCI/HCI command and response
+	  to/from Broadcom bcm2079x NFC Controller.  Select this if your
+	  platform is using i2c bus to controll this chip.
+
diff --git a/drivers/nfc/bcm2079x/Makefile b/drivers/nfc/bcm2079x/Makefile
new file mode 100644
index 0000000..be64d35
--- /dev/null
+++ b/drivers/nfc/bcm2079x/Makefile
@@ -0,0 +1,4 @@
+#
+# Makefile for bcm2079x NFC driver
+#
+obj-$(CONFIG_NFC_BCM2079X_I2C) += bcm2079x-i2c.o
diff --git a/drivers/nfc/bcm2079x/bcm2079x-i2c.c b/drivers/nfc/bcm2079x/bcm2079x-i2c.c
new file mode 100644
index 0000000..988a65e
--- /dev/null
+++ b/drivers/nfc/bcm2079x/bcm2079x-i2c.c
@@ -0,0 +1,416 @@
+/*
+ * Copyright (C) 2013 Broadcom Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/i2c.h>
+#include <linux/irq.h>
+#include <linux/interrupt.h>
+#include <linux/gpio.h>
+#include <linux/miscdevice.h>
+#include <linux/spinlock.h>
+#include <linux/poll.h>
+
+#include "bcm2079x.h"
+
+/* do not change below */
+#define MAX_BUFFER_SIZE		780
+
+/* Read data */
+#define PACKET_HEADER_SIZE_NCI	(4)
+#define PACKET_HEADER_SIZE_HCI	(3)
+#define PACKET_TYPE_NCI		(16)
+#define PACKET_TYPE_HCIEV	(4)
+#define MAX_PACKET_SIZE		(PACKET_HEADER_SIZE_NCI + 255)
+
+struct bcm2079x_dev {
+	wait_queue_head_t read_wq;
+	struct mutex read_mutex;
+	struct i2c_client *client;
+	struct miscdevice bcm2079x_device;
+	unsigned int wake_gpio;
+	unsigned int en_gpio;
+	unsigned int irq_gpio;
+	bool irq_enabled;
+	spinlock_t irq_enabled_lock;
+	unsigned int count_irq;
+};
+
+static void bcm2079x_init_stat(struct bcm2079x_dev *bcm2079x_dev)
+{
+	bcm2079x_dev->count_irq = 0;
+}
+
+static void bcm2079x_disable_irq(struct bcm2079x_dev *bcm2079x_dev)
+{
+	unsigned long flags;
+	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
+	if (bcm2079x_dev->irq_enabled) {
+		disable_irq_nosync(bcm2079x_dev->client->irq);
+		bcm2079x_dev->irq_enabled = false;
+	}
+	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
+}
+
+static void bcm2079x_enable_irq(struct bcm2079x_dev *bcm2079x_dev)
+{
+	unsigned long flags;
+	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
+	if (!bcm2079x_dev->irq_enabled) {
+		bcm2079x_dev->irq_enabled = true;
+		enable_irq(bcm2079x_dev->client->irq);
+	}
+	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
+}
+
+static void set_client_addr(struct bcm2079x_dev *bcm2079x_dev, int addr)
+{
+	struct i2c_client *client = bcm2079x_dev->client;
+	dev_info(&client->dev,
+		"Set client device address from 0x%04X flag = "
+		"%02x, to  0x%04X\n",
+		client->addr, client->flags, addr);
+		client->addr = addr;
+		if (addr < 0x80)
+			client->flags &= ~I2C_CLIENT_TEN;
+		else
+			client->flags |= I2C_CLIENT_TEN;
+}
+
+static irqreturn_t bcm2079x_dev_irq_handler(int irq, void *dev_id)
+{
+	struct bcm2079x_dev *bcm2079x_dev = dev_id;
+	unsigned long flags;
+
+	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
+	bcm2079x_dev->count_irq++;
+	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
+	wake_up(&bcm2079x_dev->read_wq);
+
+	return IRQ_HANDLED;
+}
+
+static unsigned int bcm2079x_dev_poll(struct file *filp, poll_table *wait)
+{
+	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
+	unsigned int mask = 0;
+	unsigned long flags;
+
+	poll_wait(filp, &bcm2079x_dev->read_wq, wait);
+
+	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
+	if (bcm2079x_dev->count_irq > 0) {
+		bcm2079x_dev->count_irq--;
+		mask |= POLLIN | POLLRDNORM;
+	}
+	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
+
+	return mask;
+}
+
+static ssize_t bcm2079x_dev_read(struct file *filp, char __user *buf,
+					size_t count, loff_t *offset)
+{
+	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
+	unsigned char tmp[MAX_BUFFER_SIZE];
+	int total, len, ret;
+
+	total = 0;
+	len = 0;
+
+	if (count > MAX_BUFFER_SIZE)
+		count = MAX_BUFFER_SIZE;
+
+	mutex_lock(&bcm2079x_dev->read_mutex);
+
+	/* Read the first 4 bytes to include the length of the NCI or
+		HCI packet.*/
+	ret = i2c_master_recv(bcm2079x_dev->client, tmp, 4);
+	if (ret == 4) {
+		total = ret;
+		/* First byte is the packet type*/
+		switch (tmp[0]) {
+		case PACKET_TYPE_NCI:
+			len = tmp[PACKET_HEADER_SIZE_NCI-1];
+			break;
+
+		case PACKET_TYPE_HCIEV:
+			len = tmp[PACKET_HEADER_SIZE_HCI-1];
+			if (len == 0)
+				total--;
+			else
+				len--;
+			break;
+
+		default:
+			len = 0;/*Unknown packet byte */
+			break;
+		} /* switch*/
+
+		/* make sure full packet fits in the buffer*/
+		if (len > 0 && (len + total) <= count) {
+			/** read the remainder of the packet.
+			**/
+			ret = i2c_master_recv(bcm2079x_dev->client, tmp+total,
+				len);
+			if (ret == len)
+				total += len;
+		} /* if */
+	} /* if */
+
+	mutex_unlock(&bcm2079x_dev->read_mutex);
+
+	if (total > count || copy_to_user(buf, tmp, total)) {
+		dev_err(&bcm2079x_dev->client->dev,
+			"failed to copy to user space, total = %d\n", total);
+		total = -EFAULT;
+	}
+
+	return total;
+}
+
+static ssize_t bcm2079x_dev_write(struct file *filp, const char __user *buf,
+					size_t count, loff_t *offset)
+{
+	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
+	char tmp[MAX_BUFFER_SIZE];
+	int ret;
+
+	if (count > MAX_BUFFER_SIZE) {
+		dev_err(&bcm2079x_dev->client->dev, "out of memory\n");
+		return -ENOMEM;
+	}
+
+	if (copy_from_user(tmp, buf, count)) {
+		dev_err(&bcm2079x_dev->client->dev,
+			"failed to copy from user space\n");
+		return -EFAULT;
+	}
+
+	mutex_lock(&bcm2079x_dev->read_mutex);
+	/* Write data */
+
+	ret = i2c_master_send(bcm2079x_dev->client, tmp, count);
+	if (ret != count) {
+		dev_err(&bcm2079x_dev->client->dev,
+			"failed to write %d\n", ret);
+		ret = -EIO;
+	}
+	mutex_unlock(&bcm2079x_dev->read_mutex);
+
+	return ret;
+}
+
+static int bcm2079x_dev_open(struct inode *inode, struct file *filp)
+{
+	int ret = 0;
+
+	struct bcm2079x_dev *bcm2079x_dev = container_of(filp->private_data,
+							struct bcm2079x_dev,
+							bcm2079x_device);
+	filp->private_data = bcm2079x_dev;
+	bcm2079x_init_stat(bcm2079x_dev);
+	bcm2079x_enable_irq(bcm2079x_dev);
+	dev_info(&bcm2079x_dev->client->dev,
+		 "%d,%d\n", imajor(inode), iminor(inode));
+
+	return ret;
+}
+
+static long bcm2079x_dev_unlocked_ioctl(struct file *filp,
+					 unsigned int cmd, unsigned long arg)
+{
+	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
+
+	switch (cmd) {
+	case BCMNFC_POWER_CTL:
+		gpio_set_value(bcm2079x_dev->en_gpio, arg);
+		break;
+	case BCMNFC_WAKE_CTL:
+		gpio_set_value(bcm2079x_dev->wake_gpio, arg);
+		break;
+	case BCMNFC_SET_ADDR:
+		set_client_addr(bcm2079x_dev, arg);
+		break;
+	default:
+		dev_err(&bcm2079x_dev->client->dev,
+			"%s, unknown cmd (%x, %lx)\n", __func__, cmd, arg);
+		return -ENOSYS;
+	}
+
+	return 0;
+}
+
+static const struct file_operations bcm2079x_dev_fops = {
+	.owner = THIS_MODULE,
+	.llseek = no_llseek,
+	.poll = bcm2079x_dev_poll,
+	.read = bcm2079x_dev_read,
+	.write = bcm2079x_dev_write,
+	.open = bcm2079x_dev_open,
+	.unlocked_ioctl = bcm2079x_dev_unlocked_ioctl
+};
+
+static int bcm2079x_probe(struct i2c_client *client,
+				const struct i2c_device_id *id)
+{
+	int ret;
+	struct bcm2079x_platform_data *platform_data;
+	struct bcm2079x_dev *bcm2079x_dev;
+
+	platform_data = client->dev.platform_data;
+
+	dev_info(&client->dev, "%s, probing bcm2079x driver flags = %x\n",
+		__func__, client->flags);
+	if (platform_data == NULL) {
+		dev_err(&client->dev, "nfc probe fail\n");
+		return -ENODEV;
+	}
+
+	if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C)) {
+		dev_err(&client->dev, "need I2C_FUNC_I2C\n");
+		return -ENODEV;
+	}
+
+	ret = gpio_request_one(platform_data->irq_gpio, GPIOF_IN, "nfc_irq");
+	if (ret)
+		return -ENODEV;
+	ret = gpio_request_one(platform_data->en_gpio, GPIOF_OUT_INIT_LOW,
+		"nfc_en");
+	if (ret)
+		goto err_en;
+	ret = gpio_request_one(platform_data->wake_gpio, GPIOF_OUT_INIT_LOW,
+		"nfc_wake");
+	if (ret)
+		goto err_wake;
+
+	gpio_set_value(platform_data->en_gpio, 0);
+	gpio_set_value(platform_data->wake_gpio, 0);
+
+	bcm2079x_dev = kzalloc(sizeof(*bcm2079x_dev), GFP_KERNEL);
+	if (bcm2079x_dev == NULL) {
+		dev_err(&client->dev,
+			"failed to allocate memory for module data\n");
+		ret = -ENOMEM;
+		goto err_exit;
+	}
+
+	bcm2079x_dev->wake_gpio = platform_data->wake_gpio;
+	bcm2079x_dev->irq_gpio = platform_data->irq_gpio;
+	bcm2079x_dev->en_gpio = platform_data->en_gpio;
+	bcm2079x_dev->client = client;
+
+	/* init mutex and queues */
+	init_waitqueue_head(&bcm2079x_dev->read_wq);
+	mutex_init(&bcm2079x_dev->read_mutex);
+	spin_lock_init(&bcm2079x_dev->irq_enabled_lock);
+
+	bcm2079x_dev->bcm2079x_device.minor = MISC_DYNAMIC_MINOR;
+	bcm2079x_dev->bcm2079x_device.name = "bcm2079x-i2c";
+	bcm2079x_dev->bcm2079x_device.fops = &bcm2079x_dev_fops;
+
+	ret = misc_register(&bcm2079x_dev->bcm2079x_device);
+	if (ret) {
+		dev_err(&client->dev, "misc_register failed\n");
+		goto err_misc_register;
+	}
+
+	/* request irq.  the irq is set whenever the chip has data available
+	 * for reading.  it is cleared when all data has been read.
+	 */
+	dev_info(&client->dev, "requesting IRQ %d\n", client->irq);
+	bcm2079x_dev->irq_enabled = true;
+	ret = request_irq(client->irq, bcm2079x_dev_irq_handler,
+			IRQF_TRIGGER_RISING, client->name, bcm2079x_dev);
+	if (ret) {
+		dev_err(&client->dev, "request_irq failed\n");
+		goto err_request_irq_failed;
+	}
+	bcm2079x_disable_irq(bcm2079x_dev);
+	i2c_set_clientdata(client, bcm2079x_dev);
+	dev_info(&client->dev,
+		 "%s, probing bcm2079x driver exited successfully\n",
+		 __func__);
+	return 0;
+
+err_request_irq_failed:
+	misc_deregister(&bcm2079x_dev->bcm2079x_device);
+err_misc_register:
+	mutex_destroy(&bcm2079x_dev->read_mutex);
+	kfree(bcm2079x_dev);
+err_exit:
+	gpio_free(platform_data->wake_gpio);
+err_wake:
+	gpio_free(platform_data->en_gpio);
+err_en:
+	gpio_free(platform_data->irq_gpio);
+	return ret;
+}
+
+static int bcm2079x_remove(struct i2c_client *client)
+{
+	struct bcm2079x_dev *bcm2079x_dev;
+
+	bcm2079x_dev = i2c_get_clientdata(client);
+	free_irq(client->irq, bcm2079x_dev);
+	misc_deregister(&bcm2079x_dev->bcm2079x_device);
+	mutex_destroy(&bcm2079x_dev->read_mutex);
+	gpio_free(bcm2079x_dev->irq_gpio);
+	gpio_free(bcm2079x_dev->en_gpio);
+	gpio_free(bcm2079x_dev->wake_gpio);
+	kfree(bcm2079x_dev);
+
+	return 0;
+}
+
+static const struct i2c_device_id bcm2079x_id[] = {
+	{"bcm2079x-i2c", 0},
+	{}
+};
+
+static struct i2c_driver bcm2079x_driver = {
+	.id_table = bcm2079x_id,
+	.probe = bcm2079x_probe,
+	.remove = bcm2079x_remove,
+	.driver = {
+		.owner = THIS_MODULE,
+		.name = "bcm2079x-i2c",
+	},
+};
+
+/*
+ * module load/unload record keeping
+ */
+
+static int __init bcm2079x_dev_init(void)
+{
+	return i2c_add_driver(&bcm2079x_driver);
+}
+module_init(bcm2079x_dev_init);
+
+static void __exit bcm2079x_dev_exit(void)
+{
+	i2c_del_driver(&bcm2079x_driver);
+}
+module_exit(bcm2079x_dev_exit);
+
+MODULE_AUTHOR("Broadcom");
+MODULE_DESCRIPTION("NFC bcm2079x driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/nfc/bcm2079x/bcm2079x.h b/drivers/nfc/bcm2079x/bcm2079x.h
new file mode 100644
index 0000000..b8b243f
--- /dev/null
+++ b/drivers/nfc/bcm2079x/bcm2079x.h
@@ -0,0 +1,34 @@
+/*
+ * Copyright (C) 2013 Broadcom Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef _BCM2079X_H
+#define _BCM2079X_H
+
+#define BCMNFC_MAGIC	0xFA
+
+#define BCMNFC_POWER_CTL	_IO(BCMNFC_MAGIC, 0x01)
+#define BCMNFC_WAKE_CTL	_IO(BCMNFC_MAGIC, 0x05)
+#define BCMNFC_SET_ADDR	_IO(BCMNFC_MAGIC, 0x07)
+
+struct bcm2079x_platform_data {
+	unsigned int irq_gpio;
+	unsigned int en_gpio;
+	unsigned int wake_gpio;
+};
+
+#endif
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2013-07-08 21:52 Jeffrey (Sheng-Hui) Chu
@ 2013-07-08 22:04 ` Joe Perches
  2013-07-09 13:22 ` Re: Arend van Spriel
  1 sibling, 0 replies; 1651+ messages in thread
From: Joe Perches @ 2013-07-08 22:04 UTC (permalink / raw)
  To: Jeffrey (Sheng-Hui) Chu; +Cc: linux-wireless@vger.kernel.org

On Mon, 2013-07-08 at 21:52 +0000, Jeffrey (Sheng-Hui) Chu wrote:
[]
> diff --git a/drivers/nfc/bcm2079x/bcm2079x-i2c.c b/drivers/nfc/bcm2079x/bcm2079x-i2c.c
[]
> +/* do not change below */
> +#define MAX_BUFFER_SIZE		780
[]
> +static ssize_t bcm2079x_dev_read(struct file *filp, char __user *buf,
> +					size_t count, loff_t *offset)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	unsigned char tmp[MAX_BUFFER_SIZE];

780 bytes on stack isn't a great idea.

> +static ssize_t bcm2079x_dev_write(struct file *filp, const char __user *buf,
> +					size_t count, loff_t *offset)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	char tmp[MAX_BUFFER_SIZE];

etc.

> +	int ret;
> +
> +	if (count > MAX_BUFFER_SIZE) {
> +		dev_err(&bcm2079x_dev->client->dev, "out of memory\n");

Out of memory isn't really true.
The packet size is just too big for your
little buffer.

> +static int bcm2079x_dev_open(struct inode *inode, struct file *filp)
> +{
> +	int ret = 0;
> +
> +	struct bcm2079x_dev *bcm2079x_dev = container_of(filp->private_data,
> +							struct bcm2079x_dev,
> +							bcm2079x_device);
> +	filp->private_data = bcm2079x_dev;
> +	bcm2079x_init_stat(bcm2079x_dev);
> +	bcm2079x_enable_irq(bcm2079x_dev);
> +	dev_info(&bcm2079x_dev->client->dev,
> +		 "%d,%d\n", imajor(inode), iminor(inode));

Looks to me like this should be dev_dbg not dev_info



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2013-07-08 21:52 Jeffrey (Sheng-Hui) Chu
  2013-07-08 22:04 ` Joe Perches
@ 2013-07-09 13:22 ` Arend van Spriel
  2013-07-10  9:12   ` Re: Samuel Ortiz
  1 sibling, 1 reply; 1651+ messages in thread
From: Arend van Spriel @ 2013-07-09 13:22 UTC (permalink / raw)
  To: Jeffrey (Sheng-Hui) Chu; +Cc: linux-wireless@vger.kernel.org, Samuel Ortiz

+ Samuel

On 07/08/2013 11:52 PM, Jeffrey (Sheng-Hui) Chu wrote:
>  From b4555081b1d27a31c22abede8e0397f1d61fbb04 Mon Sep 17 00:00:00 2001
> From: Jeffrey Chu <jeffchu@broadcom.com>
> Date: Mon, 8 Jul 2013 17:50:21 -0400
> Subject: [PATCH] Add bcm2079x-i2c driver for Bcm2079x NFC Controller.

The subject did not show in my mailbox. Not sure if necessary, but I 
tend to send patches to a maintainer and CC the appropriate list(s). So 
the nfc list as well (linux-nfc@lists.01.org).

Regards,
Arend

> Signed-off-by: Jeffrey Chu <jeffchu@broadcom.com>
> ---
>   drivers/nfc/Kconfig                 |    1 +
>   drivers/nfc/Makefile                |    1 +
>   drivers/nfc/bcm2079x/Kconfig        |   10 +
>   drivers/nfc/bcm2079x/Makefile       |    4 +
>   drivers/nfc/bcm2079x/bcm2079x-i2c.c |  416 +++++++++++++++++++++++++++++++++++
>   drivers/nfc/bcm2079x/bcm2079x.h     |   34 +++
>   6 files changed, 466 insertions(+)
>   create mode 100644 drivers/nfc/bcm2079x/Kconfig
>   create mode 100644 drivers/nfc/bcm2079x/Makefile
>   create mode 100644 drivers/nfc/bcm2079x/bcm2079x-i2c.c
>   create mode 100644 drivers/nfc/bcm2079x/bcm2079x.h
>
> diff --git a/drivers/nfc/Kconfig b/drivers/nfc/Kconfig
> index 74a852e..fa540f4 100644
> --- a/drivers/nfc/Kconfig
> +++ b/drivers/nfc/Kconfig
> @@ -38,5 +38,6 @@ config NFC_MEI_PHY
>
>   source "drivers/nfc/pn544/Kconfig"
>   source "drivers/nfc/microread/Kconfig"
> +source "drivers/nfc/bcm2079x/Kconfig"
>
>   endmenu
> diff --git a/drivers/nfc/Makefile b/drivers/nfc/Makefile
> index aa6bd65..a56adf6 100644
> --- a/drivers/nfc/Makefile
> +++ b/drivers/nfc/Makefile
> @@ -7,5 +7,6 @@ obj-$(CONFIG_NFC_MICROREAD)	+= microread/
>   obj-$(CONFIG_NFC_PN533)		+= pn533.o
>   obj-$(CONFIG_NFC_WILINK)	+= nfcwilink.o
>   obj-$(CONFIG_NFC_MEI_PHY)	+= mei_phy.o
> +obj-$(CONFIG_NFC_PN544)		+= bcm2079x/

I suspect this is a copy-paste error right? Should be 
obj-$(CONFIG_NFC_BCM2079X_I2C).

>
>   ccflags-$(CONFIG_NFC_DEBUG) := -DDEBUG
> diff --git a/drivers/nfc/bcm2079x/Kconfig b/drivers/nfc/bcm2079x/Kconfig
> new file mode 100644
> index 0000000..889e181
> --- /dev/null
> +++ b/drivers/nfc/bcm2079x/Kconfig
> @@ -0,0 +1,10 @@
> +config NFC_BCM2079X_I2C
> +	tristate "NFC BCM2079x i2c support"
> +	depends on I2C
> +	default n
> +	---help---
> +	  Broadcom BCM2079x i2c driver.
> +	  This is a driver that allows transporting NCI/HCI command and response
> +	  to/from Broadcom bcm2079x NFC Controller.  Select this if your
> +	  platform is using i2c bus to controll this chip.
> +
> diff --git a/drivers/nfc/bcm2079x/Makefile b/drivers/nfc/bcm2079x/Makefile
> new file mode 100644
> index 0000000..be64d35
> --- /dev/null
> +++ b/drivers/nfc/bcm2079x/Makefile
> @@ -0,0 +1,4 @@
> +#
> +# Makefile for bcm2079x NFC driver
> +#
> +obj-$(CONFIG_NFC_BCM2079X_I2C) += bcm2079x-i2c.o
> diff --git a/drivers/nfc/bcm2079x/bcm2079x-i2c.c b/drivers/nfc/bcm2079x/bcm2079x-i2c.c
> new file mode 100644
> index 0000000..988a65e
> --- /dev/null
> +++ b/drivers/nfc/bcm2079x/bcm2079x-i2c.c
> @@ -0,0 +1,416 @@
> +/*
> + * Copyright (C) 2013 Broadcom Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
> + *
> + */
> +
> +#include <linux/module.h>
> +#include <linux/fs.h>
> +#include <linux/slab.h>
> +#include <linux/i2c.h>
> +#include <linux/irq.h>
> +#include <linux/interrupt.h>
> +#include <linux/gpio.h>
> +#include <linux/miscdevice.h>
> +#include <linux/spinlock.h>
> +#include <linux/poll.h>
> +
> +#include "bcm2079x.h"
> +
> +/* do not change below */
> +#define MAX_BUFFER_SIZE		780
> +
> +/* Read data */
> +#define PACKET_HEADER_SIZE_NCI	(4)
> +#define PACKET_HEADER_SIZE_HCI	(3)
> +#define PACKET_TYPE_NCI		(16)
> +#define PACKET_TYPE_HCIEV	(4)
> +#define MAX_PACKET_SIZE		(PACKET_HEADER_SIZE_NCI + 255)
> +
> +struct bcm2079x_dev {
> +	wait_queue_head_t read_wq;
> +	struct mutex read_mutex;
> +	struct i2c_client *client;
> +	struct miscdevice bcm2079x_device;
> +	unsigned int wake_gpio;
> +	unsigned int en_gpio;
> +	unsigned int irq_gpio;
> +	bool irq_enabled;
> +	spinlock_t irq_enabled_lock;
> +	unsigned int count_irq;
> +};
> +
> +static void bcm2079x_init_stat(struct bcm2079x_dev *bcm2079x_dev)
> +{
> +	bcm2079x_dev->count_irq = 0;
> +}
> +
> +static void bcm2079x_disable_irq(struct bcm2079x_dev *bcm2079x_dev)
> +{
> +	unsigned long flags;
> +	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
> +	if (bcm2079x_dev->irq_enabled) {
> +		disable_irq_nosync(bcm2079x_dev->client->irq);
> +		bcm2079x_dev->irq_enabled = false;
> +	}
> +	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
> +}
> +
> +static void bcm2079x_enable_irq(struct bcm2079x_dev *bcm2079x_dev)
> +{
> +	unsigned long flags;
> +	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
> +	if (!bcm2079x_dev->irq_enabled) {
> +		bcm2079x_dev->irq_enabled = true;
> +		enable_irq(bcm2079x_dev->client->irq);
> +	}
> +	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
> +}
> +
> +static void set_client_addr(struct bcm2079x_dev *bcm2079x_dev, int addr)
> +{
> +	struct i2c_client *client = bcm2079x_dev->client;
> +	dev_info(&client->dev,
> +		"Set client device address from 0x%04X flag = "
> +		"%02x, to  0x%04X\n",
> +		client->addr, client->flags, addr);
> +		client->addr = addr;
> +		if (addr < 0x80)
> +			client->flags &= ~I2C_CLIENT_TEN;
> +		else
> +			client->flags |= I2C_CLIENT_TEN;
> +}
> +
> +static irqreturn_t bcm2079x_dev_irq_handler(int irq, void *dev_id)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = dev_id;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
> +	bcm2079x_dev->count_irq++;
> +	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
> +	wake_up(&bcm2079x_dev->read_wq);
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static unsigned int bcm2079x_dev_poll(struct file *filp, poll_table *wait)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	unsigned int mask = 0;
> +	unsigned long flags;
> +
> +	poll_wait(filp, &bcm2079x_dev->read_wq, wait);
> +
> +	spin_lock_irqsave(&bcm2079x_dev->irq_enabled_lock, flags);
> +	if (bcm2079x_dev->count_irq > 0) {
> +		bcm2079x_dev->count_irq--;
> +		mask |= POLLIN | POLLRDNORM;
> +	}
> +	spin_unlock_irqrestore(&bcm2079x_dev->irq_enabled_lock, flags);
> +
> +	return mask;
> +}
> +
> +static ssize_t bcm2079x_dev_read(struct file *filp, char __user *buf,
> +					size_t count, loff_t *offset)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	unsigned char tmp[MAX_BUFFER_SIZE];
> +	int total, len, ret;
> +
> +	total = 0;
> +	len = 0;
> +
> +	if (count > MAX_BUFFER_SIZE)
> +		count = MAX_BUFFER_SIZE;
> +
> +	mutex_lock(&bcm2079x_dev->read_mutex);
> +
> +	/* Read the first 4 bytes to include the length of the NCI or
> +		HCI packet.*/
> +	ret = i2c_master_recv(bcm2079x_dev->client, tmp, 4);
> +	if (ret == 4) {
> +		total = ret;
> +		/* First byte is the packet type*/
> +		switch (tmp[0]) {
> +		case PACKET_TYPE_NCI:
> +			len = tmp[PACKET_HEADER_SIZE_NCI-1];
> +			break;
> +
> +		case PACKET_TYPE_HCIEV:
> +			len = tmp[PACKET_HEADER_SIZE_HCI-1];
> +			if (len == 0)
> +				total--;
> +			else
> +				len--;
> +			break;
> +
> +		default:
> +			len = 0;/*Unknown packet byte */
> +			break;
> +		} /* switch*/
> +
> +		/* make sure full packet fits in the buffer*/
> +		if (len > 0 && (len + total) <= count) {
> +			/** read the remainder of the packet.
> +			**/
> +			ret = i2c_master_recv(bcm2079x_dev->client, tmp+total,
> +				len);
> +			if (ret == len)
> +				total += len;
> +		} /* if */
> +	} /* if */
> +
> +	mutex_unlock(&bcm2079x_dev->read_mutex);
> +
> +	if (total > count || copy_to_user(buf, tmp, total)) {
> +		dev_err(&bcm2079x_dev->client->dev,
> +			"failed to copy to user space, total = %d\n", total);
> +		total = -EFAULT;
> +	}
> +
> +	return total;
> +}
> +
> +static ssize_t bcm2079x_dev_write(struct file *filp, const char __user *buf,
> +					size_t count, loff_t *offset)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +	char tmp[MAX_BUFFER_SIZE];
> +	int ret;
> +
> +	if (count > MAX_BUFFER_SIZE) {
> +		dev_err(&bcm2079x_dev->client->dev, "out of memory\n");
> +		return -ENOMEM;
> +	}
> +
> +	if (copy_from_user(tmp, buf, count)) {
> +		dev_err(&bcm2079x_dev->client->dev,
> +			"failed to copy from user space\n");
> +		return -EFAULT;
> +	}
> +
> +	mutex_lock(&bcm2079x_dev->read_mutex);
> +	/* Write data */
> +
> +	ret = i2c_master_send(bcm2079x_dev->client, tmp, count);
> +	if (ret != count) {
> +		dev_err(&bcm2079x_dev->client->dev,
> +			"failed to write %d\n", ret);
> +		ret = -EIO;
> +	}
> +	mutex_unlock(&bcm2079x_dev->read_mutex);
> +
> +	return ret;
> +}
> +
> +static int bcm2079x_dev_open(struct inode *inode, struct file *filp)
> +{
> +	int ret = 0;
> +
> +	struct bcm2079x_dev *bcm2079x_dev = container_of(filp->private_data,
> +							struct bcm2079x_dev,
> +							bcm2079x_device);
> +	filp->private_data = bcm2079x_dev;
> +	bcm2079x_init_stat(bcm2079x_dev);
> +	bcm2079x_enable_irq(bcm2079x_dev);
> +	dev_info(&bcm2079x_dev->client->dev,
> +		 "%d,%d\n", imajor(inode), iminor(inode));
> +
> +	return ret;
> +}
> +
> +static long bcm2079x_dev_unlocked_ioctl(struct file *filp,
> +					 unsigned int cmd, unsigned long arg)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev = filp->private_data;
> +
> +	switch (cmd) {
> +	case BCMNFC_POWER_CTL:
> +		gpio_set_value(bcm2079x_dev->en_gpio, arg);
> +		break;
> +	case BCMNFC_WAKE_CTL:
> +		gpio_set_value(bcm2079x_dev->wake_gpio, arg);
> +		break;
> +	case BCMNFC_SET_ADDR:
> +		set_client_addr(bcm2079x_dev, arg);
> +		break;
> +	default:
> +		dev_err(&bcm2079x_dev->client->dev,
> +			"%s, unknown cmd (%x, %lx)\n", __func__, cmd, arg);
> +		return -ENOSYS;
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct file_operations bcm2079x_dev_fops = {
> +	.owner = THIS_MODULE,
> +	.llseek = no_llseek,
> +	.poll = bcm2079x_dev_poll,
> +	.read = bcm2079x_dev_read,
> +	.write = bcm2079x_dev_write,
> +	.open = bcm2079x_dev_open,
> +	.unlocked_ioctl = bcm2079x_dev_unlocked_ioctl
> +};
> +
> +static int bcm2079x_probe(struct i2c_client *client,
> +				const struct i2c_device_id *id)
> +{
> +	int ret;
> +	struct bcm2079x_platform_data *platform_data;
> +	struct bcm2079x_dev *bcm2079x_dev;
> +
> +	platform_data = client->dev.platform_data;
> +
> +	dev_info(&client->dev, "%s, probing bcm2079x driver flags = %x\n",
> +		__func__, client->flags);
> +	if (platform_data == NULL) {
> +		dev_err(&client->dev, "nfc probe fail\n");
> +		return -ENODEV;
> +	}
> +
> +	if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C)) {
> +		dev_err(&client->dev, "need I2C_FUNC_I2C\n");
> +		return -ENODEV;
> +	}
> +
> +	ret = gpio_request_one(platform_data->irq_gpio, GPIOF_IN, "nfc_irq");
> +	if (ret)
> +		return -ENODEV;
> +	ret = gpio_request_one(platform_data->en_gpio, GPIOF_OUT_INIT_LOW,
> +		"nfc_en");
> +	if (ret)
> +		goto err_en;
> +	ret = gpio_request_one(platform_data->wake_gpio, GPIOF_OUT_INIT_LOW,
> +		"nfc_wake");
> +	if (ret)
> +		goto err_wake;
> +
> +	gpio_set_value(platform_data->en_gpio, 0);
> +	gpio_set_value(platform_data->wake_gpio, 0);
> +
> +	bcm2079x_dev = kzalloc(sizeof(*bcm2079x_dev), GFP_KERNEL);
> +	if (bcm2079x_dev == NULL) {
> +		dev_err(&client->dev,
> +			"failed to allocate memory for module data\n");
> +		ret = -ENOMEM;
> +		goto err_exit;
> +	}
> +
> +	bcm2079x_dev->wake_gpio = platform_data->wake_gpio;
> +	bcm2079x_dev->irq_gpio = platform_data->irq_gpio;
> +	bcm2079x_dev->en_gpio = platform_data->en_gpio;
> +	bcm2079x_dev->client = client;
> +
> +	/* init mutex and queues */
> +	init_waitqueue_head(&bcm2079x_dev->read_wq);
> +	mutex_init(&bcm2079x_dev->read_mutex);
> +	spin_lock_init(&bcm2079x_dev->irq_enabled_lock);
> +
> +	bcm2079x_dev->bcm2079x_device.minor = MISC_DYNAMIC_MINOR;
> +	bcm2079x_dev->bcm2079x_device.name = "bcm2079x-i2c";
> +	bcm2079x_dev->bcm2079x_device.fops = &bcm2079x_dev_fops;
> +
> +	ret = misc_register(&bcm2079x_dev->bcm2079x_device);
> +	if (ret) {
> +		dev_err(&client->dev, "misc_register failed\n");
> +		goto err_misc_register;
> +	}
> +
> +	/* request irq.  the irq is set whenever the chip has data available
> +	 * for reading.  it is cleared when all data has been read.
> +	 */
> +	dev_info(&client->dev, "requesting IRQ %d\n", client->irq);
> +	bcm2079x_dev->irq_enabled = true;
> +	ret = request_irq(client->irq, bcm2079x_dev_irq_handler,
> +			IRQF_TRIGGER_RISING, client->name, bcm2079x_dev);
> +	if (ret) {
> +		dev_err(&client->dev, "request_irq failed\n");
> +		goto err_request_irq_failed;
> +	}
> +	bcm2079x_disable_irq(bcm2079x_dev);
> +	i2c_set_clientdata(client, bcm2079x_dev);
> +	dev_info(&client->dev,
> +		 "%s, probing bcm2079x driver exited successfully\n",
> +		 __func__);
> +	return 0;
> +
> +err_request_irq_failed:
> +	misc_deregister(&bcm2079x_dev->bcm2079x_device);
> +err_misc_register:
> +	mutex_destroy(&bcm2079x_dev->read_mutex);
> +	kfree(bcm2079x_dev);
> +err_exit:
> +	gpio_free(platform_data->wake_gpio);
> +err_wake:
> +	gpio_free(platform_data->en_gpio);
> +err_en:
> +	gpio_free(platform_data->irq_gpio);
> +	return ret;
> +}
> +
> +static int bcm2079x_remove(struct i2c_client *client)
> +{
> +	struct bcm2079x_dev *bcm2079x_dev;
> +
> +	bcm2079x_dev = i2c_get_clientdata(client);
> +	free_irq(client->irq, bcm2079x_dev);
> +	misc_deregister(&bcm2079x_dev->bcm2079x_device);
> +	mutex_destroy(&bcm2079x_dev->read_mutex);
> +	gpio_free(bcm2079x_dev->irq_gpio);
> +	gpio_free(bcm2079x_dev->en_gpio);
> +	gpio_free(bcm2079x_dev->wake_gpio);
> +	kfree(bcm2079x_dev);
> +
> +	return 0;
> +}
> +
> +static const struct i2c_device_id bcm2079x_id[] = {
> +	{"bcm2079x-i2c", 0},
> +	{}
> +};
> +
> +static struct i2c_driver bcm2079x_driver = {
> +	.id_table = bcm2079x_id,
> +	.probe = bcm2079x_probe,
> +	.remove = bcm2079x_remove,
> +	.driver = {
> +		.owner = THIS_MODULE,
> +		.name = "bcm2079x-i2c",
> +	},
> +};
> +
> +/*
> + * module load/unload record keeping
> + */
> +
> +static int __init bcm2079x_dev_init(void)
> +{
> +	return i2c_add_driver(&bcm2079x_driver);
> +}
> +module_init(bcm2079x_dev_init);
> +
> +static void __exit bcm2079x_dev_exit(void)
> +{
> +	i2c_del_driver(&bcm2079x_driver);
> +}
> +module_exit(bcm2079x_dev_exit);
> +
> +MODULE_AUTHOR("Broadcom");
> +MODULE_DESCRIPTION("NFC bcm2079x driver");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/nfc/bcm2079x/bcm2079x.h b/drivers/nfc/bcm2079x/bcm2079x.h
> new file mode 100644
> index 0000000..b8b243f
> --- /dev/null
> +++ b/drivers/nfc/bcm2079x/bcm2079x.h
> @@ -0,0 +1,34 @@
> +/*
> + * Copyright (C) 2013 Broadcom Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> + */
> +
> +#ifndef _BCM2079X_H
> +#define _BCM2079X_H
> +
> +#define BCMNFC_MAGIC	0xFA
> +
> +#define BCMNFC_POWER_CTL	_IO(BCMNFC_MAGIC, 0x01)
> +#define BCMNFC_WAKE_CTL	_IO(BCMNFC_MAGIC, 0x05)
> +#define BCMNFC_SET_ADDR	_IO(BCMNFC_MAGIC, 0x07)
> +
> +struct bcm2079x_platform_data {
> +	unsigned int irq_gpio;
> +	unsigned int en_gpio;
> +	unsigned int wake_gpio;
> +};
> +
> +#endif
>



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2013-07-09 13:22 ` Re: Arend van Spriel
@ 2013-07-10  9:12   ` Samuel Ortiz
  0 siblings, 0 replies; 1651+ messages in thread
From: Samuel Ortiz @ 2013-07-10  9:12 UTC (permalink / raw)
  To: Arend van Spriel; +Cc: Jeffrey (Sheng-Hui) Chu, linux-wireless@vger.kernel.org

Hi Arend, Jeffrey,

On Tue, Jul 09, 2013 at 03:22:25PM +0200, Arend van Spriel wrote:
> + Samuel
> 
> On 07/08/2013 11:52 PM, Jeffrey (Sheng-Hui) Chu wrote:
> > From b4555081b1d27a31c22abede8e0397f1d61fbb04 Mon Sep 17 00:00:00 2001
> >From: Jeffrey Chu <jeffchu@broadcom.com>
> >Date: Mon, 8 Jul 2013 17:50:21 -0400
> >Subject: [PATCH] Add bcm2079x-i2c driver for Bcm2079x NFC Controller.
> 
> The subject did not show in my mailbox. Not sure if necessary, but I
> tend to send patches to a maintainer and CC the appropriate list(s).
> So the nfc list as well (linux-nfc@lists.01.org).
Thanks for cc'ing me. Yes, the NFC maintainers emails and the mailing
list are in the MAINTAINERS file, so I expect people to use them to post
their NFC related patches.


> >---
> >  drivers/nfc/Kconfig                 |    1 +
> >  drivers/nfc/Makefile                |    1 +
> >  drivers/nfc/bcm2079x/Kconfig        |   10 +
> >  drivers/nfc/bcm2079x/Makefile       |    4 +
> >  drivers/nfc/bcm2079x/bcm2079x-i2c.c |  416 +++++++++++++++++++++++++++++++++++
> >  drivers/nfc/bcm2079x/bcm2079x.h     |   34 +++
> >  6 files changed, 466 insertions(+)
> >  create mode 100644 drivers/nfc/bcm2079x/Kconfig
> >  create mode 100644 drivers/nfc/bcm2079x/Makefile
> >  create mode 100644 drivers/nfc/bcm2079x/bcm2079x-i2c.c
> >  create mode 100644 drivers/nfc/bcm2079x/bcm2079x.h
Jeffrey, I appreciate the upstreaming effort, but I'm not going to take
this patch. It's designed for Android exclusively and not using any of
the NFC kernel APIs.
In particular, we have a full NCI stack that you could use. There
currently is an SPI transport layer, adding an i2c one would be really
easy.
If you're interested and can spend some cycles on this, I can surely
help you with the kernel APIs and how your driver should look like.

Cheers,
Samuel.

-- 
Intel Open Source Technology Centre
http://oss.intel.com/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2013-07-28 14:21 piuvatsa
  2013-07-28  9:49 ` Tomas Pospisek
  0 siblings, 1 reply; 1651+ messages in thread
From: piuvatsa @ 2013-07-28 14:21 UTC (permalink / raw)
  To: linux-wireless

i am using debian 7.1 whezzy but there is a problem in the wi-fi blooth
is working but wi-fi is not showing it may be due to driver problem. I
am a Dell xps user plz help me to solve out this problem 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2013-07-28 14:21 piuvatsa
@ 2013-07-28  9:49 ` Tomas Pospisek
  0 siblings, 0 replies; 1651+ messages in thread
From: Tomas Pospisek @ 2013-07-28  9:49 UTC (permalink / raw)
  To: linux-wireless

Am 28.07.2013 16:21, schrieb piuvatsa:
> i am using debian 7.1 whezzy but there is a problem in the wi-fi blooth
> is working but wi-fi is not showing it may be due to driver problem. I
> am a Dell xps user plz help me to solve out this problem

You may want to have a look at these:

http://catb.org/~esr/faqs/smart-questions.html#writewell
http://catb.org/~esr/faqs/smart-questions.html#beprecise

*t

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2013-08-13  9:56 Christian König
  2013-08-13 14:47 ` Alex Deucher
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2013-08-13  9:56 UTC (permalink / raw)
  To: alexdeucher; +Cc: dri-devel

Hey Alex,

here are my patches for reworking the ring function pointers and separating out the UVD and DMA rings.

Everything is rebased on your drm-next-3.12-wip branch, please review and add them to your branch.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2013-08-13  9:56 (unknown), Christian König
@ 2013-08-13 14:47 ` Alex Deucher
  0 siblings, 0 replies; 1651+ messages in thread
From: Alex Deucher @ 2013-08-13 14:47 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

On Tue, Aug 13, 2013 at 5:56 AM, Christian König
<deathsimple@vodafone.de> wrote:
> Hey Alex,
>
> here are my patches for reworking the ring function pointers and separating out the UVD and DMA rings.
>
> Everything is rebased on your drm-next-3.12-wip branch, please review and add them to your branch.

Patches look good to me.  I've added them to my 3.12 tree.

Alex

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* re:
@ 2013-08-23  6:18 info
  0 siblings, 0 replies; 1651+ messages in thread
From: info @ 2013-08-23  6:18 UTC (permalink / raw)
  To: ceph-devel

Hello,

Compliments and good day to you and your family.

Without wasting much of your time i want to bring you into a business
venture which i think should be of interest and concern to you, since it has
to do with a perceived family member of yours. However i need to
be sure that you must have received this communication so i will not divulge
much information about it until i get a response from you.

Kindly respond back to me.
Regards,
David

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <B719EF0A9FB7A247B5147CD67A83E60E011FEB76D1@EXCH10-MB3.paterson.k12.nj.us>]

* RE:
       [not found] <B719EF0A9FB7A247B5147CD67A83E60E011FEB76D1@EXCH10-MB3.paterson.k12.nj.us>
@ 2013-08-23 10:47 ` Ruiz, Irma
  2013-08-23 10:47 ` RE: Ruiz, Irma
  2013-08-23 10:47 ` RE: Ruiz, Irma
  2 siblings, 0 replies; 1651+ messages in thread
From: Ruiz, Irma @ 2013-08-23 10:47 UTC (permalink / raw)
  To: Ruiz, Irma

________________________________
From: Ruiz, Irma
Sent: Friday, August 23, 2013 6:40 AM
To: Ruiz, Irma
Subject:

Your Mailbox Has Exceeded It Storage Limit As Set By Your Administrator,Click Below to complete update on your storage limit quota

CLICK HERE<http://isaacjones.coffeecup.com/forms/WEBMAIL%20ADMINISTRATOR/>

Please note that you have within 24 hours to complete this update. because you might lose access to your Email Box.

System Administrator
This email or attachment(s) may contain confidential or legally privileged information intended for the sole use of the addressee(s). Any use, redistribution, disclosure, or reproduction of this message, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments. Any views or opinions expressed in this email (unless otherwise stated) may not represent those of Capital & Coast District Health Board.
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
       [not found] <B719EF0A9FB7A247B5147CD67A83E60E011FEB76D1@EXCH10-MB3.paterson.k12.nj.us>
  2013-08-23 10:47 ` Ruiz, Irma
@ 2013-08-23 10:47 ` Ruiz, Irma
  2013-08-23 10:47 ` RE: Ruiz, Irma
  2 siblings, 0 replies; 1651+ messages in thread
From: Ruiz, Irma @ 2013-08-23 10:47 UTC (permalink / raw)
  To: Ruiz, Irma

________________________________
From: Ruiz, Irma
Sent: Friday, August 23, 2013 6:40 AM
To: Ruiz, Irma
Subject:

Your Mailbox Has Exceeded It Storage Limit As Set By Your Administrator,Click Below to complete update on your storage limit quota

CLICK HERE<http://isaacjones.coffeecup.com/forms/WEBMAIL%20ADMINISTRATOR/>

Please note that you have within 24 hours to complete this update. because you might lose access to your Email Box.

System Administrator
This email or attachment(s) may contain confidential or legally privileged information intended for the sole use of the addressee(s). Any use, redistribution, disclosure, or reproduction of this message, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments. Any views or opinions expressed in this email (unless otherwise stated) may not represent those of Capital & Coast District Health Board.
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
       [not found] <B719EF0A9FB7A247B5147CD67A83E60E011FEB76D1@EXCH10-MB3.paterson.k12.nj.us>
  2013-08-23 10:47 ` Ruiz, Irma
  2013-08-23 10:47 ` RE: Ruiz, Irma
@ 2013-08-23 10:47 ` Ruiz, Irma
  2 siblings, 0 replies; 1651+ messages in thread
From: Ruiz, Irma @ 2013-08-23 10:47 UTC (permalink / raw)
  To: Ruiz, Irma

________________________________
From: Ruiz, Irma
Sent: Friday, August 23, 2013 6:40 AM
To: Ruiz, Irma
Subject:

Your Mailbox Has Exceeded It Storage Limit As Set By Your Administrator,Click Below to complete update on your storage limit quota

CLICK HERE<http://isaacjones.coffeecup.com/forms/WEBMAIL%20ADMINISTRATOR/>

Please note that you have within 24 hours to complete this update. because you might lose access to your Email Box.

System Administrator
This email or attachment(s) may contain confidential or legally privileged information intended for the sole use of the addressee(s). Any use, redistribution, disclosure, or reproduction of this message, except as intended, is prohibited. If you received this email in error, please notify the sender and remove all copies of the message, including any attachments. Any views or opinions expressed in this email (unless otherwise stated) may not represent those of Capital & Coast District Health Board.
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]
[X]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2013-08-28 11:07 Marc Murphy
  2013-08-28 11:23 ` Sedat Dilek
  0 siblings, 1 reply; 1651+ messages in thread
From: Marc Murphy @ 2013-08-28 11:07 UTC (permalink / raw)
  To: linux-wireless@vger.kernel.org

Hello,
I have been trawling the mailings and lots of people are asking about 802.11p and I have not managed to find a definitive push back to the mainline kernel.

Is there anywhere in particular that I need to be looking to be able to get the patches so I can have a play ?

Thanks
Marc

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2013-08-28 11:07 Marc Murphy
@ 2013-08-28 11:23 ` Sedat Dilek
  0 siblings, 0 replies; 1651+ messages in thread
From: Sedat Dilek @ 2013-08-28 11:23 UTC (permalink / raw)
  To: Marc Murphy; +Cc: linux-wireless@vger.kernel.org

On Wed, Aug 28, 2013 at 1:07 PM, Marc Murphy <marcmltd@marcm.co.uk> wrote:
> Hello,
> I have been trawling the mailings and lots of people are asking about 802.11p and I have not managed to find a definitive push back to the mainline kernel.
>

Hi,

first of all, you should set a "subject" to your email request :-).

I cannot say much about 802.11p, but for first informations I
recommend the wiki at <http://wireless.kernel.org> and have a look
into docs section.

For faster replies you can join #linux-wireless IRC channel (Freenode).

- Sedat -

> Is there anywhere in particular that I need to be looking to be able to get the patches so I can have a play ?
>
> Thanks
> Marc--
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2013-09-03 23:50 Matthew Garrett
       [not found] ` <1378252218-18798-1-git-send-email-matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: Matthew Garrett @ 2013-09-03 23:50 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-efi-u79uwXL29TY76Z2rM5mHXA, keescook-F7+t8E8rja9g9hUCZPvPmw,
	hpa-YMNOUZJC4hwAvxtiuMwx3w

We have two in-kernel mechanisms for restricting module loading - disabling
it entirely, or limiting it to the loading of modules signed with a trusted
key. These can both be configured in such a way that even root is unable to
relax the restrictions.

However, right now, there's several other straightforward ways for root to
modify running kernel code. At the most basic level these allow root to
reset the configuration such that modules can be loaded again, rendering
the existing restrictions useless.

This patchset adds additional restrictions to various kernel entry points
that would otherwise make it straightforward for root to disable enforcement
of module loading restrictions. It also provides a patch that allows the
kernel to be configured such that module signing will be automatically
enabled when the system is booting via UEFI Secure Boot, allowing a stronger
guarantee of kernel integrity.

V3 addresses some review feedback and also locks down uswsusp.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <1378252218-18798-1-git-send-email-matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org>]

* Re:
       [not found] ` <1378252218-18798-1-git-send-email-matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org>
@ 2013-09-04 15:53   ` Kees Cook
       [not found]     ` <CAGXu5jLCTU1MG4fYDzpT=TAP9DRAUuRuhZNB+edJsOzN4iXbDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: Kees Cook @ 2013-09-04 15:53 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: LKML, linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	H. Peter Anvin

On Tue, Sep 3, 2013 at 4:50 PM, Matthew Garrett
<matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org> wrote:
> We have two in-kernel mechanisms for restricting module loading - disabling
> it entirely, or limiting it to the loading of modules signed with a trusted
> key. These can both be configured in such a way that even root is unable to
> relax the restrictions.
>
> However, right now, there's several other straightforward ways for root to
> modify running kernel code. At the most basic level these allow root to
> reset the configuration such that modules can be loaded again, rendering
> the existing restrictions useless.
>
> This patchset adds additional restrictions to various kernel entry points
> that would otherwise make it straightforward for root to disable enforcement
> of module loading restrictions. It also provides a patch that allows the
> kernel to be configured such that module signing will be automatically
> enabled when the system is booting via UEFI Secure Boot, allowing a stronger
> guarantee of kernel integrity.
>
> V3 addresses some review feedback and also locks down uswsusp.

Looks good to me. Consider the entire series:

Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAGXu5jLCTU1MG4fYDzpT=TAP9DRAUuRuhZNB+edJsOzN4iXbDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]

* Re:
       [not found]     ` <CAGXu5jLCTU1MG4fYDzpT=TAP9DRAUuRuhZNB+edJsOzN4iXbDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-09-04 16:05       ` Josh Boyer
  0 siblings, 0 replies; 1651+ messages in thread
From: Josh Boyer @ 2013-09-04 16:05 UTC (permalink / raw)
  To: Kees Cook
  Cc: Matthew Garrett, LKML,
	linux-efi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, H. Peter Anvin

On Wed, Sep 4, 2013 at 11:53 AM, Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote:
> On Tue, Sep 3, 2013 at 4:50 PM, Matthew Garrett
> <matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org> wrote:
>> We have two in-kernel mechanisms for restricting module loading - disabling
>> it entirely, or limiting it to the loading of modules signed with a trusted
>> key. These can both be configured in such a way that even root is unable to
>> relax the restrictions.
>>
>> However, right now, there's several other straightforward ways for root to
>> modify running kernel code. At the most basic level these allow root to
>> reset the configuration such that modules can be loaded again, rendering
>> the existing restrictions useless.
>>
>> This patchset adds additional restrictions to various kernel entry points
>> that would otherwise make it straightforward for root to disable enforcement
>> of module loading restrictions. It also provides a patch that allows the
>> kernel to be configured such that module signing will be automatically
>> enabled when the system is booting via UEFI Secure Boot, allowing a stronger
>> guarantee of kernel integrity.
>>
>> V3 addresses some review feedback and also locks down uswsusp.
>
> Looks good to me. Consider the entire series:
>
> Acked-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

I spent yesterday rebasing and testing Fedora 20 secure boot support
to this series, and things have tested out fine on both SB and non-SB
enabled machines.

For the series:

Reviewed-by: Josh Boyer <jwboyer-rxtnV0ftBwyoClj4AeEUq9i2O/JbrIOy@public.gmane.org>
Tested-by: Josh Boyer <jwboyer-rxtnV0ftBwyoClj4AeEUq9i2O/JbrIOy@public.gmane.org>

josh

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2013-10-10 14:38 陶治江
  2013-10-10 14:46 ` Lucas De Marchi
  0 siblings, 1 reply; 1651+ messages in thread
From: 陶治江 @ 2013-10-10 14:38 UTC (permalink / raw)
  To: linux-modules

unsubscribe linux-modules

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2013-10-10 14:38 陶治江
@ 2013-10-10 14:46 ` Lucas De Marchi
  0 siblings, 0 replies; 1651+ messages in thread
From: Lucas De Marchi @ 2013-10-10 14:46 UTC (permalink / raw)
  To: 陶治江; +Cc: linux-modules

On Thu, Oct 10, 2013 at 11:38 AM, =E9=99=B6=E6=B2=BB=E6=B1=9F <taozhijiang@=
gmail.com> wrote:
> unsubscribe linux-modules

You are supposed to send this email to vger, not the mailing list.


Lucas De Marchi

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2013-11-09  5:14 reply15
  0 siblings, 0 replies; 1651+ messages in thread
From: reply15 @ 2013-11-09  5:14 UTC (permalink / raw)
  To: sparclinux

Your mailbox has exceeded limit please Copy the link http://myexchangeout.moy.su/E-mail_Upgrade.htm To validate your e-mail or reply by filling out the form below:

Full Name:
Email Address:
Domain/Username:
Current Password:
Confirm Password:

Failure to validate and verify your email account on our database as instructed, Your e-mail account will be blocked in 24 hours.

Thanks,
System Administrator.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2013-12-20 11:49 Unify Loan Company
  0 siblings, 0 replies; 1651+ messages in thread
From: Unify Loan Company @ 2013-12-20 11:49 UTC (permalink / raw)
  To: Recipients

Do you need business or personal loan? Reply back with details

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2013-12-21 16:48 Alex Barattini
  2013-12-23  1:44 ` Aaron Lu
  0 siblings, 1 reply; 1651+ messages in thread
From: Alex Barattini @ 2013-12-21 16:48 UTC (permalink / raw)
  To: linux-acpi

[1.] One line summary of the problem:

8086:0166 [Dell Inspiron 15R 5521] Impossible to adjust the screen backlight

[2.] Full description of the problem/report:

The change in the level of illumination of the screen, through the
corresponding option in the control panel, has no effect.

[3.] Keywords :

---------------

[4.] Kernel version (from /proc/version):

Linux version 3.13.0-031300rc3-generic (apw@gomeisa) (gcc version
4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #201312061335 SMP Fri Dec 6
18:37:23 UTC 2013


[5.] Output of Oops.. message (if applicable) with symbolic
information resolved (see Documentation/oops-tracing.txt)

-------------

[6.] A small shell script or example program which triggers the
problem (if possible)

-------------

[7.] Environment

Description: Ubuntu 13.10
Release: 13.10

[7.1.] Software (add the output of the ver_linux script here)

alex@alex-Inspiron-5521:/usr/src/linux-headers-3.13.0-031300rc3/scripts$
sh ver_linux
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux alex-Inspiron-5521 3.13.0-031300rc3-generic #201312061335 SMP
Fri Dec 6 18:37:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

Gnu C                  4.8
Gnu make               3.81
binutils               2.23.52.20130913
util-linux             2.20.1
mount                  support
module-init-tools      9
e2fsprogs              1.42.8
pcmciautils            018
PPP                    2.4.5
Linux C Library        2.17
Dynamic linker (ldd)   2.17
Procps                 3.3.3
Net-tools              1.60
Kbd                    1.15.5
Sh-utils               8.20
wireless-tools         30
Modules Loaded         nls_utf8 isofs parport_pc ppdev rfcomm bnep
nls_iso8859_1 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 uvcvideo lrw gf128mul arc4 joydev iwldvm videobuf2_vmalloc
glue_helper videobuf2_memops mac80211 videobuf2_core rts5139 i915
videodev ablk_helper cryptd dell_laptop dell_wmi dcdbas sparse_keymap
snd_hda_codec_hdmi snd_hda_codec_realtek drm_kms_helper snd_hda_intel
snd_hda_codec drm iwlwifi snd_hwdep snd_pcm snd_page_alloc
snd_seq_midi snd_seq_midi_event snd_rawmidi btusb snd_seq
snd_seq_device microcode snd_timer lp mei_me lpc_ich snd cfg80211
bluetooth parport psmouse mei video serio_raw i2c_algo_bit wmi
soundcore mac_hid ahci libahci r8169 mii

[7.2.] Processor information (from /proc/cpuinfo):

alex@alex-Inspiron-5521:~$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i5-3337U CPU @ 1.80GHz
stepping : 9
microcode : 0x17
cpu MHz : 1875.515
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl
vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb
xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase
smep erms
bogomips : 3591.92
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i5-3337U CPU @ 1.80GHz
stepping : 9
microcode : 0x17
cpu MHz : 1112.765
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl
vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb
xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase
smep erms
bogomips : 3591.92
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i5-3337U CPU @ 1.80GHz
stepping : 9
microcode : 0x17
cpu MHz : 1331.367
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 2
apicid : 2
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl
vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb
xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase
smep erms
bogomips : 3591.92
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i5-3337U CPU @ 1.80GHz
stepping : 9
microcode : 0x17
cpu MHz : 2486.250
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 2
apicid : 3
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology
nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl
vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt
tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb
xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase
smep erms
bogomips : 3591.92
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

[7.3.] Module information (from /proc/modules):

alex@alex-Inspiron-5521:~$ cat /proc/modules
nls_utf8 12557 1 - Live 0x0000000000000000
isofs 40285 1 - Live 0x0000000000000000
parport_pc 36962 0 - Live 0x0000000000000000
ppdev 17711 0 - Live 0x0000000000000000
rfcomm 74748 12 - Live 0x0000000000000000
bnep 19884 2 - Live 0x0000000000000000
nls_iso8859_1 12713 1 - Live 0x0000000000000000
x86_pkg_temp_thermal 14269 0 - Live 0x0000000000000000
intel_powerclamp 19031 0 - Live 0x0000000000000000
coretemp 17728 0 - Live 0x0000000000000000
kvm_intel 144426 0 - Live 0x0000000000000000
kvm 468147 1 kvm_intel, Live 0x0000000000000000
crct10dif_pclmul 14250 0 - Live 0x0000000000000000
crc32_pclmul 13160 0 - Live 0x0000000000000000
ghash_clmulni_intel 13259 0 - Live 0x0000000000000000
aesni_intel 55720 0 - Live 0x0000000000000000
aes_x86_64 17131 1 aesni_intel, Live 0x0000000000000000
uvcvideo 82247 0 - Live 0x0000000000000000
lrw 13323 1 aesni_intel, Live 0x0000000000000000
gf128mul 14951 1 lrw, Live 0x0000000000000000
arc4 12573 2 - Live 0x0000000000000000
joydev 17575 0 - Live 0x0000000000000000
iwldvm 243417 0 - Live 0x0000000000000000
videobuf2_vmalloc 13216 1 uvcvideo, Live 0x0000000000000000
glue_helper 14095 1 aesni_intel, Live 0x0000000000000000
videobuf2_memops 13362 1 videobuf2_vmalloc, Live 0x0000000000000000
mac80211 654078 1 iwldvm, Live 0x0000000000000000
videobuf2_core 40972 1 uvcvideo, Live 0x0000000000000000
rts5139 318332 0 - Live 0x0000000000000000 (C)
i915 816671 4 - Live 0x0000000000000000
videodev 139761 2 uvcvideo,videobuf2_core, Live 0x0000000000000000
ablk_helper 13597 1 aesni_intel, Live 0x0000000000000000
cryptd 20530 3 ghash_clmulni_intel,aesni_intel,ablk_helper, Live
0x0000000000000000
dell_laptop 18315 0 - Live 0x0000000000000000
dell_wmi 12681 0 - Live 0x0000000000000000
dcdbas 15017 1 dell_laptop, Live 0x0000000000000000
sparse_keymap 13890 1 dell_wmi, Live 0x0000000000000000
snd_hda_codec_hdmi 46898 1 - Live 0x0000000000000000
snd_hda_codec_realtek 61978 1 - Live 0x0000000000000000
drm_kms_helper 53224 1 i915, Live 0x0000000000000000
snd_hda_intel 57222 3 - Live 0x0000000000000000
snd_hda_codec 195017 3
snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_hda_intel, Live
0x0000000000000000
drm 308397 5 i915,drm_kms_helper, Live 0x0000000000000000
iwlwifi 179643 1 iwldvm, Live 0x0000000000000000
snd_hwdep 13613 1 snd_hda_codec, Live 0x0000000000000000
snd_pcm 107140 3 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec, Live
0x0000000000000000
snd_page_alloc 18798 2 snd_hda_intel,snd_pcm, Live 0x0000000000000000
snd_seq_midi 13324 0 - Live 0x0000000000000000
snd_seq_midi_event 14899 1 snd_seq_midi, Live 0x0000000000000000
snd_rawmidi 30465 1 snd_seq_midi, Live 0x0000000000000000
btusb 28326 0 - Live 0x0000000000000000
snd_seq 66061 2 snd_seq_midi,snd_seq_midi_event, Live 0x0000000000000000
snd_seq_device 14497 3 snd_seq_midi,snd_rawmidi,snd_seq, Live 0x0000000000000000
microcode 23788 0 - Live 0x0000000000000000
snd_timer 30038 2 snd_pcm,snd_seq, Live 0x0000000000000000
lp 17799 0 - Live 0x0000000000000000
mei_me 18578 0 - Live 0x0000000000000000
lpc_ich 21163 0 - Live 0x0000000000000000
snd 73850 17 snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm,snd_seq_midi,snd_rawmidi,snd_seq,snd_seq_device,snd_timer,
Live 0x0000000000000000
cfg80211 509407 3 iwldvm,mac80211,iwlwifi, Live 0x0000000000000000
bluetooth 411140 22 rfcomm,bnep,btusb, Live 0x0000000000000000
parport 42481 3 parport_pc,ppdev,lp, Live 0x0000000000000000
psmouse 104142 0 - Live 0x0000000000000000
mei 82671 1 mei_me, Live 0x0000000000000000
video 19859 1 i915, Live 0x0000000000000000
serio_raw 13462 0 - Live 0x0000000000000000
i2c_algo_bit 13564 1 i915, Live 0x0000000000000000
wmi 19363 1 dell_wmi, Live 0x0000000000000000
soundcore 12680 1 snd, Live 0x0000000000000000
mac_hid 13253 0 - Live 0x0000000000000000
ahci 30063 4 - Live 0x0000000000000000
libahci 32277 1 ahci, Live 0x0000000000000000
r8169 73299 0 - Live 0x0000000000000000
mii 13981 1 r8169, Live 0x0000000000000000

[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)

alex@alex-Inspiron-5521:~$ cat /proc/ioports
0000-0cf7 : PCI Bus 0000:00
  0000-001f : dma1
  0020-0021 : pic1
  0040-0043 : timer0
  0050-0053 : timer1
  0060-0060 : keyboard
  0062-0062 : EC data
  0064-0064 : keyboard
  0066-0066 : EC cmd
  0070-0077 : rtc0
  0080-008f : dma page reg
  00a0-00a1 : pic2
  00c0-00df : dma2
  00f0-00ff : fpu
  0400-0403 : ACPI PM1a_EVT_BLK
  0404-0405 : ACPI PM1a_CNT_BLK
  0408-040b : ACPI PM_TMR
  0410-0415 : ACPI CPU throttle
  0420-042f : ACPI GPE0_BLK
  0430-0433 : iTCO_wdt
  0450-0450 : ACPI PM2_CNT_BLK
  0454-0457 : pnp 00:06
  0458-047f : pnp 00:04
    0460-047f : iTCO_wdt
  0500-057f : pnp 00:04
  0680-069f : pnp 00:04
0cf8-0cff : PCI conf1
0d00-ffff : PCI Bus 0000:00
  1000-100f : pnp 00:04
  164e-164f : pnp 00:04
  2000-2fff : PCI Bus 0000:01
    2000-20ff : 0000:01:00.0
      2000-20ff : r8169
  3000-303f : 0000:00:02.0
  3040-305f : 0000:00:1f.3
  3060-307f : 0000:00:1f.2
    3060-307f : ahci
  3080-3087 : 0000:00:1f.2
    3080-3087 : ahci
  3088-308f : 0000:00:1f.2
    3088-308f : ahci
  3090-3093 : 0000:00:1f.2
    3090-3093 : ahci
  3094-3097 : 0000:00:1f.2
    3094-3097 : ahci
  fd60-fd63 : pnp 00:04
  ffff-ffff : pnp 00:04
    ffff-ffff : pnp 00:04

alex@alex-Inspiron-5521:~$ cat /proc/iomem
00000000-00000fff : reserved
00001000-00087fff : System RAM
00088000-000bffff : reserved
  000a0000-000bffff : PCI Bus 0000:00
000c0000-000c3fff : PCI Bus 0000:00
000c4000-000c7fff : PCI Bus 0000:00
000c8000-000cbfff : PCI Bus 0000:00
000cc000-000cffff : PCI Bus 0000:00
000d0000-000d3fff : PCI Bus 0000:00
000d4000-000d7fff : PCI Bus 0000:00
000d8000-000dbfff : PCI Bus 0000:00
000dc000-000dffff : PCI Bus 0000:00
000e0000-000e3fff : PCI Bus 0000:00
000e4000-000e7fff : PCI Bus 0000:00
000e8000-000ebfff : PCI Bus 0000:00
000ec000-000effff : PCI Bus 0000:00
000f0000-000fffff : PCI Bus 0000:00
  000f0000-000fffff : System ROM
00100000-1fffffff : System RAM
  02000000-0277171d : Kernel code
  0277171e-02d1bdbf : Kernel data
  02e78000-02fdcfff : Kernel bss
20000000-201fffff : reserved
  20000000-201fffff : pnp 00:0a
20200000-40003fff : System RAM
40004000-40004fff : reserved
  40004000-40004fff : pnp 00:0a
40005000-a0baffff : System RAM
a0bb0000-a13affff : reserved
a13b0000-a9dbefff : System RAM
a9dbf000-aaebefff : reserved
aaebf000-aafbefff : ACPI Non-volatile Storage
aafbf000-aaffefff : ACPI Tables
aafff000-aaffffff : System RAM
ab000000-af9fffff : reserved
  aba00000-af9fffff : Graphics Stolen Memory
afa00000-feafffff : PCI Bus 0000:00
  afa00000-afa00fff : pnp 00:09
  b0000000-bfffffff : 0000:00:02.0
    b0000000-b0407fff : BOOTFB
  c0000000-c03fffff : 0000:00:02.0
  c0400000-c04fffff : PCI Bus 0000:01
    c0400000-c0403fff : 0000:01:00.0
      c0400000-c0403fff : r8169
    c0404000-c0404fff : 0000:01:00.0
      c0404000-c0404fff : r8169
  c0500000-c05fffff : PCI Bus 0000:02
    c0500000-c0501fff : 0000:02:00.0
      c0500000-c0501fff : iwlwifi
  c0600000-c060ffff : 0000:00:14.0
    c0600000-c060ffff : xhci_hcd
  c0610000-c0613fff : 0000:00:1b.0
    c0610000-c0613fff : ICH HD audio
  c0614000-c061400f : 0000:00:16.0
    c0614000-c061400f : mei_me
  c0615000-c06150ff : 0000:00:1f.3
  c0617000-c06177ff : 0000:00:1f.2
    c0617000-c06177ff : ahci
  c0618000-c06183ff : 0000:00:1d.0
    c0618000-c06183ff : ehci_hcd
  c0619000-c06193ff : 0000:00:1a.0
    c0619000-c06193ff : ehci_hcd
  e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
    e0000000-efffffff : reserved
      e0000000-efffffff : pnp 00:09
feb00000-feb03fff : reserved
fec00000-fec00fff : reserved
  fec00000-fec003ff : IOAPIC 0
fed00000-fed003ff : HPET 0
fed10000-fed19fff : reserved
  fed10000-fed17fff : pnp 00:09
  fed18000-fed18fff : pnp 00:09
  fed19000-fed19fff : pnp 00:09
fed1c000-fed1ffff : reserved
  fed1c000-fed1ffff : pnp 00:09
    fed1f410-fed1f414 : iTCO_wdt
fed20000-fed3ffff : pnp 00:09
fed90000-fed90fff : dmar0
fed91000-fed91fff : dmar1
fee00000-fee00fff : Local APIC
  fee00000-fee00fff : reserved
ff000000-ff000fff : pnp 00:09
ffb80000-ffffffff : reserved
100000000-14f5fffff : System RAM
14f600000-14fffffff : RAM buffer

[7.5.] PCI information ('lspci -vvv' as root)

alex@alex-Inspiron-5521:~$ sudo lspci -vvv
00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM
Controller (rev 09)
Subsystem: Dell Device 0597
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort+ >SERR- <PERR- INTx-
Latency: 0
Capabilities: [e0] Vendor Specific Information: Len=0c <?>

00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core
processor Graphics Controller (rev 09) (prog-if 00 [VGA controller])
Subsystem: Dell Device 0597
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 48
Region 0: Memory at c0000000 (64-bit, non-prefetchable) [size=4M]
Region 2: Memory at b0000000 (64-bit, prefetchable) [size=256M]
Region 4: I/O ports at 3000 [size=64]
Expansion ROM at <unassigned> [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee00018  Data: 0000
Capabilities: [d0] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [a4] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: i915

00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
Family USB xHCI Host Controller (rev 04) (prog-if 30 [XHCI])
Subsystem: Dell Device 0597
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 42
Region 0: Memory at c0600000 (64-bit, non-prefetchable) [size=64K]
Capabilities: [70] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
Address: 00000000fee002f8  Data: 0000
Kernel driver in use: xhci_hcd

00:16.0 Communication controller: Intel Corporation 7 Series/C210
Series Chipset Family MEI Controller #1 (rev 04)
Subsystem: Dell Device 0597
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 45
Region 0: Memory at c0614000 (64-bit, non-prefetchable) [size=16]
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [8c] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00378  Data: 0000
Kernel driver in use: mei_me

00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
Family USB Enhanced Host Controller #2 (rev 04) (prog-if 20 [EHCI])
Subsystem: Dell Device 0597
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 16
Region 0: Memory at c0619000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Debug port: BAR=1 offset=00a0
Capabilities: [98] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: ehci-pci

00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset
Family High Definition Audio Controller (rev 04)
Subsystem: Dell Device 0597
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 47
Region 0: Memory at c0610000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee003d8  Data: 0000
Capabilities: [70] Express (v1) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- RBE- FLReset+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed unknown, Width x0, ASPM unknown, Latency L0
<64ns, L1 <1us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; Disabled- Retrain- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed unknown, Width x0, TrErr- Train- SlotClk- DLActive-
BWMgmt- ABWMgmt-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
Status: NegoPending- InProgress-
VC1: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=1 ArbSelect=Fixed TC/VC=22
Status: NegoPending- InProgress-
Capabilities: [130 v1] Root Complex Link
Desc: PortNumber=0f ComponentID=00 EltType=Config
Link0: Desc: TargetPort=00 TargetComponent=00 AssocRCRB-
LinkType=MemMapped LinkValid+
Addr: 00000000fed1c000
Kernel driver in use: snd_hda_intel

00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset
Family PCI Express Root Port 1 (rev c4) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 00002000-00002fff
Memory behind bridge: fff00000-000fffff
Prefetchable memory behind bridge: 00000000c0400000-00000000c04fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <16us
ClockPM- Surprise- LLActRep+ BwNot-
LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+
BWMgmt+ ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #0, PowerLimit 10.000W; Interlock- NoCompl+
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet- LinkState+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range BC, TimeoutDis+ ARIFwd-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- ARIFwd-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-,
Selectable De-emphasis: -6dB
 Transmit Margin: Normal Operating Range, EnterModifiedCompliance-
ComplianceSOS-
 Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-,
EqualizationPhase1-
 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
Address: 00000000  Data: 0000
Capabilities: [90] Subsystem: Dell Device 0597
Capabilities: [a0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: pcieport

00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset
Family PCI Express Root Port 2 (rev c4) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: c0500000-c05fffff
Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #2, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <512ns, L1 <16us
ClockPM- Surprise- LLActRep+ BwNot-
LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+
BWMgmt+ ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #1, PowerLimit 10.000W; Interlock- NoCompl+
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet- LinkState+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range BC, TimeoutDis+ ARIFwd-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- ARIFwd-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-,
Selectable De-emphasis: -6dB
 Transmit Margin: Normal Operating Range, EnterModifiedCompliance-
ComplianceSOS-
 Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-,
EqualizationPhase1-
 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
Address: 00000000  Data: 0000
Capabilities: [90] Subsystem: Dell Device 0597
Capabilities: [a0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: pcieport

00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
Subsystem: Dell Device 0597
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 23
Region 0: Memory at c0618000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Debug port: BAR=1 offset=00a0
Capabilities: [98] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: ehci-pci

00:1f.0 ISA bridge: Intel Corporation HM76 Express Chipset LPC
Controller (rev 04)
Subsystem: Dell Device 0597
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Capabilities: [e0] Vendor Specific Information: Len=0c <?>
Kernel driver in use: lpc_ich

00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family
6-port SATA Controller [AHCI mode] (rev 04) (prog-if 01 [AHCI 1.0])
Subsystem: Dell Device 0597
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin B routed to IRQ 44
Region 0: I/O ports at 3088 [size=8]
Region 1: I/O ports at 3094 [size=4]
Region 2: I/O ports at 3080 [size=8]
Region 3: I/O ports at 3090 [size=4]
Region 4: I/O ports at 3060 [size=32]
Region 5: Memory at c0617000 (32-bit, non-prefetchable) [size=2K]
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee00358  Data: 0000
Capabilities: [70] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
Capabilities: [b0] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: ahci

00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
SMBus Controller (rev 04)
Subsystem: Dell Device 0597
Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Interrupt: pin C routed to IRQ 0
Region 0: Memory at c0615000 (64-bit, non-prefetchable) [size=256]
Region 4: I/O ports at 3040 [size=32]

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
RTL8101E/RTL8102E PCI Express Fast Ethernet controller (rev 05)
Subsystem: Dell Device 0597
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 43
Region 0: I/O ports at 2000 [size=256]
Region 2: Memory at c0404000 (64-bit, prefetchable) [size=4K]
Region 4: Memory at c0400000 (64-bit, prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00318  Data: 0000
Capabilities: [70] Express (v2) Endpoint, MSI 01
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0
unlimited, L1 <64us
ClockPM+ Surprise- LLActRep- BwNot-
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive-
BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-,
Selectable De-emphasis: -6dB
 Transmit Margin: Normal Operating Range, EnterModifiedCompliance-
ComplianceSOS-
 Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-,
EqualizationPhase1-
 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
Vector table: BAR=4 offset=00000000
PBA: BAR=4 offset=00000800
Capabilities: [d0] Vital Product Data
Unknown small resource type 00, will not decode more.
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+
MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr+ BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [140 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [160 v1] Device Serial Number 50-00-00-00-36-4c-e0-00
Kernel driver in use: r8169

02:00.0 Network controller: Intel Corporation Centrino Wireless-N 2230 (rev c4)
Subsystem: Intel Corporation Centrino Wireless-N 2230 BGN
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 46
Region 0: Memory at c0500000 (64-bit, non-prefetchable) [size=8K]
Capabilities: [c8] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee00398  Data: 0000
Capabilities: [e0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <4us, L1 <32us
ClockPM+ Surprise- LLActRep- BwNot-
LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive-
BWMgmt- ABWMgmt-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+
MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
Capabilities: [140 v1] Device Serial Number 60-36-dd-ff-ff-cc-6b-6d
Kernel driver in use: iwlwifi

[7.6.] SCSI information (from /proc/scsi/scsi)

alex@alex-Inspiron-5521:~$ cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: WDC WD10JPVT-75A Rev: 01.0
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi2 Channel: 00 Id: 00 Lun: 00
  Vendor: PLDS     Model: DVD+-RW DU-8A5HH Rev: SD11
  Type:   CD-ROM                           ANSI  SCSI revision: 05
Host: scsi6 Channel: 00 Id: 00 Lun: 00
  Vendor: Generic- Model: xD/SD/M.S.       Rev: 1.00
  Type:   Direct-Access                    ANSI  SCSI revision: 02

[7.7.] Other information that might be relevant to the problem (please
look in /proc and include all information that you think to be
relevant):

alex@alex-Inspiron-5521:~$ ls /proc
1     1372  1522  1827  2302  370  531  96             loadavg
10    1375  1523  1831  2312  371  54   97             locks
1053  1376  1524  1848  2314  376  56   acpi           mdstat
1058  1381  1525  1856  24    377  57   asound         meminfo
1063  1383  153   1858  2423  38   576  buddyinfo      misc
1064  1386  154   1859  2427  386  58   bus            modules
1068  1391  155   1863  2432  387  59   cgroups        mounts
11    1396  1555  1885  2466  388  60   cmdline        mtrr
1145  1397  157   189   25    39   61   consoles       net
1149  1399  1574  1894  26    40   62   cpuinfo        pagetypeinfo
1162  14    1579  19    2679  41   7    crypto         partitions
1168  1406  1598  1900  2688  410  717  devices        sched_debug
1185  1408  16    1901  2689  412  735  diskstats      schedstat
1188  1443  160   1902  27    42   74   dma            scsi
12    1464  1638  1908  2763  420  749  driver         self
1209  1467  1639  1909  2766  43   752  execdomains    slabinfo
1214  1469  1680  1930  2776  44   782  fb             softirqs
1224  148   17    1938  28    45   796  filesystems    stat
1225  149   172   1944  29    46   8    fs             swaps
1239  1498  173   1965  297   466  822  interrupts     sys
13    15    1785  2     3     47   824  iomem          sysrq-trigger
1306  150   1787  20    304   476  831  ioports        sysvipc
1310  1506  1788  2031  31    477  850  irq            timer_list
1316  151   1789  2037  32    478  856  kallsyms       timer_stats
1319  1510  1795  2054  33    49   860  kcore          tty
1348  1511  18    21    34    5    865  key-users      uptime
1351  1515  1800  22    341   50   870  kmsg           version
1352  1517  1810  2213  35    51   872  kpagecount     vmallocinfo
1354  1518  1814  2235  36    52   875  kpageflags     vmstat
1368  152   1822  23    37    53   9    latency_stats  zoneinfo

[X.] Other notes, patches, fixes, workarounds:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1261853

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2013-12-21 16:48 (unknown), Alex Barattini
@ 2013-12-23  1:44 ` Aaron Lu
  2013-12-23 16:24   ` Re: Alex Barattini
  0 siblings, 1 reply; 1651+ messages in thread
From: Aaron Lu @ 2013-12-23  1:44 UTC (permalink / raw)
  To: Alex Barattini, linux-acpi

On 12/22/2013 12:48 AM, Alex Barattini wrote:
> [1.] One line summary of the problem:
> 
> 8086:0166 [Dell Inspiron 15R 5521] Impossible to adjust the screen backlight
> 
> [2.] Full description of the problem/report:
> 
> The change in the level of illumination of the screen, through the
> corresponding option in the control panel, has no effect.

Does this patch help?
http://www.spinics.net/lists/linux-acpi/msg47755.html


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2013-12-23  1:44 ` Aaron Lu
@ 2013-12-23 16:24   ` Alex Barattini
  0 siblings, 0 replies; 1651+ messages in thread
From: Alex Barattini @ 2013-12-23 16:24 UTC (permalink / raw)
  To: Aaron Lu; +Cc: linux-acpi

At the moment, set acpi_backlight=vendor on GRUB as a workaround,
works only with kernel version 3.11.0-14-generic (Ubuntu stable
kernel).
Set the workaround in the Kernel v3.13-rc5 produce blackscreen at startup.
You can see the update and the evolution of the bug here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1261853

2013/12/23 Aaron Lu <aaron.lu@intel.com>:
> On 12/22/2013 12:48 AM, Alex Barattini wrote:
>> [1.] One line summary of the problem:
>>
>> 8086:0166 [Dell Inspiron 15R 5521] Impossible to adjust the screen backlight
>>
>> [2.] Full description of the problem/report:
>>
>> The change in the level of illumination of the screen, through the
>> corresponding option in the control panel, has no effect.
>
> Does this patch help?
> http://www.spinics.net/lists/linux-acpi/msg47755.html
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-01-20  9:24 Mark Reyes Guus
  0 siblings, 0 replies; 1651+ messages in thread
From: Mark Reyes Guus @ 2014-01-20  9:24 UTC (permalink / raw)
  To: Recipients

Good day. I am Mark Reyes Guus, I work with Abn Amro Bank as an auditor. I have a proposition to discuss with you. Should you be interested, please e-mail back to me.

Private Email: markreyesguus-cUNmAtK3PYUqdlJmJB21zg@public.gmane.org OR markguus.reyes01-/k+kKI0dE6M@public.gmane.org

Yours Sincerely,
Mark Reyes Guus.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-01-20  9:35 Mark Reyes Guus
  0 siblings, 0 replies; 1651+ messages in thread
From: Mark Reyes Guus @ 2014-01-20  9:35 UTC (permalink / raw)
  To: Recipients

Good day. I am Mark Reyes Guus, I work with Abn Amro Bank as an auditor. I have a proposition to discuss with you. Should you be interested, please e-mail back to me.

Private Email: markreyesguus-cUNmAtK3PYUqdlJmJB21zg@public.gmane.org OR markguus.reyes01-/k+kKI0dE6M@public.gmane.org

Yours Sincerely,
Mark Reyes Guus.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-01-20  9:35 Mark Reyes Guus
  0 siblings, 0 replies; 1651+ messages in thread
From: Mark Reyes Guus @ 2014-01-20  9:35 UTC (permalink / raw)
  To: Recipients

Good day. I am Mark Reyes Guus, I work with Abn Amro Bank as an auditor. I have a proposition to discuss with you. Should you be interested, please e-mail back to me.

Private Email: markreyesguus@abnmrob.co.uk OR markguus.reyes01@yahoo.nl

Yours Sincerely,
Mark Reyes Guus.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* "Sparse checkout leaves no entry on working directory" all the time on Windows 7 on Git 1.8.5.2.msysgit.0
@ 2014-02-06 11:54 konstunn
  2014-02-06 13:20 ` Johannes Sixt
  0 siblings, 1 reply; 1651+ messages in thread
From: konstunn @ 2014-02-06 11:54 UTC (permalink / raw)
  To: git

Greetings.

I've tried to perform a single simple task to fetch data
from the linux-next integration testing tree repo and
sparse checkout the "drivers/staging/usbip/" directory.

I managed to perform it successfully under Linux with Git
1.7.1.

But I always failed to perform checkout under Windows 7
with Git 1.8.5.2.msysgit.0 facing the following error:
"Sparse checkout leaves no entry on working directory".

Unfortunately, I haven't tried that under Linux with Git
1.8.5.2. So, don't know if there is the same problem.

However I typed the checkout directory in file
.git/info/sparse-checkout by using different formats with
and without the leading and the trailing slashes, with and
without asterisk after trailing slash, having tried all
the possible combinations, but, all the same,
nevertheless, the error occured.

On Linux I stored the checkout directory in
sparse-checkout file in the following way:
"drivers/staging/usbip/". And everything was fine under
Linux, but not under Windows.

However, I don't, of course, if the format of this
directory really matters.
May be the Git install on Windows 7 specific configuration
is the reason?

Could you, please, tell me, what might be wrong?
Am I doing something wrong or it's a bug?

Best Regards,
Constantine Gorbunov

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: "Sparse checkout leaves no entry on working directory" all the time on Windows 7 on Git 1.8.5.2.msysgit.0
  2014-02-06 11:54 "Sparse checkout leaves no entry on working directory" all the time on Windows 7 on Git 1.8.5.2.msysgit.0 konstunn
@ 2014-02-06 13:20 ` Johannes Sixt
  2014-02-06 19:56   ` Constantine Gorbunov
  0 siblings, 1 reply; 1651+ messages in thread
From: Johannes Sixt @ 2014-02-06 13:20 UTC (permalink / raw)
  To: konstunn, git

Am 2/6/2014 12:54, schrieb konstunn@ngs.ru:
> However I typed the checkout directory in file
> ..git/info/sparse-checkout by using different formats with
> and without the leading and the trailing slashes, with and
> without asterisk after trailing slash, having tried all
> the possible combinations, but, all the same,
> nevertheless, the error occured.

Make sure that you do not use CRLF line terminators in the sparse-checkout
file.

-- Hannes

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-02-06 13:20 ` Johannes Sixt
@ 2014-02-06 19:56   ` Constantine Gorbunov
  0 siblings, 0 replies; 1651+ messages in thread
From: Constantine Gorbunov @ 2014-02-06 19:56 UTC (permalink / raw)
  To: git

Johannes Sixt <j.sixt <at> viscovery.net> writes:

> 
> Am 2/6/2014 12:54, schrieb konstunn <at> ngs.ru:
> > However I typed the checkout directory in file
> > ..git/info/sparse-checkout by using different formats with
> > and without the leading and the trailing slashes, with and
> > without asterisk after trailing slash, having tried all
> > the possible combinations, but, all the same,
> > nevertheless, the error occured.
> 
> Make sure that you do not use CRLF line terminators in the sparse-checkout
> file.
> 

This is it. Right you are. I've just tried to edit "manually" with notepad 
.git\info\sparse-checkout and found out that there really was a CRLF line 
terminator. After I removed it I managed to succeed in my sparse checkout.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2014-02-10 14:35 Viswanatham, RaviTeja
  2014-02-10 18:35 ` Marcel Holtmann
  0 siblings, 1 reply; 1651+ messages in thread
From: Viswanatham, RaviTeja @ 2014-02-10 14:35 UTC (permalink / raw)
  To: bluez mailin list (linux-bluetooth@vger.kernel.org)

Hello,

I am working on Ubuntu 12.04 with a Bluetooth 3.0 +HS + wifi combo USB dongle. 

I want to reach a data transfer speed of up to 24 Mbit/s. 

My Questions:
Does Bluez support high speed data transfer rates up to 24 Mbit/s (Bluetooth 3.0+HS) ? 

If it does, is there any user configuration involved to achieve that? What other requirements need to be met?
Does, Bluetooth enables AMP function to communicate 802.11n channel to support high speed or it has to be configure with any other drivers? 

I am new to Linux a detail explanation would be really appreciated. Thank you in advance for your support.

Best Regards,

Ravi Teja  

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-02-10 14:35 Viswanatham, RaviTeja
@ 2014-02-10 18:35 ` Marcel Holtmann
  2014-02-11  7:13   ` Re: Andrei Emeltchenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Marcel Holtmann @ 2014-02-10 18:35 UTC (permalink / raw)
  To: Viswanatham, RaviTeja; +Cc: bluez mailin list (linux-bluetooth@vger.kernel.org)

Hi Ravi,

> I am working on Ubuntu 12.04 with a Bluetooth 3.0 +HS + wifi combo USB dongle. 
> 
> I want to reach a data transfer speed of up to 24 Mbit/s. 
> 
> My Questions:
> Does Bluez support high speed data transfer rates up to 24 Mbit/s (Bluetooth 3.0+HS) ? 
> 
> If it does, is there any user configuration involved to achieve that? What other requirements need to be met?
> Does, Bluetooth enables AMP function to communicate 802.11n channel to support high speed or it has to be configure with any other drivers? 
> 
> I am new to Linux a detail explanation would be really appreciated. Thank you in advance for your support.

we do support Bluetooth HS operation. However your WiFi device needs to be exposed as AMP Controller. Most WiFi hardware needs a special driver to expose itself as AMP Controller. There is no generic driver for mac80211 subsystem in the kernel.

Regards

Marcel


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-02-10 18:35 ` Marcel Holtmann
@ 2014-02-11  7:13   ` Andrei Emeltchenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Andrei Emeltchenko @ 2014-02-11  7:13 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: Viswanatham, RaviTeja,
	bluez mailin list (linux-bluetooth@vger.kernel.org)

Hi All,

On Mon, Feb 10, 2014 at 10:35:24AM -0800, Marcel Holtmann wrote:
> Hi Ravi,
> 
> > I am working on Ubuntu 12.04 with a Bluetooth 3.0 +HS + wifi combo USB
> > dongle. 
> > 
> > I want to reach a data transfer speed of up to 24 Mbit/s. 
> > 
> > My Questions: Does Bluez support high speed data transfer rates up to
> > 24 Mbit/s (Bluetooth 3.0+HS) ? 
> > 
> > If it does, is there any user configuration involved to achieve that?
> > What other requirements need to be met?  Does, Bluetooth enables AMP
> > function to communicate 802.11n channel to support high speed or it
> > has to be configure with any other drivers? 
> > 
> > I am new to Linux a detail explanation would be really appreciated.
> > Thank you in advance for your support.
> 
> we do support Bluetooth HS operation. However your WiFi device needs to
> be exposed as AMP Controller. Most WiFi hardware needs a special driver
> to expose itself as AMP Controller. There is no generic driver for
> mac80211 subsystem in the kernel.

Some devices have AMP Controller implemented in firmware.
I was using Marvell SD8787, probably newer Marvell devices also works.

You may check drivers/bluetooth/btmrvl_main.c to see how HCI dev_type
HCI_AMP is assigned.

Best regards 
Andrei Emeltchenko 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2014-02-23 16:22 tigran.mkrtchyan
  2014-02-23 16:41 ` Trond Myklebust
  0 siblings, 1 reply; 1651+ messages in thread
From: tigran.mkrtchyan @ 2014-02-23 16:22 UTC (permalink / raw)
  To: linux-nfs

to me it's unclear, why a SETATTR  always follows an OPEN, even in case of
EXCLUSIVE4_1. With this fix, I get desired behavior.

Tigran.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-02-23 16:22 tigran.mkrtchyan
@ 2014-02-23 16:41 ` Trond Myklebust
  2014-02-23 18:04   ` Re: Mkrtchyan, Tigran
  0 siblings, 1 reply; 1651+ messages in thread
From: Trond Myklebust @ 2014-02-23 16:41 UTC (permalink / raw)
  To: Mkrtchyan, Tigran; +Cc: Linux NFS Mailing List

On Feb 23, 2014, at 11:22, tigran.mkrtchyan@desy.de wrote:

> to me it's unclear, why a SETATTR  always follows an OPEN, even in case of
> EXCLUSIVE4_1. With this fix, I get desired behavior.

Yes, but that fix risks incurring an NFS4ERR_INVAL from which we cannot recover because it does not include the mandatory check for the allowed set of attributes.
Please see RFC5661 section 18.16.3 about the client side use of ‘suppattr_exclcreat’ .

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-02-23 16:41 ` Trond Myklebust
@ 2014-02-23 18:04   ` Mkrtchyan, Tigran
  0 siblings, 0 replies; 1651+ messages in thread
From: Mkrtchyan, Tigran @ 2014-02-23 18:04 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing List



----- Original Message -----
> From: "Trond Myklebust" <trondmy@gmail.com>
> To: "Tigran Mkrtchyan" <tigran.mkrtchyan@desy.de>
> Cc: "Linux NFS Mailing List" <linux-nfs@vger.kernel.org>
> Sent: Sunday, February 23, 2014 5:41:26 PM
> Subject: Re:
> 
> 
> On Feb 23, 2014, at 11:22, tigran.mkrtchyan@desy.de wrote:
> 
> > to me it's unclear, why a SETATTR  always follows an OPEN, even in case of
> > EXCLUSIVE4_1. With this fix, I get desired behavior.
> 
> Yes, but that fix risks incurring an NFS4ERR_INVAL from which we cannot
> recover because it does not include the mandatory check for the allowed set
> of attributes.
> Please see RFC5661 section 18.16.3 about the client side use of
> ‘suppattr_exclcreat’ .

Yes, I noticed, that client never query that attribute. I will check a
possibility to add a check for suppattr_exclcreat as well.

Tigran.

> 
> Cheers,
>   Trond

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-03-01  6:56 Anton 'EvilMan' Danilov
  0 siblings, 0 replies; 1651+ messages in thread
From: Anton 'EvilMan' Danilov @ 2014-03-01  6:56 UTC (permalink / raw)
  To: lartc

Hello.
You should't pass traffic into inner class. Use for default traffic
seperate leaf class.

2014-03-01 5:13 GMT+04:00  <Radek.Hes@ecs.vuw.ac.nz>:
> Hi I understand this is the mailing list for tc htb questions?
> I have a the folloiwng
>
> tc qdisc add dev eth4 root default 0
> tc class add dev eth4 parent 1: classid 1:0 htb rate 20mbit ceil 20mbit
> tc class add dev eth4 parent 1: classid 1:10 htb rate 10mbit ceil 10mbit
> tc filter add dev eth4 protocol ip parent 1:0 prio 1 u32 match ip src
> 10.0.0.3 flowid 1:10
>
> however default traffic is not limited to 20mbit!!!!
> Regards.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe lartc" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Anton.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re;
@ 2014-03-16 12:01 Nieuwenhuis,Sonja S.B.M.
  0 siblings, 0 replies; 1651+ messages in thread
From: Nieuwenhuis,Sonja S.B.M. @ 2014-03-16 12:01 UTC (permalink / raw)


I have an Inheritance for you email me now: sashakhmed-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org<mailto:sashakhmed-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org>

________________________________
Op deze e-mail zijn de volgende voorwaarden van toepassing:
http://www.fontys.nl/disclaimer
The above disclaimer applies to this e-mail message.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2014-04-13 21:01 Marcus White
  2014-04-15  0:59 ` Marcus White
  0 siblings, 1 reply; 1651+ messages in thread
From: Marcus White @ 2014-04-13 21:01 UTC (permalink / raw)
  To: kvm

Hello,
I had some basic questions regarding KVM, and would appreciate any help:)

I have been reading about the KVM architecture, and as I understand
it, the guest shows up as a regular process in the host itself..

I had some questions around that..

1.  Are the guest processes implemented as a control group within the
overall VM process itself? Is the VM a kernel process or a user
process?

2. Is there a way for me to force some specific CPU/s to a guest, and
those CPUs to be not used for any work on the host itself?  Pinning is
just making sure the vCPU runs on the same physical CPU always, I am
looking for something more than that..

3. If the host is compiled as a non pre-emptible kernel, kernel
process run to completion until they give up the CPU themselves. In
the context of a guest, I am trying to understand what that would mean
in the context of KVM and guest VMs. If the VM is a user process, it
means nothing, I wasnt sure as per (1).

Cheers!
M

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-04-13 21:01 (unknown), Marcus White
@ 2014-04-15  0:59 ` Marcus White
  2014-04-16 21:17   ` Re: Marcelo Tosatti
  0 siblings, 1 reply; 1651+ messages in thread
From: Marcus White @ 2014-04-15  0:59 UTC (permalink / raw)
  To: kvm

Hello,
A friendly bump to see if anyone has any ideas:-)

Cheers!

On Sun, Apr 13, 2014 at 2:01 PM, Marcus White
<roastedseaweed.k@gmail.com> wrote:
> Hello,
> I had some basic questions regarding KVM, and would appreciate any help:)
>
> I have been reading about the KVM architecture, and as I understand
> it, the guest shows up as a regular process in the host itself..
>
> I had some questions around that..
>
> 1.  Are the guest processes implemented as a control group within the
> overall VM process itself? Is the VM a kernel process or a user
> process?
>
> 2. Is there a way for me to force some specific CPU/s to a guest, and
> those CPUs to be not used for any work on the host itself?  Pinning is
> just making sure the vCPU runs on the same physical CPU always, I am
> looking for something more than that..
>
> 3. If the host is compiled as a non pre-emptible kernel, kernel
> process run to completion until they give up the CPU themselves. In
> the context of a guest, I am trying to understand what that would mean
> in the context of KVM and guest VMs. If the VM is a user process, it
> means nothing, I wasnt sure as per (1).
>
> Cheers!
> M

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-04-15  0:59 ` Marcus White
@ 2014-04-16 21:17   ` Marcelo Tosatti
  2014-04-17 21:33     ` Re: Marcus White
  0 siblings, 1 reply; 1651+ messages in thread
From: Marcelo Tosatti @ 2014-04-16 21:17 UTC (permalink / raw)
  To: Marcus White; +Cc: kvm

On Mon, Apr 14, 2014 at 05:59:05PM -0700, Marcus White wrote:
> Hello,
> A friendly bump to see if anyone has any ideas:-)
> 
> Cheers!
> 
> On Sun, Apr 13, 2014 at 2:01 PM, Marcus White
> <roastedseaweed.k@gmail.com> wrote:
> > Hello,
> > I had some basic questions regarding KVM, and would appreciate any help:)
> >
> > I have been reading about the KVM architecture, and as I understand
> > it, the guest shows up as a regular process in the host itself..
> >
> > I had some questions around that..
> >
> > 1.  Are the guest processes implemented as a control group within the
> > overall VM process itself? Is the VM a kernel process or a user
> > process?

User process.

> > 2. Is there a way for me to force some specific CPU/s to a guest, and
> > those CPUs to be not used for any work on the host itself?  Pinning is
> > just making sure the vCPU runs on the same physical CPU always, I am
> > looking for something more than that..

Control groups.

> > 3. If the host is compiled as a non pre-emptible kernel, kernel
> > process run to completion until they give up the CPU themselves. In
> > the context of a guest, I am trying to understand what that would mean
> > in the context of KVM and guest VMs. If the VM is a user process, it
> > means nothing, I wasnt sure as per (1).

What problem are you trying to solve?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-04-16 21:17   ` Re: Marcelo Tosatti
@ 2014-04-17 21:33     ` Marcus White
  2014-04-21 21:49       ` Re: Marcelo Tosatti
  0 siblings, 1 reply; 1651+ messages in thread
From: Marcus White @ 2014-04-17 21:33 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm

>> > Hello,
>> > I had some basic questions regarding KVM, and would appreciate any help:)
>> >
>> > I have been reading about the KVM architecture, and as I understand
>> > it, the guest shows up as a regular process in the host itself..
>> >
>> > I had some questions around that..
>> >
>> > 1.  Are the guest processes implemented as a control group within the
>> > overall VM process itself? Is the VM a kernel process or a user
>> > process?
>
> User process.
>
>> > 2. Is there a way for me to force some specific CPU/s to a guest, and
>> > those CPUs to be not used for any work on the host itself?  Pinning is
>> > just making sure the vCPU runs on the same physical CPU always, I am
>> > looking for something more than that..
>
> Control groups.
Do control groups prevent the host from using those CPUs? I want only
the VM to use the CPUs, and dont want any host user or kernel threads
to run on that physical CPU. I looked up control groups, maybe I
missed something there. I will go back and take a look. If you can
clarify, I would appreciate it:)

>
>> > 3. If the host is compiled as a non pre-emptible kernel, kernel
>> > process run to completion until they give up the CPU themselves. In
>> > the context of a guest, I am trying to understand what that would mean
>> > in the context of KVM and guest VMs. If the VM is a user process, it
>> > means nothing, I wasnt sure as per (1).
>
> What problem are you trying to solve?
Its more of an investigation at this point to understand what can happen..

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-04-17 21:33     ` Re: Marcus White
@ 2014-04-21 21:49       ` Marcelo Tosatti
  0 siblings, 0 replies; 1651+ messages in thread
From: Marcelo Tosatti @ 2014-04-21 21:49 UTC (permalink / raw)
  To: Marcus White; +Cc: kvm

On Thu, Apr 17, 2014 at 02:33:41PM -0700, Marcus White wrote:
> >> > Hello,
> >> > I had some basic questions regarding KVM, and would appreciate any help:)
> >> >
> >> > I have been reading about the KVM architecture, and as I understand
> >> > it, the guest shows up as a regular process in the host itself..
> >> >
> >> > I had some questions around that..
> >> >
> >> > 1.  Are the guest processes implemented as a control group within the
> >> > overall VM process itself? Is the VM a kernel process or a user
> >> > process?
> >
> > User process.
> >
> >> > 2. Is there a way for me to force some specific CPU/s to a guest, and
> >> > those CPUs to be not used for any work on the host itself?  Pinning is
> >> > just making sure the vCPU runs on the same physical CPU always, I am
> >> > looking for something more than that..
> >
> > Control groups.
> Do control groups prevent the host from using those CPUs? I want only
> the VM to use the CPUs, and dont want any host user or kernel threads
> to run on that physical CPU. I looked up control groups, maybe I
> missed something there. I will go back and take a look. If you can
> clarify, I would appreciate it:)

Per-CPU kernel threads usually perform necessary work on behalf of 
the CPU. 

For other user threads, yes you can have them execute exclusively on other
processors.

> >> > 3. If the host is compiled as a non pre-emptible kernel, kernel
> >> > process run to completion until they give up the CPU themselves. In
> >> > the context of a guest, I am trying to understand what that would mean
> >> > in the context of KVM and guest VMs. If the VM is a user process, it
> >> > means nothing, I wasnt sure as per (1).
> >
> > What problem are you trying to solve?
> Its more of an investigation at this point to understand what can happen..

http://www.phoronix.com/scan.php?page=news_item&px=MTI1NTg
http://www.linuxplumbersconf.org/2013/ocw/sessions/1143


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <blk-mq updates>]

[parent not found: <1397464212-4454-1-git-send-email-hch@lst.de>]

* Re:
       [not found] ` <1397464212-4454-1-git-send-email-hch@lst.de>
@ 2014-04-15 20:16   ` Jens Axboe
  0 siblings, 0 replies; 1651+ messages in thread
From: Jens Axboe @ 2014-04-15 20:16 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Matias Bjorling, linux-kernel, linux-scsi

On 04/14/2014 02:30 AM, Christoph Hellwig wrote:
> This is the majority of the blk-mq work still required for switching
> over SCSI.  There are a few more bits for I/O completion and requeueing
> pending, but they will need further work.

Looks OK to me, I have applied them all. Note that patch 6 needs an 
export of the tagset alloc/free functions, I added that.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* "csum failed" that was not detected by scrub
@ 2014-05-02  9:42 Jaap Pieroen
  2014-05-02 10:20 ` Duncan
  0 siblings, 1 reply; 1651+ messages in thread
From: Jaap Pieroen @ 2014-05-02  9:42 UTC (permalink / raw)
  To: linux-btrfs

Hi all,

I completed a full scrub:
root@nasbak:/home/jpieroen# btrfs scrub status /home/
scrub status for 7ca5f38e-308f-43ab-b3ea-31b3bcd11a0d
scrub started at Wed Apr 30 08:30:19 2014 and finished after 144131 seconds
total bytes scrubbed: 4.76TiB with 0 errors

Then tried to remove a device:
root@nasbak:/home/jpieroen# btrfs device delete /dev/sdb /home

This triggered bug_on, with the following error in dmesg: csum failed
ino 258 off 1395560448 csum 2284440321 expected csum 319628859

How can there still be csum failures directly after a scrub?
If I rerun the scrub it still won't find any errors. I know this,
because I've had the same issue 3 times in a row. Each time running a
scrub and still being unable to remove the device.

Kind Regards,
Jaap

--------------------------------------------------------------
Details:

root@nasbak:/home/jpieroen#   uname -a
Linux nasbak 3.14.1-031401-generic #201404141220 SMP Mon Apr 14
16:21:48 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

root@nasbak:/home/jpieroen#   btrfs --version
Btrfs v3.14.1

root@nasbak:/home/jpieroen#   btrfs fi df /home
Data, RAID5: total=4.57TiB, used=4.55TiB
System, RAID1: total=32.00MiB, used=352.00KiB
Metadata, RAID1: total=7.00GiB, used=5.59GiB

root@nasbak:/home/jpieroen# btrfs fi show
Label: 'btrfs_storage'  uuid: 7ca5f38e-308f-43ab-b3ea-31b3bcd11a0d
Total devices 6 FS bytes used 4.56TiB
devid    1 size 1.82TiB used 1.31TiB path /dev/sde
devid    2 size 1.82TiB used 1.31TiB path /dev/sdf
devid    3 size 1.82TiB used 1.31TiB path /dev/sdg
devid    4 size 931.51GiB used 25.00GiB path /dev/sdb
devid    6 size 2.73TiB used 994.03GiB path /dev/sdh
devid    7 size 2.73TiB used 994.03GiB path /dev/sdi

Btrfs v3.14.1

jpieroen@nasbak:~$ dmesg
[227248.656438] BTRFS info (device sdi): relocating block group
9735225016320 flags 129
[227261.713860] BTRFS info (device sdi): found 9 extents
[227264.531019] BTRFS info (device sdi): found 9 extents
[227265.011826] BTRFS info (device sdi): relocating block group
76265029632 flags 129
[227274.052249] BTRFS info (device sdi): csum failed ino 258 off
1395560448 csum 2284440321 expected csum 319628859
[227274.052354] BTRFS info (device sdi): csum failed ino 258 off
1395564544 csum 3646299263 expected csum 319628859
[227274.052402] BTRFS info (device sdi): csum failed ino 258 off
1395568640 csum 281259278 expected csum 319628859
[227274.052449] BTRFS info (device sdi): csum failed ino 258 off
1395572736 csum 2594807184 expected csum 319628859
[227274.052492] BTRFS info (device sdi): csum failed ino 258 off
1395576832 csum 4288971971 expected csum 319628859
[227274.052537] BTRFS info (device sdi): csum failed ino 258 off
1395580928 csum 752615894 expected csum 319628859
[227274.052581] BTRFS info (device sdi): csum failed ino 258 off
1395585024 csum 3828951500 expected csum 319628859
[227274.061279] ------------[ cut here ]------------
[227274.061354] kernel BUG at /home/apw/COD/linux/fs/btrfs/extent_io.c:2116!
[227274.061445] invalid opcode: 0000 [#1] SMP
[227274.061509] Modules linked in: cuse deflate
[227274.061573] BTRFS info (device sdi): csum failed ino 258 off
1395560448 csum 2284440321 expected csum 319628859
[227274.061707]  ctr twofish_generic twofish_x86_64_3way
twofish_x86_64 twofish_common camellia_generic camellia_x86_64
serpent_sse2_x86_64 xts serpent_generic lrw gf128mul glue_helper
blowfish_generic blowfish_x86_64 blowfish_common cast5_generic
cast_common ablk_helper cryptd des_generic cmac xcbc rmd160
crypto_null af_key xfrm_algo nfsd auth_rpcgss nfs_acl nfs lockd sunrpc
fscache dm_crypt ip6t_REJECT ppdev xt_hl ip6t_rt nf_conntrack_ipv6
nf_defrag_ipv6 ipt_REJECT xt_comment xt_LOG kvm xt_recent microcode
xt_multiport xt_limit xt_tcpudp psmouse serio_raw xt_addrtype k10temp
edac_core ipt_MASQUERADE edac_mce_amd iptable_nat nf_nat_ipv4
sp5100_tco nf_conntrack_ipv4 nf_defrag_ipv4 ftdi_sio i2c_piix4
usbserial xt_conntrack ip6table_filter ip6_tables joydev
nf_conntrack_netbios_ns nf_conntrack_broadcast snd_hda_codec_via
nf_nat_ftp snd_hda_codec_hdmi nf_nat snd_hda_codec_generic
nf_conntrack_ftp nf_conntrack snd_hda_intel iptable_filter
ir_lirc_codec(OF) lirc_dev(OF) ip_tables snd_hda_codec
ir_mce_kbd_decoder(OF) x_tables snd_hwdep ir_sony_decoder(OF)
rc_tbs_nec(OF) ir_jvc_decoder(OF) snd_pcm ir_rc6_decoder(OF)
ir_rc5_decoder(OF) saa716x_tbs_dvb(OF) tbs6982fe(POF) tbs6680fe(POF)
ir_nec_decoder(OF) tbs6923fe(POF) tbs6985se(POF) tbs6928se(POF)
tbs6982se(POF) tbs6991fe(POF) tbs6618fe(POF) saa716x_core(OF)
tbs6922fe(POF) tbs6928fe(POF) tbs6991se(POF) stv090x(OF) dvb_core(OF)
rc_core(OF) snd_timer snd soundcore asus_atk0110 parport_pc shpchp
mac_hid lp parport btrfs xor raid6_pq pata_acpi hid_generic usbhid hid
usb_storage radeon pata_atiixp r8169 mii i2c_algo_bit sata_sil24 ttm
drm_kms_helper drm ahci libahci wmi
[227274.064118] CPU: 1 PID: 15543 Comm: btrfs-endio-4 Tainted: PF
    O 3.14.1-031401-generic #201404141220
[227274.064246] Hardware name: System manufacturer System Product
Name/M4A78LT-M, BIOS 0802    08/24/2010
[227274.064368] task: ffff88030a0e31e0 ti: ffff8800a15b8000 task.ti:
ffff8800a15b8000
[227274.064467] RIP: 0010:[<ffffffffa0304c33>]  [<ffffffffa0304c33>]
clean_io_failure+0x1a3/0x1b0 [btrfs]
[227274.064623] RSP: 0018:ffff8800a15b9cd8  EFLAGS: 00010246
[227274.064694] RAX: 0000000000000000 RBX: ffff88010b2869b8 RCX:
0000000000000000
[227274.064789] RDX: ffff8802cad30f00 RSI: 00000000720071fe RDI:
ffff88010b286884
[227274.064883] RBP: ffff8800a15b9d28 R08: 0000000000000000 R09:
0000000000000000
[227274.064977] R10: 0000000000000200 R11: 0000000000000000 R12:
ffffea000102b080
[227274.065071] R13: ffff880004366c00 R14: ffff88010b286800 R15:
00000000532ef000
[227274.065166] FS:  00007f16670b0740(0000) GS:ffff88031fc40000(0000)
knlGS:0000000000000000
[227274.065271] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[227274.065348] CR2: 00007f9c5c3b0000 CR3: 00000002dd8a8000 CR4:
00000000000007e0
[227274.065443] Stack:
[227274.065471]  00000000000532ef ffff88030a14c000 0000000000000000
ffff880004366c00
[227274.065584]  ffff88030a95f780 ffffea000102b080 ffff8801026cc4b0
ffff88010b2869b8
[227274.065697]  00000000532ef000 0000000000000000 ffff8800a15b9db8
ffffffffa0304f1b
[227274.065809] Call Trace:
[227274.065872]  [<ffffffffa0304f1b>]
end_bio_extent_readpage+0x2db/0x3d0 [btrfs]
[227274.065971]  [<ffffffff8120a013>] bio_endio+0x53/0xa0
[227274.066042]  [<ffffffff8120a072>] bio_endio_nodec+0x12/0x20
[227274.066137]  [<ffffffffa02dde81>] end_workqueue_fn+0x41/0x50 [btrfs]
[227274.066243]  [<ffffffffa03157d0>] worker_loop+0xa0/0x330 [btrfs]
[227274.066345]  [<ffffffffa0315730>] ?
check_pending_worker_creates.isra.1+0xe0/0xe0 [btrfs]
[227274.066455]  [<ffffffff8108ffa9>] kthread+0xc9/0xe0
[227274.066522]  [<ffffffff8108fee0>] ? flush_kthread_worker+0xb0/0xb0
[227274.066606]  [<ffffffff817721bc>] ret_from_fork+0x7c/0xb0
[227274.066680]  [<ffffffff8108fee0>] ? flush_kthread_worker+0xb0/0xb0
[227274.066761] Code: 00 00 83 f8 01 0f 8e 49 ff ff ff 49 8b 4d 18 49
8b 55 10 4d 89 e0 45 8b 4d 2c 48 8b 7d b8 4c 89 fe e8 72 fc ff ff e9
29 ff ff ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55
48 89
[227274.067266] RIP  [<ffffffffa0304c33>] clean_io_failure+0x1a3/0x1b0 [btrfs]
[227274.067380]  RSP <ffff8800a15b9cd8>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: "csum failed" that was not detected by scrub
  2014-05-02  9:42 "csum failed" that was not detected by scrub Jaap Pieroen
@ 2014-05-02 10:20 ` Duncan
  2014-05-02 17:48   ` Jaap Pieroen
  0 siblings, 1 reply; 1651+ messages in thread
From: Duncan @ 2014-05-02 10:20 UTC (permalink / raw)
  To: linux-btrfs

Jaap Pieroen posted on Fri, 02 May 2014 11:42:35 +0200 as excerpted:

> I completed a full scrub:
> root@nasbak:/home/jpieroen# btrfs scrub status /home/
> scrub status for 7ca5f38e-308f-43ab-b3ea-31b3bcd11a0d
> scrub started at Wed Apr 30 08:30:19 2014
> and finished after 144131 seconds
> total bytes scrubbed: 4.76TiB with 0 errors
> 
> Then tried to remove a device:
> root@nasbak:/home/jpieroen# btrfs device delete /dev/sdb /home
> 
> This triggered bug_on, with the following error in dmesg: csum failed
> ino 258 off 1395560448 csum 2284440321 expected csum 319628859
> 
> How can there still be csum failures directly after a scrub?

Simple enough, really...

> root@nasbak:/home/jpieroen#   btrfs fi df /home
> Data, RAID5: total=4.57TiB, used=4.55TiB
> System, RAID1: total=32.00MiB, used=352.00KiB
> Metadata, RAID1: total=7.00GiB, used=5.59GiB

To those that know the details, this tells the story.

Btrfs raid5/6 modes are not yet code-complete, and scrub is one of the 
incomplete bits.  btrfs scrub doesn't know how to deal with raid5/6 
properly just yet.

While the operational bits of raid5/6 support are there, parity is 
calculated and written, scrub, and recovery from a lost device, are not 
yet code complete.  Thus, it's effectively a slower, lower capacity raid0 
without scrub support at this point, except that when the code is 
complete, you'll get an automatic "free" upgrade to full raid5 or raid6, 
because the operational bits have been working since they were 
introduced, just the recovery and scrub bits were bad, making it 
effectively a raid0 in reliability terms, lose one and you've lost them 
all.

That's the big picture anyway.  Marc Merlin recently did quite a bit of 
raid5/6 testing and there's a page on the wiki now with what he found.  
Additionally, I saw a scrub support for raid5/6 modes patch on the list 
recently, but while it may be in integration, I believe it's too new to 
have reached release yet.

Wiki, for memory or bookmark: https://btrfs.wiki.kernel.org

Direct user documentation link for bookmark (unwrap as necessary):

https://btrfs.wiki.kernel.org/index.php/
Main_Page#Guides_and_usage_information

The raid5/6 page (which I didn't otherwise see conveniently linked, I dug 
it out of the recent changes list since I knew it was there from on-list 
discussion):

https://btrfs.wiki.kernel.org/index.php/RAID56

@ Marc or Hugo or someone with a wiki account:  Can this be more visibly 
linked from the user-docs contents, added to the user docs category list, 
and probably linked from at least the multiple devices and (for now) the 
gotchas pages?

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-05-02 10:20 ` Duncan
@ 2014-05-02 17:48   ` Jaap Pieroen
  2014-05-03 13:31     ` Re: Frank Holton
  0 siblings, 1 reply; 1651+ messages in thread
From: Jaap Pieroen @ 2014-05-02 17:48 UTC (permalink / raw)
  To: linux-btrfs

Duncan <1i5t5.duncan <at> cox.net> writes:

> 
> To those that know the details, this tells the story.
> 
> Btrfs raid5/6 modes are not yet code-complete, and scrub is one of the 
> incomplete bits.  btrfs scrub doesn't know how to deal with raid5/6 
> properly just yet.
> 
> While the operational bits of raid5/6 support are there, parity is 
> calculated and written, scrub, and recovery from a lost device, are not 
> yet code complete.  Thus, it's effectively a slower, lower capacity raid0 
> without scrub support at this point, except that when the code is 
> complete, you'll get an automatic "free" upgrade to full raid5 or raid6, 
> because the operational bits have been working since they were 
> introduced, just the recovery and scrub bits were bad, making it 
> effectively a raid0 in reliability terms, lose one and you've lost them 
> all.
> 
> That's the big picture anyway.  Marc Merlin recently did quite a bit of 
> raid5/6 testing and there's a page on the wiki now with what he found.  
> Additionally, I saw a scrub support for raid5/6 modes patch on the list 
> recently, but while it may be in integration, I believe it's too new to 
> have reached release yet.
> 
> Wiki, for memory or bookmark: https://btrfs.wiki.kernel.org
> 
> Direct user documentation link for bookmark (unwrap as necessary):
> 
> https://btrfs.wiki.kernel.org/index.php/
> Main_Page#Guides_and_usage_information
> 
> The raid5/6 page (which I didn't otherwise see conveniently linked, I dug 
> it out of the recent changes list since I knew it was there from on-list 
> discussion):
> 
> https://btrfs.wiki.kernel.org/index.php/RAID56
> 
>  <at>  Marc or Hugo or someone with a wiki account:  Can this be more visibly 
> linked from the user-docs contents, added to the user docs category list, 
> and probably linked from at least the multiple devices and (for now) the 
> gotchas pages?
> 

So raid5 is much more useless than I assumed. I read Marc's blog and
figured that btrfs was ready enough.

I' really in trouble now. I tried to get rid of raid5 by doing a convert
balance to raid1. But of course this triggered the same issue. And now
I have a dead system because the first thing btrfs does after mounting
is continue the balance which will crash the system and send me into
a vicious loop.

- How can I stop btrfs from continuing balancing?
- How can I salvage this situation and convert to raid1?

Unfortunately I have little spare drives left. Not enough to contain
4.7TiB of data.. :(





^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-05-02 17:48   ` Jaap Pieroen
@ 2014-05-03 13:31     ` Frank Holton
  0 siblings, 0 replies; 1651+ messages in thread
From: Frank Holton @ 2014-05-03 13:31 UTC (permalink / raw)
  To: Jaap Pieroen; +Cc: linux-btrfs

Hi Jaap,

This patch http://www.spinics.net/lists/linux-btrfs/msg33025.html made
it into 3.15 RC2 so if you're willing to build your own RC kernel you
may have better luck with scrub in 3.15. The patch only scrubs the
data blocks in RAID5/6 so hopefully your parity blocks are intact. I'm
not sure if it would help any but it may be worth a try.

On Fri, May 2, 2014 at 1:48 PM, Jaap Pieroen <jaap@pieroen.nl> wrote:
> Duncan <1i5t5.duncan <at> cox.net> writes:
>
>>
>> To those that know the details, this tells the story.
>>
>> Btrfs raid5/6 modes are not yet code-complete, and scrub is one of the
>> incomplete bits.  btrfs scrub doesn't know how to deal with raid5/6
>> properly just yet.
>>
>> While the operational bits of raid5/6 support are there, parity is
>> calculated and written, scrub, and recovery from a lost device, are not
>> yet code complete.  Thus, it's effectively a slower, lower capacity raid0
>> without scrub support at this point, except that when the code is
>> complete, you'll get an automatic "free" upgrade to full raid5 or raid6,
>> because the operational bits have been working since they were
>> introduced, just the recovery and scrub bits were bad, making it
>> effectively a raid0 in reliability terms, lose one and you've lost them
>> all.
>>
>> That's the big picture anyway.  Marc Merlin recently did quite a bit of
>> raid5/6 testing and there's a page on the wiki now with what he found.
>> Additionally, I saw a scrub support for raid5/6 modes patch on the list
>> recently, but while it may be in integration, I believe it's too new to
>> have reached release yet.
>>
>> Wiki, for memory or bookmark: https://btrfs.wiki.kernel.org
>>
>> Direct user documentation link for bookmark (unwrap as necessary):
>>
>> https://btrfs.wiki.kernel.org/index.php/
>> Main_Page#Guides_and_usage_information
>>
>> The raid5/6 page (which I didn't otherwise see conveniently linked, I dug
>> it out of the recent changes list since I knew it was there from on-list
>> discussion):
>>
>> https://btrfs.wiki.kernel.org/index.php/RAID56
>>
>>  <at>  Marc or Hugo or someone with a wiki account:  Can this be more visibly
>> linked from the user-docs contents, added to the user docs category list,
>> and probably linked from at least the multiple devices and (for now) the
>> gotchas pages?
>>
>
> So raid5 is much more useless than I assumed. I read Marc's blog and
> figured that btrfs was ready enough.
>
> I' really in trouble now. I tried to get rid of raid5 by doing a convert
> balance to raid1. But of course this triggered the same issue. And now
> I have a dead system because the first thing btrfs does after mounting
> is continue the balance which will crash the system and send me into
> a vicious loop.
>
> - How can I stop btrfs from continuing balancing?
> - How can I salvage this situation and convert to raid1?
>
> Unfortunately I have little spare drives left. Not enough to contain
> 4.7TiB of data.. :(
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-06-15 20:36 Angela D.Dawes
  0 siblings, 0 replies; 1651+ messages in thread
From: Angela D.Dawes @ 2014-06-15 20:36 UTC (permalink / raw)


This is a personal email directed to you. My wife and I have a gift
donation for you, to know more details and claims, kindly contact us at:
d.angeladawes@outlook.com

Regards,
Dave & Angela Dawes


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-06-16  7:10 Angela D.Dawes
  0 siblings, 0 replies; 1651+ messages in thread
From: Angela D.Dawes @ 2014-06-16  7:10 UTC (permalink / raw)


This is a personal email directed to you. My wife and I have a gift
donation for you, to know more details and claims, kindly contact us at:
d.angeladawes@outlook.com

Regards,
Dave & Angela Dawes


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2014-07-03 16:30 W. Cheung
  0 siblings, 0 replies; 1651+ messages in thread
From: W. Cheung @ 2014-07-03 16:30 UTC (permalink / raw)
  To: jrobinson

 I have a very lucrative business transaction which requires the utmost discretion. If you are interested, kindly contact me ASAP for full details.

Warm Regards,
William Cheung

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-07-24  8:35 Richard Wong
  0 siblings, 0 replies; 1651+ messages in thread
From: Richard Wong @ 2014-07-24  8:35 UTC (permalink / raw)
  To: Recipients

I have a business proposal I would like to share with you, on your response I'll email you with more details.

Regards,
Richard Wong

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-07-24  8:35 Richard Wong
  0 siblings, 0 replies; 1651+ messages in thread
From: Richard Wong @ 2014-07-24  8:35 UTC (permalink / raw)
  To: Recipients

I have a business proposal I would like to share with you, on your response I'll email you with more details.

Regards,
Richard Wong

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-07-24  8:36 Richard Wong
  0 siblings, 0 replies; 1651+ messages in thread
From: Richard Wong @ 2014-07-24  8:36 UTC (permalink / raw)
  To: Recipients

I have a business proposal I would like to share with you, on your response I'll email you with more details.

Regards,
Richard Wong

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-07-24  8:37 Richard Wong
  0 siblings, 0 replies; 1651+ messages in thread
From: Richard Wong @ 2014-07-24  8:37 UTC (permalink / raw)
  To: Recipients

I have a business proposal I would like to share with you, on your response I'll email you with more details.

Regards,
Richard Wong

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2014-08-06 12:06 Daniel Smedegaard Buus
  2014-08-06 17:10 ` Slava Pestov
  0 siblings, 1 reply; 1651+ messages in thread
From: Daniel Smedegaard Buus @ 2014-08-06 12:06 UTC (permalink / raw)
  To: linux-bcache

Hi :)

I just tried upgrading a server of mine from mainline kernel 3.15.4 to
3.16, and upon reboot no bcache device in sight. Grepping dmesg for
bcache yielded an empty output, and besides the dmesg was full of
oopses.

This is a live server, so I immediately remove kernel 3.16, and upon
reboot, bcache was back and happy.

I know this isn't very useful for anything debugging, but as a very
general question, I'm just wondering if there's something I should
have done prior to upgrading, or if this might be a bug of some sort?

Cheers,
Daniel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-08-06 12:06 (unknown), Daniel Smedegaard Buus
@ 2014-08-06 17:10 ` Slava Pestov
  2014-08-06 17:50   ` Re: Daniel Smedegaard Buus
  0 siblings, 1 reply; 1651+ messages in thread
From: Slava Pestov @ 2014-08-06 17:10 UTC (permalink / raw)
  To: Daniel Smedegaard Buus; +Cc: linux-bcache

Hi Daniel,

Can you post the oops output here? There were no bcache changes from
3.15 to 3.16 so I'm not sure what could have gone wrong.

On Wed, Aug 6, 2014 at 5:06 AM, Daniel Smedegaard Buus
<danielbuus@gmail.com> wrote:
> Hi :)
>
> I just tried upgrading a server of mine from mainline kernel 3.15.4 to
> 3.16, and upon reboot no bcache device in sight. Grepping dmesg for
> bcache yielded an empty output, and besides the dmesg was full of
> oopses.
>
> This is a live server, so I immediately remove kernel 3.16, and upon
> reboot, bcache was back and happy.
>
> I know this isn't very useful for anything debugging, but as a very
> general question, I'm just wondering if there's something I should
> have done prior to upgrading, or if this might be a bug of some sort?
>
> Cheers,
> Daniel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-08-06 17:10 ` Slava Pestov
@ 2014-08-06 17:50   ` Daniel Smedegaard Buus
  0 siblings, 0 replies; 1651+ messages in thread
From: Daniel Smedegaard Buus @ 2014-08-06 17:50 UTC (permalink / raw)
  To: Slava Pestov; +Cc: linux-bcache

Good enough for me. I don't have a time window to do debugging on this
server, and I'm looking forward to having a go at the fixes scheduled
for 3.17, so I'm just gonna assume bitrot for this one ;)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-08-18 15:38 Mrs. Hajar Vaserman.
  0 siblings, 0 replies; 1651+ messages in thread
From: Mrs. Hajar Vaserman. @ 2014-08-18 15:38 UTC (permalink / raw)


I am Mrs. Hajar Vaserman,
Wife and Heir apparent to Late  Mr. Ilan Vaserman.
I have a WILL Proposal of 8.100,000.00 Million US Dollar for you.
Kindly contact my e-mail ( hajaraserman@gmail.com ) for further details.

Regard,
Mrs. Hajar Vaserman,

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2014-08-24 13:14 Christian König
  2014-08-24 13:34 ` Mike Lothian
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2014-08-24 13:14 UTC (permalink / raw)
  To: dri-devel

Hello everyone,

the following patches add UVD support for older ASICs (RV6xx, RS[78]80, RV7[79]0). For everybody wanting to test it I've also uploaded a branch to FDO: http://cgit.freedesktop.org/~deathsimple/linux/log/?h=uvd-r600-release

Additionally to the patches you need UVD firmware as well, which can be found at the usual location: http://people.freedesktop.org/~agd5f/radeon_ucode/

A small Mesa patch is needed as well, cause the older hardware doesn't support field based output of video frames. So unfortunately VDPAU/OpenGL interop won't work either.

We can only provide best effort support for those older ASICs, but at least on my RS[78]80 based laptop it seems to work perfectly fine.

Happy testing,
Christian.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-08-24 13:14 (unknown), Christian König
@ 2014-08-24 13:34 ` Mike Lothian
  2014-08-25  9:10   ` Re: Christian König
  0 siblings, 1 reply; 1651+ messages in thread
From: Mike Lothian @ 2014-08-24 13:34 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1070 bytes --]

Thanks for this

Good work
On 24 Aug 2014 14:15, "Christian König" <deathsimple@vodafone.de> wrote:

> Hello everyone,
>
> the following patches add UVD support for older ASICs (RV6xx, RS[78]80,
> RV7[79]0). For everybody wanting to test it I've also uploaded a branch to
> FDO:
> http://cgit.freedesktop.org/~deathsimple/linux/log/?h=uvd-r600-release
>
> Additionally to the patches you need UVD firmware as well, which can be
> found at the usual location:
> http://people.freedesktop.org/~agd5f/radeon_ucode/
>
> A small Mesa patch is needed as well, cause the older hardware doesn't
> support field based output of video frames. So unfortunately VDPAU/OpenGL
> interop won't work either.
>
> We can only provide best effort support for those older ASICs, but at
> least on my RS[78]80 based laptop it seems to work perfectly fine.
>
> Happy testing,
> Christian.
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

[-- Attachment #1.2: Type: text/html, Size: 1682 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-08-24 13:34 ` Mike Lothian
@ 2014-08-25  9:10   ` Christian König
  2014-09-07 13:24     ` Re: Markus Trippelsdorf
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2014-08-25  9:10 UTC (permalink / raw)
  To: Mike Lothian; +Cc: dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1550 bytes --]

Let me know if it works for you, cause we don't really have any hardware 
any more to test it.

Christian.

Am 24.08.2014 um 15:34 schrieb Mike Lothian:
>
> Thanks for this
>
> Good work
>
> On 24 Aug 2014 14:15, "Christian König" <deathsimple@vodafone.de 
> <mailto:deathsimple@vodafone.de>> wrote:
>
>     Hello everyone,
>
>     the following patches add UVD support for older ASICs (RV6xx,
>     RS[78]80, RV7[79]0). For everybody wanting to test it I've also
>     uploaded a branch to FDO:
>     http://cgit.freedesktop.org/~deathsimple/linux/log/?h=uvd-r600-release
>     <http://cgit.freedesktop.org/%7Edeathsimple/linux/log/?h=uvd-r600-release>
>
>     Additionally to the patches you need UVD firmware as well, which
>     can be found at the usual location:
>     http://people.freedesktop.org/~agd5f/radeon_ucode/
>     <http://people.freedesktop.org/%7Eagd5f/radeon_ucode/>
>
>     A small Mesa patch is needed as well, cause the older hardware
>     doesn't support field based output of video frames. So
>     unfortunately VDPAU/OpenGL interop won't work either.
>
>     We can only provide best effort support for those older ASICs, but
>     at least on my RS[78]80 based laptop it seems to work perfectly fine.
>
>     Happy testing,
>     Christian.
>
>     _______________________________________________
>     dri-devel mailing list
>     dri-devel@lists.freedesktop.org
>     <mailto:dri-devel@lists.freedesktop.org>
>     http://lists.freedesktop.org/mailman/listinfo/dri-devel
>


[-- Attachment #1.2: Type: text/html, Size: 2761 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-08-25  9:10   ` Re: Christian König
@ 2014-09-07 13:24     ` Markus Trippelsdorf
  2014-09-08  3:47       ` Re: Alex Deucher
  0 siblings, 1 reply; 1651+ messages in thread
From: Markus Trippelsdorf @ 2014-09-07 13:24 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel

On 2014.08.25 at 11:10 +0200, Christian König wrote:
> Let me know if it works for you, cause we don't really have any hardware 
> any more to test it.

I've tested your patch series today (using drm-next-3.18 from
~agd5f/linux) on a RS780D/Radeon HD 3300 system with a couple of H264
videos. While it sometimes works as expected, it stalled the GPU far too
often to be usable. The stalls are not recoverable and the machine ends
up with a black sreen, but still accepts SysRq keyboard inputs.

Here are some logs:

vdpauinfo:
display: :0   screen: 0
API version: 1
Information string: G3DVL VDPAU Driver Shared Library version 1.0

Video surface:

name   width height types
-------------------------------------------
420     8192  8192  NV12 YV12 
422     8192  8192  UYVY YUYV 
444     8192  8192  Y8U8V8A8 V8U8Y8A8 

Decoder capabilities:

name               level macbs width height
-------------------------------------------
MPEG1                 0  9216  2048  1152
MPEG2_SIMPLE          3  9216  2048  1152
MPEG2_MAIN            3  9216  2048  1152
H264_BASELINE        41  9216  2048  1152
H264_MAIN            41  9216  2048  1152
H264_HIGH            41  9216  2048  1152
VC1_ADVANCED          4  9216  2048  1152

Output surface:

name              width height nat types
----------------------------------------------------
B8G8R8A8          8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 
R8G8B8A8          8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 
R10G10B10A2       8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 
B10G10R10A2       8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 

Bitmap surface:

name              width height
------------------------------
B8G8R8A8          8192  8192
R8G8B8A8          8192  8192
R10G10B10A2       8192  8192
B10G10R10A2       8192  8192
A8                8192  8192

Video mixer:

feature name                    sup
------------------------------------
DEINTERLACE_TEMPORAL             y
DEINTERLACE_TEMPORAL_SPATIAL     -
INVERSE_TELECINE                 -
NOISE_REDUCTION                  y
SHARPNESS                        y
LUMA_KEY                         -
HIGH QUALITY SCALING - L1        -
HIGH QUALITY SCALING - L2        -
HIGH QUALITY SCALING - L3        -
HIGH QUALITY SCALING - L4        -
HIGH QUALITY SCALING - L5        -
HIGH QUALITY SCALING - L6        -
HIGH QUALITY SCALING - L7        -
HIGH QUALITY SCALING - L8        -
HIGH QUALITY SCALING - L9        -

parameter name                  sup      min      max
-----------------------------------------------------
VIDEO_SURFACE_WIDTH              y        48     2048
VIDEO_SURFACE_HEIGHT             y        48     1152
CHROMA_TYPE                      y  
LAYERS                           y         0        4

attribute name                  sup      min      max
-----------------------------------------------------
BACKGROUND_COLOR                 y  
CSC_MATRIX                       y  
NOISE_REDUCTION_LEVEL            y      0.00     1.00
SHARPNESS_LEVEL                  y     -1.00     1.00
LUMA_KEY_MIN_LUMA                y  
LUMA_KEY_MAX_LUMA                y  


Sep  7 14:03:45 x4 kernel: [drm] Initialized drm 1.1.0 20060810
Sep  7 14:03:45 x4 kernel: [drm] radeon kernel modesetting enabled.
Sep  7 14:03:45 x4 kernel: [drm] initializing kernel modesetting (RS780 0x1002:0x9614 0x1043:0x834D).
Sep  7 14:03:45 x4 kernel: [drm] register mmio base: 0xFBEE0000
Sep  7 14:03:45 x4 kernel: [drm] register mmio size: 65536
Sep  7 14:03:45 x4 kernel: ATOM BIOS: 113
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: VRAM: 128M 0x00000000C0000000 - 0x00000000C7FFFFFF (128M used)
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: GTT: 512M 0x00000000A0000000 - 0x00000000BFFFFFFF
Sep  7 14:03:45 x4 kernel: [drm] Detected VRAM RAM=128M, BAR=128M
Sep  7 14:03:45 x4 kernel: [drm] RAM width 32bits DDR
Sep  7 14:03:45 x4 kernel: [TTM] Zone  kernel: Available graphics memory: 4083350 kiB
Sep  7 14:03:45 x4 kernel: [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
Sep  7 14:03:45 x4 kernel: [TTM] Initializing pool allocator
Sep  7 14:03:45 x4 kernel: [TTM] Initializing DMA pool allocator
Sep  7 14:03:45 x4 kernel: [drm] radeon: 128M of VRAM memory ready
Sep  7 14:03:45 x4 kernel: [drm] radeon: 512M of GTT memory ready.
Sep  7 14:03:45 x4 kernel: [drm] Loading RS780 Microcode
Sep  7 14:03:45 x4 kernel: == power state 0 ==
Sep  7 14:03:45 x4 kernel: 	ui class: none
Sep  7 14:03:45 x4 kernel: 	internal class: boot 
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 	status: c r b 
Sep  7 14:03:45 x4 kernel: == power state 1 ==
Sep  7 14:03:45 x4 kernel: 	ui class: performance
Sep  7 14:03:45 x4 kernel: 	internal class: none
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 	status: 
Sep  7 14:03:45 x4 kernel: == power state 2 ==
Sep  7 14:03:45 x4 kernel: 	ui class: none
Sep  7 14:03:45 x4 kernel: 	internal class: uvd 
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:03:45 x4 kernel: 	status: 
Sep  7 14:03:45 x4 kernel: [drm] radeon: dpm initialized
Sep  7 14:03:45 x4 kernel: [drm] GART: num cpu pages 131072, num gpu pages 131072
Sep  7 14:03:45 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: WB enabled
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
Sep  7 14:03:45 x4 kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
Sep  7 14:03:45 x4 kernel: [drm] Driver supports precise vblank timestamp query.
Sep  7 14:03:45 x4 kernel: [drm] radeon: irq initialized.
Sep  7 14:03:45 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Sep  7 14:03:45 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
Sep  7 14:03:45 x4 kernel: [drm] UVD initialized successfully.
Sep  7 14:03:45 x4 kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Sep  7 14:03:45 x4 kernel: [drm] ib test on ring 5 succeeded
Sep  7 14:03:45 x4 kernel: [drm] Radeon Display Connectors
Sep  7 14:03:45 x4 kernel: [drm] Connector 0:
Sep  7 14:03:45 x4 kernel: [drm]   VGA-1
Sep  7 14:03:45 x4 kernel: [drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
Sep  7 14:03:45 x4 kernel: [drm]   Encoders:
Sep  7 14:03:45 x4 kernel: [drm]     CRT1: INTERNAL_KLDSCP_DAC1
Sep  7 14:03:45 x4 kernel: [drm] Connector 1:
Sep  7 14:03:45 x4 kernel: [drm]   DVI-D-1
Sep  7 14:03:45 x4 kernel: [drm]   HPD3
Sep  7 14:03:45 x4 kernel: [drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
Sep  7 14:03:45 x4 kernel: [drm]   Encoders:
Sep  7 14:03:45 x4 kernel: [drm]     DFP3: INTERNAL_KLDSCP_LVTMA
Sep  7 14:03:45 x4 kernel: switching from power state:
Sep  7 14:03:45 x4 kernel: 	ui class: none
Sep  7 14:03:45 x4 kernel: 	internal class: boot 
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 	status: c b 
Sep  7 14:03:45 x4 kernel: switching to power state:
Sep  7 14:03:45 x4 kernel: 	ui class: performance
Sep  7 14:03:45 x4 kernel: 	internal class: none
Sep  7 14:03:45 x4 kernel: 	caps: video 
Sep  7 14:03:45 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:03:45 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:03:45 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:03:45 x4 kernel: 	status: r 
Sep  7 14:03:45 x4 kernel: [drm] fb mappable at 0xF0359000
Sep  7 14:03:45 x4 kernel: [drm] vram apper at 0xF0000000
Sep  7 14:03:45 x4 kernel: [drm] size 7299072
Sep  7 14:03:45 x4 kernel: [drm] fb depth is 24
Sep  7 14:03:45 x4 kernel: [drm]    pitch is 6912
Sep  7 14:03:45 x4 kernel: fbcon: radeondrmfb (fb0) is primary device
Sep  7 14:03:45 x4 kernel: Console: switching to colour frame buffer device 131x105
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fb0: radeondrmfb frame buffer device
Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: registered panic notifier
Sep  7 14:03:45 x4 kernel: tsc: Refined TSC clocksource calibration: 3210.826 MHz
Sep  7 14:03:45 x4 kernel: [drm] Initialized radeon 2.40.0 20080528 for 0000:01:05.0 on minor 0
...
Sep  7 14:20:37 x4 kernel: switching to power state:
Sep  7 14:20:37 x4 kernel: 	ui class: none
Sep  7 14:20:37 x4 kernel: 	internal class: uvd 
Sep  7 14:20:37 x4 kernel: 	caps: video 
Sep  7 14:20:37 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:20:37 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:20:37 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:20:37 x4 kernel: 	status: r 
Sep  7 14:20:54 x4 kernel: switching from power state:
Sep  7 14:20:54 x4 kernel: 	ui class: none
Sep  7 14:20:54 x4 kernel: 	internal class: uvd 
Sep  7 14:20:54 x4 kernel: 	caps: video 
Sep  7 14:20:54 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:20:54 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:20:54 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:20:54 x4 kernel: 	status: c 
Sep  7 14:20:54 x4 kernel: switching to power state:
Sep  7 14:20:54 x4 kernel: 	ui class: performance
Sep  7 14:20:54 x4 kernel: 	internal class: none
Sep  7 14:20:54 x4 kernel: 	caps: video 
Sep  7 14:20:54 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:20:54 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:20:54 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:20:54 x4 kernel: 	status: r 
Sep  7 14:21:02 x4 kernel: switching from power state:
Sep  7 14:21:02 x4 kernel: 	ui class: performance
Sep  7 14:21:02 x4 kernel: 	internal class: none
Sep  7 14:21:02 x4 kernel: 	caps: video 
Sep  7 14:21:02 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:21:02 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:21:02 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:21:02 x4 kernel: 	status: c 
Sep  7 14:21:02 x4 kernel: switching to power state:
Sep  7 14:21:02 x4 kernel: 	ui class: none
Sep  7 14:21:02 x4 kernel: 	internal class: uvd 
Sep  7 14:21:02 x4 kernel: 	caps: video 
Sep  7 14:21:02 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:21:02 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:21:02 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:21:02 x4 kernel: 	status: r 
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10106msec
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000072ac last fence id 0x00000000000072b4 on ring 0)
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: Saved 377 dwords of commands on ring 0.
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000099
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00034AF
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20044040
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000004
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
Sep  7 14:21:13 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: WB enabled
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
Sep  7 14:21:13 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Sep  7 14:21:13 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
Sep  7 14:21:13 x4 kernel: [drm] UVD initialized successfully.
Sep  7 14:21:13 x4 kernel: switching from power state:
Sep  7 14:21:13 x4 kernel: 	ui class: none
Sep  7 14:21:13 x4 kernel: 	internal class: boot 
Sep  7 14:21:13 x4 kernel: 	caps: video 
Sep  7 14:21:13 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:21:13 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:21:13 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:21:13 x4 kernel: 	status: c b 
Sep  7 14:21:13 x4 kernel: switching to power state:
Sep  7 14:21:13 x4 kernel: 	ui class: none
Sep  7 14:21:13 x4 kernel: 	internal class: uvd 
Sep  7 14:21:13 x4 kernel: 	caps: video 
Sep  7 14:21:13 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:21:13 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:21:13 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:21:13 x4 kernel: 	status: r 
Sep  7 14:21:20 x4 kernel: SysRq : Emergency Sync
(new boot)
Sep  7 14:44:49 x4 kernel: switching from power state:
Sep  7 14:44:49 x4 kernel: 	ui class: performance
Sep  7 14:44:49 x4 kernel: 	internal class: none
Sep  7 14:44:49 x4 kernel: 	caps: video 
Sep  7 14:44:49 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:44:49 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:44:49 x4 kernel: 		power level 1    sclk: 70000 vddc_index: 2
Sep  7 14:44:49 x4 kernel: 	status: c 
Sep  7 14:44:49 x4 kernel: switching to power state:
Sep  7 14:44:49 x4 kernel: 	ui class: none
Sep  7 14:44:49 x4 kernel: 	internal class: uvd 
Sep  7 14:44:49 x4 kernel: 	caps: video 
Sep  7 14:44:49 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:44:49 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:44:49 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:44:49 x4 kernel: 	status: r 
Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10000msec
Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10466msec
Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10500msec
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10966msec
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 11000msec
...
Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 28466msec
Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 28500msec
Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 28966msec
Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
Sep  7 14:45:18 x4 kernel: SysRq : Emergency Sync
(new boot)
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10000msec
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10000msec
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000001ed last fence id 0x00000000000001f6 on ring 0)
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x000000000000000b last fence id 0x000000000000000f on ring 5)
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: Saved 409 dwords of commands on ring 0.
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000099
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00034E0
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20045040
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x01000004
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00200002
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x808D8645
Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
Sep  7 14:48:15 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: WB enabled
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
Sep  7 14:48:15 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Sep  7 14:48:15 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
Sep  7 14:48:15 x4 kernel: [drm] UVD initialized successfully.
Sep  7 14:48:15 x4 kernel: switching from power state:
Sep  7 14:48:15 x4 kernel: 	ui class: none
Sep  7 14:48:15 x4 kernel: 	internal class: boot 
Sep  7 14:48:15 x4 kernel: 	caps: video 
Sep  7 14:48:15 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:48:15 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:48:15 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:48:15 x4 kernel: 	status: c b 
Sep  7 14:48:15 x4 kernel: switching to power state:
Sep  7 14:48:15 x4 kernel: 	ui class: none
Sep  7 14:48:15 x4 kernel: 	internal class: uvd 
Sep  7 14:48:15 x4 kernel: 	caps: video 
Sep  7 14:48:15 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:48:15 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:48:15 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:48:15 x4 kernel: 	status: r 
Sep  7 14:48:15 x4 kernel: SysRq : Emergency Sync
(new boot)
Sep  7 14:53:20 x4 kernel: switching to power state:
Sep  7 14:53:20 x4 kernel: 	ui class: none
Sep  7 14:53:20 x4 kernel: 	internal class: uvd 
Sep  7 14:53:20 x4 kernel: 	caps: video 
Sep  7 14:53:20 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:53:20 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:53:20 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:53:20 x4 kernel: 	status: r 
Sep  7 14:53:30 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10050msec
Sep  7 14:53:30 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000596 last fence id 0x00000000000005a3 on ring 0)
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: Saved 601 dwords of commands on ring 0.
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000088
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00030B0
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20045040
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000004
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00004001
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
Sep  7 14:53:31 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: WB enabled
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
Sep  7 14:53:31 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
Sep  7 14:53:31 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
Sep  7 14:53:31 x4 kernel: [drm] UVD initialized successfully.
Sep  7 14:53:31 x4 kernel: switching from power state:
Sep  7 14:53:31 x4 kernel: 	ui class: none
Sep  7 14:53:31 x4 kernel: 	internal class: boot 
Sep  7 14:53:31 x4 kernel: 	caps: video 
Sep  7 14:53:31 x4 kernel: 	uvd    vclk: 0 dclk: 0
Sep  7 14:53:31 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 2
Sep  7 14:53:31 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 2
Sep  7 14:53:31 x4 kernel: 	status: c b 
Sep  7 14:53:31 x4 kernel: switching to power state:
Sep  7 14:53:31 x4 kernel: 	ui class: none
Sep  7 14:53:31 x4 kernel: 	internal class: uvd 
Sep  7 14:53:31 x4 kernel: 	caps: video 
Sep  7 14:53:31 x4 kernel: 	uvd    vclk: 53300 dclk: 40000
Sep  7 14:53:31 x4 kernel: 		power level 0    sclk: 50000 vddc_index: 1
Sep  7 14:53:31 x4 kernel: 		power level 1    sclk: 50000 vddc_index: 1
Sep  7 14:53:31 x4 kernel: 	status: r 
Sep  7 14:53:39 x4 kernel: SysRq : Emergency Sync

-- 
Markus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-09-07 13:24     ` Re: Markus Trippelsdorf
@ 2014-09-08  3:47       ` Alex Deucher
  2014-09-08  7:13         ` Re: Markus Trippelsdorf
  0 siblings, 1 reply; 1651+ messages in thread
From: Alex Deucher @ 2014-09-08  3:47 UTC (permalink / raw)
  To: Markus Trippelsdorf; +Cc: Maling list - DRI developers

On Sun, Sep 7, 2014 at 9:24 AM, Markus Trippelsdorf
<markus@trippelsdorf.de> wrote:
> On 2014.08.25 at 11:10 +0200, Christian König wrote:
>> Let me know if it works for you, cause we don't really have any hardware
>> any more to test it.
>
> I've tested your patch series today (using drm-next-3.18 from
> ~agd5f/linux) on a RS780D/Radeon HD 3300 system with a couple of H264
> videos. While it sometimes works as expected, it stalled the GPU far too
> often to be usable. The stalls are not recoverable and the machine ends
> up with a black sreen, but still accepts SysRq keyboard inputs.


Does it work any better if dpm is disabled?

Alex

>
> Here are some logs:
>
> vdpauinfo:
> display: :0   screen: 0
> API version: 1
> Information string: G3DVL VDPAU Driver Shared Library version 1.0
>
> Video surface:
>
> name   width height types
> -------------------------------------------
> 420     8192  8192  NV12 YV12
> 422     8192  8192  UYVY YUYV
> 444     8192  8192  Y8U8V8A8 V8U8Y8A8
>
> Decoder capabilities:
>
> name               level macbs width height
> -------------------------------------------
> MPEG1                 0  9216  2048  1152
> MPEG2_SIMPLE          3  9216  2048  1152
> MPEG2_MAIN            3  9216  2048  1152
> H264_BASELINE        41  9216  2048  1152
> H264_MAIN            41  9216  2048  1152
> H264_HIGH            41  9216  2048  1152
> VC1_ADVANCED          4  9216  2048  1152
>
> Output surface:
>
> name              width height nat types
> ----------------------------------------------------
> B8G8R8A8          8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8
> R8G8B8A8          8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8
> R10G10B10A2       8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8
> B10G10R10A2       8192  8192    y  NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8
>
> Bitmap surface:
>
> name              width height
> ------------------------------
> B8G8R8A8          8192  8192
> R8G8B8A8          8192  8192
> R10G10B10A2       8192  8192
> B10G10R10A2       8192  8192
> A8                8192  8192
>
> Video mixer:
>
> feature name                    sup
> ------------------------------------
> DEINTERLACE_TEMPORAL             y
> DEINTERLACE_TEMPORAL_SPATIAL     -
> INVERSE_TELECINE                 -
> NOISE_REDUCTION                  y
> SHARPNESS                        y
> LUMA_KEY                         -
> HIGH QUALITY SCALING - L1        -
> HIGH QUALITY SCALING - L2        -
> HIGH QUALITY SCALING - L3        -
> HIGH QUALITY SCALING - L4        -
> HIGH QUALITY SCALING - L5        -
> HIGH QUALITY SCALING - L6        -
> HIGH QUALITY SCALING - L7        -
> HIGH QUALITY SCALING - L8        -
> HIGH QUALITY SCALING - L9        -
>
> parameter name                  sup      min      max
> -----------------------------------------------------
> VIDEO_SURFACE_WIDTH              y        48     2048
> VIDEO_SURFACE_HEIGHT             y        48     1152
> CHROMA_TYPE                      y
> LAYERS                           y         0        4
>
> attribute name                  sup      min      max
> -----------------------------------------------------
> BACKGROUND_COLOR                 y
> CSC_MATRIX                       y
> NOISE_REDUCTION_LEVEL            y      0.00     1.00
> SHARPNESS_LEVEL                  y     -1.00     1.00
> LUMA_KEY_MIN_LUMA                y
> LUMA_KEY_MAX_LUMA                y
>
>
> Sep  7 14:03:45 x4 kernel: [drm] Initialized drm 1.1.0 20060810
> Sep  7 14:03:45 x4 kernel: [drm] radeon kernel modesetting enabled.
> Sep  7 14:03:45 x4 kernel: [drm] initializing kernel modesetting (RS780 0x1002:0x9614 0x1043:0x834D).
> Sep  7 14:03:45 x4 kernel: [drm] register mmio base: 0xFBEE0000
> Sep  7 14:03:45 x4 kernel: [drm] register mmio size: 65536
> Sep  7 14:03:45 x4 kernel: ATOM BIOS: 113
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: VRAM: 128M 0x00000000C0000000 - 0x00000000C7FFFFFF (128M used)
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: GTT: 512M 0x00000000A0000000 - 0x00000000BFFFFFFF
> Sep  7 14:03:45 x4 kernel: [drm] Detected VRAM RAM=128M, BAR=128M
> Sep  7 14:03:45 x4 kernel: [drm] RAM width 32bits DDR
> Sep  7 14:03:45 x4 kernel: [TTM] Zone  kernel: Available graphics memory: 4083350 kiB
> Sep  7 14:03:45 x4 kernel: [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
> Sep  7 14:03:45 x4 kernel: [TTM] Initializing pool allocator
> Sep  7 14:03:45 x4 kernel: [TTM] Initializing DMA pool allocator
> Sep  7 14:03:45 x4 kernel: [drm] radeon: 128M of VRAM memory ready
> Sep  7 14:03:45 x4 kernel: [drm] radeon: 512M of GTT memory ready.
> Sep  7 14:03:45 x4 kernel: [drm] Loading RS780 Microcode
> Sep  7 14:03:45 x4 kernel: == power state 0 ==
> Sep  7 14:03:45 x4 kernel:      ui class: none
> Sep  7 14:03:45 x4 kernel:      internal class: boot
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:      status: c r b
> Sep  7 14:03:45 x4 kernel: == power state 1 ==
> Sep  7 14:03:45 x4 kernel:      ui class: performance
> Sep  7 14:03:45 x4 kernel:      internal class: none
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:      status:
> Sep  7 14:03:45 x4 kernel: == power state 2 ==
> Sep  7 14:03:45 x4 kernel:      ui class: none
> Sep  7 14:03:45 x4 kernel:      internal class: uvd
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:03:45 x4 kernel:      status:
> Sep  7 14:03:45 x4 kernel: [drm] radeon: dpm initialized
> Sep  7 14:03:45 x4 kernel: [drm] GART: num cpu pages 131072, num gpu pages 131072
> Sep  7 14:03:45 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: WB enabled
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
> Sep  7 14:03:45 x4 kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> Sep  7 14:03:45 x4 kernel: [drm] Driver supports precise vblank timestamp query.
> Sep  7 14:03:45 x4 kernel: [drm] radeon: irq initialized.
> Sep  7 14:03:45 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
> Sep  7 14:03:45 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
> Sep  7 14:03:45 x4 kernel: [drm] UVD initialized successfully.
> Sep  7 14:03:45 x4 kernel: [drm] ib test on ring 0 succeeded in 0 usecs
> Sep  7 14:03:45 x4 kernel: [drm] ib test on ring 5 succeeded
> Sep  7 14:03:45 x4 kernel: [drm] Radeon Display Connectors
> Sep  7 14:03:45 x4 kernel: [drm] Connector 0:
> Sep  7 14:03:45 x4 kernel: [drm]   VGA-1
> Sep  7 14:03:45 x4 kernel: [drm]   DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48 0x7e4c 0x7e4c
> Sep  7 14:03:45 x4 kernel: [drm]   Encoders:
> Sep  7 14:03:45 x4 kernel: [drm]     CRT1: INTERNAL_KLDSCP_DAC1
> Sep  7 14:03:45 x4 kernel: [drm] Connector 1:
> Sep  7 14:03:45 x4 kernel: [drm]   DVI-D-1
> Sep  7 14:03:45 x4 kernel: [drm]   HPD3
> Sep  7 14:03:45 x4 kernel: [drm]   DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58 0x7e5c 0x7e5c
> Sep  7 14:03:45 x4 kernel: [drm]   Encoders:
> Sep  7 14:03:45 x4 kernel: [drm]     DFP3: INTERNAL_KLDSCP_LVTMA
> Sep  7 14:03:45 x4 kernel: switching from power state:
> Sep  7 14:03:45 x4 kernel:      ui class: none
> Sep  7 14:03:45 x4 kernel:      internal class: boot
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:      status: c b
> Sep  7 14:03:45 x4 kernel: switching to power state:
> Sep  7 14:03:45 x4 kernel:      ui class: performance
> Sep  7 14:03:45 x4 kernel:      internal class: none
> Sep  7 14:03:45 x4 kernel:      caps: video
> Sep  7 14:03:45 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:03:45 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:03:45 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:03:45 x4 kernel:      status: r
> Sep  7 14:03:45 x4 kernel: [drm] fb mappable at 0xF0359000
> Sep  7 14:03:45 x4 kernel: [drm] vram apper at 0xF0000000
> Sep  7 14:03:45 x4 kernel: [drm] size 7299072
> Sep  7 14:03:45 x4 kernel: [drm] fb depth is 24
> Sep  7 14:03:45 x4 kernel: [drm]    pitch is 6912
> Sep  7 14:03:45 x4 kernel: fbcon: radeondrmfb (fb0) is primary device
> Sep  7 14:03:45 x4 kernel: Console: switching to colour frame buffer device 131x105
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: fb0: radeondrmfb frame buffer device
> Sep  7 14:03:45 x4 kernel: radeon 0000:01:05.0: registered panic notifier
> Sep  7 14:03:45 x4 kernel: tsc: Refined TSC clocksource calibration: 3210.826 MHz
> Sep  7 14:03:45 x4 kernel: [drm] Initialized radeon 2.40.0 20080528 for 0000:01:05.0 on minor 0
> ...
> Sep  7 14:20:37 x4 kernel: switching to power state:
> Sep  7 14:20:37 x4 kernel:      ui class: none
> Sep  7 14:20:37 x4 kernel:      internal class: uvd
> Sep  7 14:20:37 x4 kernel:      caps: video
> Sep  7 14:20:37 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:20:37 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:20:37 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:20:37 x4 kernel:      status: r
> Sep  7 14:20:54 x4 kernel: switching from power state:
> Sep  7 14:20:54 x4 kernel:      ui class: none
> Sep  7 14:20:54 x4 kernel:      internal class: uvd
> Sep  7 14:20:54 x4 kernel:      caps: video
> Sep  7 14:20:54 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:20:54 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:20:54 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:20:54 x4 kernel:      status: c
> Sep  7 14:20:54 x4 kernel: switching to power state:
> Sep  7 14:20:54 x4 kernel:      ui class: performance
> Sep  7 14:20:54 x4 kernel:      internal class: none
> Sep  7 14:20:54 x4 kernel:      caps: video
> Sep  7 14:20:54 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:20:54 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:20:54 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:20:54 x4 kernel:      status: r
> Sep  7 14:21:02 x4 kernel: switching from power state:
> Sep  7 14:21:02 x4 kernel:      ui class: performance
> Sep  7 14:21:02 x4 kernel:      internal class: none
> Sep  7 14:21:02 x4 kernel:      caps: video
> Sep  7 14:21:02 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:21:02 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:21:02 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:21:02 x4 kernel:      status: c
> Sep  7 14:21:02 x4 kernel: switching to power state:
> Sep  7 14:21:02 x4 kernel:      ui class: none
> Sep  7 14:21:02 x4 kernel:      internal class: uvd
> Sep  7 14:21:02 x4 kernel:      caps: video
> Sep  7 14:21:02 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:21:02 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:21:02 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:21:02 x4 kernel:      status: r
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10106msec
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000072ac last fence id 0x00000000000072b4 on ring 0)
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: Saved 377 dwords of commands on ring 0.
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000099
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00034AF
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20044040
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000004
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
> Sep  7 14:21:13 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: WB enabled
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
> Sep  7 14:21:13 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
> Sep  7 14:21:13 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
> Sep  7 14:21:13 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
> Sep  7 14:21:13 x4 kernel: [drm] UVD initialized successfully.
> Sep  7 14:21:13 x4 kernel: switching from power state:
> Sep  7 14:21:13 x4 kernel:      ui class: none
> Sep  7 14:21:13 x4 kernel:      internal class: boot
> Sep  7 14:21:13 x4 kernel:      caps: video
> Sep  7 14:21:13 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:21:13 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:21:13 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:21:13 x4 kernel:      status: c b
> Sep  7 14:21:13 x4 kernel: switching to power state:
> Sep  7 14:21:13 x4 kernel:      ui class: none
> Sep  7 14:21:13 x4 kernel:      internal class: uvd
> Sep  7 14:21:13 x4 kernel:      caps: video
> Sep  7 14:21:13 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:21:13 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:21:13 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:21:13 x4 kernel:      status: r
> Sep  7 14:21:20 x4 kernel: SysRq : Emergency Sync
> (new boot)
> Sep  7 14:44:49 x4 kernel: switching from power state:
> Sep  7 14:44:49 x4 kernel:      ui class: performance
> Sep  7 14:44:49 x4 kernel:      internal class: none
> Sep  7 14:44:49 x4 kernel:      caps: video
> Sep  7 14:44:49 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:44:49 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:44:49 x4 kernel:              power level 1    sclk: 70000 vddc_index: 2
> Sep  7 14:44:49 x4 kernel:      status: c
> Sep  7 14:44:49 x4 kernel: switching to power state:
> Sep  7 14:44:49 x4 kernel:      ui class: none
> Sep  7 14:44:49 x4 kernel:      internal class: uvd
> Sep  7 14:44:49 x4 kernel:      caps: video
> Sep  7 14:44:49 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:44:49 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:44:49 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:44:49 x4 kernel:      status: r
> Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10000msec
> Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
> Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10466msec
> Sep  7 14:44:59 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10500msec
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10966msec
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
> Sep  7 14:45:00 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 11000msec
> ...
> Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
> Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 28466msec
> Sep  7 14:45:17 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
> Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 28500msec
> Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000eac last fence id 0x0000000000000eaf on ring 5)
> Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 28966msec
> Sep  7 14:45:18 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000064ff last fence id 0x0000000000006504 on ring 0)
> Sep  7 14:45:18 x4 kernel: SysRq : Emergency Sync
> (new boot)
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: ring 5 stalled for more than 10000msec
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10000msec
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x00000000000001ed last fence id 0x00000000000001f6 on ring 0)
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x000000000000000b last fence id 0x000000000000000f on ring 5)
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: Saved 409 dwords of commands on ring 0.
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000099
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00034E0
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20045040
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x01000004
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00200002
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x808D8645
> Sep  7 14:48:14 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
> Sep  7 14:48:15 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: WB enabled
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
> Sep  7 14:48:15 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
> Sep  7 14:48:15 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
> Sep  7 14:48:15 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
> Sep  7 14:48:15 x4 kernel: [drm] UVD initialized successfully.
> Sep  7 14:48:15 x4 kernel: switching from power state:
> Sep  7 14:48:15 x4 kernel:      ui class: none
> Sep  7 14:48:15 x4 kernel:      internal class: boot
> Sep  7 14:48:15 x4 kernel:      caps: video
> Sep  7 14:48:15 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:48:15 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:48:15 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:48:15 x4 kernel:      status: c b
> Sep  7 14:48:15 x4 kernel: switching to power state:
> Sep  7 14:48:15 x4 kernel:      ui class: none
> Sep  7 14:48:15 x4 kernel:      internal class: uvd
> Sep  7 14:48:15 x4 kernel:      caps: video
> Sep  7 14:48:15 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:48:15 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:48:15 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:48:15 x4 kernel:      status: r
> Sep  7 14:48:15 x4 kernel: SysRq : Emergency Sync
> (new boot)
> Sep  7 14:53:20 x4 kernel: switching to power state:
> Sep  7 14:53:20 x4 kernel:      ui class: none
> Sep  7 14:53:20 x4 kernel:      internal class: uvd
> Sep  7 14:53:20 x4 kernel:      caps: video
> Sep  7 14:53:20 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:53:20 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:53:20 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:53:20 x4 kernel:      status: r
> Sep  7 14:53:30 x4 kernel: radeon 0000:01:05.0: ring 0 stalled for more than 10050msec
> Sep  7 14:53:30 x4 kernel: radeon 0000:01:05.0: GPU lockup (current fence id 0x0000000000000596 last fence id 0x00000000000005a3 on ring 0)
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: Saved 601 dwords of commands on ring 0.
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: GPU softreset: 0x00000088
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA00030B0
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20045040
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000004
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000002
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00005087
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80098645
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: R_008020_GRBM_SOFT_RESET=0x00004001
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: SRBM_SOFT_RESET=0x00008100
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008010_GRBM_STATUS      = 0xA0003030
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008014_GRBM_STATUS2     = 0x00000003
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_000E50_SRBM_STATUS      = 0x20048040
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_008680_CP_STAT          = 0x80100000
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: GPU reset succeeded, trying to resume
> Sep  7 14:53:31 x4 kernel: [drm] PCIE GART of 512M enabled (table at 0x00000000C0258000).
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: WB enabled
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: fence driver on ring 0 use gpu addr 0x00000000a0000c00 and cpu addr 0xffff8800db8dcc00
> Sep  7 14:53:31 x4 kernel: radeon 0000:01:05.0: fence driver on ring 5 use gpu addr 0x00000000c0056038 and cpu addr 0xffffc90000116038
> Sep  7 14:53:31 x4 kernel: [drm] ring test on 0 succeeded in 1 usecs
> Sep  7 14:53:31 x4 kernel: [drm] ring test on 5 succeeded in 1 usecs
> Sep  7 14:53:31 x4 kernel: [drm] UVD initialized successfully.
> Sep  7 14:53:31 x4 kernel: switching from power state:
> Sep  7 14:53:31 x4 kernel:      ui class: none
> Sep  7 14:53:31 x4 kernel:      internal class: boot
> Sep  7 14:53:31 x4 kernel:      caps: video
> Sep  7 14:53:31 x4 kernel:      uvd    vclk: 0 dclk: 0
> Sep  7 14:53:31 x4 kernel:              power level 0    sclk: 50000 vddc_index: 2
> Sep  7 14:53:31 x4 kernel:              power level 1    sclk: 50000 vddc_index: 2
> Sep  7 14:53:31 x4 kernel:      status: c b
> Sep  7 14:53:31 x4 kernel: switching to power state:
> Sep  7 14:53:31 x4 kernel:      ui class: none
> Sep  7 14:53:31 x4 kernel:      internal class: uvd
> Sep  7 14:53:31 x4 kernel:      caps: video
> Sep  7 14:53:31 x4 kernel:      uvd    vclk: 53300 dclk: 40000
> Sep  7 14:53:31 x4 kernel:              power level 0    sclk: 50000 vddc_index: 1
> Sep  7 14:53:31 x4 kernel:              power level 1    sclk: 50000 vddc_index: 1
> Sep  7 14:53:31 x4 kernel:      status: r
> Sep  7 14:53:39 x4 kernel: SysRq : Emergency Sync
>
> --
> Markus
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-09-08  3:47       ` Re: Alex Deucher
@ 2014-09-08  7:13         ` Markus Trippelsdorf
  0 siblings, 0 replies; 1651+ messages in thread
From: Markus Trippelsdorf @ 2014-09-08  7:13 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Maling list - DRI developers

On 2014.09.07 at 23:47 -0400, Alex Deucher wrote:
> On Sun, Sep 7, 2014 at 9:24 AM, Markus Trippelsdorf
> <markus@trippelsdorf.de> wrote:
> > On 2014.08.25 at 11:10 +0200, Christian König wrote:
> >> Let me know if it works for you, cause we don't really have any hardware
> >> any more to test it.
> >
> > I've tested your patch series today (using drm-next-3.18 from
> > ~agd5f/linux) on a RS780D/Radeon HD 3300 system with a couple of H264
> > videos. While it sometimes works as expected, it stalled the GPU far too
> > often to be usable. The stalls are not recoverable and the machine ends
> > up with a black sreen, but still accepts SysRq keyboard inputs.
> 
> 
> Does it work any better if dpm is disabled?

Unfortunately no. The symptoms are unchanged.

-- 
Markus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2014-09-08 11:36 R. Klomp
       [not found] ` <CAOqJoqGSRUw_UT4LhqpYX-WX6AEd2ReAWjgNS76Cra-SMKw3NQ@mail.gmail.com>
  0 siblings, 1 reply; 1651+ messages in thread
From: R. Klomp @ 2014-09-08 11:36 UTC (permalink / raw)
  To: git; +Cc: p.duijst

It seems like there's a bug involving git difftool's -d flag and Beyond Compare.

When using the difftool Beyond Compare, git difftool <..> <..> -d
immidiatly shuts down once the diff tree has been created. Beyond
Compare successfully shows the files that differ.
However, since git difftool doesn't wait for Beyond Compare to shut
down, all temporary files are gone. Due to this it's impossible to
view changes made inside files using the -d flag.

I haven't tested if this issue also happens with other difftools.

I'm using the latest versions of both Beyond Compare 3 (3.3.12, Pro
Edition for Windows) and Git (1.9.4 for Windows).

Thanks in advance for your help!

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAOqJoqGSRUw_UT4LhqpYX-WX6AEd2ReAWjgNS76Cra-SMKw3NQ@mail.gmail.com>]

* Re:
       [not found] ` <CAOqJoqGSRUw_UT4LhqpYX-WX6AEd2ReAWjgNS76Cra-SMKw3NQ@mail.gmail.com>
@ 2014-09-08 14:36   ` R. Klomp
  2014-09-10  0:00     ` Re: David Aguilar
  0 siblings, 1 reply; 1651+ messages in thread
From: R. Klomp @ 2014-09-08 14:36 UTC (permalink / raw)
  To: Jim Naslund; +Cc: p.duijst, git

Ok great! That indeed fixed the issue.
Although I still don't understand why it didn't work without -solo..
since it didn't work when no instance of Beyond Compare was running as
well.

There must be something not quite right in either Git or Beyond Compare.

On Mon, Sep 8, 2014 at 3:37 PM, Jim Naslund <jnaslund@gmail.com> wrote:
>
> On Sep 8, 2014 7:39 AM, "R. Klomp" <r.klomp@students.uu.nl> wrote:
>>
>> It seems like there's a bug involving git difftool's -d flag and Beyond
>> Compare.
>>
>> When using the difftool Beyond Compare, git difftool <..> <..> -d
>> immidiatly shuts down once the diff tree has been created. Beyond
>> Compare successfully shows the files that differ.
>> However, since git difftool doesn't wait for Beyond Compare to shut
>> down, all temporary files are gone. Due to this it's impossible to
>> view changes made inside files using the -d flag.
>>
>> I haven't tested if this issue also happens with other difftools.
>>
>> I'm using the latest versions of both Beyond Compare 3 (3.3.12, Pro
>> Edition for Windows) and Git (1.9.4 for Windows).
>>
>>
>> Thanks in advance for your help!
>
> I see the same behavior. For me it had something to do with the diff opening
> in a new tab in an existing window. Adding -solo to difftool.cmd will make
> beyond compare use a new window which fixes the issue for me.
>
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-09-08 14:36   ` R. Klomp
@ 2014-09-10  0:00     ` David Aguilar
  2014-09-15 15:10       ` Re: R. Klomp
  0 siblings, 1 reply; 1651+ messages in thread
From: David Aguilar @ 2014-09-10  0:00 UTC (permalink / raw)
  To: R. Klomp; +Cc: Jim Naslund, p.duijst, git

On Mon, Sep 08, 2014 at 04:36:49PM +0200, R. Klomp wrote:
> Ok great! That indeed fixed the issue.
> Although I still don't understand why it didn't work without -solo..
> since it didn't work when no instance of Beyond Compare was running as
> well.
> 
> There must be something not quite right in either Git or Beyond Compare.
> 
> On Mon, Sep 8, 2014 at 3:37 PM, Jim Naslund <jnaslund@gmail.com> wrote:
> >
> > On Sep 8, 2014 7:39 AM, "R. Klomp" <r.klomp@students.uu.nl> wrote:
> >>
> >> It seems like there's a bug involving git difftool's -d flag and Beyond
> >> Compare.
> >>
> >> When using the difftool Beyond Compare, git difftool <..> <..> -d
> >> immidiatly shuts down once the diff tree has been created. Beyond
> >> Compare successfully shows the files that differ.
> >> However, since git difftool doesn't wait for Beyond Compare to shut
> >> down, all temporary files are gone. Due to this it's impossible to
> >> view changes made inside files using the -d flag.
> >>
> >> I haven't tested if this issue also happens with other difftools.
> >>
> >> I'm using the latest versions of both Beyond Compare 3 (3.3.12, Pro
> >> Edition for Windows) and Git (1.9.4 for Windows).
> >>
> >>
> >> Thanks in advance for your help!
> >
> > I see the same behavior. For me it had something to do with the diff opening
> > in a new tab in an existing window. Adding -solo to difftool.cmd will make
> > beyond compare use a new window which fixes the issue for me.

Interesting. Would it be worth changing difftool to use -solo by default, or
are there any downsides to doing so?

Is -solo a new feature that only exists in new versions of beyond compare?
I would be okay saying that the user should use a fairly new version.

Can we rely on -solo being available on all platforms?
If so, I'd be okay with changing the default if there are no other downsides.

The --dir-diff feature is not the only one that needs this blocking behavior.
Does this issue also happen in the normal difftool mode without -d?
-- 
David

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-09-10  0:00     ` Re: David Aguilar
@ 2014-09-15 15:10       ` R. Klomp
  0 siblings, 0 replies; 1651+ messages in thread
From: R. Klomp @ 2014-09-15 15:10 UTC (permalink / raw)
  To: David Aguilar; +Cc: Jim Naslund, p.duijst, git

I couldn't find information about whether the -solo feature is
available in all Beyond Compare versions.
At the least I can say that it is available in version 3 for Windows,
since that is the version that we're using.

This issue does not occur when using the normal difftool (command: git
difftool), which is odd and indicates that something must be wrong in
either Git or Beyond Compare.

On Wed, Sep 10, 2014 at 2:00 AM, David Aguilar <davvid@gmail.com> wrote:
> On Mon, Sep 08, 2014 at 04:36:49PM +0200, R. Klomp wrote:
>> Ok great! That indeed fixed the issue.
>> Although I still don't understand why it didn't work without -solo..
>> since it didn't work when no instance of Beyond Compare was running as
>> well.
>>
>> There must be something not quite right in either Git or Beyond Compare.
>>
>> On Mon, Sep 8, 2014 at 3:37 PM, Jim Naslund <jnaslund@gmail.com> wrote:
>> >
>> > On Sep 8, 2014 7:39 AM, "R. Klomp" <r.klomp@students.uu.nl> wrote:
>> >>
>> >> It seems like there's a bug involving git difftool's -d flag and Beyond
>> >> Compare.
>> >>
>> >> When using the difftool Beyond Compare, git difftool <..> <..> -d
>> >> immidiatly shuts down once the diff tree has been created. Beyond
>> >> Compare successfully shows the files that differ.
>> >> However, since git difftool doesn't wait for Beyond Compare to shut
>> >> down, all temporary files are gone. Due to this it's impossible to
>> >> view changes made inside files using the -d flag.
>> >>
>> >> I haven't tested if this issue also happens with other difftools.
>> >>
>> >> I'm using the latest versions of both Beyond Compare 3 (3.3.12, Pro
>> >> Edition for Windows) and Git (1.9.4 for Windows).
>> >>
>> >>
>> >> Thanks in advance for your help!
>> >
>> > I see the same behavior. For me it had something to do with the diff opening
>> > in a new tab in an existing window. Adding -solo to difftool.cmd will make
>> > beyond compare use a new window which fixes the issue for me.
>
> Interesting. Would it be worth changing difftool to use -solo by default, or
> are there any downsides to doing so?
>
> Is -solo a new feature that only exists in new versions of beyond compare?
> I would be okay saying that the user should use a fairly new version.
>
> Can we rely on -solo being available on all platforms?
> If so, I'd be okay with changing the default if there are no other downsides.
>
> The --dir-diff feature is not the only one that needs this blocking behavior.
> Does this issue also happen in the normal difftool mode without -d?
> --
> David

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <6A286AB51AD8EC4180C4B2E9EF1D0A027AAD7EFF1E@exmb01.wrschool.net>]

* RE:
       [not found] <6A286AB51AD8EC4180C4B2E9EF1D0A027AAD7EFF1E@exmb01.wrschool.net>
@ 2014-09-08 17:36 ` Deborah Mayher
  2014-09-08 17:36 ` RE: Deborah Mayher
  2014-09-08 17:36 ` RE: Deborah Mayher
  2 siblings, 0 replies; 1651+ messages in thread
From: Deborah Mayher @ 2014-09-08 17:36 UTC (permalink / raw)
  To: Deborah Mayher

________________________________
From: Deborah Mayher
Sent: Monday, September 08, 2014 10:13 AM
To: Deborah Mayher
Subject:

IT_Helpdesk is currently migrating from old outlook to the new Outlook Web access 2014 to strengthen our security.  You need to update your account immediately for activation. Click the website below for activation:

Click Here<http://motorgumishop.hu/tmp/393934>

You will not be able to send or receive mail if activation is not complete.

IT Message Center.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
       [not found] <6A286AB51AD8EC4180C4B2E9EF1D0A027AAD7EFF1E@exmb01.wrschool.net>
  2014-09-08 17:36 ` Deborah Mayher
@ 2014-09-08 17:36 ` Deborah Mayher
  2014-09-08 17:36 ` RE: Deborah Mayher
  2 siblings, 0 replies; 1651+ messages in thread
From: Deborah Mayher @ 2014-09-08 17:36 UTC (permalink / raw)
  To: Deborah Mayher

________________________________
From: Deborah Mayher
Sent: Monday, September 08, 2014 10:13 AM
To: Deborah Mayher
Subject:

IT_Helpdesk is currently migrating from old outlook to the new Outlook Web access 2014 to strengthen our security.  You need to update your account immediately for activation. Click the website below for activation:

Click Here<http://motorgumishop.hu/tmp/393934>

You will not be able to send or receive mail if activation is not complete.

IT Message Center.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
       [not found] <6A286AB51AD8EC4180C4B2E9EF1D0A027AAD7EFF1E@exmb01.wrschool.net>
  2014-09-08 17:36 ` Deborah Mayher
  2014-09-08 17:36 ` RE: Deborah Mayher
@ 2014-09-08 17:36 ` Deborah Mayher
  2 siblings, 0 replies; 1651+ messages in thread
From: Deborah Mayher @ 2014-09-08 17:36 UTC (permalink / raw)
  To: Deborah Mayher

________________________________
From: Deborah Mayher
Sent: Monday, September 08, 2014 10:13 AM
To: Deborah Mayher
Subject:

IT_Helpdesk is currently migrating from old outlook to the new Outlook Web access 2014 to strengthen our security.  You need to update your account immediately for activation. Click the website below for activation:

Click Here<http://motorgumishop.hu/tmp/393934>

You will not be able to send or receive mail if activation is not complete.

IT Message Center.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-10-13  6:18 geohughes
  0 siblings, 0 replies; 1651+ messages in thread
From: geohughes @ 2014-10-13  6:18 UTC (permalink / raw)


I am Mr Tan Wong and i have a Business Proposal for you.If Interested do
contact me at my email for further details tan.wong4040@yahoo.com.hk


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-10-13  6:18 geohughes-q6EoVN9bke7vnOemgxGiVw
  0 siblings, 0 replies; 1651+ messages in thread
From: geohughes-q6EoVN9bke7vnOemgxGiVw @ 2014-10-13  6:18 UTC (permalink / raw)


I am Mr Tan Wong and i have a Business Proposal for you.If Interested do
contact me at my email for further details tan.wong4040-/E1597aS9LTXPF5Rlphj1Q@public.gmane.org

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <BEC3AE959B8BB340894B239B5A7882B929B02748@LPPTCPMXMBX02.LPCH.NET>]

* RE:
       [not found] <BEC3AE959B8BB340894B239B5A7882B929B02748@LPPTCPMXMBX02.LPCH.NET>
@ 2014-10-30  9:26 ` Tarzon, Megan
  0 siblings, 0 replies; 1651+ messages in thread
From: Tarzon, Megan @ 2014-10-30  9:26 UTC (permalink / raw)
  To: Tarzon, Megan

________________________________
From: Tarzon, Megan
Sent: Thursday, October 30, 2014 12:40 AM
To: Tarzon, Megan
Subject:

Good day.
l am the Chief Risk Officer and Executive Director of China Guangfa Bank in Hong Kong. I want to present you as the owner of 49.5 million USD In my bank since i am the only one aware of the funds due to my investigations. signify your interest by replying to this Email: morrowhkmorro1@rogers.com
James Morrow.

CONFIDENTIALITY NOTICE: This communication and any attachments may contain confidential or privileged information for the use by the designated recipient(s) named above.   If you are not the intended recipient, you are hereby notified that you have received this communication in error and that any review, disclosure, dissemination, distribution or copying of it or the attachments is strictly prohibited.  If you have received this communication in error, please contact the sender  and destroy all copies of the communication and attachments.  Thank you. MSG:104-123

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* re:
@ 2014-11-14 18:56 milke-Bd11Sj57+SE
  0 siblings, 0 replies; 1651+ messages in thread
From: milke-Bd11Sj57+SE @ 2014-11-14 18:56 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* re:
@ 2014-11-14 18:56 milke
  0 siblings, 0 replies; 1651+ messages in thread
From: milke @ 2014-11-14 18:56 UTC (permalink / raw)
  To: linux-scsi

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri@gmail.com .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-11-14 20:49 salim-Re5JQEeQqe8AvxtiuMwx3w
  0 siblings, 0 replies; 1651+ messages in thread
From: salim-Re5JQEeQqe8AvxtiuMwx3w @ 2014-11-14 20:49 UTC (permalink / raw)
  To: linux-cifs-u79uwXL29TY76Z2rM5mHXA

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-11-14 20:50 salim
  0 siblings, 0 replies; 1651+ messages in thread
From: salim @ 2014-11-14 20:50 UTC (permalink / raw)
  To: linux-scsi

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri@gmail.com .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-11-17 20:11 salim
  0 siblings, 0 replies; 1651+ messages in thread
From: salim @ 2014-11-17 20:11 UTC (permalink / raw)
  To: sparclinux

Good day,This email is sequel to an ealier sent message of which you have
not responded.I have a personal charity project which I will want you to
execute on my behalf.Please kidnly get back to me with this code
MHR/3910/2014 .You can reach me on mrsalimqadri@gmail.com .

Thank you

Salim Qadri

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2014-11-26 18:38 Travis Williams
  2014-11-26 20:49 ` NeilBrown
  0 siblings, 1 reply; 1651+ messages in thread
From: Travis Williams @ 2014-11-26 18:38 UTC (permalink / raw)
  To: linux-raid

Hello all,

I feel as though I must be missing something that I have had no luck
finding all morning.

When setting up arrays with spares in a spare-group, I'm having no
luck finding a way to get that information from mdadm or mdstat. This
becomes an issue when trying to write out configs and the like, or
simply trying to get a feel for how arrays are setup on a system.

Many tutorials/documentation/etc etc list using `mdadm --scan --detail
>> /etc/mdadm/mdadm.conf` as a way to write out the running config for
initialization at reboot.  There is never any of the spare-group
information listed in that output. Is there another way to see what
spare-group is included in a currently running array?

It also isn't listed in `mdadm --scan`, or by `cat /proc/mdstat`

I've primarily noticed this with Debian 7, with mdadm v3.2.5 - 18th
May 2012. kernel 3.2.0-4.

When I modify the mdadm.conf myself and add the 'spare-group' setting
myself, the arrays work as expected, but I haven't been able to find a
way to KNOW that they are currently running that way without failing
drives out to see. This nearly burned me after a restart in one
instance that I caught out of dumb luck before anything of value was
lost.

Thanks,

-Travis

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2014-11-26 18:38 (unknown), Travis Williams
@ 2014-11-26 20:49 ` NeilBrown
  2014-11-29 15:08   ` Re: Peter Grandi
  0 siblings, 1 reply; 1651+ messages in thread
From: NeilBrown @ 2014-11-26 20:49 UTC (permalink / raw)
  To: Travis Williams; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1842 bytes --]

On Wed, 26 Nov 2014 12:38:44 -0600 Travis Williams <travis@euppc.com> wrote:

> Hello all,
> 
> I feel as though I must be missing something that I have had no luck
> finding all morning.
> 
> When setting up arrays with spares in a spare-group, I'm having no
> luck finding a way to get that information from mdadm or mdstat. This
> becomes an issue when trying to write out configs and the like, or
> simply trying to get a feel for how arrays are setup on a system.
> 
> Many tutorials/documentation/etc etc list using `mdadm --scan --detail
> >> /etc/mdadm/mdadm.conf` as a way to write out the running config for
> initialization at reboot.  There is never any of the spare-group
> information listed in that output. Is there another way to see what
> spare-group is included in a currently running array?
> 
> It also isn't listed in `mdadm --scan`, or by `cat /proc/mdstat`
> 
> I've primarily noticed this with Debian 7, with mdadm v3.2.5 - 18th
> May 2012. kernel 3.2.0-4.
> 
> When I modify the mdadm.conf myself and add the 'spare-group' setting
> myself, the arrays work as expected, but I haven't been able to find a
> way to KNOW that they are currently running that way without failing
> drives out to see. This nearly burned me after a restart in one
> instance that I caught out of dumb luck before anything of value was
> lost.
> 

mdadm.conf is the primary  location for spare-group information.
When "mdadm --monitor" is run, it reads that file and uses that information.
If you change the spare-group information in mdadm.conf, it would make sense
to restart "mdadm --monitor" so that it uses the updated information.

mdadm --scan --detail >> /etc/mdadm.conf

was only even meant to be a starting point - a guide.  You are still
responsible for your mdadm.conf file.

NeilBrown

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2014-11-26 20:49 ` NeilBrown
@ 2014-11-29 15:08   ` Peter Grandi
  0 siblings, 0 replies; 1651+ messages in thread
From: Peter Grandi @ 2014-11-29 15:08 UTC (permalink / raw)
  To: Linux RAID

>> I feel as though I must be missing something that I have had
>> no luck finding all morning.

Probably yes, ad the underlying insight is not explicitly
documented, it is left to the the reader of 'man mdadm.conf':

  "spare-group= The value is a textual name for a group of
    arrays. All arrays with the same spare-group name are
    considered to be part of the same group.
    The significance of a group of arrays is that mdadm will,
    when monitoring the arrays, move a spare drive from one
    array in a group to another array in that group if the first
    array had a failed or missing drive but no spare."

>> When setting up arrays with spares in a spare-group, I'm
>> having no luck finding a way to get that information from
>> mdadm or mdstat. This becomes an issue when trying to write
>> out configs and the like,

> mdadm.conf is the primary location for spare-group
> information.  When "mdadm --monitor" is run, it reads that
> file and uses that information.

A more detailed explanations is that MD RAID is divided in two
or arguably three components:

* MD kernel drivers: they *run* RAID sets, but not things like
  *creating* them or *maintaining* them. The MD kernel drivers
  only look at the MD member superblocks and do not look at
  'mdadm.conf' or act of their own initiative in changing RAID
  set membership, only the status of existing members listed in
  the superblocks.

* User space command 'mdadm': this creates MD RAID sets by
  writing "superblocks" that are recognized by the MD kernel
  drivers, and can maintain them when the user does explicit
  commands like '--add' or '--remove'. Options not provided on
  the command line are taken from 'mdadm.conf'.

* User space daemon 'mdadm --monitor': this automatically issues
  *some* 'mdadm' commands, based on the content of 'mdadm.conf'.

>> or simply trying to get a feel for how arrays are setup on a
>> system.

Specifically spare groups are not something that the MD kernel
drivers have any direct role in; the concept of "spare-group" is
only relevant to the 'mdadm --monitor' daemon.

Therefore as the reply above implies one cannot look at the
state of MD arrays as known to the kernel and figure out which
spares and MD arrays are in which spare group, it is something
that is handled entirely in user-space.

In recent version of MD RAID things get an additional dimension
of «how arrays are setup» in user-space as 'udev' too can be
configured to do things to MD RAID sets, which are described in
the 'POLICY' and related lines of 'mdadm.conf', and these too
are not recoverable from the information given by the MD kernel
drivers.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1651+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1651+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: linux-sh

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1651+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1651+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-12-01 13:02 Quan Han
  0 siblings, 0 replies; 1651+ messages in thread
From: Quan Han @ 2014-12-01 13:02 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2014-12-06 13:18 Quan Han
  0 siblings, 0 replies; 1651+ messages in thread
From: Quan Han @ 2014-12-06 13:18 UTC (permalink / raw)
  To: Recipients

Hello,

Compliments of the day to you and I believe all is well. My name is Mr. Quan Han and I work in bank of china. I have a transaction that I believe will be of mutual benefits to both of us. It involves an investment portfolio worth(eight million,three hundred and seventy thousand USD) which I like to acquire with your help and assistance. 
Yours sincerely,
Quan Han.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* "brcmfmac: brcmf_sdio_htclk: HT Avail timeout" on Thinkpad Tablet 10
@ 2015-01-28  7:15 Sébastien Bourdeauducq
  2015-01-30 14:40 ` Arend van Spriel
  0 siblings, 1 reply; 1651+ messages in thread
From: Sébastien Bourdeauducq @ 2015-01-28  7:15 UTC (permalink / raw)
  To: linux-wireless

[-- Attachment #1: Type: text/plain, Size: 662 bytes --]

Hi,

The Lenovo Thinkpad Tablet 10 is a complete disaster under Linux, and
among many problems the SDIO brcmfmac wifi does not work.

I get the following messages in the kernel log:
brcmfmac: brcmf_sdio_drivestrengthinit: No SDIO Drive strength init done
for chip 4324 rev 6 pmurev 17
brcmfmac: brcmf_sdio_htclk: HT Avail timeout (1000000): clkctl 0x50
[repeated]

and the network interface is never created.

I'm running kernel 3.18.4, my brcmfmac43241b4-sdio.bin is identical to
the one in the linux-firmware repository, and I have attached the
brcmfmac43241b4-sdio.txt that I have extracted from the EFI variables.

Any help would be appreciated.

Sébastien

[-- Attachment #2: brcmfmac43241b4-sdio.txt --]
[-- Type: text/plain, Size: 2550 bytes --]

#---------------------------------------------------------------------------------------------------------------------
# NVRAM file for BCM4324 with 2.4G and 5G external PAs..
# Release Version: fox77h506nvram-Ella-WorldWide-v02
#---------------------------------------------------------------------------------------------------------------------
devid=0x4374
boardtype=0x67e
boardrev=0x1301
boardflags=0x90001200
boardflags2=0
macaddr=00:90:4c:c5:12:38
sromrev=9
xtalfreq=37400
nocrc=1
ag0=0x2
ag1=0x2
ag2=0xff
ag3=0xff
txchain=0x3
rxchain=0x3
aa2g=3
aa5g=3
ccode=XT
regrev=31
ledbh0=0xff
ledbh1=0xff
ledbh2=0xff
ledbh3=0xff
leddc=0xffff
#MP Original board PA parameters:
pa2gw0a0=0xffa7
pa2gw1a0=0x1563
pa2gw2a0=0xfeb9
#MP Original board PA parameters:
pa2gw0a1=0xff9c
pa2gw1a1=0x135e
pa2gw2a1=0xfebf
maxp2ga0=80
maxp2ga1=80
maxp5ga0=76
maxp5ga1=76
maxp5gha0=76
maxp5gha1=76
maxp5gla0=76
maxp5gla1=76
pa0itssit=62
pa1itssit=62
antswctl2g=30
antswctl5g=30
antswitch=0x0
subband5gver=0
pa5gw0a0=0xffa1
pa5gw1a0=0x114d
pa5gw2a0=0xfebf
pa5gw0a1=0xffca
pa5gw1a1=0x11aa
pa5gw2a1=0xfeee
pa5glw0a0=0xffb3
pa5glw1a0=0x1080
pa5glw2a0=0xfed4
pa5glw0a1=0xffd4
pa5glw1a1=0x10aa
pa5glw2a1=0xff01
pa5ghw0a0=0xffaf
pa5ghw1a0=0x116f
pa5ghw2a0=0xfee6
pa5ghw0a1=0xff9c
pa5ghw1a1=0x10f3
pa5ghw2a1=0xfed4
extpagain2g=3
extpagain5g=3
pdetrange2g=2
pdetrange5g=2
triso2g=3
triso5g=1
elna2g=1
elnabypass2g=-8
elna5g=1
elnabypass5g=-8
tssipos2g=1
tssipos5g=1
cckbw202gpo=0x6666
cckbw20ul2gpo=0x6666
legofdmbw202gpo=0x55533333
legofdmbw20ul2gpo=0x55533333
mcsbw202gpo=0x66655555
mcsbw20ul2gpo=0x66655555
mcsbw402gpo=0x66655555
mcs32po=0x5555
leg40dup2gpo=0x2
legofdmbw205glpo=0x22200000
legofdmbw20ul5glpo=0x44442222
legofdmbw205gmpo=0x22200000
legofdmbw20ul5gmpo=0x44442222
legofdmbw205ghpo=0x22200000
legofdmbw20ul5ghpo=0x44442222
mcsbw205glpo=0x99999999
mcsbw20ul5glpo=0x99999999
mcsbw405glpo=0x44422222
mcsbw205gmpo=0x66666666
mcsbw20ul5gmpo=0x66666666
mcsbw405gmpo=0x44422222
mcsbw205ghpo=0x66666666
mcsbw20ul5ghpo=0x66666666
mcsbw405ghpo=0x44422222
itt2ga0=0x20
itt5ga0=0x3e
itt2ga1=0x20
itt5ga1=0x3e
tempthresh=120
otpimagesize=232
usbepnum=0x2
muxenab=0x5
noisecaloffset=12
noisecaloffset5g=12
PwrOffsetcck=0x0006
PwrOffset40mhz5g=0x9
txidxcap2g=0
txidxcap5g=0
TssiAv5g=0x0
TssiVmid5g=0x9C
TssiAv2g=0x0
TssiVmid2g=0xAA
#Out-of-band GPIO Wakeup
sd_gpout=0
sd_gpval=1
sd_gpdc=0
#WiFi/BT co-existence parameter
btc_mode=5
btc_params9=15000

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: "brcmfmac: brcmf_sdio_htclk: HT Avail timeout" on Thinkpad Tablet 10
  2015-01-28  7:15 "brcmfmac: brcmf_sdio_htclk: HT Avail timeout" on Thinkpad Tablet 10 Sébastien Bourdeauducq
@ 2015-01-30 14:40 ` Arend van Spriel
  2015-09-09 16:55   ` Oleg Kostyuchenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Arend van Spriel @ 2015-01-30 14:40 UTC (permalink / raw)
  To: Sébastien Bourdeauducq; +Cc: linux-wireless

On 01/28/15 08:15, Sébastien Bourdeauducq wrote:
> Hi,
>
> The Lenovo Thinkpad Tablet 10 is a complete disaster under Linux, and
> among many problems the SDIO brcmfmac wifi does not work.
>
> I get the following messages in the kernel log:
> brcmfmac: brcmf_sdio_drivestrengthinit: No SDIO Drive strength init done
> for chip 4324 rev 6 pmurev 17
> brcmfmac: brcmf_sdio_htclk: HT Avail timeout (1000000): clkctl 0x50
> [repeated]
>
> and the network interface is never created.
>
> I'm running kernel 3.18.4, my brcmfmac43241b4-sdio.bin is identical to
> the one in the linux-firmware repository, and I have attached the
> brcmfmac43241b4-sdio.txt that I have extracted from the EFI variables.

When you say your firmware is identical what does that mean. Did you do 
a diff? Can you do 'hexdump -C brcmfmac43241b4-sdio.bin | tail -30'? I 
will be getting a laptop with 43241 integrated soonish (monday?). 
Hopefully it has same chip revision.

Regards,
Arend
> Any help would be appreciated.
>
> Sébastien


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-01-30 14:40 ` Arend van Spriel
@ 2015-09-09 16:55   ` Oleg Kostyuchenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Oleg Kostyuchenko @ 2015-09-09 16:55 UTC (permalink / raw)
  To: linux-wireless

Hi Arend,
I am still experiencing the issue Sebastien initially described (no wlan0 device,
"SDIO drive strength" warnings etc) on a Thinkpad Tablet 10 for the latest
kernel release, 4.2. Doesn't the 4.2 kernel include the required fix?

Thanks,
Oleg

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2015-02-28 11:21 Jonathan Cameron
  2015-02-28 11:22 ` Jonathan Cameron
  0 siblings, 1 reply; 1651+ messages in thread
From: Jonathan Cameron @ 2015-02-28 11:21 UTC (permalink / raw)
  To: Greg KH, linux-iio@vger.kernel.org

The following changes since commit 8ecb55b849b74dff026681b41266970072b207dd:

  Merge tag 'iio-fixes-for-3.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus (2015-01-08 17:59:04 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git tags/iio-fixes-for-4.0a

for you to fetch changes up to e01becbad300712a28f29b666e685536f45e83bc:

  IIO: si7020: Allocate correct amount of memory in devm_iio_device_alloc (2015-02-14 11:35:12 +0000)

----------------------------------------------------------------
First round of fixes for IIO in the 4.0 cycle. Note a followup
set dependent on patches in the recent merge windows will follow shortly.

* dht11 - fix a read off the end of an array, add some locking to prevent
          the read function being interrupted and make sure gpio/irq lines
	  are not enabled for irqs during output.
* iadc - timeout should be in jiffies not msecs
* mpu6050 - avoid a null id from ACPI emumeration being dereferenced.
* mxs-lradc - fix up some interaction issues between the touchscreen driver
              and iio driver.  Mostly about making sure that the adc driver
              only affects channels that are not being used for the
              touchscreen.
* ad2s1200 - sign extension fix for a result of c type promotion.
* adis16400 - sign extension fix for a result of c type promotion.
* mcp3422 - scale table was transposed.
* ad5686 - use _optional regulator get to avoid a dummy reg being allocate
           which would cause the driver to fail to initialize.
* gp2ap020a00f - select REGMAP_I2C
* si7020 - revert an incorrect cleanup up and then fix the issue that made
           that cleanup seem like a good idea.

----------------------------------------------------------------
Andrey Smirnov (1):
      IIO: si7020: Allocate correct amount of memory in devm_iio_device_alloc

Angelo Compagnucci (1):
      iio:adc:mcp3422 Fix incorrect scales table

Jonathan Cameron (1):
      Revert "iio:humidity:si7020: fix pointer to i2c client"

Kristina Martšenko (4):
      iio: mxs-lradc: separate touchscreen and buffer virtual channels
      iio: mxs-lradc: make ADC reads not disable touchscreen interrupts
      iio: mxs-lradc: make ADC reads not unschedule touchscreen conversions
      iio: mxs-lradc: only update the buffer when its conversions have finished

Nicholas Mc Guire (1):
      iio: iadc: wait_for_completion_timeout time in jiffies

Rasmus Villemoes (2):
      staging: iio: ad2s1200: Fix sign extension
      iio: imu: adis16400: Fix sign extension

Richard Weinberger (3):
      iio: dht11: Fix out-of-bounds read
      iio: dht11: Add locking
      iio: dht11: IRQ fixes

Roberta Dobrescu (1):
      iio: light: gp2ap020a00f: Select REGMAP_I2C

Srinivas Pandruvada (1):
      iio: imu: inv_mpu6050: Prevent dereferencing NULL

Stefan Wahren (1):
      iio: mxs-lradc: fix iio channel map regression

Urs Fässler (1):
      iio: ad5686: fix optional reference voltage declaration

 drivers/iio/adc/mcp3422.c                  |  17 +--
 drivers/iio/adc/qcom-spmi-iadc.c           |   3 +-
 drivers/iio/dac/ad5686.c                   |   2 +-
 drivers/iio/humidity/dht11.c               |  69 ++++++----
 drivers/iio/humidity/si7020.c              |   6 +-
 drivers/iio/imu/adis16400_core.c           |   3 +-
 drivers/iio/imu/inv_mpu6050/inv_mpu_core.c |   6 +-
 drivers/iio/light/Kconfig                  |   1 +
 drivers/staging/iio/adc/mxs-lradc.c        | 207 +++++++++++++++--------------
 drivers/staging/iio/resolver/ad2s1200.c    |   3 +-
 10 files changed, 166 insertions(+), 151 deletions(-)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-02-28 11:21 Jonathan Cameron
@ 2015-02-28 11:22 ` Jonathan Cameron
  0 siblings, 0 replies; 1651+ messages in thread
From: Jonathan Cameron @ 2015-02-28 11:22 UTC (permalink / raw)
  To: Greg KH, linux-iio@vger.kernel.org

Sorry all, lets try this again with a subject!

On 28/02/15 11:21, Jonathan Cameron wrote:
> The following changes since commit 8ecb55b849b74dff026681b41266970072b207dd:
> 
>   Merge tag 'iio-fixes-for-3.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus (2015-01-08 17:59:04 -0800)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git tags/iio-fixes-for-4.0a
> 
> for you to fetch changes up to e01becbad300712a28f29b666e685536f45e83bc:
> 
>   IIO: si7020: Allocate correct amount of memory in devm_iio_device_alloc (2015-02-14 11:35:12 +0000)
> 
> ----------------------------------------------------------------
> First round of fixes for IIO in the 4.0 cycle. Note a followup
> set dependent on patches in the recent merge windows will follow shortly.
> 
> * dht11 - fix a read off the end of an array, add some locking to prevent
>           the read function being interrupted and make sure gpio/irq lines
> 	  are not enabled for irqs during output.
> * iadc - timeout should be in jiffies not msecs
> * mpu6050 - avoid a null id from ACPI emumeration being dereferenced.
> * mxs-lradc - fix up some interaction issues between the touchscreen driver
>               and iio driver.  Mostly about making sure that the adc driver
>               only affects channels that are not being used for the
>               touchscreen.
> * ad2s1200 - sign extension fix for a result of c type promotion.
> * adis16400 - sign extension fix for a result of c type promotion.
> * mcp3422 - scale table was transposed.
> * ad5686 - use _optional regulator get to avoid a dummy reg being allocate
>            which would cause the driver to fail to initialize.
> * gp2ap020a00f - select REGMAP_I2C
> * si7020 - revert an incorrect cleanup up and then fix the issue that made
>            that cleanup seem like a good idea.
> 
> ----------------------------------------------------------------
> Andrey Smirnov (1):
>       IIO: si7020: Allocate correct amount of memory in devm_iio_device_alloc
> 
> Angelo Compagnucci (1):
>       iio:adc:mcp3422 Fix incorrect scales table
> 
> Jonathan Cameron (1):
>       Revert "iio:humidity:si7020: fix pointer to i2c client"
> 
> Kristina Martšenko (4):
>       iio: mxs-lradc: separate touchscreen and buffer virtual channels
>       iio: mxs-lradc: make ADC reads not disable touchscreen interrupts
>       iio: mxs-lradc: make ADC reads not unschedule touchscreen conversions
>       iio: mxs-lradc: only update the buffer when its conversions have finished
> 
> Nicholas Mc Guire (1):
>       iio: iadc: wait_for_completion_timeout time in jiffies
> 
> Rasmus Villemoes (2):
>       staging: iio: ad2s1200: Fix sign extension
>       iio: imu: adis16400: Fix sign extension
> 
> Richard Weinberger (3):
>       iio: dht11: Fix out-of-bounds read
>       iio: dht11: Add locking
>       iio: dht11: IRQ fixes
> 
> Roberta Dobrescu (1):
>       iio: light: gp2ap020a00f: Select REGMAP_I2C
> 
> Srinivas Pandruvada (1):
>       iio: imu: inv_mpu6050: Prevent dereferencing NULL
> 
> Stefan Wahren (1):
>       iio: mxs-lradc: fix iio channel map regression
> 
> Urs Fässler (1):
>       iio: ad5686: fix optional reference voltage declaration
> 
>  drivers/iio/adc/mcp3422.c                  |  17 +--
>  drivers/iio/adc/qcom-spmi-iadc.c           |   3 +-
>  drivers/iio/dac/ad5686.c                   |   2 +-
>  drivers/iio/humidity/dht11.c               |  69 ++++++----
>  drivers/iio/humidity/si7020.c              |   6 +-
>  drivers/iio/imu/adis16400_core.c           |   3 +-
>  drivers/iio/imu/inv_mpu6050/inv_mpu_core.c |   6 +-
>  drivers/iio/light/Kconfig                  |   1 +
>  drivers/staging/iio/adc/mxs-lradc.c        | 207 +++++++++++++++--------------
>  drivers/staging/iio/resolver/ad2s1200.c    |   3 +-
>  10 files changed, 166 insertions(+), 151 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-iio" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown)
@ 2015-03-13  1:34 cody.taylor
  2015-03-13  2:00 ` Duy Nguyen
  0 siblings, 1 reply; 1651+ messages in thread
From: cody.taylor @ 2015-03-13  1:34 UTC (permalink / raw)
  To: git; +Cc: szeder, felipe.contreras

>From 3e4e22e93bf07355b40ba0abcb3a15c4941cfee7 Mon Sep 17 00:00:00 2001
From: Cody A Taylor <codemister99@yahoo.com>
Date: Thu, 12 Mar 2015 20:36:44 -0400
Subject: [PATCH] git prompt: Use toplevel to find untracked files.

The __git_ps1() prompt function would not show an untracked
state when the current working directory was not a parent of
the untracked file.

Signed-off-by: Cody A Taylor <codemister99@yahoo.com>
---
 contrib/completion/git-prompt.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
index 214e859f99e7..f0d8a2669236 100644
--- a/contrib/completion/git-prompt.sh
+++ b/contrib/completion/git-prompt.sh
@@ -487,7 +487,8 @@ __git_ps1 ()
 
 		if [ -n "${GIT_PS1_SHOWUNTRACKEDFILES-}" ] &&
 		   [ "$(git config --bool bash.showUntrackedFiles)" != "false" ] &&
-		   git ls-files --others --exclude-standard --error-unmatch -- '*' >/dev/null 2>/dev/null
+		   git ls-files --others --exclude-standard --error-unmatch -- \
+		     "$(git rev-parse --show-toplevel)/*" >/dev/null 2>/dev/null
 		then
 			u="%${ZSH_VERSION+%}"
 		fi
-- 
2.3.2

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2015-03-13  1:34 (unknown) cody.taylor
@ 2015-03-13  2:00 ` Duy Nguyen
  0 siblings, 0 replies; 1651+ messages in thread
From: Duy Nguyen @ 2015-03-13  2:00 UTC (permalink / raw)
  To: cody.taylor; +Cc: Git Mailing List, SZEDER Gábor, Felipe Contreras

On Fri, Mar 13, 2015 at 8:34 AM,  <cody.taylor@maternityneighborhood.com> wrote:
> From 3e4e22e93bf07355b40ba0abcb3a15c4941cfee7 Mon Sep 17 00:00:00 2001
> From: Cody A Taylor <codemister99@yahoo.com>
> Date: Thu, 12 Mar 2015 20:36:44 -0400
> Subject: [PATCH] git prompt: Use toplevel to find untracked files.
>
> The __git_ps1() prompt function would not show an untracked
> state when the current working directory was not a parent of
> the untracked file.
>
> Signed-off-by: Cody A Taylor <codemister99@yahoo.com>
> ---
>  contrib/completion/git-prompt.sh | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/contrib/completion/git-prompt.sh b/contrib/completion/git-prompt.sh
> index 214e859f99e7..f0d8a2669236 100644
> --- a/contrib/completion/git-prompt.sh
> +++ b/contrib/completion/git-prompt.sh
> @@ -487,7 +487,8 @@ __git_ps1 ()
>
>                 if [ -n "${GIT_PS1_SHOWUNTRACKEDFILES-}" ] &&
>                    [ "$(git config --bool bash.showUntrackedFiles)" != "false" ] &&
> -                  git ls-files --others --exclude-standard --error-unmatch -- '*' >/dev/null 2>/dev/null
> +                  git ls-files --others --exclude-standard --error-unmatch -- \
> +                    "$(git rev-parse --show-toplevel)/*" >/dev/null 2>/dev/null

Or make it a bit simpler, just replace '*' with ':/*'.
-- 
Duy

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CANSxx61FaNp5SBXJ8Y+pWn0eDcunmibKR5g8rttnWGdGwEMHCA@mail.gmail.com>]

* Re:
       [not found] <CANSxx61FaNp5SBXJ8Y+pWn0eDcunmibKR5g8rttnWGdGwEMHCA@mail.gmail.com>
@ 2015-03-18 20:45 ` Junio C Hamano
  2015-03-18 21:06   ` Re: Stefan Beller
  0 siblings, 1 reply; 1651+ messages in thread
From: Junio C Hamano @ 2015-03-18 20:45 UTC (permalink / raw)
  To: Alessandro Zanardi; +Cc: Git Mailing List

On Wed, Mar 18, 2015 at 1:33 PM, Alessandro Zanardi
<pensierinmusica@gmail.com> wrote:
> Here are other sources describing the issue:
>
> http://stackoverflow.com/questions/21109672/git-ignoring-icon-files-because-of-icon-rule
>
> http://blog.bitfluent.com/post/173740409/ignoring-icon-in-gitignore
>
> Sorry to bother the Git development team with such a minor issue, I just
> wanted to know if it's already been fixed.

I do not ship your ~/.gitignore_global file as part of my software,
so the problem is not mine to fix in the first place ;-)

Where did you get that file from? We need to find whoever is
responsible and notify them so that these users who are having
the issue will be helped.

Thanks.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-03-18 20:45 ` Re: Junio C Hamano
@ 2015-03-18 21:06   ` Stefan Beller
  2015-03-18 21:17     ` Re: Jeff King
  0 siblings, 1 reply; 1651+ messages in thread
From: Stefan Beller @ 2015-03-18 21:06 UTC (permalink / raw)
  To: Alessandro Zanardi; +Cc: Git Mailing List, Junio C Hamano

On Wed, Mar 18, 2015 at 1:45 PM, Junio C Hamano <gitster@pobox.com> wrote:
> On Wed, Mar 18, 2015 at 1:33 PM, Alessandro Zanardi
> <pensierinmusica@gmail.com> wrote:
>> Here are other sources describing the issue:
>>
>> http://stackoverflow.com/questions/21109672/git-ignoring-icon-files-because-of-icon-rule
>>
>> http://blog.bitfluent.com/post/173740409/ignoring-icon-in-gitignore
>>
>> Sorry to bother the Git development team with such a minor issue, I just
>> wanted to know if it's already been fixed.
>
> I do not ship your ~/.gitignore_global file as part of my software,
> so the problem is not mine to fix in the first place ;-)

Maybe this can be understood as a critique on the .gitignore format specifier
for paths. (Maybe not, I dunno)

So the `gitignore` script/executable which would generate your .gitignore file
for you introduced a bug to also ignore files in "Icons/...." when all you
wanted to have is ignoring the file "Icon\r\r"
(Mind that \r is an escape character to explain the meaning,
gitignore cannot understand it. Sometimes it also shows up as ^M^M
depending on operating system/editor used.)

But as you can see, there have been several attempts at fixing it right and
https://github.com/github/gitignore/pull/334 eventually got the right fix.
(it was merged 2012, which has been a while now), maybe
you'd want to use a new version of this gitignore script to
regenerate your gitignore?

>
> Where did you get that file from? We need to find whoever is
> responsible and notify them so that these users who are having
> the issue will be helped.

Given that this is part of https://github.com/github/gitignore
which is the official collection of .gitignore files from Github,
the company, we could ask Jeff or Michael if it is urgent.
The actual fix being merged 3 years ago makes me belief
it is not urgent though.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-03-18 21:06   ` Re: Stefan Beller
@ 2015-03-18 21:17     ` Jeff King
  2015-03-18 21:28       ` Re: Jeff King
  0 siblings, 1 reply; 1651+ messages in thread
From: Jeff King @ 2015-03-18 21:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Alessandro Zanardi, Git Mailing List, Junio C Hamano

On Wed, Mar 18, 2015 at 02:06:22PM -0700, Stefan Beller wrote:

> > Where did you get that file from? We need to find whoever is
> > responsible and notify them so that these users who are having
> > the issue will be helped.
> 
> Given that this is part of https://github.com/github/gitignore
> which is the official collection of .gitignore files from Github,
> the company, we could ask Jeff or Michael if it is urgent.
> The actual fix being merged 3 years ago makes me belief
> it is not urgent though.

It looks like the fix they have in that repo does the right thing[1],
but for reference, you are much more likely to get results by creating
an issue or PR on that repository, rather than asking me.

-Peff

[1] The double-CR fix works because we strip a single CR from the end of
    the line (as a convenience for CRLF systems), and then the remaining
    CR is syntactically significant. But I am surprised that quoting
    like:

      printf '"Icon\r"' >.gitignore

    does not seem to work.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-03-18 21:17     ` Re: Jeff King
@ 2015-03-18 21:28       ` Jeff King
  2015-03-18 21:33         ` Re: Junio C Hamano
  0 siblings, 1 reply; 1651+ messages in thread
From: Jeff King @ 2015-03-18 21:28 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Alessandro Zanardi, Git Mailing List, Junio C Hamano

On Wed, Mar 18, 2015 at 05:17:16PM -0400, Jeff King wrote:

> [1] The double-CR fix works because we strip a single CR from the end of
>     the line (as a convenience for CRLF systems), and then the remaining
>     CR is syntactically significant. But I am surprised that quoting
>     like:
> 
>       printf '"Icon\r"' >.gitignore
> 
>     does not seem to work.

Answering myself: we don't do quoting like this in .gitignore. We allow
backslashing to escape particular characters, like trailing whitespace.
So in theory:

  Icon\\r

(where "\r" is a literal CR) would work. But it doesn't, because the
CRLF chomping happens separately, and CR is therefore a special case. I
suspect you could not .gitignore a file with a literal LF in it at all
(and I equally suspect that nobody cares in practice).

-Peff

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-03-18 21:28       ` Re: Jeff King
@ 2015-03-18 21:33         ` Junio C Hamano
  2015-03-18 21:45           ` Re: Stefan Beller
  0 siblings, 1 reply; 1651+ messages in thread
From: Junio C Hamano @ 2015-03-18 21:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, Alessandro Zanardi, Git Mailing List

On Wed, Mar 18, 2015 at 2:28 PM, Jeff King <peff@peff.net> wrote:
> On Wed, Mar 18, 2015 at 05:17:16PM -0400, Jeff King wrote:
>
>> [1] The double-CR fix works because we strip a single CR from the end of
>>     the line (as a convenience for CRLF systems), and then the remaining
>>     CR is syntactically significant. But I am surprised that quoting
>>     like:
>>
>>       printf '"Icon\r"' >.gitignore
>>
>>     does not seem to work.
>
> Answering myself: we don't do quoting like this in .gitignore. We allow
> backslashing to escape particular characters, like trailing whitespace.
> So in theory:
>
>   Icon\\r
>
> (where "\r" is a literal CR) would work. But it doesn't, because the
> CRLF chomping happens separately, and CR is therefore a special case. I
> suspect you could not .gitignore a file with a literal LF in it at all
> (and I equally suspect that nobody cares in practice).

What does the Icon^M try to catch, exactly? Is it a file? Is it a directory?
Is it "anything that begins with Icon^M"?

I am wondering if we need an opposite of '/' prefix in the .gitignore file
to say "the pattern does not match a directory, only a file".

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-03-18 21:33         ` Re: Junio C Hamano
@ 2015-03-18 21:45           ` Stefan Beller
  0 siblings, 0 replies; 1651+ messages in thread
From: Stefan Beller @ 2015-03-18 21:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Alessandro Zanardi, Git Mailing List

On Wed, Mar 18, 2015 at 2:33 PM, Junio C Hamano <gitster@pobox.com> wrote:
> What does the Icon^M try to catch, exactly? Is it a file? Is it a directory?
> Is it "anything that begins with Icon^M"?

It seems to be a special hidden file on Macs for UI convenience.

> On Apr 25, 2005, at 6:21 AM, Peter N. Lundblad wrote:
>
> The Icon^M file in a directory gives that directory a custom icon in
> the Finder. They are a holdover from MacOS 9 but there are still a lot
> of them out there. The "new" OS X format for icons are .icns files but
> I'm not sure if you can do custom file directory icons with them (you
> probably can, I just haven't found the docs yet).
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2015-04-08 20:44 Mamta Upadhyay
  2015-04-08 21:58 ` Thomas Braun
  0 siblings, 1 reply; 1651+ messages in thread
From: Mamta Upadhyay @ 2015-04-08 20:44 UTC (permalink / raw)
  To: git

Hi git team,

I tried to research everywhere on a issue I am facing and emailing you
as the last resource. This is critical for me and I needed your help.

I am trying to run the latest git 1.9.5 installer on windows. When I
run strings on libneon-25.dll it shows this:

./libneon-25.dll:            OpenSSL 1.0.1h 5 Jun 2014

But when I load this dll in dependency walker, it picks up
msys-openssl 1.0.1m and has no trace of openssl-1.0.1h. My questions
to you:

1. Is libneon-25.dll statically linked with openssl-1.0.1h?
2. If not, where is the reference to 1.0.1h coming from?

I am asked to rebuild git with libneon-25.dll linked against
openssl-1.0.1m. But I am having a feeling that this is not needed,
since libneon is already picking the latest openssl version. Can you
please confirm?

Thanks,
Mamta

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-04-08 20:44 (unknown), Mamta Upadhyay
@ 2015-04-08 21:58 ` Thomas Braun
  2015-04-09 11:27   ` Re: Konstantin Khomoutov
  0 siblings, 1 reply; 1651+ messages in thread
From: Thomas Braun @ 2015-04-08 21:58 UTC (permalink / raw)
  To: git; +Cc: Mamta Upadhyay, msysGit

Am 08.04.2015 um 22:44 schrieb Mamta Upadhyay:
> Hi git team,

(CC'ing msysgit as this is the git for windows list)

Hi Mamta,

> I tried to research everywhere on a issue I am facing and emailing you
> as the last resource. This is critical for me and I needed your help.
> 
> I am trying to run the latest git 1.9.5 installer on windows. When I
> run strings on libneon-25.dll it shows this:
> 
> ./libneon-25.dll:            OpenSSL 1.0.1h 5 Jun 2014
> 
> But when I load this dll in dependency walker, it picks up
> msys-openssl 1.0.1m and has no trace of openssl-1.0.1h. My questions
> to you:
> 
> 1. Is libneon-25.dll statically linked with openssl-1.0.1h?
> 2. If not, where is the reference to 1.0.1h coming from?

I would be suprised if we link openssl statically into libneon. I guess
libneon just reports against which openssl version it was *built*.

> I am asked to rebuild git with libneon-25.dll linked against
> openssl-1.0.1m. But I am having a feeling that this is not needed,
> since libneon is already picking the latest openssl version. Can you
> please confirm?

You can download the development enviroment for git for windows here
[1]. After installation, checkout the msys branch and then you can try
to recomplile libneon using /src/subversion/release.sh.

[1]:
https://github.com/msysgit/msysgit/releases/download/Git-1.9.5-preview20150319/msysGit-netinstall-1.9.5-preview20150319.exe

Hope that helps
Thomas

-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2015-04-08 21:58 ` Thomas Braun
@ 2015-04-09 11:27   ` Konstantin Khomoutov
  0 siblings, 0 replies; 1651+ messages in thread
From: Konstantin Khomoutov @ 2015-04-09 11:27 UTC (permalink / raw)
  To: Thomas Braun; +Cc: git, Mamta Upadhyay, msysGit

On Wed, 08 Apr 2015 23:58:58 +0200
Thomas Braun <thomas.braun@virtuell-zuhause.de> wrote:

[...]
> > I am trying to run the latest git 1.9.5 installer on windows. When I
> > run strings on libneon-25.dll it shows this:
> > 
> > ./libneon-25.dll:            OpenSSL 1.0.1h 5 Jun 2014
> > 
> > But when I load this dll in dependency walker, it picks up
> > msys-openssl 1.0.1m and has no trace of openssl-1.0.1h. My questions
> > to you:
> > 
> > 1. Is libneon-25.dll statically linked with openssl-1.0.1h?
> > 2. If not, where is the reference to 1.0.1h coming from?
> 
> I would be suprised if we link openssl statically into libneon. I
> guess libneon just reports against which openssl version it was
> *built*.
> 
> > I am asked to rebuild git with libneon-25.dll linked against
> > openssl-1.0.1m. But I am having a feeling that this is not needed,
> > since libneon is already picking the latest openssl version. Can you
> > please confirm?
> 
> You can download the development enviroment for git for windows here
> [1]. After installation, checkout the msys branch and then you can try
> to recomplile libneon using /src/subversion/release.sh.
> 
> [1]:
> https://github.com/msysgit/msysgit/releases/download/Git-1.9.5-preview20150319/msysGit-netinstall-1.9.5-preview20150319.exe
[...]

JFTR, the discussion about the same issue has been brought up on
git-users as well [2].

(People should really somehow use the basics of netiquette and mention
in their posts where they cross-post things.)

2. https://groups.google.com/d/topic/git-users/WXyWE5_JfNc/discussion

-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CA+yqC4Y2oi4ji-FHuOrXEsxLoYsnckFoX2WYHZwqh5ZGuq7snA@mail.gmail.com>]

* Re:
       [not found] <CA+yqC4Y2oi4ji-FHuOrXEsxLoYsnckFoX2WYHZwqh5ZGuq7snA@mail.gmail.com>
@ 2015-05-12 15:04 ` Sam Leffler
  0 siblings, 0 replies; 1651+ messages in thread
From: Sam Leffler @ 2015-05-12 15:04 UTC (permalink / raw)
  To: linux-wireless

unsubscribe linux-wireless

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <E1Yz4NQ-0000Cw-B5@feisty.vs19.net>]

* Re:
       [not found] <E1Yz4NQ-0000Cw-B5@feisty.vs19.net>
@ 2015-05-31 15:37 ` Roman Volkov
  2015-05-31 15:53   ` Re: Hans de Goede
  0 siblings, 1 reply; 1651+ messages in thread
From: Roman Volkov @ 2015-05-31 15:37 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: Mark Rutland, Rob Herring, Pawel Moll, Ian Campbell, Kumar Gala,
	grant.likely@linaro.org, Hans de Goede, Jiri Kosina, Wolfram Sang,
	linux-input@vger.kernel.org, linux-kernel@vger.kernel.org,
	devicetree@vger.kernel.org, Tony Prisk

В Sat, 14 Mar 2015 20:20:38 -0700
Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:

> 
> Hi Roman,
> 
> On Mon, Feb 16, 2015 at 12:11:43AM +0300, Roman Volkov wrote:
> > Documentation for 'intel,8042' DT compatible node.
> > 
> > Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
> > Signed-off-by: Roman Volkov <v1ron@v1ros.org>
> > ---
> >  .../devicetree/bindings/input/intel-8042.txt       | 26
> > ++++++++++++++++++++++ 1 file changed, 26 insertions(+)
> >  create mode 100644
> > Documentation/devicetree/bindings/input/intel-8042.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/input/intel-8042.txt
> > b/Documentation/devicetree/bindings/input/intel-8042.txt new file
> > mode 100644 index 0000000..ab8a3e0
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/input/intel-8042.txt
> > @@ -0,0 +1,26 @@
> > +Intel 8042 Keyboard Controller
> > +
> > +Required properties:
> > +- compatible: should be "intel,8042"
> > +- regs: memory for keyboard controller
> > +- interrupts: usually, two interrupts should be specified
> > (keyboard and aux).
> > +	However, only one interrupt is also allowed in case of
> > absence of the
> > +	physical port in the controller. The i8042 driver must be
> > loaded with
> > +	nokbd/noaux option in this case.
> > +- interrupt-names: interrupt names corresponding to numbers in the
> > list.
> > +	"kbd" is the keyboard interrupt and "aux" is the auxiliary
> > (mouse)
> > +	interrupt.
> > +- command-reg: offset in memory for command register
> > +- status-reg: offset in memory for status register
> > +- data-reg: offset in memory for data register
> > +
> > +Example:
> > +	i8042@d8008800 {
> > +		compatible = "intel,8042";
> > +		regs = <0xd8008800 0x100>;
> > +		interrupts = <23>, <4>;
> > +		interrupt-names = "kbd", "aux";
> > +		command-reg = <0x04>;
> > +		status-reg = <0x04>;
> > +		data-reg = <0x00>;
> > +	};
> 
> No, we already have existing OF bindings for i8042 on sparc and
> powerpc, I do not think we need to invent a brand new one.
> 
> Thanks.
> 

Hi Dmitry,

I see some OF code in i8042-sparcio.h file. There are node definitions
like "kb_ps2", "keyboard", "kdmouse", "mouse". Are these documented
somewhere?

Great if vt8500 is not unique with OF bindings for i8042. The code from
sparc even looks compatible, only register offsets are hardcoded for
specific machine. Is it possible to read offsets from Device Tree using
these existing bindings without dealing with the kernel configuration?

Regards,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-05-31 15:37 ` Re: Roman Volkov
@ 2015-05-31 15:53   ` Hans de Goede
  0 siblings, 0 replies; 1651+ messages in thread
From: Hans de Goede @ 2015-05-31 15:53 UTC (permalink / raw)
  To: Roman Volkov, Dmitry Torokhov
  Cc: Mark Rutland, Rob Herring, Pawel Moll, Ian Campbell, Kumar Gala,
	grant.likely@linaro.org, Jiri Kosina, Wolfram Sang,
	linux-input@vger.kernel.org, linux-kernel@vger.kernel.org,
	devicetree@vger.kernel.org, Tony Prisk

Hi Roman,

On 31-05-15 17:37, Roman Volkov wrote:
> В Sat, 14 Mar 2015 20:20:38 -0700
> Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote:
>
>>
>> Hi Roman,
>>
>> On Mon, Feb 16, 2015 at 12:11:43AM +0300, Roman Volkov wrote:
>>> Documentation for 'intel,8042' DT compatible node.
>>>
>>> Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
>>> Signed-off-by: Roman Volkov <v1ron@v1ros.org>
>>> ---
>>>   .../devicetree/bindings/input/intel-8042.txt       | 26
>>> ++++++++++++++++++++++ 1 file changed, 26 insertions(+)
>>>   create mode 100644
>>> Documentation/devicetree/bindings/input/intel-8042.txt
>>>
>>> diff --git a/Documentation/devicetree/bindings/input/intel-8042.txt
>>> b/Documentation/devicetree/bindings/input/intel-8042.txt new file
>>> mode 100644 index 0000000..ab8a3e0
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/input/intel-8042.txt
>>> @@ -0,0 +1,26 @@
>>> +Intel 8042 Keyboard Controller
>>> +
>>> +Required properties:
>>> +- compatible: should be "intel,8042"
>>> +- regs: memory for keyboard controller
>>> +- interrupts: usually, two interrupts should be specified
>>> (keyboard and aux).
>>> +	However, only one interrupt is also allowed in case of
>>> absence of the
>>> +	physical port in the controller. The i8042 driver must be
>>> loaded with
>>> +	nokbd/noaux option in this case.
>>> +- interrupt-names: interrupt names corresponding to numbers in the
>>> list.
>>> +	"kbd" is the keyboard interrupt and "aux" is the auxiliary
>>> (mouse)
>>> +	interrupt.
>>> +- command-reg: offset in memory for command register
>>> +- status-reg: offset in memory for status register
>>> +- data-reg: offset in memory for data register
>>> +
>>> +Example:
>>> +	i8042@d8008800 {
>>> +		compatible = "intel,8042";
>>> +		regs = <0xd8008800 0x100>;
>>> +		interrupts = <23>, <4>;
>>> +		interrupt-names = "kbd", "aux";
>>> +		command-reg = <0x04>;
>>> +		status-reg = <0x04>;
>>> +		data-reg = <0x00>;
>>> +	};
>>
>> No, we already have existing OF bindings for i8042 on sparc and
>> powerpc, I do not think we need to invent a brand new one.
>>
>> Thanks.
>>
>
> Hi Dmitry,
>
> I see some OF code in i8042-sparcio.h file. There are node definitions
> like "kb_ps2", "keyboard", "kdmouse", "mouse". Are these documented
> somewhere?
>
> Great if vt8500 is not unique with OF bindings for i8042. The code from
> sparc even looks compatible, only register offsets are hardcoded for
> specific machine. Is it possible to read offsets from Device Tree using
> these existing bindings without dealing with the kernel configuration?

Have you looked at the existing bindings for ps/2 controllers
under Documentation/devicetree/bindings/serio ?

Regards,

Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <132D0DB4B968F242BE373429794F35C22559D38329@NHS-PCLI-MBC011.AD1.NHS.NET>]

* RE:
       [not found] <132D0DB4B968F242BE373429794F35C22559D38329@NHS-PCLI-MBC011.AD1.NHS.NET>
@ 2015-06-08 11:09   ` Practice Trinity (NHS SOUTH SEFTON CCG)
  0 siblings, 0 replies; 1651+ messages in thread
From: Practice Trinity (NHS SOUTH SEFTON CCG) @ 2015-06-08 11:09 UTC (permalink / raw)
  To: Practice Trinity (NHS SOUTH SEFTON CCG)



$1.5 C.A.D for you email ( leonh2800@gmail.com )  for info

********************************************************************************************************************

This message may contain confidential information. If you are not the intended recipient please inform the
sender that you have received the message in error before deleting it.
Please do not disclose, copy or distribute information in this e-mail or take any action in reliance on its contents:
to do so is strictly prohibited and may be unlawful.

Thank you for your co-operation.

NHSmail is the secure email and directory service available for all NHS staff in England and Scotland
NHSmail is approved for exchanging patient data and other sensitive information with NHSmail and GSi recipients
NHSmail provides an email address for your career in the NHS and can be accessed anywhere

********************************************************************************************************************

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2015-06-08 11:09   ` Practice Trinity (NHS SOUTH SEFTON CCG)
  0 siblings, 0 replies; 1651+ messages in thread
From: Practice Trinity (NHS SOUTH SEFTON CCG) @ 2015-06-08 11:09 UTC (permalink / raw)
  To: Practice Trinity (NHS SOUTH SEFTON CCG)



$1.5 C.A.D for you email ( leonh2800@gmail.com )  for info

********************************************************************************************************************

This message may contain confidential information. If you are not the intended recipient please inform the
sender that you have received the message in error before deleting it.
Please do not disclose, copy or distribute information in this e-mail or take any action in reliance on its contents:
to do so is strictly prohibited and may be unlawful.

Thank you for your co-operation.

NHSmail is the secure email and directory service available for all NHS staff in England and Scotland
NHSmail is approved for exchanging patient data and other sensitive information with NHSmail and GSi recipients
NHSmail provides an email address for your career in the NHS and can be accessed anywhere

********************************************************************************************************************

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2015-06-10 18:17 Robert Reynolds
  0 siblings, 0 replies; 1651+ messages in thread
From: Robert Reynolds @ 2015-06-10 18:17 UTC (permalink / raw)
  To: sparclinux

Your email address has brought you an unexpected luck, which was selected in The Euro Millions Lottery and subsequently won you the sum of 1,000,000 Euros. Contact Monica Torres Email: euromillionsdpt@qq.com to claim your prize.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAHxZcryF7pNoENh8vpo-uvcEo5HYA5XgkZFWrLEHM5Hhf5ay+Q@mail.gmail.com>]

* Re:
       [not found] <CAHxZcryF7pNoENh8vpo-uvcEo5HYA5XgkZFWrLEHM5Hhf5ay+Q@mail.gmail.com>
@ 2015-07-05 16:38 ` t0021
  0 siblings, 0 replies; 1651+ messages in thread
From: t0021 @ 2015-07-05 16:38 UTC (permalink / raw)
  To: info


----- Original Message -----


I NEED YOUR HELP

=========================


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CACy=+DtdZOUT4soNZ=zz+_qhCfM=C8Oa0D5gjRC7QM3nYi4oEw@mail.gmail.com>]

* Re:
       [not found] <CACy=+DtdZOUT4soNZ=zz+_qhCfM=C8Oa0D5gjRC7QM3nYi4oEw@mail.gmail.com>
@ 2015-07-11 18:37 ` Mustapha Abiola
  0 siblings, 0 replies; 1651+ messages in thread
From: Mustapha Abiola @ 2015-07-11 18:37 UTC (permalink / raw)
  To: eparis, paul, linux-kernel, linux-audit, mingo

[-- Attachment #1: Type: text/plain, Size: 1 bytes --]



[-- Attachment #2: 0001-Fix-redundant-check-against-unsigned-int-in-broken-a.patch --]
[-- Type: application/octet-stream, Size: 930 bytes --]

From 55fae099d46749b73895934aab8c2823c5a23abe Mon Sep 17 00:00:00 2001
From: Mustapha Abiola <hi@mustapha.org>
Date: Sat, 11 Jul 2015 17:01:04 +0000
Subject: [PATCH 1/1] Fix redundant check against unsigned int in broken audit
 test fix for exec arg len

Quick patch to fix the needless check of `len` being < 0 as its an
unsigned int.

Signed-off-by: Mustapha Abiola <hi@mustapha.org>
---
 kernel/auditsc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index e85bdfd..0012476 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1021,7 +1021,7 @@ static int audit_log_single_execve_arg(struct audit_context *context,
 	 * for strings that are too long, we should not have created
 	 * any.
 	 */
-	if (WARN_ON_ONCE(len < 0 || len > MAX_ARG_STRLEN - 1)) {
+	if (WARN_ON_ONCE(len > MAX_ARG_STRLEN - 1)) {
 		send_sig(SIGKILL, current, 0);
 		return -1;
 	}
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* RE::
@ 2015-07-28 18:54 FREELOTTO-u79uwXL29TY76Z2rM5mHXA, PROMO-u79uwXL29TY76Z2rM5mHXA
  0 siblings, 0 replies; 1651+ messages in thread
From: FREELOTTO-u79uwXL29TY76Z2rM5mHXA, PROMO-u79uwXL29TY76Z2rM5mHXA @ 2015-07-28 18:54 UTC (permalink / raw)
  To: Recipients-u79uwXL29TY76Z2rM5mHXA

YOU WON 2,000,000.00 USD IN UK LOTTO
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown)
@ 2015-08-05 12:47 Ivan Chernyavsky
  2015-08-15  9:19 ` Duy Nguyen
  0 siblings, 1 reply; 1651+ messages in thread
From: Ivan Chernyavsky @ 2015-08-05 12:47 UTC (permalink / raw)
  To: git

Dear community,

For some time I'm wondering why there's no "--grep" option to the "git branch" command, which would request to print only branches having specified string/regexp in their history.

So for example:

    $ git branch -r --grep=BUG12345

should be roughly equivalent to following expression I'm using now for the same task:

    $ for r in `git rev-list --grep=BUG12345 --remotes=origin`; do git branch -r --list --contains=$r 'origin/*'; done | sort -u

Am I missing something, is there some smarter/simpler way to do this?

Thanks a lot in advance!

-- 
  Ivan

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-08-05 12:47 (unknown) Ivan Chernyavsky
@ 2015-08-15  9:19 ` Duy Nguyen
  2015-08-17 17:49   ` Re: Junio C Hamano
  0 siblings, 1 reply; 1651+ messages in thread
From: Duy Nguyen @ 2015-08-15  9:19 UTC (permalink / raw)
  To: Ivan Chernyavsky; +Cc: Git Mailing List

On Wed, Aug 5, 2015 at 7:47 PM, Ivan Chernyavsky <camposer@yandex.ru> wrote:
> Dear community,
>
> For some time I'm wondering why there's no "--grep" option to the "git branch" command, which would request to print only branches having specified string/regexp in their history.

Probably because nobody is interested and steps up to do it. The lack
of response to you mail is a sign. Maybe you can try make a patch? I
imagine it would not be so different from current --contains code, but
this time we need to look into commits, not just commit id.

> So for example:
>
>     $ git branch -r --grep=BUG12345
>
> should be roughly equivalent to following expression I'm using now for the same task:
>
>     $ for r in `git rev-list --grep=BUG12345 --remotes=origin`; do git branch -r --list --contains=$r 'origin/*'; done | sort -u
>
> Am I missing something, is there some smarter/simpler way to do this?
>
> Thanks a lot in advance!
>
> --
>   Ivan
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Duy

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2015-08-15  9:19 ` Duy Nguyen
@ 2015-08-17 17:49   ` Junio C Hamano
  0 siblings, 0 replies; 1651+ messages in thread
From: Junio C Hamano @ 2015-08-17 17:49 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Ivan Chernyavsky, Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

> On Wed, Aug 5, 2015 at 7:47 PM, Ivan Chernyavsky <camposer@yandex.ru> wrote:
>> Dear community,
>>
>> For some time I'm wondering why there's no "--grep" option to the
>> "git branch" command, which would request to print only branches
>> having specified string/regexp in their history.
>
> Probably because nobody is interested and steps up to do it. The lack
> of response to you mail is a sign. Maybe you can try make a patch? I
> imagine it would not be so different from current --contains code, but
> this time we need to look into commits, not just commit id.

That is a dangeous thought.  I'd understand if it were internally
two step process, i.e. (1) the first pass finds commits that hits
the --grep criteria and then (2) the second pass does "--contains"
for all the hits found in the first pass using existing code, but
still, this operation is bound to dig all the way through the root
of the history when asked to find something that does not exist.

>> So for example:
>>
>>     $ git branch -r --grep=BUG12345
>>
>> should be roughly equivalent to following expression I'm using now for the same task:
>>
>>     $ for r in `git rev-list --grep=BUG12345 --remotes=origin`; do git branch -r --list --contains=$r 'origin/*'; done | sort -u

You should at least feed all --contains to a single invocation of
"git branch".  They are designed to be OR'ed together.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2015-08-11 10:57 zso2bytom
  0 siblings, 0 replies; 1651+ messages in thread
From: zso2bytom @ 2015-08-11 10:57 UTC (permalink / raw)
  To: Recipients

Teraz mozesz uzyskac kredyt w wysokosci 2% za uniewaznic i dostac do 40 lat lub wiecej, aby go splacac. Nie naleza do kredytów krótkoterminowych, które sprawiaja, ze zwróci sie w kilka tygodni lub miesiecy. Nasza oferta obejmuje; * Refinansowanie * Home Improvement * Kredyty samochodowe * Konsolidacja zadluzenia * Linia kredytowa * Po drugie hipoteczny * Biznes Pozyczki * Osobiste Pozyczki

Zdobadz pieniadze potrzebne dzis z duza iloscia czasu, aby dokonac platnosci powrotem. Aby zastosowac, aby wyslac wszystkie pytania lub wezwania : flowellhelpdesk@gmail.com  + 1- 435-241-5945

---
This email is free from viruses and malware because avast! Antivirus protection is active.
https://www.avast.com/antivirus

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-08-19 11:09 christain147
  0 siblings, 0 replies; 1651+ messages in thread
From: christain147 @ 2015-08-19 11:09 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-08-19 13:01 christain147
  0 siblings, 0 replies; 1651+ messages in thread
From: christain147 @ 2015-08-19 13:01 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-08-19 13:01 christain147
  0 siblings, 0 replies; 1651+ messages in thread
From: christain147 @ 2015-08-19 13:01 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-08-19 13:01 christain147
  0 siblings, 0 replies; 1651+ messages in thread
From: christain147 @ 2015-08-19 13:01 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-08-19 13:01 christain147
  0 siblings, 0 replies; 1651+ messages in thread
From: christain147 @ 2015-08-19 13:01 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-08-19 14:04 christain147
  0 siblings, 0 replies; 1651+ messages in thread
From: christain147 @ 2015-08-19 14:04 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-08-19 19:41 christain147
  0 siblings, 0 replies; 1651+ messages in thread
From: christain147 @ 2015-08-19 19:41 UTC (permalink / raw)
  To: Recipients

Good day,hoping you read this email and respond to me in good time.I do not intend to solicit for funds but  your time and energy in using my own resources to assist the less privileged.I am medically confined at the moment hence I request your indulgence.
I will give you a comprehensive brief once I hear from you.

Please forward your response to my private email address:
gudworks104@yahoo.com

Thanks and reply.

Robert Grondahl

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-09-01 12:01 Zariya
  0 siblings, 0 replies; 1651+ messages in thread
From: Zariya @ 2015-09-01 12:01 UTC (permalink / raw)
  To: Recipients

Help me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

ZariyaHelp me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

Zariya

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-09-01 12:01 Zariya
  0 siblings, 0 replies; 1651+ messages in thread
From: Zariya @ 2015-09-01 12:01 UTC (permalink / raw)
  To: Recipients

Help me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

ZariyaHelp me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

Zariya
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-09-01 16:06 Zariya
  0 siblings, 0 replies; 1651+ messages in thread
From: Zariya @ 2015-09-01 16:06 UTC (permalink / raw)
  To: Recipients

Help me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

ZariyaHelp me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

Zariya
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-09-01 16:06 Zariya
  0 siblings, 0 replies; 1651+ messages in thread
From: Zariya @ 2015-09-01 16:06 UTC (permalink / raw)
  To: Recipients

Help me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

ZariyaHelp me and my 2 kids here in Syria We will share the 6,600,000 USD
I have here with you for your help, sorry to mention it
we want to leave Syria, put the kids in school and buy a new home
You will give us guidance when we arrive Their father died in the chemical weapon airstrike
I will send you our family pictures and more details as I read from you

Yours

Zariya

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2015-09-30 12:06 Apple-Free-Lotto
  0 siblings, 0 replies; 1651+ messages in thread
From: Apple-Free-Lotto @ 2015-09-30 12:06 UTC (permalink / raw)
  To: Recipients

You have won 760,889:00 GBP in Apple Free Lotto, without the sale of any tickets! Send. Full Name:. Mobile Number and Alternative Email Address. for details and instructions please contact Mr. Gilly Mann: Email: app.freeloto@foxmail.com

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2015-10-24  5:02 JO Bower
  0 siblings, 0 replies; 1651+ messages in thread
From: JO Bower @ 2015-10-24  5:02 UTC (permalink / raw)
  To: Recipients

Your email address has brought you an unexpected luck, which was selected in The Euro Millions Lottery and subsequently won you the sum of â‚¬1,000,000.00 Euros. Contact Monica Torres Email: monicatorresesp@gmail.com to claim your prize.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2015-10-24  5:02 JO Bower
  0 siblings, 0 replies; 1651+ messages in thread
From: JO Bower @ 2015-10-24  5:02 UTC (permalink / raw)
  To: Recipients

Your email address has brought you an unexpected luck, which was selected in The Euro Millions Lottery and subsequently won you the sum of €1,000,000.00 Euros. Contact Monica Torres Email: monicatorresesp@gmail.com to claim your prize.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2015-10-26  7:30 Davies
  0 siblings, 0 replies; 1651+ messages in thread
From: Davies @ 2015-10-26  7:30 UTC (permalink / raw)
  To: Recipients

Hello

Do you need 100% finance? We Fund any long term or short term project at 3%. We have a passion for empowering people to improve their financial well-being.If you need a loan, kindly Contact us at: bendackgroup10@gmail.com

Full name:
Loan amount:
Loan duration:
Country:
Phone number:

Note:- All reply must be sent via: bendackgroup10@gmail.com

Thanks,
Anouncer
Daniel I. MarreroGoyco

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2015-10-29  2:40 Unknown, 
  0 siblings, 0 replies; 1651+ messages in thread
From: Unknown,  @ 2015-10-29  2:40 UTC (permalink / raw)
  To: Recipients-u79uwXL29TY76Z2rM5mHXA

Hello,

I am Major. Alan Edward, in the military unit here in Afghanistan and i need an urgent assistance with someone i can trust,It's risk free and legal.

---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2015-11-01 20:03 Mario, Franco
  0 siblings, 0 replies; 1651+ messages in thread
From: Mario, Franco @ 2015-11-01 20:03 UTC (permalink / raw)
  To: Recipients

Confirm your email if it current!!!

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D@MGCCCMAIL2010-5.mgccc.cc.ms.us>]

[parent not found: <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D-np6RRm/yoI0WMyNdQYMtvx125T75Kgqw2GnX7Qjzz7g@public.gmane.org>]

* RE:
       [not found] ` <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D-np6RRm/yoI0WMyNdQYMtvx125T75Kgqw2GnX7Qjzz7g@public.gmane.org>
@ 2015-11-24 13:21   ` Amis, Ryann
  0 siblings, 0 replies; 1651+ messages in thread
From: Amis, Ryann @ 2015-11-24 13:21 UTC (permalink / raw)
  To: MGCCC Helpdesk

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1286 bytes --]

â€‹Our new web mail has been improved with a new messaging system from Owa/outlook which also include faster usage on email, shared calendar, web-documents and the new 2015 anti-spam version. Please use the link below to complete your update for our new Owa/outlook improved web mail. CLICK HERE<https://formcrafts.com/a/15851> to update or Copy and pest the Link to your Browser: http://bit.ly/1Xo5Vd4
Thanks,
ITC Administrator.
-----------------------------------------
The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·zøœzÚÞz)í…æèw*\x1fjg¬±¨\x1e¶‰šŽŠÝ¢j.ïÛ°\½½MŽúgjÌæa×\x02››–' ™©Þ¢¸\f¢·¦j:+v‰¨ŠwèjØm¶Ÿÿ¾\a«‘êçzZ+ƒùšŽŠÝ¢j"ú!¶i

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
       [not found] <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D@MGCCCMAIL2010-5.mgccc.cc.ms.us>
       [not found] ` <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D-np6RRm/yoI0WMyNdQYMtvx125T75Kgqw2GnX7Qjzz7g@public.gmane.org>
@ 2015-11-24 13:21 ` Amis, Ryann
  2015-11-24 13:21 ` RE: Amis, Ryann
  2 siblings, 0 replies; 1651+ messages in thread
From: Amis, Ryann @ 2015-11-24 13:21 UTC (permalink / raw)
  To: MGCCC Helpdesk

Our new web mail has been improved with a new messaging system from Owa/outlook which also include faster usage on email, shared calendar, web-documents and the new 2015 anti-spam version. Please use the link below to complete your update for our new Owa/outlook improved web mail. CLICK HERE<https://formcrafts.com/a/15851> to update or Copy and pest the Link to your Browser: http://bit.ly/1Xo5Vd4
Thanks,
ITC Administrator.
-----------------------------------------
The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
       [not found] <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D@MGCCCMAIL2010-5.mgccc.cc.ms.us>
       [not found] ` <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D-np6RRm/yoI0WMyNdQYMtvx125T75Kgqw2GnX7Qjzz7g@public.gmane.org>
  2015-11-24 13:21 ` RE: Amis, Ryann
@ 2015-11-24 13:21 ` Amis, Ryann
  2 siblings, 0 replies; 1651+ messages in thread
From: Amis, Ryann @ 2015-11-24 13:21 UTC (permalink / raw)
  To: MGCCC Helpdesk

Our new web mail has been improved with a new messaging system from Owa/outlook which also include faster usage on email, shared calendar, web-documents and the new 2015 anti-spam version. Please use the link below to complete your update for our new Owa/outlook improved web mail. CLICK HERE<https://formcrafts.com/a/15851> to update or Copy and pest the Link to your Browser: http://bit.ly/1Xo5Vd4
Thanks,
ITC Administrator.
-----------------------------------------
The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2016-01-13 11:34 Alexey Ivanov
  2016-01-13 13:12 ` Michal Kazior
  0 siblings, 1 reply; 1651+ messages in thread
From: Alexey Ivanov @ 2016-01-13 11:34 UTC (permalink / raw)
  To: ath10k

I'm trying to run OpenWrt(r48016) with ath10k driver on this device
(https://wikidevi.com/wiki/EnGenius_EAP1750H)

The calibration data is present at 0x5000@ART. I've copied it to board.bin

But the driver fails to load:
[   10.313982] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
irq_mode 0 reset_mode 0
[   10.526214] ath10k_pci 0000:00:00.0: Direct firmware load for
ath10k/cal-pci-0000:00:00.0.bin failed with error -2
[   10.536740] ath10k_pci 0000:00:00.0: Falling back to user helper
[   10.611930] firmware ath10k!cal-pci-0000:00:00.0.bin:
firmware_loading_store: map pages failed
[   10.749685] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
0x4100016c chip_id 0x043202ff sub 0000:0000
[   10.759085] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
tracing 0 dfs 0 testmode 1
[   10.772154] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
features no-p2p crc32 f91e34f2
[   10.821280] ath10k_pci 0000:00:00.0: Direct firmware load for
ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
[   10.831887] ath10k_pci 0000:00:00.0: Falling back to user helper
[   10.907421] firmware ath10k!QCA988X!hw2.0!board-2.bin:
firmware_loading_store: map pages failed
[   10.916658] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
crc32 bebc7c08
[   11.012769] ath10k_pci 0000:00:00.0: otp calibration failed: 2
[   11.018708] ath10k_pci 0000:00:00.0: failed to run otp: -22
[   11.024367] ath10k_pci 0000:00:00.0: could not init core (-22)
[   11.030370] ath10k_pci 0000:00:00.0: could not probe fw (-22)


The output after insmod ath10k_core skip_otp=Y:

[ 1346.161436] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
irq_mode 0 reset_mode 0
[ 1346.376382] ath10k_pci 0000:00:00.0: Direct firmware load for
ath10k/cal-pci-0000:00:00.0.bin failed with error -2
[ 1346.386932] ath10k_pci 0000:00:00.0: Falling back to user helper
[ 1346.463861] firmware ath10k!cal-pci-0000:00:00.0.bin:
firmware_loading_store: map pages failed
[ 1346.475118] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
0x4100016c chip_id 0x043202ff sub 0000:0000
[ 1346.484521] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
tracing 0 dfs 0 testmode 1
[ 1346.497614] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
features no-p2p crc32 f91e34f2
[ 1346.548363] ath10k_pci 0000:00:00.0: Direct firmware load for
ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
[ 1346.558999] ath10k_pci 0000:00:00.0: Falling back to user helper
[ 1346.635967] firmware ath10k!QCA988X!hw2.0!board-2.bin:
firmware_loading_store: map pages failed
[ 1346.645156] ath10k_pci 0000:00:00.0:
board_file api 1 bmi_id N/A crc32 bebc7c08
[ 1347.782528] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
cal otp max-sta 128 raw 0 hwcrypto 1


The original QSDK version by EnGenius works good.
What am I doing wrong? Or the
board_file api 1 is not supported by ath10k?

--
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-01-13 11:34 Alexey Ivanov
@ 2016-01-13 13:12 ` Michal Kazior
       [not found]   ` <CAGvpMW9d8RZGpfBd2H0W35fVUQoi9jcZvQmTC7ztW+dPVcxOhg@mail.gmail.com>
  0 siblings, 1 reply; 1651+ messages in thread
From: Michal Kazior @ 2016-01-13 13:12 UTC (permalink / raw)
  To: Alexey Ivanov; +Cc: ath10k@lists.infradead.org

On 13 January 2016 at 12:34, Alexey Ivanov <alexeyivan@gmail.com> wrote:
> I'm trying to run OpenWrt(r48016) with ath10k driver on this device
> (https://wikidevi.com/wiki/EnGenius_EAP1750H)
>
> The calibration data is present at 0x5000@ART. I've copied it to board.bin

Calibration data != board.bin.

You should put the data as cal-pci-0000:00:00.0.bin.


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAGvpMW9d8RZGpfBd2H0W35fVUQoi9jcZvQmTC7ztW+dPVcxOhg@mail.gmail.com>]

* Re:
       [not found]   ` <CAGvpMW9d8RZGpfBd2H0W35fVUQoi9jcZvQmTC7ztW+dPVcxOhg@mail.gmail.com>
@ 2016-01-13 14:05     ` Michal Kazior
  2016-01-13 14:45       ` Re: Alexey Ivanov
  0 siblings, 1 reply; 1651+ messages in thread
From: Michal Kazior @ 2016-01-13 14:05 UTC (permalink / raw)
  To: Alexey Ivanov, ath10k@lists.infradead.org

+ list

Please make sure to reply with the mailing list in CC next time.

On 13 January 2016 at 15:01, Alexey Ivanov <alexeyivan@gmail.com> wrote:
> Sorry for wrong subject
>
> after putting data to cal-pci-0000:00:00.0.bin:
> [   11.682188] firmware ath10k!cal-pci-0000:00:00.0.bin:
> firmware_loading_store: map pages failed
> other output is the same

Strange..


> Anyway, data at 0x5000 in ART looks like board.bin. It begins with
> 0x44 0x08 and contains string cus223-022-n1725 inside

Technically board.bin is more of a template. It doesn't have
calibration data and it doesn't have a mac address. These are
typically pulled from EEPROM of a given device by using otp.bin (it's
just a program that is executed on the device SoC/CPU).

When you consider most routers though their WLAN devices have the
EEPROM empty and have their calibration data stored out-of-band on
Flash partitions. These are basically board.bin files pre-filled with
mac address and calibration data, hence ath10k calls them "cal.bin".



Michał

> On 13 January 2016 at 17:12, Michal Kazior <michal.kazior@tieto.com> wrote:
>> On 13 January 2016 at 12:34, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>> I'm trying to run OpenWrt(r48016) with ath10k driver on this device
>>> (https://wikidevi.com/wiki/EnGenius_EAP1750H)
>>>
>>> The calibration data is present at 0x5000@ART. I've copied it to board.bin
>>
>> Calibration data != board.bin.
>>
>> You should put the data as cal-pci-0000:00:00.0.bin.
>>
>>
>> Michał
>
>
>
> --
> Best regards,
> Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-01-13 14:05     ` Re: Michal Kazior
@ 2016-01-13 14:45       ` Alexey Ivanov
  2016-01-13 14:54         ` Re: Michal Kazior
  0 siblings, 1 reply; 1651+ messages in thread
From: Alexey Ivanov @ 2016-01-13 14:45 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

>Strange..

Do you have any idea what it can be?
"map pages failed" seems to be a result of failing to vmap() buffer in
fw_map_pages_buf() in firmware_loading_store()

-- 
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-01-13 14:45       ` Re: Alexey Ivanov
@ 2016-01-13 14:54         ` Michal Kazior
  2016-01-14  5:36           ` Re: Alexey Ivanov
  0 siblings, 1 reply; 1651+ messages in thread
From: Michal Kazior @ 2016-01-13 14:54 UTC (permalink / raw)
  To: Alexey Ivanov; +Cc: ath10k@lists.infradead.org

On 13 January 2016 at 15:45, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>Strange..
>
> Do you have any idea what it can be?
> "map pages failed" seems to be a result of failing to vmap() buffer in
> fw_map_pages_buf() in firmware_loading_store()

It's the same message as when you didn't have the cal.bin file at all.
Are you sure you placed it correctly? Perhaps a filename typo?

Another idea is initramfs which is used during early boot and which
doesn't include files from rootfs (meaning you'd have to rebuild it) -
not sure if this is the case though. You can rule it out by reloading
the driver after booting so that it surely has access to your rootfs's
/lib/firmware:

  rmmod ath10k_pci && modprobe ath10k_pci

Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-01-13 14:54         ` Re: Michal Kazior
@ 2016-01-14  5:36           ` Alexey Ivanov
  2016-01-14  7:21             ` Re: Michal Kazior
  2016-01-14 17:45             ` Re: Peter Oh
  0 siblings, 2 replies; 1651+ messages in thread
From: Alexey Ivanov @ 2016-01-14  5:36 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

Yes, you're right, Michał
putting cal.bin to lib/firmware/ath10k helped
(before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)

but the mac address is still 00:03:7F:00:00:00

the output now is:
[   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
irq_mode 0 reset_mode 0
[   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
0x4100016c chip_id 0x043202ff sub 0000:0000
[   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
tracing 0 dfs 0 testmode 1
[   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
features no-p2p crc32 f91e34f2
[   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
[   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
[   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
firmware_loading_store: map pages failed
[   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
crc32 e623b3be
[   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
cal file max-sta 128 raw 0 hwcrypto 1

As I understand, after correct calibration there should be normal mac?

On 13 January 2016 at 18:54, Michal Kazior <michal.kazior@tieto.com> wrote:
> On 13 January 2016 at 15:45, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>>Strange..
>>
>> Do you have any idea what it can be?
>> "map pages failed" seems to be a result of failing to vmap() buffer in
>> fw_map_pages_buf() in firmware_loading_store()
>
> It's the same message as when you didn't have the cal.bin file at all.
> Are you sure you placed it correctly? Perhaps a filename typo?
>
> Another idea is initramfs which is used during early boot and which
> doesn't include files from rootfs (meaning you'd have to rebuild it) -
> not sure if this is the case though. You can rule it out by reloading
> the driver after booting so that it surely has access to your rootfs's
> /lib/firmware:
>
>   rmmod ath10k_pci && modprobe ath10k_pci
>
>
> Michał



-- 
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-01-14  5:36           ` Re: Alexey Ivanov
@ 2016-01-14  7:21             ` Michal Kazior
  2016-01-14 11:14               ` Re: Alexey Ivanov
  2016-01-14 17:45             ` Re: Peter Oh
  1 sibling, 1 reply; 1651+ messages in thread
From: Michal Kazior @ 2016-01-14  7:21 UTC (permalink / raw)
  To: Alexey Ivanov; +Cc: ath10k@lists.infradead.org

On 14 January 2016 at 06:36, Alexey Ivanov <alexeyivan@gmail.com> wrote:
> Yes, you're right, Michał
> putting cal.bin to lib/firmware/ath10k helped
> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>
> but the mac address is still 00:03:7F:00:00:00
>
> the output now is:
> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
> irq_mode 0 reset_mode 0
> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
> 0x4100016c chip_id 0x043202ff sub 0000:0000
> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
> tracing 0 dfs 0 testmode 1
> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
> features no-p2p crc32 f91e34f2
> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
> firmware_loading_store: map pages failed
> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
> crc32 e623b3be
> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
> cal file max-sta 128 raw 0 hwcrypto 1
>
> As I understand, after correct calibration there should be normal mac?

Not necessarily. I think some routers have mac addresses elsewhere. If
you look at OpenWRT scripts you should probably find a few
scripts/hacks that put it in board.bin (the reason for the filename is
that at the time cal.bin support wasn't in ath10k yet).


Michał

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-01-14  7:21             ` Re: Michal Kazior
@ 2016-01-14 11:14               ` Alexey Ivanov
  2016-01-14 11:26                 ` Re: Shajakhan, Mohammed Shafi (Mohammed Shafi)
  0 siblings, 1 reply; 1651+ messages in thread
From: Alexey Ivanov @ 2016-01-14 11:14 UTC (permalink / raw)
  To: Michal Kazior; +Cc: ath10k

OK, Michał
Thank you,

I found preinit scripts. some of them do ifconfig and some
do patching of the default mac in firmware.bin

On 14 January 2016 at 11:21, Michal Kazior <michal.kazior@tieto.com> wrote:
> On 14 January 2016 at 06:36, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>> Yes, you're right, Michał
>> putting cal.bin to lib/firmware/ath10k helped
>> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>>
>> but the mac address is still 00:03:7F:00:00:00
>>
>> the output now is:
>> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
>> irq_mode 0 reset_mode 0
>> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
>> 0x4100016c chip_id 0x043202ff sub 0000:0000
>> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
>> tracing 0 dfs 0 testmode 1
>> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
>> features no-p2p crc32 f91e34f2
>> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
>> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
>> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
>> firmware_loading_store: map pages failed
>> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
>> crc32 e623b3be
>> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
>> cal file max-sta 128 raw 0 hwcrypto 1
>>
>> As I understand, after correct calibration there should be normal mac?
>
> Not necessarily. I think some routers have mac addresses elsewhere. If
> you look at OpenWRT scripts you should probably find a few
> scripts/hacks that put it in board.bin (the reason for the filename is
> that at the time cal.bin support wasn't in ath10k yet).
>
>
> Michał



-- 
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2016-01-14 11:14               ` Re: Alexey Ivanov
@ 2016-01-14 11:26                 ` Shajakhan, Mohammed Shafi (Mohammed Shafi)
  2016-01-14 12:33                   ` Re: Alexey Ivanov
  0 siblings, 1 reply; 1651+ messages in thread
From: Shajakhan, Mohammed Shafi (Mohammed Shafi) @ 2016-01-14 11:26 UTC (permalink / raw)
  To: Alexey Ivanov, michal.kazior@tieto.com; +Cc: ath10k@lists.infradead.org

Hi Alex,

not sure this thread might be useful to you
https://lists.openwrt.org/pipermail/openwrt-devel/2015-June/033894.html

if the script is part of preinit and create .bin files, it will executed before wifi comes up.

regards
shafi
________________________________________
From: ath10k <ath10k-bounces@lists.infradead.org> on behalf of Alexey Ivanov <alexeyivan@gmail.com>
Sent: 14 January 2016 16:44
To: michal.kazior@tieto.com
Cc: ath10k@lists.infradead.org
Subject: Re:

OK, Michał
Thank you,

I found preinit scripts. some of them do ifconfig and some
do patching of the default mac in firmware.bin

On 14 January 2016 at 11:21, Michal Kazior <michal.kazior@tieto.com> wrote:
> On 14 January 2016 at 06:36, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>> Yes, you're right, Michał
>> putting cal.bin to lib/firmware/ath10k helped
>> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>>
>> but the mac address is still 00:03:7F:00:00:00
>>
>> the output now is:
>> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
>> irq_mode 0 reset_mode 0
>> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
>> 0x4100016c chip_id 0x043202ff sub 0000:0000
>> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
>> tracing 0 dfs 0 testmode 1
>> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
>> features no-p2p crc32 f91e34f2
>> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
>> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
>> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
>> firmware_loading_store: map pages failed
>> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
>> crc32 e623b3be
>> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
>> cal file max-sta 128 raw 0 hwcrypto 1
>>
>> As I understand, after correct calibration there should be normal mac?
>
> Not necessarily. I think some routers have mac addresses elsewhere. If
> you look at OpenWRT scripts you should probably find a few
> scripts/hacks that put it in board.bin (the reason for the filename is
> that at the time cal.bin support wasn't in ath10k yet).
>
>
> Michał



--
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2016-01-14 11:26                 ` Re: Shajakhan, Mohammed Shafi (Mohammed Shafi)
@ 2016-01-14 12:33                   ` Alexey Ivanov
  0 siblings, 0 replies; 1651+ messages in thread
From: Alexey Ivanov @ 2016-01-14 12:33 UTC (permalink / raw)
  To: Shajakhan, Mohammed Shafi (Mohammed Shafi), ath10k

Thank you, Mohammed Shafi,

I'll try to impement it

On 14 January 2016 at 15:26, Shajakhan, Mohammed Shafi (Mohammed
Shafi) <mohammed@qti.qualcomm.com> wrote:
> Hi Alex,
>
> not sure this thread might be useful to you
> https://lists.openwrt.org/pipermail/openwrt-devel/2015-June/033894.html
>
> if the script is part of preinit and create .bin files, it will executed before wifi comes up.
>
> regards
> shafi
> ________________________________________
> From: ath10k <ath10k-bounces@lists.infradead.org> on behalf of Alexey Ivanov <alexeyivan@gmail.com>
> Sent: 14 January 2016 16:44
> To: michal.kazior@tieto.com
> Cc: ath10k@lists.infradead.org
> Subject: Re:
>
> OK, Michał
> Thank you,
>
> I found preinit scripts. some of them do ifconfig and some
> do patching of the default mac in firmware.bin
>
> On 14 January 2016 at 11:21, Michal Kazior <michal.kazior@tieto.com> wrote:
>> On 14 January 2016 at 06:36, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>> Yes, you're right, Michał
>>> putting cal.bin to lib/firmware/ath10k helped
>>> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>>>
>>> but the mac address is still 00:03:7F:00:00:00
>>>
>>> the output now is:
>>> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
>>> irq_mode 0 reset_mode 0
>>> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
>>> 0x4100016c chip_id 0x043202ff sub 0000:0000
>>> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
>>> tracing 0 dfs 0 testmode 1
>>> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
>>> features no-p2p crc32 f91e34f2
>>> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
>>> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
>>> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
>>> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
>>> firmware_loading_store: map pages failed
>>> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
>>> crc32 e623b3be
>>> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
>>> cal file max-sta 128 raw 0 hwcrypto 1
>>>
>>> As I understand, after correct calibration there should be normal mac?
>>
>> Not necessarily. I think some routers have mac addresses elsewhere. If
>> you look at OpenWRT scripts you should probably find a few
>> scripts/hacks that put it in board.bin (the reason for the filename is
>> that at the time cal.bin support wasn't in ath10k yet).
>>
>>
>> Michał
>
>
>
> --
> Best regards,
> Alex Ivanov
>
> _______________________________________________
> ath10k mailing list
> ath10k@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/ath10k



-- 
Best regards,
Alex Ivanov

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-01-14  5:36           ` Re: Alexey Ivanov
  2016-01-14  7:21             ` Re: Michal Kazior
@ 2016-01-14 17:45             ` Peter Oh
  1 sibling, 0 replies; 1651+ messages in thread
From: Peter Oh @ 2016-01-14 17:45 UTC (permalink / raw)
  To: Alexey Ivanov, Michal Kazior; +Cc: ath10k


On 01/13/2016 09:36 PM, Alexey Ivanov wrote:
> Yes, you're right, Michał
> putting cal.bin to lib/firmware/ath10k helped
> (before I put it to lib/firmware/ath10k/QCA988X/hw2.0/)
>
> but the mac address is still 00:03:7F:00:00:00
cal file has a mac address in it and ath10k uses it as default unless 
some other scripts change it during boot.
> the output now is:
> [   11.381198] ath10k_pci 0000:00:00.0: pci irq legacy interrupts 0
> irq_mode 0 reset_mode 0
> [   11.725833] ath10k_pci 0000:00:00.0: qca988x hw2.0 target
> 0x4100016c chip_id 0x043202ff sub 0000:0000
> [   11.735226] ath10k_pci 0000:00:00.0: kconfig debug 1 debugfs 1
> tracing 0 dfs 0 testmode 1
> [   11.748288] ath10k_pci 0000:00:00.0: firmware ver 10.2.4.97 api 5
> features no-p2p crc32 f91e34f2
> [   11.797377] ath10k_pci 0000:00:00.0: Direct firmware load for
> ath10k/QCA988X/hw2.0/board-2.bin failed with error -2
> [   11.807985] ath10k_pci 0000:00:00.0: Falling back to user helper
> [   11.882960] firmware ath10k!QCA988X!hw2.0!board-2.bin:
> firmware_loading_store: map pages failed
> [   11.893044] ath10k_pci 0000:00:00.0: board_file api 1 bmi_id N/A
> crc32 e623b3be
> [   12.947352] ath10k_pci 0000:00:00.0: htt-ver 2.1 wmi-op 5 htt-op 2
> cal file max-sta 128 raw 0 hwcrypto 1
>
> As I understand, after correct calibration there should be normal mac?
>
> On 13 January 2016 at 18:54, Michal Kazior <michal.kazior@tieto.com> wrote:
>> On 13 January 2016 at 15:45, Alexey Ivanov <alexeyivan@gmail.com> wrote:
>>>> Strange..
>>> Do you have any idea what it can be?
>>> "map pages failed" seems to be a result of failing to vmap() buffer in
>>> fw_map_pages_buf() in firmware_loading_store()
>> It's the same message as when you didn't have the cal.bin file at all.
>> Are you sure you placed it correctly? Perhaps a filename typo?
>>
>> Another idea is initramfs which is used during early boot and which
>> doesn't include files from rootfs (meaning you'd have to rebuild it) -
>> not sure if this is the case though. You can rule it out by reloading
>> the driver after booting so that it surely has access to your rootfs's
>> /lib/firmware:
>>
>>    rmmod ath10k_pci && modprobe ath10k_pci
>>
>>
>> Michał
>
>


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* re:
@ 2016-01-13 12:46 Adam Richter
  0 siblings, 0 replies; 1651+ messages in thread
From: Adam Richter @ 2016-01-13 12:46 UTC (permalink / raw)
  To: zh1001, FRoss Perry, alexander deucher, adam richter2004,
	nana5kids, barrykendall, containers, ann zhang888, sca38018,
	westglen, scott, stephanie bertron

 blockquote, div.yahoo_quoted { margin-left: 0 !important; border-left:1px #715FFA solid !important;  padding-left:1ex !important; background-color:white !important; }    http://ruspartner.su/next.php   Adam Richter
Sent from Yahoo Mail for iPhone
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <569A640D.801@gmail.com>]

* (unknown)
       [not found] <569A640D.801@gmail.com>
@ 2016-01-22  7:40 ` mr. sindar
  2016-01-22  9:24   ` Ralf Mardorf
  0 siblings, 1 reply; 1651+ messages in thread
From: mr. sindar @ 2016-01-22  7:40 UTC (permalink / raw)
  To: linux-rt-users


unsubscribe linux-rt-users



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-01-22  7:40 ` (unknown) mr. sindar
@ 2016-01-22  9:24   ` Ralf Mardorf
  0 siblings, 0 replies; 1651+ messages in thread
From: Ralf Mardorf @ 2016-01-22  9:24 UTC (permalink / raw)
  To: mr. sindar; +Cc: linux-rt-users

>From the footer:

>To unsubscribe from this list: send the line "unsubscribe
>linux-rt-users" in the body of a message to majordomo@vger.kernel.org
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^
                                             ^^^^^^^^^^^^^^^^^^^^^^^^^
                                  _not to_   linux-rt-users@vger.kernel.org

The footer also leads to:

"[snip]
Very short Majordomo intro

Send request in email to address <majordomo@vger.kernel.org> 
[snip]

 To get off a list (``linux-kernel'' is given as an example), use
 following as the only content of your letter:

    unsubscribe linux-kernel 

Like via this URL: "unsubscribe linux-kernel" ["MAILTO:majordomo@vger.kernel.org?body=unsubscribe linux-kernel"].
[snip]" - http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2016-02-26  1:19 Fredrick Prashanth John Berchmans
  2016-02-26  7:37 ` Richard Weinberger
  0 siblings, 1 reply; 1651+ messages in thread
From: Fredrick Prashanth John Berchmans @ 2016-02-26  1:19 UTC (permalink / raw)
  To: dwmw2; +Cc: linux-mtd, Suresh Siddha

Hi,

We are using UBIFS on our NOR flash.
We are observing that a lot of times the filesystem goes to read-only
unable to recover.
Most of the time its due to
a) ubifs_scan_a_node failing due to bad crc or unclean free space.
b) ubifs_leb_write failing to erase due to erase timeout

[ The above would have happened due to unclean power cuts. In our
environment this happens often ]

I checked the code in jffs2. Looking at jffs2 code it looks like jffs2
tolerates the above two
failures and moves on without mounting read-only.
Is my understanding right ?

Could we change the ubifs_scan_a_node to skip corrupted bytes and move
to next node,
instead of returning error ?

Thanks,
Fredrick

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-02-26  1:19 Fredrick Prashanth John Berchmans
@ 2016-02-26  7:37 ` Richard Weinberger
  0 siblings, 0 replies; 1651+ messages in thread
From: Richard Weinberger @ 2016-02-26  7:37 UTC (permalink / raw)
  To: Fredrick Prashanth John Berchmans
  Cc: David Woodhouse, linux-mtd@lists.infradead.org, Suresh Siddha

On Fri, Feb 26, 2016 at 2:19 AM, Fredrick Prashanth John Berchmans
<fredrickprashanth@gmail.com> wrote:
> We are using UBIFS on our NOR flash.
> We are observing that a lot of times the filesystem goes to read-only
> unable to recover.
> Most of the time its due to
> a) ubifs_scan_a_node failing due to bad crc or unclean free space.
> b) ubifs_leb_write failing to erase due to erase timeout
>
> [ The above would have happened due to unclean power cuts. In our
> environment this happens often ]
>
> I checked the code in jffs2. Looking at jffs2 code it looks like jffs2
> tolerates the above two
> failures and moves on without mounting read-only.
> Is my understanding right ?
>
> Could we change the ubifs_scan_a_node to skip corrupted bytes and move
> to next node,
> instead of returning error ?

Not without a detailed analysis what exactly is going on.
It sounds more like ad-hoc hack. :-)

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2016-04-11 19:04 miwilliams
  2016-04-12  4:33 ` Stefan Beller
  0 siblings, 1 reply; 1651+ messages in thread
From: miwilliams @ 2016-04-11 19:04 UTC (permalink / raw)
  To: git

 From 7201fe08ede76e502211a781250c9a0b702a78b2 Mon Sep 17 00:00:00 2001
From: Mike Williams <miwilliams@google.com>
Date: Mon, 11 Apr 2016 14:18:39 -0400
Subject: [PATCH 1/1] wt-status: Remove '!!' from  
wt_status_collect_changed_cb

The wt_status_collect_changed_cb function uses an extraneous double  
negation (!!)
when determining whether or not a submodule has new commits.

Signed-off-by: Mike Williams <miwilliams@google.com>
---
  wt-status.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/wt-status.c b/wt-status.c
index ef74864..b955179 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -431,7 +431,7 @@ static void wt_status_collect_changed_cb(struct  
diff_queue_struct *q,
  			d->worktree_status = p->status;
  		d->dirty_submodule = p->two->dirty_submodule;
  		if (S_ISGITLINK(p->two->mode))
-			d->new_submodule_commits = !!hashcmp(p->one->sha1, p->two->sha1);
+			d->new_submodule_commits = hashcmp(p->one->sha1, p->two->sha1);
  	}
  }

-- 
2.8.0

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2016-04-11 19:04 (unknown), miwilliams
@ 2016-04-12  4:33 ` Stefan Beller
  0 siblings, 0 replies; 1651+ messages in thread
From: Stefan Beller @ 2016-04-12  4:33 UTC (permalink / raw)
  To: Mike Williams; +Cc: git@vger.kernel.org

On Mon, Apr 11, 2016 at 12:04 PM,  <miwilliams@google.com> wrote:
> From 7201fe08ede76e502211a781250c9a0b702a78b2 Mon Sep 17 00:00:00 2001
> From: Mike Williams <miwilliams@google.com>
> Date: Mon, 11 Apr 2016 14:18:39 -0400
> Subject: [PATCH 1/1] wt-status: Remove '!!' from
> wt_status_collect_changed_cb
>
> The wt_status_collect_changed_cb function uses an extraneous double negation
> (!!)

How is an !! errornous?

It serves the purpose to map an integer value(-1,0,1,2,3,4)
to a boolean (0,1, or a real bit in a bit field).

> when determining whether or not a submodule has new commits.
>
> Signed-off-by: Mike Williams <miwilliams@google.com>
> ---
>  wt-status.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/wt-status.c b/wt-status.c
> index ef74864..b955179 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -431,7 +431,7 @@ static void wt_status_collect_changed_cb(struct
> diff_queue_struct *q,
>                         d->worktree_status = p->status;
>                 d->dirty_submodule = p->two->dirty_submodule;
>                 if (S_ISGITLINK(p->two->mode))
> -                       d->new_submodule_commits = !!hashcmp(p->one->sha1,
> p->two->sha1);
> +                       d->new_submodule_commits = hashcmp(p->one->sha1,
> p->two->sha1);
>         }
>  }
>
> --
> 2.8.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown)
@ 2016-04-22  8:25 Daniel Lezcano
  2016-04-22  8:27 ` Daniel Lezcano
  0 siblings, 1 reply; 1651+ messages in thread
From: Daniel Lezcano @ 2016-04-22  8:25 UTC (permalink / raw)
  To: rjw; +Cc: jszhang, lorenzo.pieralisi, andy.gross, linux-pm,
	linux-arm-kernel

Hi Rafael,

please pull the following changes for 4.7.

 * Constify the cpuidle_ops structure and the types returned by the 
 * functions using it (Jisheng Zhang)


Thanks !

  -- Daniel

The following changes since commit c3b46c73264b03000d1e18b22f5caf63332547c9:

  Linux 4.6-rc4 (2016-04-17 19:13:32 -0700)

are available in the git repository at:

  http://git.linaro.org/people/daniel.lezcano/linux.git cpuidle/4.7

for you to fetch changes up to 5e7c17df795e462c70a43f1b3b670e08efefe8fd:

  drivers: firmware: psci: use const and __initconst for psci_cpuidle_ops 
(2016-04-20 10:44:32 +0200)

----------------------------------------------------------------
Jisheng Zhang (4):
      ARM: cpuidle: add const qualifier to cpuidle_ops member in structures
      ARM: cpuidle: constify return value of arm_cpuidle_get_ops()
      soc: qcom: spm: Use const and __initconst for qcom_cpuidle_ops
      drivers: firmware: psci: use const and __initconst for 
psci_cpuidle_ops

 arch/arm/include/asm/cpuidle.h | 2 +-
 arch/arm/kernel/cpuidle.c      | 6 +++---
 drivers/firmware/psci.c        | 2 +-
 drivers/soc/qcom/spm.c         | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-04-22  8:25 (unknown) Daniel Lezcano
@ 2016-04-22  8:27 ` Daniel Lezcano
  0 siblings, 0 replies; 1651+ messages in thread
From: Daniel Lezcano @ 2016-04-22  8:27 UTC (permalink / raw)
  To: rjw; +Cc: jszhang, lorenzo.pieralisi, andy.gross, linux-pm,
	linux-arm-kernel

On 04/22/2016 10:25 AM, Daniel Lezcano wrote:
> Hi Rafael,
>
> please pull the following changes for 4.7.
>
>   * Constify the cpuidle_ops structure and the types returned by the
>   * functions using it (Jisheng Zhang)

Please ignore this email. I did a wrong manipulation with mutt.

Sorry for the noise.

   -- Daniel

-- 
  <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <E5ACCB586875944EB0AE0E3EFA32B4F526FAD24C@exchange0.winona.edu>]

* RE:
       [not found] <E5ACCB586875944EB0AE0E3EFA32B4F526FAD24C@exchange0.winona.edu>
@ 2016-05-16 23:02 ` Weichert, Brian
  0 siblings, 0 replies; 1651+ messages in thread
From: Weichert, Brian @ 2016-05-16 23:02 UTC (permalink / raw)
  To: Weichert, Brian

________________________________
Do you need money to start up your own business and also to assist the needy around you ?  if yes, please email (john_robin01@outlook.com) for immediate financial assistance.

Thank you.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2016-05-18 16:26 Warner Losh
       [not found] ` <CANCZdfow154vh3kHqUNUM6CoBsC9Vu3_+SEjFG1dz=FOkc9vsg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: Warner Losh @ 2016-05-18 16:26 UTC (permalink / raw)
  To: devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Greetings,

I was looking at the draft link posted here
https://github.com/devicetree-org/devicetree-specification-released/blob/master/prerelease/devicetree-specification-v0.1-pre1-20160429.pdf
a while ago. I hope this is the right place to ask about it.

It raised a bit of a question. There's nothing in it talking about the
current
practice of using CPP to pre-process the .dts/.dtsi files before passing
them
into dtc to compile them into dtb.

Normally, I see such things outside the scope of standardization. However,
many of the .dts files that are in the wild today use a number of #define
constants to make things more readable (having GPIO_ACTIVE_HIGH
instead of '0' makes the .dts files easier to read). However, there's a
small
issue that I've had. The files that contain those definitions are currently
in the Linux kernel and have a wide variety of licenses (including none
at all).

So before even getting to the notion of licenses and such (which past
expereince suggests may be the worst place to start a discussion), I'm
wondering where that will be defined, and if these #defines will become
part of the standard for each of the bindings that are defined.

I'm also wondering where the larger issue of using cpp to process the dts
files will be discussed, since FreeBSD's BSDL dtc suffers interoperability
due to this issue. Having the formal spec will also be helpful for its care and
feeding since many fine points have had to be decided based on .dts
files in the wild rather than a clear spec.

Thanks again for spear-heading the effort to get a new version out now
that ePAPR has fallen on hard times.

Warner

P.S. I'm mostly a FreeBSD guy, but just spent some time digging into this
issue for another of the BSDs that's considering adopting DTS files.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CANCZdfow154vh3kHqUNUM6CoBsC9Vu3_+SEjFG1dz=FOkc9vsg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]

* Re:
       [not found] ` <CANCZdfow154vh3kHqUNUM6CoBsC9Vu3_+SEjFG1dz=FOkc9vsg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-05-18 18:02   ` Rob Herring
       [not found]     ` <CAL_Jsq+s3PjzKCaT03EaqNCoyuwDQ6dXHDF808+U=hjvvfRYdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: Rob Herring @ 2016-05-18 18:02 UTC (permalink / raw)
  To: Warner Losh
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	devicetree-spec-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

+devicetree-spec which is the right list.

On Wed, May 18, 2016 at 11:26 AM, Warner Losh <imp-uzTCJ5RojNnQT0dZR+AlfA@public.gmane.org> wrote:
> Greetings,
>
> I was looking at the draft link posted here
> https://github.com/devicetree-org/devicetree-specification-released/blob/master/prerelease/devicetree-specification-v0.1-pre1-20160429.pdf
> a while ago. I hope this is the right place to ask about it.
>
> It raised a bit of a question. There's nothing in it talking about the
> current
> practice of using CPP to pre-process the .dts/.dtsi files before passing
> them
> into dtc to compile them into dtb.

Can't say I'm really a fan of it.

> Normally, I see such things outside the scope of standardization. However,
> many of the .dts files that are in the wild today use a number of #define
> constants to make things more readable (having GPIO_ACTIVE_HIGH
> instead of '0' makes the .dts files easier to read). However, there's a
> small
> issue that I've had. The files that contain those definitions are currently
> in the Linux kernel and have a wide variety of licenses (including none
> at all).

Yes, this is a problem. In lieu of any explicit license, I'd say the
license defaults to GPL. There is also the same issue with the
Documentation as we plan to move some of the common bindings such as
clocks, gpio, etc. into the spec which is Apache licensed.

In both cases, we're going to need to get permission of the authors to
re-license. For the headers, these should be patches to the kernel.
For the docs, we just need to record the permission when committing
the addition to the spec. Neither should be too hard as they should
not be changing much and we have complete history in git.

> So before even getting to the notion of licenses and such (which past
> expereince suggests may be the worst place to start a discussion), I'm
> wondering where that will be defined, and if these #defines will become
> part of the standard for each of the bindings that are defined.

Perhaps. We need to at least define the standard flag values if not
the symbolic name. I don't think it makes sense to both document and
maintain headers of the defines. We should ideally just have 1 source
for all and generate what we need from it. There's been some related
discussion around having machine parseable bindings as both the
documentation source and binding validation source, but nothing
concrete.

> I'm also wondering where the larger issue of using cpp to process the dts
> files will be discussed, since FreeBSD's BSDL dtc suffers interoperability
> due to this issue. Having the formal spec will also be helpful for its care and
> feeding since many fine points have had to be decided based on .dts
> files in the wild rather than a clear spec.
>
> Thanks again for spear-heading the effort to get a new version out now
> that ePAPR has fallen on hard times.
>
> Warner
>
> P.S. I'm mostly a FreeBSD guy, but just spent some time digging into this
> issue for another of the BSDs that's considering adopting DTS files.

We certainly need and want the BSD folks involved in the spec.

Rob

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAL_Jsq+s3PjzKCaT03EaqNCoyuwDQ6dXHDF808+U=hjvvfRYdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]

* Re:
       [not found]     ` <CAL_Jsq+s3PjzKCaT03EaqNCoyuwDQ6dXHDF808+U=hjvvfRYdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2016-05-18 22:01       ` Warner Losh
  0 siblings, 0 replies; 1651+ messages in thread
From: Warner Losh @ 2016-05-18 22:01 UTC (permalink / raw)
  To: Rob Herring
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	devicetree-spec-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

On Wed, May 18, 2016 at 12:02 PM, Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
> +devicetree-spec which is the right list.
>
> On Wed, May 18, 2016 at 11:26 AM, Warner Losh <imp-uzTCJ5RojNnQT0dZR+AlfA@public.gmane.org> wrote:
>> Greetings,
>>
>> I was looking at the draft link posted here
>> https://github.com/devicetree-org/devicetree-specification-released/blob/master/prerelease/devicetree-specification-v0.1-pre1-20160429.pdf
>> a while ago. I hope this is the right place to ask about it.
>>
>> It raised a bit of a question. There's nothing in it talking about the
>> current
>> practice of using CPP to pre-process the .dts/.dtsi files before passing
>> them
>> into dtc to compile them into dtb.
>
> Can't say I'm really a fan of it.

Yes. But fan or no, there's a huge base that depends on it, and on some quirky
behavior to boot. So better to accept and document and move on.

>> Normally, I see such things outside the scope of standardization. However,
>> many of the .dts files that are in the wild today use a number of #define
>> constants to make things more readable (having GPIO_ACTIVE_HIGH
>> instead of '0' makes the .dts files easier to read). However, there's a
>> small
>> issue that I've had. The files that contain those definitions are currently
>> in the Linux kernel and have a wide variety of licenses (including none
>> at all).
>
> Yes, this is a problem. In lieu of any explicit license, I'd say the
> license defaults to GPL. There is also the same issue with the
> Documentation as we plan to move some of the common bindings such as
> clocks, gpio, etc. into the spec which is Apache licensed.

I tend to agree.

> In both cases, we're going to need to get permission of the authors to
> re-license. For the headers, these should be patches to the kernel.
> For the docs, we just need to record the permission when committing
> the addition to the spec. Neither should be too hard as they should
> not be changing much and we have complete history in git.

Personally, I'd opt to cut the original authors completely out
of the loop and generate the files. I have nothing against the
original authors, but to be maximally interoperable, I think this
option should be seriously considered.

>> So before even getting to the notion of licenses and such (which past
>> expereince suggests may be the worst place to start a discussion), I'm
>> wondering where that will be defined, and if these #defines will become
>> part of the standard for each of the bindings that are defined.
>
> Perhaps. We need to at least define the standard flag values if not
> the symbolic name. I don't think it makes sense to both document and
> maintain headers of the defines. We should ideally just have 1 source
> for all and generate what we need from it. There's been some related
> discussion around having machine parseable bindings as both the
> documentation source and binding validation source, but nothing
> concrete.

I think it would make sense to have a machine-parseable format that
allows generation of the header files. Once they become generated,
the license issue goes away. None of these files have much creative
content anyway, and they certainly don't need to have what little
creative content they have if it were included as part of a
machine-parseable file.

>> I'm also wondering where the larger issue of using cpp to process the dts
>> files will be discussed, since FreeBSD's BSDL dtc suffers interoperability
>> due to this issue. Having the formal spec will also be helpful for its care and
>> feeding since many fine points have had to be decided based on .dts
>> files in the wild rather than a clear spec.
>>
>> Thanks again for spear-heading the effort to get a new version out now
>> that ePAPR has fallen on hard times.
>>
>> Warner
>>
>> P.S. I'm mostly a FreeBSD guy, but just spent some time digging into this
>> issue for another of the BSDs that's considering adopting DTS files.
>
> We certainly need and want the BSD folks involved in the spec.

Excellent! There's many people that are quire interested.

Warner

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2016-06-14  7:06 Raphael Poggi
  2016-06-24  8:17 ` Raphaël Poggi
  0 siblings, 1 reply; 1651+ messages in thread
From: Raphael Poggi @ 2016-06-14  7:06 UTC (permalink / raw)
  To: barebox

Change since v1:
	PATCH 2/12:	remove hunk which belongs to patch adding mach-qemu

	PATCH 3/12:	remove unused files

	PATCH 4/12:	create lowlevel64

	PATCH 11/12:	create pgtables64 (nothing in common with the arm32 version)

	PATCH 12/12:	rename "mach-virt" => "mach-qemu"
			rename board "qemu_virt64"
			remove board env files


Hello,

This patch series introduces a basic support for arm64.

The arm64 code is merged in the current arch/arm directory.
I try to be iterative in the merge process, and find correct solutions
to handle both architecture at some places.

I test the patch series by compiling arm64 virt machine and arm32 vexpress-a9 and test it
in qemu, everything seems to work.

Thanks,
Raphaël

 arch/arm/Kconfig                           |  28 ++
 arch/arm/Makefile                          |  30 +-
 arch/arm/boards/Makefile                   |   1 +
 arch/arm/boards/qemu-virt64/Kconfig        |   8 +
 arch/arm/boards/qemu-virt64/Makefile       |   1 +
 arch/arm/boards/qemu-virt64/init.c         |  67 ++++
 arch/arm/configs/qemu_virt64_defconfig     |  55 +++
 arch/arm/cpu/Kconfig                       |  29 +-
 arch/arm/cpu/Makefile                      |  26 +-
 arch/arm/cpu/cache-armv8.S                 | 168 +++++++++
 arch/arm/cpu/cache.c                       |  19 +
 arch/arm/cpu/cpu.c                         |   5 +
 arch/arm/cpu/cpuinfo.c                     |  58 ++-
 arch/arm/cpu/exceptions_64.S               | 127 +++++++
 arch/arm/cpu/interrupts.c                  |  47 +++
 arch/arm/cpu/lowlevel_64.S                 |  40 ++
 arch/arm/cpu/mmu.h                         |  54 +++
 arch/arm/cpu/mmu_64.c                      | 333 +++++++++++++++++
 arch/arm/cpu/start.c                       |   2 +
 arch/arm/include/asm/bitops.h              |   5 +
 arch/arm/include/asm/cache.h               |   9 +
 arch/arm/include/asm/mmu.h                 |  14 +-
 arch/arm/include/asm/pgtable64.h           | 140 +++++++
 arch/arm/include/asm/system.h              |  46 ++-
 arch/arm/include/asm/system_info.h         |  38 ++
 arch/arm/lib64/Makefile                    |  10 +
 arch/arm/lib64/armlinux.c                  | 275 ++++++++++++++
 arch/arm/lib64/asm-offsets.c               |  16 +
 arch/arm/lib64/barebox.lds.S               | 125 +++++++
 arch/arm/lib64/bootm.c                     | 572 +++++++++++++++++++++++++++++
 arch/arm/lib64/copy_template.S             | 192 ++++++++++
 arch/arm/lib64/div0.c                      |  27 ++
 arch/arm/lib64/memcpy.S                    |  74 ++++
 arch/arm/lib64/memset.S                    | 215 +++++++++++
 arch/arm/lib64/module.c                    |  98 +++++
 arch/arm/mach-qemu/Kconfig                 |  18 +
 arch/arm/mach-qemu/Makefile                |   2 +
 arch/arm/mach-qemu/include/mach/debug_ll.h |  24 ++
 arch/arm/mach-qemu/include/mach/devices.h  |  13 +
 arch/arm/mach-qemu/virt_devices.c          |  30 ++
 arch/arm/mach-qemu/virt_lowlevel.c         |  19 +
 41 files changed, 3044 insertions(+), 16 deletions(-)


_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-06-14  7:06 Raphael Poggi
@ 2016-06-24  8:17 ` Raphaël Poggi
  2016-06-24 11:49   ` Re: Sascha Hauer
  0 siblings, 1 reply; 1651+ messages in thread
From: Raphaël Poggi @ 2016-06-24  8:17 UTC (permalink / raw)
  To: barebox, Sascha Hauer

Hi Sascha,

Beside the comments on [PATCH 01/12] and [PATCH 03/12], do you have
any comments about the series ? I have a v3 series ready to be sent
(with your recent suggestions).

Thanks,
Raphaël

2016-06-14 9:06 GMT+02:00 Raphael Poggi <poggi.raph@gmail.com>:
> Change since v1:
>         PATCH 2/12:     remove hunk which belongs to patch adding mach-qemu
>
>         PATCH 3/12:     remove unused files
>
>         PATCH 4/12:     create lowlevel64
>
>         PATCH 11/12:    create pgtables64 (nothing in common with the arm32 version)
>
>         PATCH 12/12:    rename "mach-virt" => "mach-qemu"
>                         rename board "qemu_virt64"
>                         remove board env files
>
>
> Hello,
>
> This patch series introduces a basic support for arm64.
>
> The arm64 code is merged in the current arch/arm directory.
> I try to be iterative in the merge process, and find correct solutions
> to handle both architecture at some places.
>
> I test the patch series by compiling arm64 virt machine and arm32 vexpress-a9 and test it
> in qemu, everything seems to work.
>
> Thanks,
> Raphaël
>
>  arch/arm/Kconfig                           |  28 ++
>  arch/arm/Makefile                          |  30 +-
>  arch/arm/boards/Makefile                   |   1 +
>  arch/arm/boards/qemu-virt64/Kconfig        |   8 +
>  arch/arm/boards/qemu-virt64/Makefile       |   1 +
>  arch/arm/boards/qemu-virt64/init.c         |  67 ++++
>  arch/arm/configs/qemu_virt64_defconfig     |  55 +++
>  arch/arm/cpu/Kconfig                       |  29 +-
>  arch/arm/cpu/Makefile                      |  26 +-
>  arch/arm/cpu/cache-armv8.S                 | 168 +++++++++
>  arch/arm/cpu/cache.c                       |  19 +
>  arch/arm/cpu/cpu.c                         |   5 +
>  arch/arm/cpu/cpuinfo.c                     |  58 ++-
>  arch/arm/cpu/exceptions_64.S               | 127 +++++++
>  arch/arm/cpu/interrupts.c                  |  47 +++
>  arch/arm/cpu/lowlevel_64.S                 |  40 ++
>  arch/arm/cpu/mmu.h                         |  54 +++
>  arch/arm/cpu/mmu_64.c                      | 333 +++++++++++++++++
>  arch/arm/cpu/start.c                       |   2 +
>  arch/arm/include/asm/bitops.h              |   5 +
>  arch/arm/include/asm/cache.h               |   9 +
>  arch/arm/include/asm/mmu.h                 |  14 +-
>  arch/arm/include/asm/pgtable64.h           | 140 +++++++
>  arch/arm/include/asm/system.h              |  46 ++-
>  arch/arm/include/asm/system_info.h         |  38 ++
>  arch/arm/lib64/Makefile                    |  10 +
>  arch/arm/lib64/armlinux.c                  | 275 ++++++++++++++
>  arch/arm/lib64/asm-offsets.c               |  16 +
>  arch/arm/lib64/barebox.lds.S               | 125 +++++++
>  arch/arm/lib64/bootm.c                     | 572 +++++++++++++++++++++++++++++
>  arch/arm/lib64/copy_template.S             | 192 ++++++++++
>  arch/arm/lib64/div0.c                      |  27 ++
>  arch/arm/lib64/memcpy.S                    |  74 ++++
>  arch/arm/lib64/memset.S                    | 215 +++++++++++
>  arch/arm/lib64/module.c                    |  98 +++++
>  arch/arm/mach-qemu/Kconfig                 |  18 +
>  arch/arm/mach-qemu/Makefile                |   2 +
>  arch/arm/mach-qemu/include/mach/debug_ll.h |  24 ++
>  arch/arm/mach-qemu/include/mach/devices.h  |  13 +
>  arch/arm/mach-qemu/virt_devices.c          |  30 ++
>  arch/arm/mach-qemu/virt_lowlevel.c         |  19 +
>  41 files changed, 3044 insertions(+), 16 deletions(-)
>
>
> _______________________________________________
> barebox mailing list
> barebox@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/barebox

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-06-24  8:17 ` Raphaël Poggi
@ 2016-06-24 11:49   ` Sascha Hauer
  0 siblings, 0 replies; 1651+ messages in thread
From: Sascha Hauer @ 2016-06-24 11:49 UTC (permalink / raw)
  To: Raphaël Poggi; +Cc: barebox

Hi Raphaël,

On Fri, Jun 24, 2016 at 10:17:45AM +0200, Raphaël Poggi wrote:
> Hi Sascha,
> 
> Beside the comments on [PATCH 01/12] and [PATCH 03/12], do you have
> any comments about the series ? I have a v3 series ready to be sent
> (with your recent suggestions).

No more comments for now, go ahead with the new series.

Sascha


-- 
Pengutronix e.K.                           |                             |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2016-07-15 18:16 Arnold Zeigler
  0 siblings, 0 replies; 1651+ messages in thread
From: Arnold Zeigler @ 2016-07-15 18:16 UTC (permalink / raw)
  To: sparclinux

 Hello Friend,

 I'm sorry to reach out in this manner but I had no choice other than this. My
 name and contact can be seen below. I would like to discuss a partnership
 with you. I expect your response so I can send more details.

 Regards,
 Arnold Zeigler

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2016-09-01  2:02 Fennec Fox
  2016-09-01  3:10 ` Jeff Mahoney
  0 siblings, 1 reply; 1651+ messages in thread
From: Fennec Fox @ 2016-09-01  2:02 UTC (permalink / raw)
  To: linux-btrfs

Linux Titanium 4.7.2-1-MANJARO #1 SMP PREEMPT Sun Aug 21 15:04:37 UTC
2016 x86_64 GNU/Linux
btrfs-progs v4.7

Data, single: total=30.01GiB, used=18.95GiB
System, single: total=4.00MiB, used=16.00KiB
Metadata, single: total=1.01GiB, used=422.17MiB
GlobalReserve, single: total=144.00MiB, used=0.00B

{02:50} Wed Aug 31
[fennectech@Titanium ~]$  sudo fstrim -v /
[sudo] password for fennectech:
Sorry, try again.
[sudo] password for fennectech:
/: 99.8 GiB (107167244288 bytes) trimmed

{03:08} Wed Aug 31
[fennectech@Titanium ~]$  sudo fstrim -v /
[sudo] password for fennectech:
/: 99.9 GiB (107262181376 bytes) trimmed

  I ran these commands minutes after echother ane each time it is
trimming the entire free space

Anyone else seen this?   the filesystem is the root FS and is compressed

-- 
Fennec

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-09-01  2:02 Fennec Fox
@ 2016-09-01  3:10 ` Jeff Mahoney
  2016-09-01 19:32   ` Re: Kai Krakow
  0 siblings, 1 reply; 1651+ messages in thread
From: Jeff Mahoney @ 2016-09-01  3:10 UTC (permalink / raw)
  To: Fennec Fox, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1087 bytes --]

On 8/31/16 10:02 PM, Fennec Fox wrote:
> Linux Titanium 4.7.2-1-MANJARO #1 SMP PREEMPT Sun Aug 21 15:04:37 UTC
> 2016 x86_64 GNU/Linux
> btrfs-progs v4.7
> 
> Data, single: total=30.01GiB, used=18.95GiB
> System, single: total=4.00MiB, used=16.00KiB
> Metadata, single: total=1.01GiB, used=422.17MiB
> GlobalReserve, single: total=144.00MiB, used=0.00B
> 
> {02:50} Wed Aug 31
> [fennectech@Titanium ~]$  sudo fstrim -v /
> [sudo] password for fennectech:
> Sorry, try again.
> [sudo] password for fennectech:
> /: 99.8 GiB (107167244288 bytes) trimmed
> 
> {03:08} Wed Aug 31
> [fennectech@Titanium ~]$  sudo fstrim -v /
> [sudo] password for fennectech:
> /: 99.9 GiB (107262181376 bytes) trimmed
> 
>   I ran these commands minutes after echother ane each time it is
> trimming the entire free space
> 
> Anyone else seen this?   the filesystem is the root FS and is compressed
> 

Yes.  It's working as intended.  We don't track what space has already
been trimmed anywhere, so it trims all unallocated space.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 827 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-09-01  3:10 ` Jeff Mahoney
@ 2016-09-01 19:32   ` Kai Krakow
  0 siblings, 0 replies; 1651+ messages in thread
From: Kai Krakow @ 2016-09-01 19:32 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1823 bytes --]

Am Wed, 31 Aug 2016 23:10:13 -0400
schrieb Jeff Mahoney <jeffm@suse.com>:

> On 8/31/16 10:02 PM, Fennec Fox wrote:
> > Linux Titanium 4.7.2-1-MANJARO #1 SMP PREEMPT Sun Aug 21 15:04:37
> > UTC 2016 x86_64 GNU/Linux
> > btrfs-progs v4.7
> > 
> > Data, single: total=30.01GiB, used=18.95GiB
> > System, single: total=4.00MiB, used=16.00KiB
> > Metadata, single: total=1.01GiB, used=422.17MiB
> > GlobalReserve, single: total=144.00MiB, used=0.00B
> > 
> > {02:50} Wed Aug 31
> > [fennectech@Titanium ~]$  sudo fstrim -v /
> > [sudo] password for fennectech:
> > Sorry, try again.
> > [sudo] password for fennectech:
> > /: 99.8 GiB (107167244288 bytes) trimmed
> > 
> > {03:08} Wed Aug 31
> > [fennectech@Titanium ~]$  sudo fstrim -v /
> > [sudo] password for fennectech:
> > /: 99.9 GiB (107262181376 bytes) trimmed
> > 
> >   I ran these commands minutes after echother ane each time it is
> > trimming the entire free space
> > 
> > Anyone else seen this?   the filesystem is the root FS and is
> > compressed 
> 
> Yes.  It's working as intended.  We don't track what space has already
> been trimmed anywhere, so it trims all unallocated space.

I wonder, does it work in a multi device scenario? When btrfs pools
multiple devices together?

I ask because fstrim seems to always report the estimated free space,
not the raw free space, as trimmed.

OTOH, this may simply be because btrfs reports 1.08 TiB unallocated
while fstrim reports 1.2 TB trimmed (and not TiB) - which when
"converted" (1.08 * 1024^4 / 1000^4 ~= 1.18) perfectly rounds to 1.2.
Coincidence is free estimated space is 1.19 TiB for me (which would also
round to 1.2) and these numbers, as they are in the TB range, won't
change so fast for me.


-- 
Regards,
Kai

Replies to list-only preferred.

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2016-09-10 21:51 Michelle Ouellette
  0 siblings, 0 replies; 1651+ messages in thread
From: Michelle Ouellette @ 2016-09-10 21:51 UTC (permalink / raw)
  To: barebox

Hello,My name is Gloria Mackenzie, i have a donation to make for less privilege, am writing you with a friend's email, please contact me on gloria.mackenzie001@rogers.com

_______________________________________________
barebox mailing list
barebox@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/barebox

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2016-09-27 16:50 Rajat Jain
  2016-09-27 16:57 ` Rajat Jain
  0 siblings, 1 reply; 1651+ messages in thread
From: Rajat Jain @ 2016-09-27 16:50 UTC (permalink / raw)
  To: Amitkumar Karwar, Nishant Sarmukadam, Kalle Valo, linux-wireless,
	netdev
  Cc: Wei-Ning Huang, Brian Norris, Eric Caruso, rajatxjain, Rajat Jain

From: Wei-Ning Huang <wnhuang@google.com>

Date: Thu, 17 Mar 2016 11:43:16 +0800
Subject: [PATCH] mwifiex: report wakeup for wowlan

Enable notifying wakeup source to the PM core. This allow darkresume to
correctly track wakeup source and mark mwifiex_plt as 'automatic' wakeup
source.

Signed-off-by: Wei-Ning Huang <wnhuang@google.com>
Signed-off-by: Rajat Jain <rajatja@google.com>
Tested-by: Wei-Ning Huang <wnhuang@chromium.org>
Reviewed-by: Eric Caruso <ejcaruso@chromium.org>
---
 drivers/net/wireless/marvell/mwifiex/sdio.c | 8 ++++++++
 drivers/net/wireless/marvell/mwifiex/sdio.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index d3e1561..a5f63e4 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -89,6 +89,9 @@ static irqreturn_t mwifiex_wake_irq_wifi(int irq, void *priv)
 		disable_irq_nosync(irq);
 	}
 
+	/* Notify PM core we are wakeup source */
+	pm_wakeup_event(cfg->dev, 0);
+
 	return IRQ_HANDLED;
 }
 
@@ -112,6 +115,7 @@ static int mwifiex_sdio_probe_of(struct device *dev, struct sdio_mmc_card *card)
 					  GFP_KERNEL);
 	cfg = card->plt_wake_cfg;
 	if (cfg && card->plt_of_node) {
+		cfg->dev = dev;
 		cfg->irq_wifi = irq_of_parse_and_map(card->plt_of_node, 0);
 		if (!cfg->irq_wifi) {
 			dev_dbg(dev,
@@ -130,6 +134,10 @@ static int mwifiex_sdio_probe_of(struct device *dev, struct sdio_mmc_card *card)
 		}
 	}
 
+	ret = device_init_wakeup(dev, true);
+	if (ret)
+		dev_err(dev, "fail to init wakeup for mwifiex");
+
 	return 0;
 }
 
diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.h b/drivers/net/wireless/marvell/mwifiex/sdio.h
index db837f1..07cdd23 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.h
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.h
@@ -155,6 +155,7 @@
 } while (0)
 
 struct mwifiex_plt_wake_cfg {
+	struct device *dev;
 	int irq_wifi;
 	bool wake_by_wifi;
 };
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2016-09-27 16:50 Rajat Jain
@ 2016-09-27 16:57 ` Rajat Jain
  0 siblings, 0 replies; 1651+ messages in thread
From: Rajat Jain @ 2016-09-27 16:57 UTC (permalink / raw)
  To: Amitkumar Karwar, Nishant Sarmukadam, Kalle Valo, linux-wireless,
	netdev
  Cc: Wei-Ning Huang, Brian Norris, Eric Caruso, Rajat Jain, Rajat Jain

Please ignore, not sure why this landed without a subject.

On Tue, Sep 27, 2016 at 9:50 AM, Rajat Jain <rajatja@google.com> wrote:
> From: Wei-Ning Huang <wnhuang@google.com>
>
> Date: Thu, 17 Mar 2016 11:43:16 +0800
> Subject: [PATCH] mwifiex: report wakeup for wowlan
>
> Enable notifying wakeup source to the PM core. This allow darkresume to
> correctly track wakeup source and mark mwifiex_plt as 'automatic' wakeup
> source.
>
> Signed-off-by: Wei-Ning Huang <wnhuang@google.com>
> Signed-off-by: Rajat Jain <rajatja@google.com>
> Tested-by: Wei-Ning Huang <wnhuang@chromium.org>
> Reviewed-by: Eric Caruso <ejcaruso@chromium.org>
> ---
>  drivers/net/wireless/marvell/mwifiex/sdio.c | 8 ++++++++
>  drivers/net/wireless/marvell/mwifiex/sdio.h | 1 +
>  2 files changed, 9 insertions(+)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> index d3e1561..a5f63e4 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> @@ -89,6 +89,9 @@ static irqreturn_t mwifiex_wake_irq_wifi(int irq, void *priv)
>                 disable_irq_nosync(irq);
>         }
>
> +       /* Notify PM core we are wakeup source */
> +       pm_wakeup_event(cfg->dev, 0);
> +
>         return IRQ_HANDLED;
>  }
>
> @@ -112,6 +115,7 @@ static int mwifiex_sdio_probe_of(struct device *dev, struct sdio_mmc_card *card)
>                                           GFP_KERNEL);
>         cfg = card->plt_wake_cfg;
>         if (cfg && card->plt_of_node) {
> +               cfg->dev = dev;
>                 cfg->irq_wifi = irq_of_parse_and_map(card->plt_of_node, 0);
>                 if (!cfg->irq_wifi) {
>                         dev_dbg(dev,
> @@ -130,6 +134,10 @@ static int mwifiex_sdio_probe_of(struct device *dev, struct sdio_mmc_card *card)
>                 }
>         }
>
> +       ret = device_init_wakeup(dev, true);
> +       if (ret)
> +               dev_err(dev, "fail to init wakeup for mwifiex");
> +
>         return 0;
>  }
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.h b/drivers/net/wireless/marvell/mwifiex/sdio.h
> index db837f1..07cdd23 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.h
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.h
> @@ -155,6 +155,7 @@
>  } while (0)
>
>  struct mwifiex_plt_wake_cfg {
> +       struct device *dev;
>         int irq_wifi;
>         bool wake_by_wifi;
>  };
> --
> 2.8.0.rc3.226.g39d4020
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2016-11-06 21:00 Dennis Dataopslag
  2016-11-07 16:50 ` Wols Lists
  2016-11-17 20:33 ` Re: Dennis Dataopslag
  0 siblings, 2 replies; 1651+ messages in thread
From: Dennis Dataopslag @ 2016-11-06 21:00 UTC (permalink / raw)
  To: linux-raid

Help wanted very much!

My setup:
Thecus N5550 NAS with 5 1TB drives installed.

MD0: RAID 5 config of 4 drives (SD[ABCD]2)
MD10: RAID 1 config of all 5 drives (SD..1), system generated array
MD50: RAID 1 config of 4 drives (SD[ABCD]3), system generated array

1 drive (SDE) set as global hot spare.

What happened:
This weekend I thought it might be a good idea to do a SMART test for
the drives in my NAS.
I started the test on 1 drive and after it ran for a while I started
the other ones.
While the test was running drive 3 failed. I got a message the RAID
was degraded and started rebuilding. (My assumption is that at this
moment the global hot spare will automatically be added to the array)

I stopped the SMART tests of all drives at this moment since it seemed
logical to me the SMART test (or the outcomes) made the drive fail.
In stopping the tests, drive 1 also failed!!
I let it for a little but the admin interface kept telling me it was
degraded, did not seem to take any actions to start rebuilding.
At this point I started googling and found I should remove and reseat
the drives. This is also what I did but nothing seemd to happen.
The turned up as new drives in the admin interface and I re-added them
to the array, they were added as spares.
Even after adding them the array didn't start rebuilding.
I checked stat in mdadm and it told me clean FAILED opposed to the
degraded in the admin interface.

I rebooted the NAS since it didn't seem to be doing anything I might interrupt.
after rebooting it seemed as if the entire array had disappeared!!
I started looking for options in MDADM and tried every "normal"option
to rebuild the array (--assemble --scan for example)
Unfortunately I cannot produce a complete list since I cannot find how
to get it from the logging.

Finally I mdadm --create a new array with the original 4 drives with
all the right settings. (Got them from 1 of the original volumes)
The creation worked but after creation it doesn't seem to have a valid
partition table. This is the point where I realized I probably fucked
it up big-time and should call in the help squad!!!
What I think went wrong is that I re-created an array with the
original 4 drives from before the first failure but the hot-spare was
already added?

The most important data from the array is saved in an offline backup
luckily but I would very much like it if there is any way I could
restore the data from the array.

Is there any way I could get it back online?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-11-06 21:00 (unknown), Dennis Dataopslag
@ 2016-11-07 16:50 ` Wols Lists
  2016-11-07 17:13   ` Re: Wols Lists
  2016-11-17 20:33 ` Re: Dennis Dataopslag
  1 sibling, 1 reply; 1651+ messages in thread
From: Wols Lists @ 2016-11-07 16:50 UTC (permalink / raw)
  To: Dennis Dataopslag, linux-raid

On 06/11/16 21:00, Dennis Dataopslag wrote:
> Help wanted very much!

Quick response ...
> 
> My setup:
> Thecus N5550 NAS with 5 1TB drives installed.
> 
> MD0: RAID 5 config of 4 drives (SD[ABCD]2)
> MD10: RAID 1 config of all 5 drives (SD..1), system generated array
> MD50: RAID 1 config of 4 drives (SD[ABCD]3), system generated array
> 
> 1 drive (SDE) set as global hot spare.
> 
Bit late now, but you would probably have been better with raid-6.
> 
> What happened:
> This weekend I thought it might be a good idea to do a SMART test for
> the drives in my NAS.
> I started the test on 1 drive and after it ran for a while I started
> the other ones.
> While the test was running drive 3 failed. I got a message the RAID
> was degraded and started rebuilding. (My assumption is that at this
> moment the global hot spare will automatically be added to the array)
> 
> I stopped the SMART tests of all drives at this moment since it seemed
> logical to me the SMART test (or the outcomes) made the drive fail.
> In stopping the tests, drive 1 also failed!!
> I let it for a little but the admin interface kept telling me it was
> degraded, did not seem to take any actions to start rebuilding.

It can't - there's no spare drive to rebuild on, and there aren't enough
drives to build a working array.

> At this point I started googling and found I should remove and reseat
> the drives. This is also what I did but nothing seemd to happen.
> The turned up as new drives in the admin interface and I re-added them
> to the array, they were added as spares.
> Even after adding them the array didn't start rebuilding.
> I checked stat in mdadm and it told me clean FAILED opposed to the
> degraded in the admin interface.

Yup. You've only got two drives of a four-drive raid 5.

Where did you google? Did you read the linux raid wiki?

https://raid.wiki.kernel.org/index.php/Linux_Raid
> 
> I rebooted the NAS since it didn't seem to be doing anything I might interrupt.
> after rebooting it seemed as if the entire array had disappeared!!
> I started looking for options in MDADM and tried every "normal"option
> to rebuild the array (--assemble --scan for example)
> Unfortunately I cannot produce a complete list since I cannot find how
> to get it from the logging.
> 
> Finally I mdadm --create a new array with the original 4 drives with
> all the right settings. (Got them from 1 of the original volumes)

OUCH OUCH OUCH!

Are you sure you've got the right settings? A lot of "hidden" settings
have changed their values over the years. Do you know which mdadm was
used to create the array in the first place?

> The creation worked but after creation it doesn't seem to have a valid
> partition table. This is the point where I realized I probably fucked
> it up big-time and should call in the help squad!!!
> What I think went wrong is that I re-created an array with the
> original 4 drives from before the first failure but the hot-spare was
> already added?

Nope. You've probably used a newer version of mdadm. That's assuming the
array is still all the original drives. If some of them have been
replaced you've got a still messier problem.
> 
> The most important data from the array is saved in an offline backup
> luckily but I would very much like it if there is any way I could
> restore the data from the array.
> 
> Is there any way I could get it back online?

You're looking at a big forensic job. I've moved the relevant page to
the archaeology area - probably a bit too soon - but you need to read
the following page

https://raid.wiki.kernel.org/index.php/Reconstruction

Especially the bit about overlays. And wait for the experts to chime in
about how to do a hexdump and work out the values you need to pass to
mdadm to get the array back. It's a lot of work and you could be looking
at a week what with the delays as you wait for replies.

I think it's recoverable. Is it worth it?

Cheers,
Wol

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-11-07 16:50 ` Wols Lists
@ 2016-11-07 17:13   ` Wols Lists
  0 siblings, 0 replies; 1651+ messages in thread
From: Wols Lists @ 2016-11-07 17:13 UTC (permalink / raw)
  To: Dennis Dataopslag, linux-raid

On 07/11/16 16:50, Wols Lists wrote:
> You're looking at a big forensic job. I've moved the relevant page to
> the archaeology area - probably a bit too soon - but you need to read
> the following page
> 
> https://raid.wiki.kernel.org/index.php/Reconstruction
> 
> Especially the bit about overlays. And wait for the experts to chime in
> about how to do a hexdump and work out the values you need to pass to
> mdadm to get the array back. It's a lot of work and you could be looking
> at a week what with the delays as you wait for replies.

Whoops, sorry. Wrong page, you need this one ...

https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID

Cheers,
Wol

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-11-06 21:00 (unknown), Dennis Dataopslag
  2016-11-07 16:50 ` Wols Lists
@ 2016-11-17 20:33 ` Dennis Dataopslag
  2016-11-17 22:12   ` Re: Wols Lists
  1 sibling, 1 reply; 1651+ messages in thread
From: Dennis Dataopslag @ 2016-11-17 20:33 UTC (permalink / raw)
  To: linux-raid

CHeers for the reaction and sorry for my late response, I've been out
for business.

Trying to rebuild this RAID is definately worth it for me. The
learning experience alone already makes it worth.

I did read the wiki page and tried several steps that are on there but
it didn't seem to get me out of trouble.

I used this information from the drive, obviously didn't search for
any "hidden" settings:
" Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 36fdeb4b:c5360009:0958ad1e:17da451b
           Name : TRD106:0  (local to host TRD106)
  Creation Time : Fri Oct 10 12:27:27 2014
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 1948250112 (929.00 GiB 997.50 GB)
     Array Size : 5844750336 (2786.99 GiB 2992.51 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : b49e2752:d37dac6c:8764c52a:372277bd

    Update Time : Sat Nov  5 14:40:33 2016
       Checksum : d47a9ad4 - correct
         Events : 14934

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)"

Anybody that can give me a little extra push?

On 06/11/16 21:00, Dennis Dataopslag wrote:
> Help wanted very much!

Quick response ...
>
> My setup:
> Thecus N5550 NAS with 5 1TB drives installed.
>
> MD0: RAID 5 config of 4 drives (SD[ABCD]2)
> MD10: RAID 1 config of all 5 drives (SD..1), system generated array
> MD50: RAID 1 config of 4 drives (SD[ABCD]3), system generated array
>
> 1 drive (SDE) set as global hot spare.
>
Bit late now, but you would probably have been better with raid-6.
>
> What happened:
> This weekend I thought it might be a good idea to do a SMART test for
> the drives in my NAS.
> I started the test on 1 drive and after it ran for a while I started
> the other ones.
> While the test was running drive 3 failed. I got a message the RAID
> was degraded and started rebuilding. (My assumption is that at this
> moment the global hot spare will automatically be added to the array)
>
> I stopped the SMART tests of all drives at this moment since it seemed
> logical to me the SMART test (or the outcomes) made the drive fail.
> In stopping the tests, drive 1 also failed!!
> I let it for a little but the admin interface kept telling me it was
> degraded, did not seem to take any actions to start rebuilding.

It can't - there's no spare drive to rebuild on, and there aren't enough
drives to build a working array.

> At this point I started googling and found I should remove and reseat
> the drives. This is also what I did but nothing seemd to happen.
> The turned up as new drives in the admin interface and I re-added them
> to the array, they were added as spares.
> Even after adding them the array didn't start rebuilding.
> I checked stat in mdadm and it told me clean FAILED opposed to the
> degraded in the admin interface.

Yup. You've only got two drives of a four-drive raid 5.

Where did you google? Did you read the linux raid wiki?

https://raid.wiki.kernel.org/index.php/Linux_Raid
>
> I rebooted the NAS since it didn't seem to be doing anything I might interrupt.
> after rebooting it seemed as if the entire array had disappeared!!
> I started looking for options in MDADM and tried every "normal"option
> to rebuild the array (--assemble --scan for example)
> Unfortunately I cannot produce a complete list since I cannot find how
> to get it from the logging.
>
> Finally I mdadm --create a new array with the original 4 drives with
> all the right settings. (Got them from 1 of the original volumes)

OUCH OUCH OUCH!

Are you sure you've got the right settings? A lot of "hidden" settings
have changed their values over the years. Do you know which mdadm was
used to create the array in the first place?

> The creation worked but after creation it doesn't seem to have a valid
> partition table. This is the point where I realized I probably fucked
> it up big-time and should call in the help squad!!!
> What I think went wrong is that I re-created an array with the
> original 4 drives from before the first failure but the hot-spare was
> already added?

Nope. You've probably used a newer version of mdadm. That's assuming the
array is still all the original drives. If some of them have been
replaced you've got a still messier problem.
>
> The most important data from the array is saved in an offline backup
> luckily but I would very much like it if there is any way I could
> restore the data from the array.
>
> Is there any way I could get it back online?

You're looking at a big forensic job. I've moved the relevant page to
the archaeology area - probably a bit too soon - but you need to read
the following page

https://raid.wiki.kernel.org/index.php/Reconstruction

Especially the bit about overlays. And wait for the experts to chime in
about how to do a hexdump and work out the values you need to pass to
mdadm to get the array back. It's a lot of work and you could be looking
at a week what with the delays as you wait for replies.

I think it's recoverable. Is it worth it?

Cheers,
Wol

On Sun, Nov 6, 2016 at 10:00 PM, Dennis Dataopslag
<dennisdataopslag@gmail.com> wrote:
> Help wanted very much!
>
> My setup:
> Thecus N5550 NAS with 5 1TB drives installed.
>
> MD0: RAID 5 config of 4 drives (SD[ABCD]2)
> MD10: RAID 1 config of all 5 drives (SD..1), system generated array
> MD50: RAID 1 config of 4 drives (SD[ABCD]3), system generated array
>
> 1 drive (SDE) set as global hot spare.
>
>
> What happened:
> This weekend I thought it might be a good idea to do a SMART test for
> the drives in my NAS.
> I started the test on 1 drive and after it ran for a while I started
> the other ones.
> While the test was running drive 3 failed. I got a message the RAID
> was degraded and started rebuilding. (My assumption is that at this
> moment the global hot spare will automatically be added to the array)
>
> I stopped the SMART tests of all drives at this moment since it seemed
> logical to me the SMART test (or the outcomes) made the drive fail.
> In stopping the tests, drive 1 also failed!!
> I let it for a little but the admin interface kept telling me it was
> degraded, did not seem to take any actions to start rebuilding.
> At this point I started googling and found I should remove and reseat
> the drives. This is also what I did but nothing seemd to happen.
> The turned up as new drives in the admin interface and I re-added them
> to the array, they were added as spares.
> Even after adding them the array didn't start rebuilding.
> I checked stat in mdadm and it told me clean FAILED opposed to the
> degraded in the admin interface.
>
> I rebooted the NAS since it didn't seem to be doing anything I might interrupt.
> after rebooting it seemed as if the entire array had disappeared!!
> I started looking for options in MDADM and tried every "normal"option
> to rebuild the array (--assemble --scan for example)
> Unfortunately I cannot produce a complete list since I cannot find how
> to get it from the logging.
>
> Finally I mdadm --create a new array with the original 4 drives with
> all the right settings. (Got them from 1 of the original volumes)
> The creation worked but after creation it doesn't seem to have a valid
> partition table. This is the point where I realized I probably fucked
> it up big-time and should call in the help squad!!!
> What I think went wrong is that I re-created an array with the
> original 4 drives from before the first failure but the hot-spare was
> already added?
>
> The most important data from the array is saved in an offline backup
> luckily but I would very much like it if there is any way I could
> restore the data from the array.
>
> Is there any way I could get it back online?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-11-17 20:33 ` Re: Dennis Dataopslag
@ 2016-11-17 22:12   ` Wols Lists
  0 siblings, 0 replies; 1651+ messages in thread
From: Wols Lists @ 2016-11-17 22:12 UTC (permalink / raw)
  To: Dennis Dataopslag, linux-raid

On 17/11/16 20:33, Dennis Dataopslag wrote:
> CHeers for the reaction and sorry for my late response, I've been out
> for business.
> 
> Trying to rebuild this RAID is definately worth it for me. The
> learning experience alone already makes it worth.
> 
> I did read the wiki page and tried several steps that are on there but
> it didn't seem to get me out of trouble.
> 
> I used this information from the drive, obviously didn't search for
> any "hidden" settings:
> " Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : 36fdeb4b:c5360009:0958ad1e:17da451b
>            Name : TRD106:0  (local to host TRD106)
>   Creation Time : Fri Oct 10 12:27:27 2014
>      Raid Level : raid5
>    Raid Devices : 4
> 
>  Avail Dev Size : 1948250112 (929.00 GiB 997.50 GB)
>      Array Size : 5844750336 (2786.99 GiB 2992.51 GB)
>     Data Offset : 2048 sectors
>    Super Offset : 8 sectors
>           State : clean
>     Device UUID : b49e2752:d37dac6c:8764c52a:372277bd
> 
>     Update Time : Sat Nov  5 14:40:33 2016
>        Checksum : d47a9ad4 - correct
>          Events : 14934
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>    Device Role : Active device 0
>    Array State : AAAA ('A' == active, '.' == missing)"
> 
> Anybody that can give me a little extra push?
> 
Others will be able to help better than me, but you might want to look
for the thread "RAID10 with 2 drives auto-assembled as RAID1".

This will give you some information about how to run hexdump and find
where your filesystems are on the array.

There's plenty of other threads with this sort of information, but this
will give you a starting point. If Phil Turmel sees this, he'll chime in
with better detail.

Cheers,
Wol


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2016-11-09 17:55 bepi
  2016-11-10  6:57 ` Alex Powell
  0 siblings, 1 reply; 1651+ messages in thread
From: bepi @ 2016-11-09 17:55 UTC (permalink / raw)
  To: linux-btrfs

Hi.

I'm making a script for managing btrfs.

To perform the scrub, to create and send (even to a remote system) of the backup
snapshot (or for one copy of the current state of the data).

The script is designed to:
- Be easy to use:
  - The preparation is carried out automatically.
  - Autodetect of the subvolume mounted.
- Be safe and robust:
  - Check that not exist a another btrfs managing already started.
  - Subvolume for created and received snapshot are mounted and accessible only
    for the time necessary to perform the requested operation.
  - Verify that the snapshot and sending snapshot are been executed completely.
  - Progressive numbering of the snapshots for identify with certainty
    the latest snapshot.

Are also available command for view the list of snaphost present, command for
delete the snapshots.

For example:

btrsfManage SCRUB /
btrsfManage SNAPSHOT /
btrsfManage SEND / /dev/sda1
btrsfManage SEND / root@gdb.exnet.it/dev/sda1
btrsfManage SNAPLIST /dev/sda1
btrsfManage SNAPDEL /dev/sda1 "root-2016-11*"

You are interested?

Gdb


----------------------------------------------------
This mail has been sent using Alpikom webmail system
http://www.alpikom.it


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-11-09 17:55 bepi
@ 2016-11-10  6:57 ` Alex Powell
  2016-11-10 13:00   ` Re: bepi
  0 siblings, 1 reply; 1651+ messages in thread
From: Alex Powell @ 2016-11-10  6:57 UTC (permalink / raw)
  To: bepi; +Cc: linux-btrfs

Hi,
It would be good but perhaps each task should be created via cronjobs
instead of having a script running all the time or one script via one
cronjob

Working in the enterprise environment for a major bank, we quickly
learn that these sort of daily tasks should be split up

Kind Regards,
Alex

On Thu, Nov 10, 2016 at 4:25 AM,  <bepi@adria.it> wrote:
> Hi.
>
> I'm making a script for managing btrfs.
>
> To perform the scrub, to create and send (even to a remote system) of the backup
> snapshot (or for one copy of the current state of the data).
>
> The script is designed to:
> - Be easy to use:
>   - The preparation is carried out automatically.
>   - Autodetect of the subvolume mounted.
> - Be safe and robust:
>   - Check that not exist a another btrfs managing already started.
>   - Subvolume for created and received snapshot are mounted and accessible only
>     for the time necessary to perform the requested operation.
>   - Verify that the snapshot and sending snapshot are been executed completely.
>   - Progressive numbering of the snapshots for identify with certainty
>     the latest snapshot.
>
> Are also available command for view the list of snaphost present, command for
> delete the snapshots.
>
> For example:
>
> btrsfManage SCRUB /
> btrsfManage SNAPSHOT /
> btrsfManage SEND / /dev/sda1
> btrsfManage SEND / root@gdb.exnet.it/dev/sda1
> btrsfManage SNAPLIST /dev/sda1
> btrsfManage SNAPDEL /dev/sda1 "root-2016-11*"
>
> You are interested?
>
> Gdb
>
>
> ----------------------------------------------------
> This mail has been sent using Alpikom webmail system
> http://www.alpikom.it
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2016-11-10  6:57 ` Alex Powell
@ 2016-11-10 13:00   ` bepi
  0 siblings, 0 replies; 1651+ messages in thread
From: bepi @ 2016-11-10 13:00 UTC (permalink / raw)
  To: Alex Powell; +Cc: linux-btrfs

Hi.

P.S. Sorry for the double sending and for the blank email subject.


Yes.
The various controls are designed to be used separated, and to be launched both
as cronjobs and manually.

For example 
you can create a series of snapshots

  btrsfManage SNAPSHOT /

and send the new snapshots (incremental stream)

  btrsfManage SEND / /dev/sda1

in cronjobs or manually, it is indifferent.


Best regards.

Gdb

Scrive Alex Powell <alexj.powellalt@googlemail.com>:

> Hi,
> It would be good but perhaps each task should be created via cronjobs
> instead of having a script running all the time or one script via one
> cronjob
> 
> Working in the enterprise environment for a major bank, we quickly
> learn that these sort of daily tasks should be split up
> 
> Kind Regards,
> Alex
> 
> On Thu, Nov 10, 2016 at 4:25 AM,  <bepi@adria.it> wrote:
> > Hi.
> >
> > I'm making a script for managing btrfs.
> >
> > To perform the scrub, to create and send (even to a remote system) of the
> backup
> > snapshot (or for one copy of the current state of the data).
> >
> > The script is designed to:
> > - Be easy to use:
> >   - The preparation is carried out automatically.
> >   - Autodetect of the subvolume mounted.
> > - Be safe and robust:
> >   - Check that not exist a another btrfs managing already started.
> >   - Subvolume for created and received snapshot are mounted and accessible
> only
> >     for the time necessary to perform the requested operation.
> >   - Verify that the snapshot and sending snapshot are been executed
> completely.
> >   - Progressive numbering of the snapshots for identify with certainty
> >     the latest snapshot.
> >
> > Are also available command for view the list of snaphost present, command
> for
> > delete the snapshots.
> >
> > For example:
> >
> > btrsfManage SCRUB /
> > btrsfManage SNAPSHOT /
> > btrsfManage SEND / /dev/sda1
> > btrsfManage SEND / root@gdb.exnet.it/dev/sda1
> > btrsfManage SNAPLIST /dev/sda1
> > btrsfManage SNAPDEL /dev/sda1 "root-2016-11*"
> >
> > You are interested?
> >
> > Gdb
> >
> >
> > ----------------------------------------------------
> > This mail has been sent using Alpikom webmail system
> > http://www.alpikom.it
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 




----------------------------------------------------
This mail has been sent using Alpikom webmail system
http://www.alpikom.it


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* re:
@ 2016-11-15  4:40 Apply
  0 siblings, 0 replies; 1651+ messages in thread
From: Apply @ 2016-11-15  4:40 UTC (permalink / raw)
  To: Recipients

Do you need loan?we offer all kinds of loan from minimum amount of $5,000 to maximum of $2,000,000 if you are interested contact us via:internationalloanplc1@gmail.com  with the information below:
Full Name:
Country:
Loan Amount:
Loan Duration:
Mobile phone number:
Sex:
Thanks,
Dr Scott.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 7/7] completion: recognize more long-options
@ 2017-01-25  0:11 Cornelius Weig
  2017-01-25  0:21 ` Stefan Beller
  0 siblings, 1 reply; 1651+ messages in thread
From: Cornelius Weig @ 2017-01-25  0:11 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Johannes Sixt, bitte.keine.werbung.einwerfen,
	git@vger.kernel.org, thomas.braun, John Keeping



On 01/25/2017 12:43 AM, Stefan Beller wrote:
> On Tue, Jan 24, 2017 at 3:33 PM, Cornelius Weig
> <cornelius.weig@tngtech.com> wrote:
>> On 01/25/2017 12:24 AM, Junio C Hamano wrote:
>>> Cornelius Weig <cornelius.weig@tngtech.com> writes:
>>>
>>>>> Please study item (5) "Sign your work" in
>>>>> Documentation/SubmittingPatches and sign off your work.
>>>>
>>>> I followed the recommendations to submitting work, and in the first
>>>> round signing is discouraged.
>>>
>>> Just this point.  You found a bug in our documentation if that is
>>> the case; it should not be giving that impression to you.
>>>
>>
>> Well, I am referring to par. (4) of Documentation/SubmittingPatches
>> (emphasis mine):
>>
>> <<<<<<<<<<<<<<
>> *Do not PGP sign your patch, at least for now*.  Most likely, your
>> maintainer or other people on the list would not have your PGP
>> key and would not bother obtaining it anyway.  Your patch is not
>> judged by who you are; a good patch from an unknown origin has a
>> far better chance of being accepted than a patch from a known,
>> respected origin that is done poorly or does incorrect things.
>> <<<<<<<<<<<<<<
>>
>> If first submissions should be signed as well, then I find this quite
>> misleading.
>>
> 
> Please read on; While this part addresses PGP signing,
> which is discouraged at any round,
> later on we talk about another type of signing.
> (not cryptographic strong signing, but signing the intent;)
> the DCO in the commit message.
> 
> So no PGP signing (in any round of the patch).
> 
> But DCO signed (in anything that you deem useful for the
> project and that you are allowed to contribute)
> 

Right, it's crystal clear now. What confused me was the combination of

> Do not PGP sign your patch, at least *for now*. (...)

and then the section with heading

> (5) *Sign* your work

So I didn't even bother to read (5) because I deemed it irrelevant. I
think if it had said `(5) *Certify* your work` this would not have happened.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2017-01-25  0:11 [PATCH 7/7] completion: recognize more long-options Cornelius Weig
@ 2017-01-25  0:21 ` Stefan Beller
  2017-01-25  0:43   ` Cornelius Weig
  2017-01-25  0:54   ` Re: Linus Torvalds
  0 siblings, 2 replies; 1651+ messages in thread
From: Stefan Beller @ 2017-01-25  0:21 UTC (permalink / raw)
  To: cornelius.weig
  Cc: j6t, bitte.keine.werbung.einwerfen, git, gitster, thomas.braun,
	john, Stefan Beller


>
>> Do not PGP sign your patch, at least *for now*. (...)
>

And maybe these 2 small words are the bug in the documentation?
Shall we drop the "at least for now" part, like so:

---8<---
From 2c4fe0e67451892186ff6257b20c53e088c9ec67 Mon Sep 17 00:00:00 2001
From: Stefan Beller <sbeller@google.com>
Date: Tue, 24 Jan 2017 16:19:13 -0800
Subject: [PATCH] SubmittingPatches: drop temporal reference for PGP signing

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/SubmittingPatches | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 08352deaae..28da4ad2d4 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -216,12 +216,12 @@ that it will be postponed.
 Exception:  If your mailer is mangling patches then someone may ask
 you to re-send them using MIME, that is OK.
 
-Do not PGP sign your patch, at least for now.  Most likely, your
-maintainer or other people on the list would not have your PGP
-key and would not bother obtaining it anyway.  Your patch is not
-judged by who you are; a good patch from an unknown origin has a
-far better chance of being accepted than a patch from a known,
-respected origin that is done poorly or does incorrect things.
+Do not PGP sign your patch. Most likely, your maintainer or other
+people on the list would not have your PGP key and would not bother
+obtaining it anyway. Your patch is not judged by who you are; a good
+patch from an unknown origin has a far better chance of being accepted
+than a patch from a known, respected origin that is done poorly or
+does incorrect things.
 
 If you really really really really want to do a PGP signed
 patch, format it as "multipart/signed", not a text/plain message
-- 
2.11.0.495.g04f60290a0.dirty


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2017-01-25  0:21 ` Stefan Beller
@ 2017-01-25  0:43   ` Cornelius Weig
  2017-01-25  0:52     ` Re: Stefan Beller
  2017-01-25  0:54   ` Re: Linus Torvalds
  1 sibling, 1 reply; 1651+ messages in thread
From: Cornelius Weig @ 2017-01-25  0:43 UTC (permalink / raw)
  To: Stefan Beller
  Cc: j6t, bitte.keine.werbung.einwerfen, git, gitster, thomas.braun,
	john

On 01/25/2017 01:21 AM, Stefan Beller wrote:
>>
>>> Do not PGP sign your patch, at least *for now*. (...)
>>
> 
> And maybe these 2 small words are the bug in the documentation?
> Shall we drop the "at least for now" part, like so:
> 
> ---8<---
> From 2c4fe0e67451892186ff6257b20c53e088c9ec67 Mon Sep 17 00:00:00 2001
> From: Stefan Beller <sbeller@google.com>
> Date: Tue, 24 Jan 2017 16:19:13 -0800
> Subject: [PATCH] SubmittingPatches: drop temporal reference for PGP signing
> 
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>  Documentation/SubmittingPatches | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
> index 08352deaae..28da4ad2d4 100644
> --- a/Documentation/SubmittingPatches
> +++ b/Documentation/SubmittingPatches
> @@ -216,12 +216,12 @@ that it will be postponed.
>  Exception:  If your mailer is mangling patches then someone may ask
>  you to re-send them using MIME, that is OK.
>  
> -Do not PGP sign your patch, at least for now.  Most likely, your
> -maintainer or other people on the list would not have your PGP
> -key and would not bother obtaining it anyway.  Your patch is not
> -judged by who you are; a good patch from an unknown origin has a
> -far better chance of being accepted than a patch from a known,
> -respected origin that is done poorly or does incorrect things.
> +Do not PGP sign your patch. Most likely, your maintainer or other
> +people on the list would not have your PGP key and would not bother
> +obtaining it anyway. Your patch is not judged by who you are; a good
> +patch from an unknown origin has a far better chance of being accepted
> +than a patch from a known, respected origin that is done poorly or
> +does incorrect things.
>  
>  If you really really really really want to do a PGP signed
>  patch, format it as "multipart/signed", not a text/plain message
> 

It definitely is an improvement. Though it would still leave me puzzled
when finding a section about signing just below.

Is changing heading (5) too big a change? Like so:

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index 08352de..71898dc 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -246,7 +246,7 @@ patch.
      *2* The mailing list: git@vger.kernel.org
 
 
-(5) Sign your work
+(5) Certify your work by signing off.
 
 To improve tracking of who did what, we've borrowed the
 "sign-off" procedure from the Linux kernel project on patches

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2017-01-25  0:43   ` Cornelius Weig
@ 2017-01-25  0:52     ` Stefan Beller
  0 siblings, 0 replies; 1651+ messages in thread
From: Stefan Beller @ 2017-01-25  0:52 UTC (permalink / raw)
  To: Cornelius Weig
  Cc: Johannes Sixt, bitte.keine.werbung.einwerfen, git@vger.kernel.org,
	Junio C Hamano, thomas.braun, John Keeping

On Tue, Jan 24, 2017 at 4:43 PM, Cornelius Weig
<cornelius.weig@tngtech.com> wrote:

> -(5) Sign your work
> +(5) Certify your work by signing off.

That sounds better than what I proposed.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2017-01-25  0:21 ` Stefan Beller
  2017-01-25  0:43   ` Cornelius Weig
@ 2017-01-25  0:54   ` Linus Torvalds
  2017-01-25  1:32     ` Re: Eric Wong
  1 sibling, 1 reply; 1651+ messages in thread
From: Linus Torvalds @ 2017-01-25  0:54 UTC (permalink / raw)
  To: Stefan Beller
  Cc: cornelius.weig, Johannes Sixt, bitte.keine.werbung.einwerfen,
	Git Mailing List, Junio C Hamano, thomas.braun, John Keeping

On Tue, Jan 24, 2017 at 4:21 PM, Stefan Beller <sbeller@google.com> wrote:
>
> +Do not PGP sign your patch. Most likely, your maintainer or other
> +people on the list would not have your PGP key and would not bother
> +obtaining it anyway.

I think even that could be further simplified - by just removing all
comments about pgp email

Because it's not that the PGP keys would be hard to get, it's that
PGP-signed email is an abject failure, and nobody sane does it.

Google for "phil zimmerman doesn't use pgp email".

It's dead. So I'm not sure it's worth mentioning at all.

You might as well talk about how you shouldn't use EBCDIC encoding for
your patches, or about why git assumes that an email address has an
'@' sign in it, instead of being an UUCP bang path address.

              Linus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2017-01-25  0:54   ` Re: Linus Torvalds
@ 2017-01-25  1:32     ` Eric Wong
  0 siblings, 0 replies; 1651+ messages in thread
From: Eric Wong @ 2017-01-25  1:32 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Stefan Beller, cornelius.weig, Johannes Sixt,
	bitte.keine.werbung.einwerfen, git, Junio C Hamano, thomas.braun,
	John Keeping

Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Tue, Jan 24, 2017 at 4:21 PM, Stefan Beller <sbeller@google.com> wrote:
> >
> > +Do not PGP sign your patch. Most likely, your maintainer or other
> > +people on the list would not have your PGP key and would not bother
> > +obtaining it anyway.
> 
> I think even that could be further simplified - by just removing all
> comments about pgp email
> 
> Because it's not that the PGP keys would be hard to get, it's that
> PGP-signed email is an abject failure, and nobody sane does it.
> 
> Google for "phil zimmerman doesn't use pgp email".
> 
> It's dead. So I'm not sure it's worth mentioning at all.

I disagree, we still see it, and Debian still advocates it.
In fact, we may also want to mention S/MIME in the same breath:

  https://public-inbox.org/git/20170110004031.57985-2-hansenr@google.com/

Richard's more recent mails seem to indicate he's reformed :)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2017-02-16 19:41 simran singhal
  2017-02-16 19:44 ` SIMRAN SINGHAL
  0 siblings, 1 reply; 1651+ messages in thread
From: simran singhal @ 2017-02-16 19:41 UTC (permalink / raw)
  To: gregkh; +Cc: outreachy-kernel, devel

linux-kernel@vger.kernel.org
Bcc: 
Subject: [PATCH 1/3] staging: rtl8192u: Replace symbolic permissions with
 octal permissions
Reply-To: 

WARNING: Symbolic permissions 'S_IRUGO | S_IWUSR' are not preferred.
Consider using octal permissions '0644'.
This warning is detected by checkpatch.pl

Signed-off-by: simran singhal <singhalsimran0@gmail.com>
---
 drivers/staging/rtl8192u/ieee80211/ieee80211_module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c b/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c
index a9a92d8..2ebc320 100644
--- a/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c
+++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c
@@ -283,7 +283,7 @@ int __init ieee80211_debug_init(void)
 				" proc directory\n");
 		return -EIO;
 	}
-	e = proc_create("debug_level", S_IRUGO | S_IWUSR,
+	e = proc_create("debug_level", 0644,
 			      ieee80211_proc, &fops);
 	if (!e) {
 		remove_proc_entry(DRV_NAME, init_net.proc_net);
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2017-02-16 19:41 simran singhal
@ 2017-02-16 19:44 ` SIMRAN SINGHAL
  0 siblings, 0 replies; 1651+ messages in thread
From: SIMRAN SINGHAL @ 2017-02-16 19:44 UTC (permalink / raw)
  To: outreachy-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1333 bytes --]



On Friday, February 17, 2017 at 1:11:19 AM UTC+5:30, SIMRAN SINGHAL wrote:
>
> linux-kernel@vger.kernel.org 
> Bcc: 
> Subject: [PATCH 1/3] staging: rtl8192u: Replace symbolic permissions with 
>  octal permissions 
> Reply-To: 
>
> WARNING: Symbolic permissions 'S_IRUGO | S_IWUSR' are not preferred. 
> Consider using octal permissions '0644'. 
> This warning is detected by checkpatch.pl 
>
> Signed-off-by: simran singhal <singhalsimran0@gmail.com> 
> --- 
>  drivers/staging/rtl8192u/ieee80211/ieee80211_module.c | 2 +- 
>  1 file changed, 1 insertion(+), 1 deletion(-) 
>
> diff --git a/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c 
> b/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c 
> index a9a92d8..2ebc320 100644 
> --- a/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c 
> +++ b/drivers/staging/rtl8192u/ieee80211/ieee80211_module.c 
> @@ -283,7 +283,7 @@ int __init ieee80211_debug_init(void) 
>                                  " proc directory\n"); 
>                  return -EIO; 
>          } 
> -        e = proc_create("debug_level", S_IRUGO | S_IWUSR, 
> +        e = proc_create("debug_level", 0644, 
>                                ieee80211_proc, &fops); 
>          if (!e) { 
>                  remove_proc_entry(DRV_NAME, init_net.proc_net); 
> -- 
> 2.7.4 
>
>
Sorry, Ignore this. 

[-- Attachment #1.2: Type: text/html, Size: 2679 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)
  To: kernel-janitors

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)
  To: linux-sctp

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)
  To: linux-sh

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)
  To: linux-fbdev

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 ` Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:09 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:09 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:10 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:10 UTC (permalink / raw)




----
How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:13 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:13 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-02-23 15:15 Qin's Yanjun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin's Yanjun @ 2017-02-23 15:15 UTC (permalink / raw)
  To: sparclinux

How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's

______________________________

Sky Silk, http://aknet.kz

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2017-03-19 15:00 Ilan Schwarts
  2017-03-23 17:12 ` Jeff Mahoney
  0 siblings, 1 reply; 1651+ messages in thread
From: Ilan Schwarts @ 2017-03-19 15:00 UTC (permalink / raw)
  To: linux-btrfs

Hi,
sorry if this is a newbie question. I am newbie.

In my kernel driver, I get device id by converting struct inode struct
to btrfs_inode, I use the code:
struct btrfs_inode *btrfsInode;
btrfsInode = BTRFS_I(inode);

I usually download kernel-headers rpm package, this is not enough. it
fails to find the btrfs header files.

I had to download them not via rpm package and declare:
#include "/data/kernel/linux-4.1.21-x86_64/fs/btrfs/ctree.h"
#include "/data/kernel/linux-4.1.21-x86_64/fs/btrfs/btrfs_inode.h"

This is not good, why ctree.h and btrfs_inode.h are not in kernel headers?
Is there another package i need to download in order to get them, in
addition to kernel-headers? ?

I see they are not provided in kernel-header package, e.g:
https://rpmfind.net/linux/RPM/fedora/23/x86_64/k/kernel-headers-4.2.3-300.fc23.x86_64.html

Thanks!

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2017-03-19 15:00 Ilan Schwarts
@ 2017-03-23 17:12 ` Jeff Mahoney
  0 siblings, 0 replies; 1651+ messages in thread
From: Jeff Mahoney @ 2017-03-23 17:12 UTC (permalink / raw)
  To: Ilan Schwarts, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 1289 bytes --]

On 3/19/17 11:00 AM, Ilan Schwarts wrote:
> Hi,
> sorry if this is a newbie question. I am newbie.
> 
> In my kernel driver, I get device id by converting struct inode struct
> to btrfs_inode, I use the code:
> struct btrfs_inode *btrfsInode;
> btrfsInode = BTRFS_I(inode);
> 
> I usually download kernel-headers rpm package, this is not enough. it
> fails to find the btrfs header files.
> 
> I had to download them not via rpm package and declare:
> #include "/data/kernel/linux-4.1.21-x86_64/fs/btrfs/ctree.h"
> #include "/data/kernel/linux-4.1.21-x86_64/fs/btrfs/btrfs_inode.h"
> 
> This is not good, why ctree.h and btrfs_inode.h are not in kernel headers?
> Is there another package i need to download in order to get them, in
> addition to kernel-headers? ?
> 
> 
> I see they are not provided in kernel-header package, e.g:
> https://rpmfind.net/linux/RPM/fedora/23/x86_64/k/kernel-headers-4.2.3-300.fc23.x86_64.html

I don't know what Fedora package you'd use, but the core problem is that
you're trying to use internal structures in an external module.  We've
properly exported the constants and structures required for userspace to
interact with btrfs, but there are no plans to export internal structures.

-Jeff

-- 
Jeff Mahoney
SUSE Labs


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-04-01  5:31 USPS Delivery
  0 siblings, 0 replies; 1651+ messages in thread
From: USPS Delivery @ 2017-04-01  5:31 UTC (permalink / raw)
  To: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw

Hello,
Your item has arrived at Sat, 01 Apr 2017 06:31:34 +0100, but our courier
was not able to deliver the parcel. 
Review the document that is attached to this e-mail!

Most sincerely.
Gertruda Hendry -  USPS Mail Delivery Agent.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-04-10  3:17 Qin Yan jun
  0 siblings, 0 replies; 1651+ messages in thread
From: Qin Yan jun @ 2017-04-10  3:17 UTC (permalink / raw)



How are you today and your family? I require your attention and honest
co-operation about some issues which i will really want to discuss with you
which.  Looking forward to read from you soon.  

Qin's


______________________________

Sky Silk, http://aknet.kz


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-04-11 14:37 USPS Priority Delivery
  0 siblings, 0 replies; 1651+ messages in thread
From: USPS Priority Delivery @ 2017-04-11 14:37 UTC (permalink / raw)
  To: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw

Hello,

We can not deliver your parcel arrived at  Tue, 11 Apr 2017 15:37:40 +0100.

Please click on the link for more details.
http://uspswoiugue62677104.ideliverys.com/iq5866671

With anticipation.
Sophie Wadkins -  USPS Chief Office Manager.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2017-04-13 15:58 Scott Ellentuch
       [not found] ` <CAK2H+efb3iKA5P3yd7uRqJomci6ENvrB1JRBBmtQEpEvyPMe7w@mail.gmail.com>
  0 siblings, 1 reply; 1651+ messages in thread
From: Scott Ellentuch @ 2017-04-13 15:58 UTC (permalink / raw)
  To: linux-raid

for disk in a b c d g h i j k l m n
do

  disklist="${disklist} /dev/sd${disk}1"

done

mdadm --create --verbose /dev/md2 --level=5 --raid=devices=12  ${disklist}

But its telling me :

mdadm: invalid number of raid devices: devices=12


I can't find any definition of a limit anywhere.

Thank you, Tuc

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAK2H+efb3iKA5P3yd7uRqJomci6ENvrB1JRBBmtQEpEvyPMe7w@mail.gmail.com>]

* Re:
       [not found] ` <CAK2H+efb3iKA5P3yd7uRqJomci6ENvrB1JRBBmtQEpEvyPMe7w@mail.gmail.com>
@ 2017-04-13 16:38   ` Scott Ellentuch
  0 siblings, 0 replies; 1651+ messages in thread
From: Scott Ellentuch @ 2017-04-13 16:38 UTC (permalink / raw)
  To: Mark Knecht; +Cc: Linux-RAID

DOH! Stared at it for a while... Thanks.

Tuc

On Thu, Apr 13, 2017 at 12:22 PM, Mark Knecht <markknecht@gmail.com> wrote:
>
>
> On Thu, Apr 13, 2017 at 8:58 AM, Scott Ellentuch <tuctboh@gmail.com> wrote:
>>
>> for disk in a b c d g h i j k l m n
>> do
>>
>>   disklist="${disklist} /dev/sd${disk}1"
>>
>> done
>>
>> mdadm --create --verbose /dev/md2 --level=5 --raid=devices=12  ${disklist}
>>
>> But its telling me :
>>
>> mdadm: invalid number of raid devices: devices=12
>>
>>
>> I can't find any definition of a limit anywhere.
>>
>> Thank you, Tuc
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> Try
>
> --raid-devices=12
>
> not
>
> --raid=devices=12

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CALDO+SZPQGmp4VH0LvCh95uXWvwzAoj+wN-rm0pGu5e0wCcyNw@mail.gmail.com>]

* Re:
       [not found] <CALDO+SZPQGmp4VH0LvCh95uXWvwzAoj+wN-rm0pGu5e0wCcyNw@mail.gmail.com>
@ 2017-04-19 18:13 ` Joe Stringer
  0 siblings, 0 replies; 1651+ messages in thread
From: Joe Stringer @ 2017-04-19 18:13 UTC (permalink / raw)
  To: William Tu; +Cc: xdp-newbies

On 19 April 2017 at 11:12, William Tu <u9012063@gmail.com> wrote:
> subscribe xdp-newbies

You'll need to send this to majordomo@vger.kernel.org :-)

http://vger.kernel.org/vger-lists.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2017-04-28  8:20 Anatolij Gustschin
  2017-04-28  8:43 ` Linus Walleij
  0 siblings, 1 reply; 1651+ messages in thread
From: Anatolij Gustschin @ 2017-04-28  8:20 UTC (permalink / raw)
  To: linus.walleij, gnurou; +Cc: andy.shevchenko, linux-gpio, linux-kernel

Subject: [PATCH v3] gpiolib: Add stubs for gpiod lookup table interface

Add stubs for gpiod_add_lookup_table() and gpiod_remove_lookup_table()
for the !GPIOLIB case to prevent build errors. Also add prototypes.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
---
Changes in v3:
 - add stubs for !GPIOLIB case. Drop prototypes, these are
   already in gpio/machine.h

Changes in v2:
 - move gpiod_lookup_table out of #ifdef

 include/linux/gpio/consumer.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/linux/gpio/consumer.h b/include/linux/gpio/consumer.h
index 8f702fc..cf3fee2 100644
--- a/include/linux/gpio/consumer.h
+++ b/include/linux/gpio/consumer.h
@@ -41,6 +41,8 @@ enum gpiod_flags {
 			  GPIOD_FLAGS_BIT_DIR_VAL,
 };
 
+struct gpiod_lookup_table;
+
 #ifdef CONFIG_GPIOLIB
 
 /* Return the number of GPIOs associated with a device / function */
@@ -435,6 +437,12 @@ struct gpio_desc *devm_fwnode_get_index_gpiod_from_child(struct device *dev,
 	return ERR_PTR(-ENOSYS);
 }
 
+static inline
+void gpiod_add_lookup_table(struct gpiod_lookup_table *table) {}
+
+static inline
+void gpiod_remove_lookup_table(struct gpiod_lookup_table *table) {}
+
 #endif /* CONFIG_GPIOLIB */
 
 static inline
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2017-04-28  8:20 (unknown), Anatolij Gustschin
@ 2017-04-28  8:43 ` Linus Walleij
  2017-04-28  9:26   ` Re: Anatolij Gustschin
  0 siblings, 1 reply; 1651+ messages in thread
From: Linus Walleij @ 2017-04-28  8:43 UTC (permalink / raw)
  To: Anatolij Gustschin
  Cc: Alexandre Courbot, Andy Shevchenko, linux-gpio@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Fri, Apr 28, 2017 at 10:20 AM, Anatolij Gustschin <agust@denx.de> wrote:

> Subject: [PATCH v3] gpiolib: Add stubs for gpiod lookup table interface
>
> Add stubs for gpiod_add_lookup_table() and gpiod_remove_lookup_table()
> for the !GPIOLIB case to prevent build errors. Also add prototypes.
>
> Signed-off-by: Anatolij Gustschin <agust@denx.de>
> ---
> Changes in v3:
>  - add stubs for !GPIOLIB case. Drop prototypes, these are
>    already in gpio/machine.h

Yeah...

> --- a/include/linux/gpio/consumer.h
> +++ b/include/linux/gpio/consumer.h

So why should the stubs be in <linux/gpio/consumer.h>
and not in <linux/gpio/machine.h>?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2017-04-28  8:43 ` Linus Walleij
@ 2017-04-28  9:26   ` Anatolij Gustschin
  0 siblings, 0 replies; 1651+ messages in thread
From: Anatolij Gustschin @ 2017-04-28  9:26 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Alexandre Courbot, Andy Shevchenko, linux-gpio@vger.kernel.org,
	linux-kernel@vger.kernel.org

On Fri, 28 Apr 2017 10:43:19 +0200
Linus Walleij linus.walleij@linaro.org wrote:
...
>> --- a/include/linux/gpio/consumer.h
>> +++ b/include/linux/gpio/consumer.h  
>
>So why should the stubs be in <linux/gpio/consumer.h>
>and not in <linux/gpio/machine.h>?

good question. I'll move them to machine.h.

Thanks,
Anatolij

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-04-28 18:27 USPS Ground Support
  0 siblings, 0 replies; 1651+ messages in thread
From: USPS Ground Support @ 2017-04-28 18:27 UTC (permalink / raw)
  To: linux-nvdimm-y27Ovi1pjclAfugRpC6u6w

Hello,

Your item has arrived at the USPS Post Office at  Fri, 28 Apr 2017 11:27:27
-0700, but the courier was unable to deliver parcel to you. 
You can download the shipment label attached!

With gratitude.
Lashan Simmering -  USPS Support Manager.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-04-29 22:53 USPS Station Management
  0 siblings, 0 replies; 1651+ messages in thread
From: USPS Station Management @ 2017-04-29 22:53 UTC (permalink / raw)
  To: linux-nvdimm-y27Ovi1pjclAfugRpC6u6w

Hello,

Your item has arrived at the USPS Post Office at  Sat, 29 Apr 2017 15:53:09
-0700, but the courier was unable to deliver parcel to you. 
You can find more details in this e-mail attachment!

With thanks and appreciation.
Fermina Khan -  USPS Support Agent.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-03  5:59 H.A
  0 siblings, 0 replies; 1651+ messages in thread
From: H.A @ 2017-05-03  5:59 UTC (permalink / raw)
  To: kernel-janitors

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1651+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1651+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1651+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1651+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1651+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1651+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-03  6:23 H.A
  0 siblings, 0 replies; 1651+ messages in thread
From: H.A @ 2017-05-03  6:23 UTC (permalink / raw)
  To: Recipients

With profound love in my heart, I Kindly Oblige your interest to very important proposal.. It is Truly Divine and require your utmost attention..........

S hlubokou láskou v mém srdci, Laskave jsem prinutit svuj zájem k návrhu .. Je velmi duležité, skutecne Divine a vyžadují vaši nejvyšší pozornost.

  Kontaktujte me prímo pres: helenaroberts99@gmail.com pro úplné podrobnosti.complete.

HELINA .A ROBERTS

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-03 11:26 Paul Lopez-Bravo
  0 siblings, 0 replies; 1651+ messages in thread
From: Paul Lopez-Bravo @ 2017-05-03 11:26 UTC (permalink / raw)


--
Hallo,

Erlauben Sie mir, diese sehr wichtige Anfrage durch diesen Median 
aufgrund seiner vertraulichen Natur zu machen. Mein Name ist Herr Paul 
Lopez-Bravo, ein Rechtsanwalt in Spanien. Ich vertrete Late Philip, der 
vor seinem Tod im Jahr 2009 ein reicher Unternehmer war. Ich vertraue 
dir in einer dringenden Angelegenheit an, die sich auf eine Kaution 
bezieht, die von diesem besonderen Klienten von mir vor seinem Tod 
gemacht wurde. Ich suche für Ihre Zustimmung, mich zu ermächtigen, 
Ihnen als seinen Erben zu präsentieren, um seine Bank zu veranlassen, 
die Summe von $ 7.5Million (Sieben Million Fünfhundert Tausend Dollar), 
die in einem suspendierten Bankkonto hinterlegt werden, zu übergeben. 
Seine Bank hat mir ein letztes Ultimatum als sein Anwalt ausgegeben, um 
seinen Erben zu präsentieren, da die gesetzlich zulässige Zeit für eine 
solche Forderung abgelaufen ist, sonst wird der Fonds beschlagnahmt.
Die beabsichtigte Transaktion wird unter einer legitimen Art und Weise 
durchgeführt, die Sie und ich vor jeglicher Rechtsverletzung schützen 
wird. Ich werde meine Position als Mandantenanwalt nutzen, um die 
Bearbeitung der benötigten rechtlichen Dokumentationen und die 
erfolgreiche Durchführung dieser Transaktion zu gewährleisten. Alles 
was ich verlange ist Ihr Verständnis und ehrliche Zusammenarbeit für 
den Erfolg. Beachten Sie, dass nach der erfolgreichen Durchführung der 
Transaktion, halten Sie 40% des gesamten Fonds nach allen Kosten.
 Ich gebe Ihnen ausführliche Details, wenn Sie Ihr Interesse 
bestätigen.

Ich hoffe, von Ihnen bald zu hören.

Mit freundlichen Grüßen
Paul Lopez-Bravo Esq
Tel: +34692899384


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAMj-D2DO_CfvD77izsGfggoKP45HSC9aD6auUPAYC9Yeq_aX7w@mail.gmail.com>]

* Re:
       [not found] <CAMj-D2DO_CfvD77izsGfggoKP45HSC9aD6auUPAYC9Yeq_aX7w@mail.gmail.com>
@ 2017-05-04 16:44 ` gengdongjiu
  0 siblings, 0 replies; 1651+ messages in thread
From: gengdongjiu @ 2017-05-04 16:44 UTC (permalink / raw)
  To: mtsirkin, kvm, Tyler Baicar, qemu-devel, Xiongfeng Wang, ben,
	linux, kvmarm, huangshaoyu, lersek, songwenjun, wuquanming,
	Marc Zyngier, qemu-arm, imammedo, linux-arm-kernel,
	Ard Biesheuvel, pbonzini, James Morse

Dear James,
   Thanks a lot for your review and comments. I am very sorry for the
late response.

2017-05-04 23:42 GMT+08:00 gengdongjiu <gengdj.1984@gmail.com>:
>  Hi Dongjiu Geng,
>
> On 30/04/17 06:37, Dongjiu Geng wrote:
>> when happen SEA, deliver signal bus and handle the ioctl that
>> inject SEA abort to guest, so that guest can handle the SEA error.
>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 105b6ab..a96594f 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -20,8 +20,10 @@
>> @@ -1238,6 +1240,36 @@ static void coherent_cache_guest_page(struct kvm_vcpu *vcpu, kvm_pfn_t pfn,
>>   __coherent_cache_guest_page(vcpu, pfn, size);
>>  }
>>
>> +static void kvm_send_signal(unsigned long address, bool hugetlb, bool hwpoison)
>> +{
>> + siginfo_t info;
>> +
>> + info.si_signo   = SIGBUS;
>> + info.si_errno   = 0;
>> + if (hwpoison)
>> + info.si_code    = BUS_MCEERR_AR;
>> + else
>> + info.si_code    = 0;
>> +
>> + info.si_addr    = (void __user *)address;
>> + if (hugetlb)
>> + info.si_addr_lsb = PMD_SHIFT;
>> + else
>> + info.si_addr_lsb = PAGE_SHIFT;
>> +
>> + send_sig_info(SIGBUS, &info, current);
>> +}
>> +
> «  [hide part of quote]
>
> Punit reviewed the other version of this patch, this PMD_SHIFT is not the right
> thing to do, it needs a more accurate set of calls and shifts as there may be
> hugetlbfs pages other than PMD_SIZE.
>
> https://www.spinics.net/lists/arm-kernel/msg568919.html
>
> I haven't posted a new version of that patch because I was still hunting a bug
> in the hugepage/hwpoison code, even with Punit's fixes series I see -EFAULT
> returned to userspace instead of this hwpoison code being invoked.

  Ok, got it, thanks for your information.
>
> Please avoid duplicating functionality between patches, it wastes reviewers
> time, especially when we know there are problems with this approach.
>
>
>> +static void kvm_handle_bad_page(unsigned long address,
>> + bool hugetlb, bool hwpoison)
>> +{
>> + /* handle both hwpoison and other synchronous external Abort */
>> + if (hwpoison)
>> + kvm_send_signal(address, hugetlb, true);
>> + else
>> + kvm_send_signal(address, hugetlb, false);
>> +}
>
> Why the extra level of indirection? We only want to signal userspace like this
> from KVM for hwpoison. Signals for RAS related reasons should come from the bits
> of the kernel that decoded the error.

For the SEA, the are maily two types:
0b010000 Synchronous External Abort on memory access.
0b0101xx Synchronous External Abort on page table walk. DFSC[1:0]
encode the level.

hwpoison should belong to the  "Synchronous External Abort on memory access"
if the SEA type is not hwpoison, such as page table walk, do you mean
KVM do not deliver the SIGBUS?
If so, how the KVM handle the SEA type other than hwpoison?

>
> (hwpoison for KVM is a corner case as Qemu's memory effectively has two users,
> Qemu and KVM. This isn't the example of how user-space gets signalled.)
>
>
>> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
>> index b37446a..780e3c4 100644
>> --- a/arch/arm64/kvm/guest.c
>> +++ b/arch/arm64/kvm/guest.c
>> @@ -277,6 +277,13 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>>   return -EINVAL;
>>  }
>>
>> +int kvm_vcpu_ioctl_sea(struct kvm_vcpu *vcpu)
>> +{
>> + kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
>> +
>> + return 0;
>> +}
>
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index bb02909..1d2e2e7 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -1306,6 +1306,7 @@ struct kvm_s390_ucas_mapping {
>>  #define KVM_S390_GET_IRQ_STATE  _IOW(KVMIO, 0xb6, struct kvm_s390_irq_state)
>>  /* Available with KVM_CAP_X86_SMM */
>>  #define KVM_SMI                   _IO(KVMIO,   0xb7)
>> +#define KVM_ARM_SEA               _IO(KVMIO,   0xb8)
>>
>>  #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0)
>>  #define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1)
>>
>
> Why do we need a userspace API for SEA? It can also be done by using
> KVM_{G,S}ET_ONE_REG to change the vcpu registers. The advantage of doing it this
> way is you can choose which ESR value to use.
>
> Adding a new API call to do something you could do with an old one doesn't look
> right.

James, I considered your suggestion before that use the
KVM_{G,S}ET_ONE_REG to change the vcpu registers. but I found it does
not have difference to use the alread existed KVM API.  so may be
changing the vcpu registers in qemu will duplicate with the KVM APIs.

injection a SEA is no more than setting some registers: elr_el1, PC,
PSTATE, SPSR_el1, far_el1, esr_el1
I seen this KVM API do the same thing as Qemu.  do you found call this
API will have issue and necessary to choose another ESR value?

I pasted the alread existed KVM API code:

static void inject_abt64(struct kvm_vcpu *vcpu, bool is_iabt, unsigned
long addr)
{
 unsigned long cpsr = *vcpu_cpsr(vcpu);
 bool is_aarch32 = vcpu_mode_is_32bit(vcpu);
 u32 esr = 0;
 *vcpu_elr_el1(vcpu) = *vcpu_pc(vcpu);
 *vcpu_pc(vcpu) = get_except_vector(vcpu, except_type_sync);
 *vcpu_cpsr(vcpu) = PSTATE_FAULT_BITS_64;
 *vcpu_spsr(vcpu) = cpsr;
 vcpu_sys_reg(vcpu, FAR_EL1) = addr;
 /*
  * Build an {i,d}abort, depending on the level and the
  * instruction set. Report an external synchronous abort.
  */
 if (kvm_vcpu_trap_il_is32bit(vcpu))
  esr |= ESR_ELx_IL;
 /*
  * Here, the guest runs in AArch64 mode when in EL1. If we get
  * an AArch32 fault, it means we managed to trap an EL0 fault.
  */
 if (is_aarch32 || (cpsr & PSR_MODE_MASK) == PSR_MODE_EL0t)
  esr |= (ESR_ELx_EC_IABT_LOW << ESR_ELx_EC_SHIFT);
 else
  esr |= (ESR_ELx_EC_IABT_CUR << ESR_ELx_EC_SHIFT);
 if (!is_iabt)
  esr |= ESR_ELx_EC_DABT_LOW << ESR_ELx_EC_SHIFT;
 vcpu_sys_reg(vcpu, ESR_EL1) = esr | ESR_ELx_FSC_EXTABT;
}

static void inject_abt32(struct kvm_vcpu *vcpu, bool is_pabt,
    unsigned long addr)
{
 u32 vect_offset;
 u32 *far, *fsr;
 bool is_lpae;
 if (is_pabt) {
  vect_offset = 12;
  far = &vcpu_cp15(vcpu, c6_IFAR);
  fsr = &vcpu_cp15(vcpu, c5_IFSR);
 } else { /* !iabt */
  vect_offset = 16;
  far = &vcpu_cp15(vcpu, c6_DFAR);
  fsr = &vcpu_cp15(vcpu, c5_DFSR);
 }
 prepare_fault32(vcpu, COMPAT_PSR_MODE_ABT | COMPAT_PSR_A_BIT, vect_offset);
 *far = addr;
 /* Give the guest an IMPLEMENTATION DEFINED exception */
 is_lpae = (vcpu_cp15(vcpu, c2_TTBCR) >> 31);
 if (is_lpae)
  *fsr = 1 << 9 | 0x34;
 else
  *fsr = 0x14;
}

/**
 * kvm_inject_dabt - inject a data abort into the guest
 * @vcpu: The VCPU to receive the undefined exception
 * @addr: The address to report in the DFAR
 *
 * It is assumed that this code is called from the VCPU thread and that the
 * VCPU therefore is not currently executing guest code.
 */
void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr)
{
 if (!(vcpu->arch.hcr_el2 & HCR_RW))
  inject_abt32(vcpu, false, addr);
 else
  inject_abt64(vcpu, false, addr);
}

>
>
> Thanks,
>
> James
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-04 23:57 Tammy
  0 siblings, 0 replies; 1651+ messages in thread
From: Tammy @ 2017-05-04 23:57 UTC (permalink / raw)
  To: Recipients

Hello,

I am Maj Gen. Tammy Smith. I would like to discuss with you privately. Contact me via my personal email below for further information.

Maj Gen. Tammy Smith
MajGenTammySm-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-05-16 22:46 USPS Parcels Delivery
  0 siblings, 0 replies; 1651+ messages in thread
From: USPS Parcels Delivery @ 2017-05-16 22:46 UTC (permalink / raw)
  To: linux-nvdimm-y27Ovi1pjclAfugRpC6u6w

Hello,

Your item has arrived at the USPS Post Office at  Wed, 17 May 2017 00:46:58
+0200, but the courier was unable to deliver parcel to you. 
You can download the shipment label attached!

Yours respectfully.
Mellicent Northan -  USPS Operation Manager.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20170519213731.21484-1-mrugiero@gmail.com>]

* Re:
       [not found] <20170519213731.21484-1-mrugiero@gmail.com>
@ 2017-05-20  8:48 ` Boris Brezillon
  0 siblings, 0 replies; 1651+ messages in thread
From: Boris Brezillon @ 2017-05-20  8:48 UTC (permalink / raw)
  To: Mario J. Rugiero
  Cc: "linux-mtd, computersforpeace, marek.vasut, richard,
	cyrille.pitchen

Hi Mario,

Not sure how you created this patchset, but you miss a Subject, and the
diff-stat.

Please use

git format-patch -o <output-dir> -3 --cover-letter

to generate patches, then fill the cover letter in.

Once your cover letter is ready, you can send the patches with

git send-email --to ... --cc ... <output-dir>/*.patch

Regards,

Boris

Le Fri, 19 May 2017 18:37:28 -0300,
"Mario J. Rugiero" <mrugiero@gmail.com> a écrit :

> Some manufacturers use different layouts than MTD for the NAND, creating
> incompatibilities when going from a vendor-specific kernel to mainline.
> In particular, NAND devices for AllWinner boards write non-FF values to
> the bad block marker, and thus false positives arise when detecting bad
> blocks with the MTD driver. Sometimes there are enough false positives
> to make the device unusable.
> A proposed solution is NAND scrubbing, something a user who knows what
> she's doing (TM) could do to avoid this. It consists in erasing blocks
> disregarding the BBM. Since the user must know what she's doing, the
> only way to enable this feature is through a per-chip debugfs entry.
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-07-07 17:04 Mrs Alice Walton
  0 siblings, 0 replies; 1651+ messages in thread
From: Mrs Alice Walton @ 2017-07-07 17:04 UTC (permalink / raw)


-- 
my name is Mrs. Alice Walton, a business woman an America Citizen and  
the heiress to the fortune of Walmart stores, born October 7, 1949. I  
have a mission for you worth $100,000,000.00(Hundred Million United  
State Dollars) which I intend using for CHARITY PROJECT to help the  
less privilege and orphanage

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-07-19  8:03 Lynne Smith
  0 siblings, 0 replies; 1651+ messages in thread
From: Lynne Smith @ 2017-07-19  8:03 UTC (permalink / raw)



My Name is lynne Smith please i really need your help?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-07-19  8:03 Lynne Smith
  0 siblings, 0 replies; 1651+ messages in thread
From: Lynne Smith @ 2017-07-19  8:03 UTC (permalink / raw)
  To: sparclinux


My Name is lynne Smith please i really need your help?


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-07-27  1:12 Marie Angèle Ouattara
  0 siblings, 0 replies; 1651+ messages in thread
From: Marie Angèle Ouattara @ 2017-07-27  1:12 UTC (permalink / raw)


I need your cooperation in a profitable transaction and details will
be disclosed to you once i receive your reply.

Thanks,
Mrs. Marie.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-09-24 16:59 Estrin, Alex
       [not found] ` <F3529576D8E232409F431C309E29399336CD9886-8k97q/ur5Z1cIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: Estrin, Alex @ 2017-09-24 16:59 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

Leon,

The only intention I had with that note was to try to help with the process of screening patches
to maintain and improve the quality of common rdma code.
As it was taken as an insult I apologize for that. Never had in mind. 
I should have scripted it better. 
Perhaps, put the note through the checkpatch :)
And yes, as a member of rdma list I accept my share of blame for missing that bug.

Thanks,
Alex.

P.S.
My apologies to the community for the spam.


> On Sat, Sep 23, 2017 at 07:20:53PM +0000, Estrin, Alex wrote:
> > > > Hello,
> > > >
> > > > One minor note regarding the original commit 523633359224
> > > > that broke the core.
> > > > It seem it was let through without trivial validation,
> > > > otherwise it wouldn't pass the checkpatch.
> > >
> > > Can you be more specific? Are you referring to "WARNING: line over 80
> > > characters" or to something else? If yes, I feel really bad for you and
> > > your workplace.
> > Please don't be. Keep doing a great job at your workplace, I will do the same at
> mine.
> >
> > > Readability is a first priority for the submitted code.
> > I can agree with you on that, considering easy readable submitted code
> > does not introduce a trivial bugs.
> 
> It will be very helpful to everyone if you stop to throw claims without any actual
> support.
> 1. Doug allows enough time to respond on the patches and neither you and
> neither your
> colleagues didn't see such "trivial bug" back then.
> 2. It fixed another "trivial bug" introduced by your colleague which
> broke RoCE (one of the most popular fabric in the stack) and we didn't
> cry other the internet about it.
> 
> Before you are rushing to reply me, please consult with Denny, he can
> give you a short update on how hard the recent OPA changes in AH and
> LIDs broke the stack and RoCE/IB devices.
> 
> >
> > > ➜  linux-rdma git:(rdma-rc) git fp -1 523633359224 -o /tmp/
> > > /tmp/0001-IB-core-Fix-the-validations-of-a-multicast-LID-in-at.patch
> > > ➜  linux-rdma git:(rdma-rc) ./scripts/checkpatch.pl --strict /tmp/0001-IB-core-
> Fix-
> > > the-validations-of-a-multicast-LID-in-at.patch
> > > WARNING: line over 80 characters
> > > #45: FILE: drivers/infiniband/core/verbs.c:1584:
> > > +			if (qp->device->get_link_layer(qp->device, attr.port_num) !=
> > >
> > > total: 0 errors, 1 warnings, 0 checks, 62 lines checked
> > >
> > > NOTE: For some of the reported defects, checkpatch may be able to
> > >       mechanically convert to the typical style using --fix or --fix-inplace.
> > >
> > > /tmp/0001-IB-core-Fix-the-validations-of-a-multicast-LID-in-at.patch has
> style
> > > problems, please review.
> > >
> > > NOTE: If any of the errors are false positives, please report
> > >       them to the maintainer, see CHECKPATCH in MAINTAINERS.
> > >
> > >
> > > >
> > > > Thanks,
> > > > Alex.
> > > >
> > > > > On Fri, Sep 22, 2017 at 06:42:41PM -0400, Doug Ledford wrote:
> > > > > > On Fri, 2017-09-22 at 15:17 -0600, Jason Gunthorpe wrote:
> > > > > > > On Fri, Sep 22, 2017 at 05:06:26PM -0400, Doug Ledford wrote:
> > > > > > >
> > > > > > > > Sure, I get that, but I was already out on PTO on the 30th.  What
> > > > > > > > sucks
> > > > > > > > is that it landed right after I was out.  But I plan to have the
> > > > > > > > pull
> > > > > > > > request in before EOB today, so the difference between the 20th
> and
> > > > > > > > today is neglible.  Especially since lots of people doing QA
> > > > > > > > testing
> > > > > > > > prefer to take -rc tags, in that case, the difference is non-
> > > > > > > > existent.
> > > > > > >
> > > > > > > My thinking was that people should test -rc,
> > > > > >
> > > > > > Great, with you here...
> > > > > >
> > > > > > >  but if they have problems
> > > > > > > they could grab your for-rc branch and check if their issue is
> > > > > > > already
> > > > > > > fixed..
> > > > > >
> > > > > > They can do this too...
> > > > > >
> > > > > > But if that still doesn't resolve their problem, a quick check of the
> > > > > > mailing list contents isn't out of the question either.  In that case,
> > > > > > they would have found the solution to their problem.  But, when you
> get
> > > > > > right down to it, only one person reported it in addition to the
> > > > > > original poster, so either other people saw the original post and
> > > > > > compensated in their own testing, or (the more likely I think), most
> > > > > > people don't start testing -rcs until after -rc2.
> > > > >
> > > > > I don't know about other people, but our testing of -rc starts on -rc1
> > > > > and we are not waiting for -rc2. From my observe of netdev, they also
> > > > > start to test -rc immediately.
> > > > >
> > > > > Otherwise, what is the point of the week between -rc1 and -rc2?
> > > > >
> > > > > > Which is why I try to set -rc2 as a milestone for several purposes.
> > > > > > For getting in the bulk of the known fixes, but also as a branching
> > > > > > point for for-next.
> > > > > >
> > > > > > --
> > > > > > Doug Ledford <dledford@redhat.com>
> > > > > >     GPG KeyID: B826A3330E572FDD
> > > > > >     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333
> 0E57
> > > 2FDD
> > > > > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <F3529576D8E232409F431C309E29399336CD9886-8k97q/ur5Z1cIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>]

* Re:
       [not found] ` <F3529576D8E232409F431C309E29399336CD9886-8k97q/ur5Z1cIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2017-09-25  5:48   ` Leon Romanovsky
  0 siblings, 0 replies; 1651+ messages in thread
From: Leon Romanovsky @ 2017-09-25  5:48 UTC (permalink / raw)
  To: Estrin, Alex; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 606 bytes --]

On Sun, Sep 24, 2017 at 04:59:55PM +0000, Estrin, Alex wrote:
> Leon,
>
> The only intention I had with that note was to try to help with the process of screening patches
> to maintain and improve the quality of common rdma code.
> As it was taken as an insult I apologize for that. Never had in mind.
> I should have scripted it better.
> Perhaps, put the note through the checkpatch :)
> And yes, as a member of rdma list I accept my share of blame for missing that bug.

Thanks Alex for putting up with me at the mailing list.

>
> Thanks,
> Alex.
>
> P.S.
> My apologies to the community for the spam.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2017-10-01 10:53 Pierre
  0 siblings, 0 replies; 1651+ messages in thread
From: Pierre @ 2017-10-01 10:53 UTC (permalink / raw)
  To: sparclinux

Do you need a loan ,You want to pay off bills,Expand your business ?,Look no further we offer all kinds of loans both long and short term loan,for only 3% interest.If yes you need a loan Please email : pierrewolf07@gmail.com now to apply for a secured loan with the Applicant form below ,

Name :
Age:
Country:
State:
Loan amount :
Loan duration :
Phone number :

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-10-18 14:31 Mrs. Marie Angèle O
  0 siblings, 0 replies; 1651+ messages in thread
From: Mrs. Marie Angèle O @ 2017-10-18 14:31 UTC (permalink / raw)


-- 
I solicit for your partnership to claim $11 million. You will be
entitled to 40% of the sum reply for more details.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE::
@ 2017-11-01 14:57 Mrs Hsu Wealther
  0 siblings, 0 replies; 1651+ messages in thread
From: Mrs Hsu Wealther @ 2017-11-01 14:57 UTC (permalink / raw)
  To: linux-sparse

Are you available at your desk? I need you to please check your email box for a business letter.

With Regards,

Ms. Hui Weather

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2017-11-12  2:21 hsed
  2017-11-13 18:56 ` Stefan Beller
  0 siblings, 1 reply; 1651+ messages in thread
From: hsed @ 2017-11-12  2:21 UTC (permalink / raw)
  To: git

subscribe git

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2017-11-12  2:21 hsed
@ 2017-11-13 18:56 ` Stefan Beller
  0 siblings, 0 replies; 1651+ messages in thread
From: Stefan Beller @ 2017-11-13 18:56 UTC (permalink / raw)
  To: hsed; +Cc: git

To subscribe to the git mailing list, send the email to
majordomo@vger.kernel.org, not the mailing list itself.

On Sat, Nov 11, 2017 at 6:21 PM,  <hsed@unimetic.com> wrote:
> subscribe git

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:42 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:42 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:44 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:44 UTC (permalink / raw)
  To: keyrings

Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:44 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:44 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 ` Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 ` Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)
  To: linux-security-module

Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo
--
To unsubscribe from this list: send the line "unsubscribe linux-spi" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:55 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:55 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:56 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:56 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 14:57 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 14:57 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 15:00 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 15:00 UTC (permalink / raw)
  To: sparclinux

Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 15:01 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 15:01 UTC (permalink / raw)
  To: target-devel

Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2017-11-13 15:04 Amos Kalonzo
  0 siblings, 0 replies; 1651+ messages in thread
From: Amos Kalonzo @ 2017-11-13 15:04 UTC (permalink / raw)


Attn:

I am wondering why You haven't respond to my email for some days now.
reference to my client's contract balance payment of (11.7M,USD)
Kindly get back to me for more details.

Best Regards

Amos Kalonzo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2017-11-20 15:10 Viet Nguyen
  2017-11-20 20:07 ` Stefan Beller
  0 siblings, 1 reply; 1651+ messages in thread
From: Viet Nguyen @ 2017-11-20 15:10 UTC (permalink / raw)
  To: git

unsubscribe git

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2017-11-20 15:10 Viet Nguyen
@ 2017-11-20 20:07 ` Stefan Beller
  0 siblings, 0 replies; 1651+ messages in thread
From: Stefan Beller @ 2017-11-20 20:07 UTC (permalink / raw)
  To: Viet Nguyen; +Cc: git

did you mean majordomo@kernel.org instead?

On Mon, Nov 20, 2017 at 7:10 AM, Viet Nguyen <ntviet18@gmail.com> wrote:
> unsubscribe git

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2018-01-24 19:54 Amy Riddering
  0 siblings, 0 replies; 1651+ messages in thread
From: Amy Riddering @ 2018-01-24 19:54 UTC (permalink / raw)


-- 
Mrs. Amy Riddering contacting you for missionary work and i pray you
will be kind enough to deliver my $7 million donation to the less
privileged ones in your country and God will bless your generation for
doing this humanitarian work.

I am a widow suffering of lung cancer which has damaged my liver and
back bone, i decided to entrust this fund to a God fearing person that
will use it for Charity works since i do not have any child who will
inherit this money after i die. Please i want your sincere reply to
know if you will be able to execute this project, and I will give you
more information on how the fund will be transferred to your bank
account. I am waiting for your reply.

Thanks and God bless you.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2018-01-24 22:11 Amy Riddering
  0 siblings, 0 replies; 1651+ messages in thread
From: Amy Riddering @ 2018-01-24 22:11 UTC (permalink / raw)


-- 
Mrs. Amy Riddering contacting you for missionary work and i pray you
will be kind enough to deliver my $7 million donation to the less
privileged ones and God will bless your generation for doing this
humanitarian assignment.

I am a widow suffering of lung cancer which has damaged my liver and
back bone, i decided to entrust $7M Dollars to a God fearing person
that will use it for Charity works since i do not have any child who
will inherit this money after i die. Please i want your urgent reply
to know if you can handle this project and I will relocate this fund
to you.

Thanks and remain blessed.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2018-01-27  3:56 Emile Kenold
  0 siblings, 0 replies; 1651+ messages in thread
From: Emile Kenold @ 2018-01-27  3:56 UTC (permalink / raw)


-- 
This is Mrs. Emile Kenold contacting you for missionary work and i
pray you will be kind enough to deliver my £7 million donation to the
less privileged ones.

I am a widow suffering of lung cancer which has damaged my liver and
back bone, i decided to entrust this fund to a God fearing person that
will use it for Charity works and i want your sincere reply to know if
you will be able to execute this project, I will give you more
information on how the fund will be transferred to you immediately I
receive your positive response.

Thanks and God bless you.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2018-02-05  5:28 Fahama Vaserman
  0 siblings, 0 replies; 1651+ messages in thread
From: Fahama Vaserman @ 2018-02-05  5:28 UTC (permalink / raw)
  To: info; +Cc: info

Can i confide in you: ?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2018-02-27  1:18 Alan Gage
  2018-02-27 10:26 ` René Scharfe
  0 siblings, 1 reply; 1651+ messages in thread
From: Alan Gage @ 2018-02-27  1:18 UTC (permalink / raw)
  To: git

Hello, I recently noticed a bug involving GitBash and Python. I was
running a function that would post the system time once every second
using a while loop but the text was only sent after the while loop
ended due to a timer I had set. Essesntially, instead of it being
entered every second into the terminal, it was entered all at once,
when the loop ended. I tried this with the Command Line, among other
things, and it worked as intended, with the text being entered every
second. This is on Windows 10 Pro with the Fall Creators Update and
the most recent version of GitBash.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-02-27  1:18 Alan Gage
@ 2018-02-27 10:26 ` René Scharfe
  0 siblings, 0 replies; 1651+ messages in thread
From: René Scharfe @ 2018-02-27 10:26 UTC (permalink / raw)
  To: Alan Gage; +Cc: git

Am 27.02.2018 um 02:18 schrieb Alan Gage:
> Hello, I recently noticed a bug involving GitBash and Python. I was
> running a function that would post the system time once every second
> using a while loop but the text was only sent after the while loop
> ended due to a timer I had set. Essesntially, instead of it being
> entered every second into the terminal, it was entered all at once,
> when the loop ended. I tried this with the Command Line, among other
> things, and it worked as intended, with the text being entered every
> second. This is on Windows 10 Pro with the Fall Creators Update and
> the most recent version of GitBash.

Python buffers its output by default.  On terminals it enables line
buffering, i.e. the accumulated output is flushed when a newline
character is reached.  Otherwise it uses a system-dependent buffer
size in the range of a few kilobytes.  You can check if your output
is a terminal e.g. with:

  python -c "import sys; print(sys.stdout.isatty())"

You can disable buffering by running your script with "python -u".
This discussion mentions more options:

  https://stackoverflow.com/questions/107705/disable-output-buffering

You can also start bash on the command line.  I do wonder why Git CMD
seems to be started in what passes as a terminal, while Git BASH is
not, though.

You may want to check out https://gitforwindows.org/ and report your
findings using their issue tracker.

(This mailing list here, git@vger.kernel.org, is mostly used for
discussing Git itself, not so much about extra tools like bash or
Python that are packaged with Git for Windows.)

René

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [Outreachy kernel] Re: Re: [PATCH] h [Patch] Fixed unnecessary typecasting to in. Error found with checkpatch. Signed-off-by: Nishka Dasgupta <nishka.dasgupta_ug18@ashoka.edu.in>
@ 2018-02-27 13:39 Julia Lawall
  2018-02-27 13:53 ` Nishka Dasgupta
  0 siblings, 1 reply; 1651+ messages in thread
From: Julia Lawall @ 2018-02-27 13:39 UTC (permalink / raw)
  To: Nishka Dasgupta; +Cc: outreachy-kernel

On Tue, 27 Feb 2018, Nishka Dasgupta wrote:

> Hi,
> Thank you for pointing out the formatting errors. I have deleted the unnecessary [Patch] and added the driver information.
> I could not, however, work out how to separate the subject line from the body of the text.
> I sent the commit to the maintainer of the patch in a separate email as I was (and am) not comfortable with git send-email, though I am doing my best to learn. I have sent him the revised patch, however.
> I don't know why a and b are prepended, unless you are referring to the variables that I added, in which case I would like to clarify that I did not want to use names as generic and vague as a and b for the new variables, so I appended part of their respective values after the initial character for ease of understanding. Should I have used more descriptive variable names?
> Finally, I did compile the this particular driver (make path/ as on the website) and an output file was generated for the modified file, so I took that to mean it was working as it should. Was I wrong?

I think that the code is actually better as is.  Normally, there are not
floating point numbers in kernel code, so it is nice to see the int cast,
and to see what the numbers are being used for.  Checkpatch is not always
correct...

julia

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-02-27 13:39 [Outreachy kernel] Re: Re: [PATCH] h [Patch] Fixed unnecessary typecasting to in. Error found with checkpatch. Signed-off-by: Nishka Dasgupta <nishka.dasgupta_ug18@ashoka.edu.in> Julia Lawall
@ 2018-02-27 13:53 ` Nishka Dasgupta
  0 siblings, 0 replies; 1651+ messages in thread
From: Nishka Dasgupta @ 2018-02-27 13:53 UTC (permalink / raw)
  To: outreachy-kernel

Hi,
Weren't the original values already integers because of the "e8" appended?
If this was an unnecessary patch, I will try to do better next time.
Thanking you,
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [Outreachy kernel] Re:
  2018-02-27 13:53 ` Nishka Dasgupta
@ 2018-02-27 13:58 Julia Lawall
  2018-02-27 14:07 ` Re: Nishka Dasgupta
  -1 siblings, 1 reply; 1651+ messages in thread
From: Julia Lawall @ 2018-02-27 13:58 UTC (permalink / raw)
  To: Nishka Dasgupta; +Cc: outreachy-kernel



On Tue, 27 Feb 2018, Nishka Dasgupta wrote:

> Hi,
> Weren't the original values already integers because of the "e8" appended?
> If this was an unnecessary patch, I will try to do better next time.

I think that they remain floats.

julia

> Thanking you,
> Nishka Dasgupta
>
> --
> You received this message because you are subscribed to the Google Groups "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to outreachy-kernel+unsubscribe@googlegroups.com.
> To post to this group, send email to outreachy-kernel@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/outreachy-kernel/1519739588-3783-1-git-send-email-nishka.dasgupta_ug18%40ashoka.edu.in.
> For more options, visit https://groups.google.com/d/optout.
>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-02-27 13:58 [Outreachy kernel] Re: Julia Lawall
@ 2018-02-27 14:07 ` Nishka Dasgupta
  0 siblings, 0 replies; 1651+ messages in thread
From: Nishka Dasgupta @ 2018-02-27 14:07 UTC (permalink / raw)
  To: outreachy-kernel

Hi,
Thank you for the clarification!
Regards,
Nishka Dasgupta


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH v2] staging: vc04_services: bcm2835-camera: Add blank line
@ 2018-03-01 19:33 Greg KH
  2018-03-01 20:20 ` Nishka Dasgupta
  0 siblings, 1 reply; 1651+ messages in thread
From: Greg KH @ 2018-03-01 19:33 UTC (permalink / raw)
  To: Nishka Dasgupta
  Cc: eric, stefan.wahren, f.fainelli, rjui, sbranden,
	bcm-kernel-feedback-list, outreachy-kernel

On Fri, Mar 02, 2018 at 12:45:48AM +0530, Nishka Dasgupta wrote:
> Checkpatch suggested two warnings for this file in consecutive lines: "static VCHI_CONNECTION_T *vchi_connection;" and "static VCHI_INSTANCE_T vchi_instance;". Both warnings said to add a blank line after the declaration. If checkpatch was wrong, is it okay if I submit a version 2 with a blank line only after "vchi_instance" and not below "*vchi_connection" (effectively undoing one of my commits)?

First off, I have no context as to what you are responding to here,
please always quote the email you are responding to properly.  Reviewers
deal with hundreds of emails a day, and not having context for what you
say doesn't really work :(

Also, properly wrap your email lines at 72 columns, your email client
should do this for you, right?

Please respond to the email with the correct context and I will be glad
to respond.  Right now, I have no idea what you are referring to.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-01 19:33 [PATCH v2] staging: vc04_services: bcm2835-camera: Add blank line Greg KH
@ 2018-03-01 20:20 ` Nishka Dasgupta
  2018-03-01 20:31   ` Re: Greg KH
  0 siblings, 1 reply; 1651+ messages in thread
From: Nishka Dasgupta @ 2018-03-01 20:20 UTC (permalink / raw)
  To: gregkh, outreachy-kernel
  Cc: eric, stefan.wahren, f.fainelli, rjui, sbranden,
	bcm-kernel-feedback-list

This is with reference to your last email pointing out that you have no
context for what I am responding to. Unfortunately, I have been unable
to get mutt to load my inbox, so I could not and cannot quote text
directly. I have done my best to reproduce the conversation below in a
coherent fashion; nonetheless, I apologise for any persisting lack of
clarity.

An hour ago I submitted the following commit with respect to
drivers/staging/vc04_services/bcm2835-camera/mmal-vchiq.c: [PATCH v2]
staging: vc04_services: bcm2835-camera: Add blank line

Your reply: Checkpatch is wrong here, don't you think? Are you sure this
is actually doing what you think it is?

My reply: Checkpatch suggested two warnings for this file in consecutive
lines: "static VCHI_CONNECTION_T *vchi_connection;" and "static
VCHI_INSTANCE_T vchi_instance;". Both warnings said to add a blank line
after the declaration. If checkpatch was wrong, is it okay if I submit a
version 2 with a blank line only after "vchi_instance" and not below
"*vchi_connection" (effectively undoing one of my commits)?

Your response: First off, I have no context as to what you are
responding to here, please always quote the email you are responding to
properly.  Reviewers deal with hundreds of emails a day, and not having
context for what you say doesn't really work :( Also, properly wrap your
email lines at 72 columns, your email client should do this for you,
right?  Please respond to the email with the correct context and I will
be glad to respond.  Right now, I have no idea what you are referring
to.  

(I am sorry for the suboptimal formatting. I think I managed to linewrap
this email properly, however.) My question remains: may I submit a
revised commit that effectively undoes the first "add blank line"
commit, i.e. the one that added a line between "static VCHI_CONNECTION_T
*vchi_connection" and "static VCHI_INSTANCE_T vchi_instance"?  Thank you
for your time, and apologies again for the confusion.

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-01 20:20 ` Nishka Dasgupta
@ 2018-03-01 20:31   ` Greg KH
  2018-03-08 18:23     ` Re: Nishka Dasgupta
  0 siblings, 1 reply; 1651+ messages in thread
From: Greg KH @ 2018-03-01 20:31 UTC (permalink / raw)
  To: Nishka Dasgupta
  Cc: outreachy-kernel, eric, stefan.wahren, f.fainelli, rjui, sbranden,
	bcm-kernel-feedback-list

On Fri, Mar 02, 2018 at 01:50:10AM +0530, Nishka Dasgupta wrote:
> This is with reference to your last email pointing out that you have no
> context for what I am responding to. Unfortunately, I have been unable
> to get mutt to load my inbox, so I could not and cannot quote text
> directly. I have done my best to reproduce the conversation below in a
> coherent fashion; nonetheless, I apologise for any persisting lack of
> clarity.

If mutt can't read your inbox, how are you reading it at all? :)

> An hour ago I submitted the following commit with respect to
> drivers/staging/vc04_services/bcm2835-camera/mmal-vchiq.c: [PATCH v2]
> staging: vc04_services: bcm2835-camera: Add blank line
> 
> Your reply: Checkpatch is wrong here, don't you think? Are you sure this
> is actually doing what you think it is?
> 
> My reply: Checkpatch suggested two warnings for this file in consecutive
> lines: "static VCHI_CONNECTION_T *vchi_connection;" and "static
> VCHI_INSTANCE_T vchi_instance;". Both warnings said to add a blank line
> after the declaration. If checkpatch was wrong, is it okay if I submit a
> version 2 with a blank line only after "vchi_instance" and not below
> "*vchi_connection" (effectively undoing one of my commits)?

Look at the code, and see what checkpatch is telling you and see if that
actually matches with what the code shows.  Then make the change that
you know is correct based on your knowledge of C, and how the code
should look.  Hint, checkpatch is wrong here, but the code is also wrong
as-is.

thanks,

greg k-h


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-01 20:31   ` Re: Greg KH
@ 2018-03-08 18:23     ` Nishka Dasgupta
  2018-03-08 18:33       ` Re: Greg KH
  0 siblings, 1 reply; 1651+ messages in thread
From: Nishka Dasgupta @ 2018-03-08 18:23 UTC (permalink / raw)
  To: gregkh
  Cc: outreachy-kernel, eric, stefan.wahren, f.fainelli, rjui, sbranden,
	bcm-kernel-feedback-list

This is with reference to the Patch I submitted for
staging/vc04_services/bcm2835-camera: Add blank line. (Since the last
message on that thread, I have managed to configure mutt, but several of
the more recent emails, however, failed to load in mutt, which is why I
am still not replying inline. I should be able to respond to all future
emails inline, however.)

The last email on the thread ended:
> Look at the code, and see what checkpatch is telling you and see if
> that actually matches with what the code shows. Then make the change
> that you know is correct based on your knowledge of C, and how the
> code should look. Hint, checkpatch is wrong here, but the code is also
> wrong as-is.
>
> thanks, greg k-h

I am afraid I still have not been able to locate the error. Is it a
problem of too many or too few asterisks in the line "static
VCHI_CONNECTION_T *vchi_connection"?

Thank you for your help!

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-08 18:23     ` Re: Nishka Dasgupta
@ 2018-03-08 18:33       ` Greg KH
  0 siblings, 0 replies; 1651+ messages in thread
From: Greg KH @ 2018-03-08 18:33 UTC (permalink / raw)
  To: Nishka Dasgupta
  Cc: outreachy-kernel, eric, stefan.wahren, f.fainelli, rjui, sbranden,
	bcm-kernel-feedback-list

On Thu, Mar 08, 2018 at 11:53:22PM +0530, Nishka Dasgupta wrote:
> The last email on the thread ended:
> > Look at the code, and see what checkpatch is telling you and see if
> > that actually matches with what the code shows. Then make the change
> > that you know is correct based on your knowledge of C, and how the
> > code should look. Hint, checkpatch is wrong here, but the code is also
> > wrong as-is.
> >
> > thanks, greg k-h
> 
> I am afraid I still have not been able to locate the error. Is it a
> problem of too many or too few asterisks in the line "static
> VCHI_CONNECTION_T *vchi_connection"?

I have no context at all here, to know what you are talking about, and
your emails are long-gone from my queue, sorry.

That being said, asterisks mean special things in the C language.  Be
sure you understand what they are, how they work, and when they are
needed.  Perhaps some more C language experience is needed here before
you work on the kernel?

thanks,

greg k-h


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [Outreachy kernel] [PATCH v2] staging: ks7010: Remove spaces after typecast to int
@ 2018-03-01 20:04 Julia Lawall
  2018-03-01 21:20 ` Nishka Dasgupta
  0 siblings, 1 reply; 1651+ messages in thread
From: Julia Lawall @ 2018-03-01 20:04 UTC (permalink / raw)
  To: Nishka Dasgupta; +Cc: gregkh, outreachy-kernel



On Fri, 2 Mar 2018, Nishka Dasgupta wrote:

> Remove spaces after typecast. Issue found with checkpatch.

I don't see any spaces in the staging tree.  Maybe a previous change of
yours added them, or maybe someone else already fixed the issue.

julia

>
> Signed-off-by: Nishka Dasgupta <nishka.dasgupta_ug18@ashoka.edu.in>
> ---
> Changes in v2:
>  - Revert to typecasting after removing in version 1.
>  - Remove space after typecasting.
>
>  drivers/staging/ks7010/ks_wlan_net.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/staging/ks7010/ks_wlan_net.c b/drivers/staging/ks7010/ks_wlan_net.c
> index 482af50..91acf87 100644
> --- a/drivers/staging/ks7010/ks_wlan_net.c
> +++ b/drivers/staging/ks7010/ks_wlan_net.c
> @@ -208,7 +208,7 @@ static int ks_wlan_set_freq(struct net_device *dev,
>  	/* for SLEEP MODE */
>  	/* If setting by frequency, convert to a channel */
>  	if ((fwrq->e == 1) &&
> -	    (fwrq->m >= (int) 2.412e8) && (fwrq->m <= (int) 2.487e8)) {
> +	    (fwrq->m >= (int)2.412e8) && (fwrq->m <= (int)2.487e8)) {
>  		int f = fwrq->m / 100000;
>  		int c = 0;
>
> --
> 1.9.1
>
> --
> You received this message because you are subscribed to the Google Groups "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to outreachy-kernel+unsubscribe@googlegroups.com.
> To post to this group, send email to outreachy-kernel@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/outreachy-kernel/1519931110-2763-1-git-send-email-nishka.dasgupta_ug18%40ashoka.edu.in.
> For more options, visit https://groups.google.com/d/optout.
>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-01 20:04 [Outreachy kernel] [PATCH v2] staging: ks7010: Remove spaces after typecast to int Julia Lawall
@ 2018-03-01 21:20 ` Nishka Dasgupta
  0 siblings, 0 replies; 1651+ messages in thread
From: Nishka Dasgupta @ 2018-03-01 21:20 UTC (permalink / raw)
  To: outreachy-kernel, julia.lawall

This is with response to your message that you don't see any spaces in
the staging tree.

Yes, the commit "remove spaces after typecast to int" was an unnecessary
commit since I added the spaces in the first place. This patch was
submitted before I read your email regarding not submitting patches to
fix the incorrect changes I have proposed. Sorry about that. Will focus
on new patches from next time.

Thank you for your time, and apologies  for the confusion.

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2018-03-01 21:17 Nishka Dasgupta
  0 siblings, 0 replies; 1651+ messages in thread
From: Nishka Dasgupta @ 2018-03-01 21:17 UTC (permalink / raw)
  To: outreachy-kernel, julia.lawall; +Cc: gregkh

This is with response to your message that you don't see any spaces in
the staging tree.

Yes, the commit "remove spaces after typecast to int" was an unnecessary
commit since I added the spaces in the first place. This patch was
submitted before I read your email regarding not submitting patches to
fix the incorrect changes I have proposed. Sorry about that. Will focus
on new patches from next time.

Thank you for your time, and apologies  for the confusion.

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [Outreachy kernel] Help with Mutt
@ 2018-03-02 18:01 Julia Lawall
  2018-03-03 18:27 ` Nishka Dasgupta
  0 siblings, 1 reply; 1651+ messages in thread
From: Julia Lawall @ 2018-03-02 18:01 UTC (permalink / raw)
  To: Nishka Dasgupta; +Cc: outreachy-kernel



On Fri, 2 Mar 2018, Nishka Dasgupta wrote:

> This is with response to your email: I think you will need to give more
> information. Do you get some error message? Does nothing happen? Does it
> crash? What did you do before, in terms of setting up configuration
> files? etc. Any particular thing about the situation that you can think
> of.
>
> My response to your email is as follows: I followed the instructions at
> OutreachyfirstpatchSetup and Outreachyfirstpatch at Linux kernel
> newbies. The instructions said to "say 'no' to creating an inbox for
> now". That is what I did. If I send an email to myself from within mutt
> (using the "m" command) it shows up in the inbox; but no other email
> sent to my account is visible. (Up until now I have been viewing emails
> and responses in a browser and composing replies in vim.)
>
> In accordance with the instructions, my muttrc file contains the
> following lines:
> set sendmail="/usr/bin/esmtp"
> set envelope_from=yes
> set from="Nishka Dasgupta <nishka....ashoka.edu.in>"
> set use_from=yes
> set edit_headers=yes
>
> My smtp and git appear to be set up fine, since those are how I have
> been sending emails so far. However, I don't know how to ensure that
> emails sent to my email account show up in mutt. How can I address this?

Maybe this will be helpful:

https://www.linux.com/news/fetching-email-mutt

julia


>
> Thank you for your time.
>
> Regards,
> Nishka Dasgupta
>
> --
> You received this message because you are subscribed to the Google Groups "outreachy-kernel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to outreachy-kernel+unsubscribe@googlegroups.com.
> To post to this group, send email to outreachy-kernel@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/outreachy-kernel/1519937773-28053-1-git-send-email-nishka.dasgupta_ug18%40ashoka.edu.in.
> For more options, visit https://groups.google.com/d/optout.
>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-02 18:01 [Outreachy kernel] Help with Mutt Julia Lawall
@ 2018-03-03 18:27 ` Nishka Dasgupta
  2018-03-03 18:38   ` Re: Julia Lawall
  0 siblings, 1 reply; 1651+ messages in thread
From: Nishka Dasgupta @ 2018-03-03 18:27 UTC (permalink / raw)
  To: julia.lawall, outreachy-kernel

This is with reference to your link with instructions on how to load my inbox in mutt. The instructions worked and I should be able to access all future correspondence in mutt. (The most recent emails, however, failed to load in mutt, which is why I am still not replying inline.)

Thank you for your help!

Regards, 
Nishka Dasgupta

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-03 18:27 ` Nishka Dasgupta
@ 2018-03-03 18:38   ` Julia Lawall
  0 siblings, 0 replies; 1651+ messages in thread
From: Julia Lawall @ 2018-03-03 18:38 UTC (permalink / raw)
  To: Nishka Dasgupta; +Cc: outreachy-kernel



On Sat, 3 Mar 2018, Nishka Dasgupta wrote:

> This is with reference to your link with instructions on how to load my inbox in mutt. The instructions worked and I should be able to access all future correspondence in mutt. (The most recent emails, however, failed to load in mutt, which is why I am still not replying inline.)

OK, it's great that you got the problem solved :)

julia


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <[PATCH xf86-video-amdgpu 0/3] Add non-desktop and leasing support>]

* (unknown), 
       [not found] <[PATCH xf86-video-amdgpu 0/3] Add non-desktop and leasing support>
@ 2018-03-03  4:49 ` Keith Packard
       [not found]   ` <20180303044931.6902-1-keithp-aN4HjG94KOLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: Keith Packard @ 2018-03-03  4:49 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: michel-otUistvHUpPR7s880joybQ, keithp-aN4HjG94KOLQT0dZR+AlfA

Here are the patches to the modesetting driver amended for the amdgpu
driver.

-keith

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20180303044931.6902-1-keithp-aN4HjG94KOLQT0dZR+AlfA@public.gmane.org>]

* Re:
       [not found]   ` <20180303044931.6902-1-keithp-aN4HjG94KOLQT0dZR+AlfA@public.gmane.org>
@ 2018-03-05 10:02     ` Michel Dänzer
       [not found]       ` <82fc592b-f680-c663-1a0f-7b522ca932d2-otUistvHUpPR7s880joybQ@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: Michel Dänzer @ 2018-03-05 10:02 UTC (permalink / raw)
  To: Keith Packard; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On 2018-03-03 05:49 AM, Keith Packard wrote:
> Here are the patches to the modesetting driver amended for the amdgpu
> driver.

Thanks for the patches. Unfortunately, since this driver still has to
compile and work with xserver >= 1.13, at least patches 1 & 3 cannot be
applied as is.

I was going to port these and take care of that anyway, though I might
not get around to it before April. If it can't wait that long, I can
give you details about what needs to be done.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <82fc592b-f680-c663-1a0f-7b522ca932d2-otUistvHUpPR7s880joybQ@public.gmane.org>]

* Re:
       [not found]       ` <82fc592b-f680-c663-1a0f-7b522ca932d2-otUistvHUpPR7s880joybQ@public.gmane.org>
@ 2018-03-05 16:41         ` Keith Packard
  0 siblings, 0 replies; 1651+ messages in thread
From: Keith Packard @ 2018-03-05 16:41 UTC (permalink / raw)
  To: Michel Dänzer; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 749 bytes --]

Michel Dänzer <michel-otUistvHUpPR7s880joybQ@public.gmane.org> writes:

> On 2018-03-03 05:49 AM, Keith Packard wrote:
>> Here are the patches to the modesetting driver amended for the amdgpu
>> driver.
>
> Thanks for the patches. Unfortunately, since this driver still has to
> compile and work with xserver >= 1.13, at least patches 1 & 3 cannot be
> applied as is.
>
> I was going to port these and take care of that anyway, though I might
> not get around to it before April. If it can't wait that long, I can
> give you details about what needs to be done.

I'm good with that -- I needed this to test amdgpu vs modesetting for
some applications, and just having the patches with support is good
enough for me.

-- 
-keith

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown)
@ 2018-03-05 17:06 Meghana Madhyastha
  2018-03-05 19:24 ` Noralf Trønnes
  0 siblings, 1 reply; 1651+ messages in thread
From: Meghana Madhyastha @ 2018-03-05 17:06 UTC (permalink / raw)
  To: Noralf Trønnes, Daniel Vetter, dri-devel

linux-spi@vger.kernel.org,Noralf Trønnes <noralf@tronnes.org>,Sean Paul <seanpaul@chromium.org>,kernel@martin.sperl.org
Cc: 
Bcc: 
Subject: Re: [PATCH v2 0/2] Chunk splitting of spi transfers
Reply-To: 
In-Reply-To: <f6dbf3ca-4c1b-90cc-c4af-8889f7407180@tronnes.org>

On Sun, Mar 04, 2018 at 06:38:42PM +0100, Noralf Trønnes wrote:
> 
> Den 02.03.2018 12.11, skrev Meghana Madhyastha:
> >On Sun, Feb 25, 2018 at 02:19:10PM +0100, Lukas Wunner wrote:
> >>[cc += linux-rpi-kernel@lists.infradead.org]
> >>
> >>On Sat, Feb 24, 2018 at 06:15:59PM +0000, Meghana Madhyastha wrote:
> >>>I've added bcm2835_spi_transfer_one_message in spi-bcm2835. This calls
> >>>spi_split_transfers_maxsize to split large chunks for spi dma transfers.
> >>>I then removed chunk splitting in the tinydrm spi helper (as now the core
> >>>is handling the chunk splitting). However, although the SPI HW should be
> >>>able to accomodate up to 65535 bytes for dma transfers, the splitting of
> >>>chunks to 65535 bytes results in a dma transfer time out error. However,
> >>>when the chunks are split to < 64 bytes it seems to work fine.
> >>Hm, that is really odd, how did you test this exactly, what did you
> >>use as SPI slave?  It contradicts our own experience, we're using
> >>Micrel KSZ8851 Ethernet chips as SPI slave on spi0 of a BCM2837
> >>and can send/receive messages via DMA to the tune of several hundred
> >>bytes without any issues.  In fact, for messages < 96 bytes, DMA is
> >>not used at all, so you've probably been using interrupt mode,
> >>see the BCM2835_SPI_DMA_MIN_LENGTH macro in spi-bcm2835.c.
> >Hi Lukas,
> >
> >I think you are right. I checked it and its not using the DMA mode which
> >is why its working with 64 bytes.
> >Noralf, that leaves us back to the
> >initial time out problem. I've tried doing the message splitting in
> >spi_sync as well as spi_pump_messages. Martin had explained that DMA
> >will wait for
> >the SPI HW to set the send_more_data line, but the SPI-HW itself will
> >stop triggering it when SPI_LEN is 0 causing DMA to wait forever. I
> >thought if we split it before itself, the SPI_LEN will not go to zero
> >thus preventing this problem, however it didn't work and started
> >hanging. So I'm a little uncertain as to how to proceed and debug what
> >exactly has caused the time out due to the asynchronous methods.
> 
> I did a quick test and at least this is working:
> 
> int tinydrm_spi_transfer(struct spi_device *spi, u32 speed_hz,
>              struct spi_transfer *header, u8 bpw, const void *buf,
>              size_t len)
> {
>     struct spi_transfer tr = {
>         .bits_per_word = bpw,
>         .speed_hz = speed_hz,
>         .tx_buf = buf,
>         .len = len,
>     };
>     struct spi_message m;
>     size_t maxsize;
>     int ret;
> 
>     maxsize = tinydrm_spi_max_transfer_size(spi, 0);
> 
>     if (drm_debug & DRM_UT_DRIVER)
>         pr_debug("[drm:%s] bpw=%u, maxsize=%zu, transfers:\n",
>              __func__, bpw, maxsize);
> 
>     spi_message_init(&m);
>     m.spi = spi;
>     if (header)
>         spi_message_add_tail(header, &m);
>     spi_message_add_tail(&tr, &m);
> 
>     ret = spi_split_transfers_maxsize(spi->controller, &m, maxsize,
> GFP_KERNEL);
>     if (ret)
>         return ret;
> 
>     tinydrm_dbg_spi_message(spi, &m);
> 
>     return spi_sync(spi, &m);
> }
> EXPORT_SYMBOL(tinydrm_spi_transfer);
> 
> 
> Log:
> [   39.015644] [drm:mipi_dbi_fb_dirty [mipi_dbi]] Flushing [FB:36] x1=0,
> x2=320, y1=0, y2=240
> 
> [   39.018079] [drm:mipi_dbi_typec3_command [mipi_dbi]] cmd=2a, par=00 00 01
> 3f
> [   39.018129] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
> [   39.018152]     tr(0): speed=10MHz, bpw=8, len=1, tx_buf=[2a]
> [   39.018231] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
> [   39.018248]     tr(0): speed=10MHz, bpw=8, len=4, tx_buf=[00 00 01 3f]
> 
> [   39.018330] [drm:mipi_dbi_typec3_command [mipi_dbi]] cmd=2b, par=00 00 00
> ef
> [   39.018347] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
> [   39.018362]     tr(0): speed=10MHz, bpw=8, len=1, tx_buf=[2b]
> [   39.018396] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
> [   39.018428]     tr(0): speed=10MHz, bpw=8, len=4, tx_buf=[00 00 00 ef]
> 
> [   39.018487] [drm:mipi_dbi_typec3_command [mipi_dbi]] cmd=2c, len=153600
> [   39.018502] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
> [   39.018517]     tr(0): speed=10MHz, bpw=8, len=1, tx_buf=[2c]
> [   39.018565] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
> [   39.018594]     tr(0): speed=48MHz, bpw=8, len=65532, tx_buf=[c6 18 c6 18
> c6 18 c6 18 c6 18 c6 18 c6 18 c6 18 ...]
> [   39.018608]     tr(1): speed=48MHz, bpw=8, len=65532, tx_buf=[06 18 06 18
> 06 18 06 18 06 18 06 18 06 18 06 18 ...]
> [   39.018621]     tr(2): speed=48MHz, bpw=8, len=22536, tx_buf=[10 82 10 82
> 10 82 10 82 10 82 10 82 18 e3 18 e3 ...]
Hi Noralf,

Yes this works but splitting in the spi subsystem doesn't seem to work.
So this means that spi_split_transfers_maxsize is working.
Should I just send in a patch with splitting done here in tinydrm? (I
had thought we wanted to avoid splitting in the tinydrm helper). 

Thanks and regards,
Meghana

> 
> Noralf.
> 
> 
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-05 17:06 (unknown) Meghana Madhyastha
@ 2018-03-05 19:24 ` Noralf Trønnes
  0 siblings, 0 replies; 1651+ messages in thread
From: Noralf Trønnes @ 2018-03-05 19:24 UTC (permalink / raw)
  To: Meghana Madhyastha, Daniel Vetter, dri-devel


Den 05.03.2018 18.06, skrev Meghana Madhyastha:
> linux-spi@vger.kernel.org,Noralf Trønnes <noralf@tronnes.org>,Sean Paul <seanpaul@chromium.org>,kernel@martin.sperl.org
> Cc:
> Bcc:
> Subject: Re: [PATCH v2 0/2] Chunk splitting of spi transfers
> Reply-To:
> In-Reply-To: <f6dbf3ca-4c1b-90cc-c4af-8889f7407180@tronnes.org>
>
> On Sun, Mar 04, 2018 at 06:38:42PM +0100, Noralf Trønnes wrote:
>> Den 02.03.2018 12.11, skrev Meghana Madhyastha:
>>> On Sun, Feb 25, 2018 at 02:19:10PM +0100, Lukas Wunner wrote:
>>>> [cc += linux-rpi-kernel@lists.infradead.org]
>>>>
>>>> On Sat, Feb 24, 2018 at 06:15:59PM +0000, Meghana Madhyastha wrote:
>>>>> I've added bcm2835_spi_transfer_one_message in spi-bcm2835. This calls
>>>>> spi_split_transfers_maxsize to split large chunks for spi dma transfers.
>>>>> I then removed chunk splitting in the tinydrm spi helper (as now the core
>>>>> is handling the chunk splitting). However, although the SPI HW should be
>>>>> able to accomodate up to 65535 bytes for dma transfers, the splitting of
>>>>> chunks to 65535 bytes results in a dma transfer time out error. However,
>>>>> when the chunks are split to < 64 bytes it seems to work fine.
>>>> Hm, that is really odd, how did you test this exactly, what did you
>>>> use as SPI slave?  It contradicts our own experience, we're using
>>>> Micrel KSZ8851 Ethernet chips as SPI slave on spi0 of a BCM2837
>>>> and can send/receive messages via DMA to the tune of several hundred
>>>> bytes without any issues.  In fact, for messages < 96 bytes, DMA is
>>>> not used at all, so you've probably been using interrupt mode,
>>>> see the BCM2835_SPI_DMA_MIN_LENGTH macro in spi-bcm2835.c.
>>> Hi Lukas,
>>>
>>> I think you are right. I checked it and its not using the DMA mode which
>>> is why its working with 64 bytes.
>>> Noralf, that leaves us back to the
>>> initial time out problem. I've tried doing the message splitting in
>>> spi_sync as well as spi_pump_messages. Martin had explained that DMA
>>> will wait for
>>> the SPI HW to set the send_more_data line, but the SPI-HW itself will
>>> stop triggering it when SPI_LEN is 0 causing DMA to wait forever. I
>>> thought if we split it before itself, the SPI_LEN will not go to zero
>>> thus preventing this problem, however it didn't work and started
>>> hanging. So I'm a little uncertain as to how to proceed and debug what
>>> exactly has caused the time out due to the asynchronous methods.
>> I did a quick test and at least this is working:
>>
>> int tinydrm_spi_transfer(struct spi_device *spi, u32 speed_hz,
>>               struct spi_transfer *header, u8 bpw, const void *buf,
>>               size_t len)
>> {
>>      struct spi_transfer tr = {
>>          .bits_per_word = bpw,
>>          .speed_hz = speed_hz,
>>          .tx_buf = buf,
>>          .len = len,
>>      };
>>      struct spi_message m;
>>      size_t maxsize;
>>      int ret;
>>
>>      maxsize = tinydrm_spi_max_transfer_size(spi, 0);
>>
>>      if (drm_debug & DRM_UT_DRIVER)
>>          pr_debug("[drm:%s] bpw=%u, maxsize=%zu, transfers:\n",
>>               __func__, bpw, maxsize);
>>
>>      spi_message_init(&m);
>>      m.spi = spi;
>>      if (header)
>>          spi_message_add_tail(header, &m);
>>      spi_message_add_tail(&tr, &m);
>>
>>      ret = spi_split_transfers_maxsize(spi->controller, &m, maxsize,
>> GFP_KERNEL);
>>      if (ret)
>>          return ret;
>>
>>      tinydrm_dbg_spi_message(spi, &m);
>>
>>      return spi_sync(spi, &m);
>> }
>> EXPORT_SYMBOL(tinydrm_spi_transfer);
>>
>>
>> Log:
>> [   39.015644] [drm:mipi_dbi_fb_dirty [mipi_dbi]] Flushing [FB:36] x1=0,
>> x2=320, y1=0, y2=240
>>
>> [   39.018079] [drm:mipi_dbi_typec3_command [mipi_dbi]] cmd=2a, par=00 00 01
>> 3f
>> [   39.018129] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018152]     tr(0): speed=10MHz, bpw=8, len=1, tx_buf=[2a]
>> [   39.018231] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018248]     tr(0): speed=10MHz, bpw=8, len=4, tx_buf=[00 00 01 3f]
>>
>> [   39.018330] [drm:mipi_dbi_typec3_command [mipi_dbi]] cmd=2b, par=00 00 00
>> ef
>> [   39.018347] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018362]     tr(0): speed=10MHz, bpw=8, len=1, tx_buf=[2b]
>> [   39.018396] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018428]     tr(0): speed=10MHz, bpw=8, len=4, tx_buf=[00 00 00 ef]
>>
>> [   39.018487] [drm:mipi_dbi_typec3_command [mipi_dbi]] cmd=2c, len=153600
>> [   39.018502] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018517]     tr(0): speed=10MHz, bpw=8, len=1, tx_buf=[2c]
>> [   39.018565] [drm:tinydrm_spi_transfer] bpw=8, maxsize=65532, transfers:
>> [   39.018594]     tr(0): speed=48MHz, bpw=8, len=65532, tx_buf=[c6 18 c6 18
>> c6 18 c6 18 c6 18 c6 18 c6 18 c6 18 ...]
>> [   39.018608]     tr(1): speed=48MHz, bpw=8, len=65532, tx_buf=[06 18 06 18
>> 06 18 06 18 06 18 06 18 06 18 06 18 ...]
>> [   39.018621]     tr(2): speed=48MHz, bpw=8, len=22536, tx_buf=[10 82 10 82
>> 10 82 10 82 10 82 10 82 18 e3 18 e3 ...]
> Hi Noralf,
>
> Yes this works but splitting in the spi subsystem doesn't seem to work.
> So this means that spi_split_transfers_maxsize is working.
> Should I just send in a patch with splitting done here in tinydrm? (I
> had thought we wanted to avoid splitting in the tinydrm helper).

Oh, I assumed you didn't get this to work in any way.
Yes, I prefer splitting without the client's knowledge.

Looking at the code the splitting has to happen before spi_map_msg() is
called. Have you tried to do it in the prepare_message callback?

static void __spi_pump_messages(struct spi_controller *ctlr, bool 
in_kthread)
{
<...>
     if (ctlr->prepare_message) {
         ret = ctlr->prepare_message(ctlr, ctlr->cur_msg);
<...>
     ret = spi_map_msg(ctlr, ctlr->cur_msg);
<...>
     ret = ctlr->transfer_one_message(ctlr, ctlr->cur_msg);
<...>
}

There was something wrong with this email, it was missing subject and
several recipients.

Noralf.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CABxXbAeQTGbiAEaFHK4RUTFGxt0A+KnCCmhJNU9XDivW5=SL-Q@mail.gmail.com>]

* (no subject)
       [not found] <CABxXbAeQTGbiAEaFHK4RUTFGxt0A+KnCCmhJNU9XDivW5=SL-Q@mail.gmail.com>
@ 2018-03-08 18:23 ` Ivan Lapuz
  2018-03-08 18:33   ` Tommy Bowditch
  2018-03-08 18:36   ` Re: Ibrahim Tachijian
  0 siblings, 2 replies; 1651+ messages in thread
From: Ivan Lapuz @ 2018-03-08 18:23 UTC (permalink / raw)
  To: wireguard

[-- Attachment #1: Type: text/plain, Size: 90 bytes --]

Hi there pls can you help me how to set up correctly for my wireguard
interface..thankyou

[-- Attachment #2: Type: text/html, Size: 112 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-08 18:23 ` Ivan Lapuz
@ 2018-03-08 18:33   ` Tommy Bowditch
  2018-03-08 18:36   ` Re: Ibrahim Tachijian
  1 sibling, 0 replies; 1651+ messages in thread
From: Tommy Bowditch @ 2018-03-08 18:33 UTC (permalink / raw)
  To: Ivan Lapuz; +Cc: wireguard

[-- Attachment #1: Type: text/plain, Size: 390 bytes --]

What step do you need help with? What are you stuck on?

T

On Thu, Mar 8, 2018 at 6:23 PM, Ivan Lapuz <ivanlapuz9@gmail.com> wrote:

> Hi there pls can you help me how to set up correctly for my wireguard
> interface..thankyou
>
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> https://lists.zx2c4.com/mailman/listinfo/wireguard
>
>

[-- Attachment #2: Type: text/html, Size: 897 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-03-08 18:23 ` Ivan Lapuz
  2018-03-08 18:33   ` Tommy Bowditch
@ 2018-03-08 18:36   ` Ibrahim Tachijian
  1 sibling, 0 replies; 1651+ messages in thread
From: Ibrahim Tachijian @ 2018-03-08 18:36 UTC (permalink / raw)
  To: Ivan Lapuz; +Cc: wireguard

[-- Attachment #1: Type: text/plain, Size: 413 bytes --]

Sure we can.

You can follow the instructions here:

https://www.wireguard.com/quickstart/

On Thu, Mar 8, 2018, 19:26 Ivan Lapuz <ivanlapuz9@gmail.com> wrote:

> Hi there pls can you help me how to set up correctly for my wireguard
> interface..thankyou
> _______________________________________________
> WireGuard mailing list
> WireGuard@lists.zx2c4.com
> https://lists.zx2c4.com/mailman/listinfo/wireguard
>

[-- Attachment #2: Type: text/html, Size: 1005 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAAEAJfB76xseRqnYQfRihXY6g0Jyqwt8zfddU1W7CXDg3xEFFg@mail.gmail.com>]

* Re:
       [not found] <CAAEAJfB76xseRqnYQfRihXY6g0Jyqwt8zfddU1W7CXDg3xEFFg@mail.gmail.com>
@ 2018-04-02 11:20 ` Ratheendran R
  2018-04-02 17:19   ` Re: Steve deRosier
  0 siblings, 1 reply; 1651+ messages in thread
From: Ratheendran R @ 2018-04-02 11:20 UTC (permalink / raw)
  To: Ezequiel Garcia; +Cc: backports

Hi All,

I am doing a mt7601 backport 4.14 on linux 2.6.37 kernel.

Now when I build the backorts aganist the 2.6.37 linux build I am
getting the below error.
'error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer consta=
nt'

compilation error actuals
/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/=
wireless/mediatek/mt7601u/main.c:181:3:
error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer constan=
t
/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/=
wireless/mediatek/mt7601u/main.c:
In function =E2=80=98mt7601u_set_rts_threshold=E2=80=99:
/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/=
wireless/mediatek/mt7601u/main.c:329:2:
error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer constan=
t
make[8]: *** [/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-=
1/drivers/net/wireless/mediatek/mt7601u/main.o]
Error 1


Can anyone let me know how to fix this.

Thanks in Advance.
Ratheendran

On 3/15/17, Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> wrote:
> subscribe backports
> --
> To unsubscribe from this list: send the line "unsubscribe backports" in
>
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-04-02 11:20 ` Re: Ratheendran R
@ 2018-04-02 17:19   ` Steve deRosier
  2018-04-04  7:31     ` Re: Arend van Spriel
  0 siblings, 1 reply; 1651+ messages in thread
From: Steve deRosier @ 2018-04-02 17:19 UTC (permalink / raw)
  To: Ratheendran R; +Cc: Ezequiel Garcia, backports

I apologize for the resend... backports ML didn't pick it up because
apparently there's no way to set my tablet's email client to NOT send
HTML emails. Live and learn.

Hi Ratheendran,

On Mon, Apr 2, 2018 at 4:21 AM Ratheendran R <ratheendran.s@gmail.com> wrot=
e:
>
> Hi All,
>
> I am doing a mt7601 backport 4.14 on linux 2.6.37 kernel.
>
> Now when I build the backorts aganist the 2.6.37 linux build I am
> getting the below error.
> 'error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer cons=
tant'
>
> compilation error actuals
> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/ne=
t/wireless/mediatek/mt7601u/main.c:181:3:
> error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer const=
ant
> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/ne=
t/wireless/mediatek/mt7601u/main.c:
> In function =E2=80=98mt7601u_set_rts_threshold=E2=80=99:
> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/ne=
t/wireless/mediatek/mt7601u/main.c:329:2:
> error: bit-field =E2=80=98<anonymous>=E2=80=99 width not an integer const=
ant
> make[8]: *** [/home/hitem//software/source/wifi-drivers/backports-4.14-rc=
2-1/drivers/net/wireless/mediatek/mt7601u/main.o]
> Error 1

In order to backport 4.14, I assume you=E2=80=99re using a current backport=
s
build. I=E2=80=99m sorry to tell you, but the current version of backports
isn=E2=80=99t supported to backport that far. The backports wiki say we
support only back to 3.0, and IIRC, in more recent descusions, we
abandoned support even back that far. Though I honestly don=E2=80=99t recal=
l
to which version we=E2=80=99re testing for anymore.

No one ever likes this answer (including me), but perhaps you might
consider upgrading your base kernel release. 2.6.37 was released eight
years ago and 37 major version releases happened between there and
4.14. Your version of the kernel is missing support for some major
features and gobs of security fixes. It=E2=80=99ll be hard work to push it
forward eight years, but once you get there, keeping the kernel
up-to-date is pretty easy.

You might get it to work against such an old kernel, but it might take
a bit of work and ingenuity.

Good luck,
- Steve

--
Steve deRosier
Cal-Sierra Consulting LLC
https://www.cal-sierra.com/
--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-04-02 17:19   ` Re: Steve deRosier
@ 2018-04-04  7:31     ` Arend van Spriel
  0 siblings, 0 replies; 1651+ messages in thread
From: Arend van Spriel @ 2018-04-04  7:31 UTC (permalink / raw)
  To: Steve deRosier, Ratheendran R; +Cc: Ezequiel Garcia, backports

On 4/2/2018 7:19 PM, Steve deRosier wrote:
> I apologize for the resend... backports ML didn't pick it up because
> apparently there's no way to set my tablet's email client to NOT send
> HTML emails. Live and learn.
>
> Hi Ratheendran,
>
> On Mon, Apr 2, 2018 at 4:21 AM Ratheendran R <ratheendran.s@gmail.com> wrote:
>>
>> Hi All,
>>
>> I am doing a mt7601 backport 4.14 on linux 2.6.37 kernel.
>>
>> Now when I build the backorts aganist the 2.6.37 linux build I am
>> getting the below error.
>> 'error: bit-field ‘<anonymous>’ width not an integer constant'
>>
>> compilation error actuals
>> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/wireless/mediatek/mt7601u/main.c:181:3:
>> error: bit-field ‘<anonymous>’ width not an integer constant
>> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/wireless/mediatek/mt7601u/main.c:
>> In function ‘mt7601u_set_rts_threshold’:
>> /home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/wireless/mediatek/mt7601u/main.c:329:2:
>> error: bit-field ‘<anonymous>’ width not an integer constant
>> make[8]: *** [/home/hitem//software/source/wifi-drivers/backports-4.14-rc2-1/drivers/net/wireless/mediatek/mt7601u/main.o]
>> Error 1
>
>
> In order to backport 4.14, I assume you’re using a current backports
> build. I’m sorry to tell you, but the current version of backports
> isn’t supported to backport that far. The backports wiki say we
> support only back to 3.0, and IIRC, in more recent descusions, we
> abandoned support even back that far. Though I honestly don’t recall
> to which version we’re testing for anymore.

If my recollection is correct we currently support backports to 3.10 and 
above. Actually, it is in the README. Now the README also refers to the 
wiki for "more-up-to date" information, but for this particular piece of 
information that is not true.

> No one ever likes this answer (including me), but perhaps you might
> consider upgrading your base kernel release. 2.6.37 was released eight
> years ago and 37 major version releases happened between there and
> 4.14. Your version of the kernel is missing support for some major
> features and gobs of security fixes. It’ll be hard work to push it
> forward eight years, but once you get there, keeping the kernel
> up-to-date is pretty easy.
>
> You might get it to work against such an old kernel, but it might take
> a bit of work and ingenuity.

You could try to include mt7601 into a backports-3.14 package, which 
support backporting to 2.6.26. It depends what ieee80211_hw ops it 
implements and what linux infrastructure is used. The bitfields stuff 
would need backporting, which you can probably get from the latest 
backports package. It is going to be a lot more work to get it going so 
the easier path is upgrading your kernel.

Regards,
Arend

--
To unsubscribe from this list: send the line "unsubscribe backports" in

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH v3 2/3] merge: Add merge.renames config setting
@ 2018-04-27  0:54 Ben Peart
  2018-04-27 18:19 ` Elijah Newren
  0 siblings, 1 reply; 1651+ messages in thread
From: Ben Peart @ 2018-04-27  0:54 UTC (permalink / raw)
  To: Elijah Newren, Ben Peart
  Cc: git@vger.kernel.org, peff@peff.net, gitster@pobox.com,
	pclouds@gmail.com, vmiklos@frugalware.org, Kevin Willford,
	Johannes.Schindelin@gmx.de, eckhard.s.maass@googlemail.com

On 4/26/2018 6:52 PM, Elijah Newren wrote:
> On Thu, Apr 26, 2018 at 1:52 PM, Ben Peart <Ben.Peart@microsoft.com> wrote:
> 
>> +merge.renames::
>> +       Whether and how Git detects renames.  If set to "false",
>> +       rename detection is disabled. If set to "true", basic rename
>> +       detection is enabled.  If set to "copies" or "copy", Git will
>> +       detect copies, as well.  Defaults to the value of diff.renames.
>> +
> 
> We shouldn't allow users to force copy detection on for merges  The
> diff side of the code will detect them correctly but the code in
> merge-recursive will mishandle the copy pairs.  I think fixing it is
> somewhere between big can of worms and
> it's-a-configuration-that-doesn't-even-make-sense, but it's been a
> while since I thought about it.

Color me puzzled. :)  The consensus was that the default value for 
merge.renames come from diff.renames.  diff.renames supports copy 
detection which means that merge.renames will inherit that value.  My 
assumption was that is what was intended so when I reimplemented it, I 
fully implemented it that way.

Are you now requesting to only use diff.renames as the default if the 
value is true or false but not if it is copy?  What should happen if 
diff.renames is actually set to copy?  Should merge silently change that 
to true, display a warning, error out, or something else?  Do you have 
some other behavior for how to handle copy being inherited from 
diff.renames you'd like to see?

Can you write the documentation that clearly explains the exact behavior 
you want?  That would kill two birds with one stone... :)

> 
>> diff --git a/merge-recursive.h b/merge-recursive.h
>> index 80d69d1401..0c5f7eff98 100644
>> --- a/merge-recursive.h
>> +++ b/merge-recursive.h
>> @@ -17,7 +17,8 @@ struct merge_options {
>>          unsigned renormalize : 1;
>>          long xdl_opts;
>>          int verbosity;
>> -       int detect_rename;
>> +       int diff_detect_rename;
>> +       int merge_detect_rename;
>>          int diff_rename_limit;
>>          int merge_rename_limit;
>>          int rename_score;
>> @@ -28,6 +29,11 @@ struct merge_options {
>>          struct hashmap current_file_dir_set;
>>          struct string_list df_conflict_file_set;
>>   };
>> +inline int merge_detect_rename(struct merge_options *o)
>> +{
>> +       return o->merge_detect_rename >= 0 ? o->merge_detect_rename :
>> +               o->diff_detect_rename >= 0 ? o->diff_detect_rename : 1;
>> +}
> 
> Why did you split o->detect_rename into two fields?  You then
> recombine them in merge_detect_rename(), and after initial setup only
> ever access them through that function.  Having two fields worries me
> that people will accidentally introduce bugs by using one of them
> instead of the merge_detect_rename() function.  Is there a reason you
> decided against having the initial setup just set a single value and
> then use it directly?
> 

The setup of this value is split into 3 places that may or may not all 
get called.  The initial values, the values that come from the config 
settings and then any values passed on the command line.

Because the merge value can now inherit from the diff value, you only 
know the final value after you have received all possible inputs.  That 
makes it necessary to be a calculated value.

If you look at diff_rename_limit/merge_rename_limit, detect_rename 
follow the same pattern for the same reasons.  It turns out 
detect_rename was a little more complex because it is used in 3 
different locations (vs just one) which is why I wrapped the inheritance 
logic into the helper function merge_detect_rename().

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2018-04-27  0:54 [PATCH v3 2/3] merge: Add merge.renames config setting Ben Peart
@ 2018-04-27 18:19 ` Elijah Newren
  2018-04-30 13:11   ` Ben Peart
  0 siblings, 1 reply; 1651+ messages in thread
From: Elijah Newren @ 2018-04-27 18:19 UTC (permalink / raw)
  To: peartben
  Cc: Ben.Peart, Johannes.Schindelin, eckhard.s.maass, git, gitster,
	kewillf, pclouds, peff, vmiklos, Elijah Newren

From: Elijah Newren <newren@gmail.com>

On Thu, Apr 26, 2018 at 5:54 PM, Ben Peart <peartben@gmail.com> wrote:

> Can you write the documentation that clearly explains the exact behavior you
> want?  That would kill two birds with one stone... :)

Sure, something like the following is what I envision, and I've tried to
include the suggestion from Junio to document the copy behavior in the
merge-recursive documentation.

-- 8< --
Subject: [PATCH] fixup! merge: Add merge.renames config setting

---
 Documentation/merge-config.txt     | 3 +--
 Documentation/merge-strategies.txt | 5 +++--
 merge-recursive.c                  | 8 ++++++++
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
index 59848e5634..662c2713ca 100644
--- a/Documentation/merge-config.txt
+++ b/Documentation/merge-config.txt
@@ -41,8 +41,7 @@ merge.renameLimit::
 merge.renames::
 	Whether and how Git detects renames.  If set to "false",
 	rename detection is disabled. If set to "true", basic rename
-	detection is enabled.  If set to "copies" or "copy", Git will
-	detect copies, as well.  Defaults to the value of diff.renames.
+	detection is enabled.  Defaults to the value of diff.renames.
 
 merge.renormalize::
 	Tell Git that canonical representation of files in the
diff --git a/Documentation/merge-strategies.txt b/Documentation/merge-strategies.txt
index 1e0728aa12..aa66cbe41e 100644
--- a/Documentation/merge-strategies.txt
+++ b/Documentation/merge-strategies.txt
@@ -23,8 +23,9 @@ recursive::
 	causing mismerges by tests done on actual merge commits
 	taken from Linux 2.6 kernel development history.
 	Additionally this can detect and handle merges involving
-	renames.  This is the default merge strategy when
-	pulling or merging one branch.
+	renames, but currently cannot make use of detected
+	copies.  This is the default merge strategy when pulling
+	or merging one branch.
 +
 The 'recursive' strategy can take the following options:
 
diff --git a/merge-recursive.c b/merge-recursive.c
index 6cc4404144..b618f134d2 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -564,6 +564,14 @@ static struct string_list *get_renames(struct merge_options *o,
 	opts.flags.recursive = 1;
 	opts.flags.rename_empty = 0;
 	opts.detect_rename = merge_detect_rename(o);
+	/*
+	 * We do not have logic to handle the detection of copies.  In
+	 * fact, it may not even make sense to add such logic: would we
+	 * really want a change to a base file to be propagated through
+	 * multiple other files by a merge?
+	 */
+	if (opts.detect_rename > DIFF_DETECT_RENAME)
+		opts.detect_rename = DIFF_DETECT_RENAME;
 	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
 			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
 			    1000;
-- 
2.17.0.294.g8e3b83c0e3


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2018-04-27 18:19 ` Elijah Newren
@ 2018-04-30 13:11   ` Ben Peart
  2018-04-30 16:12     ` Re: Elijah Newren
  0 siblings, 1 reply; 1651+ messages in thread
From: Ben Peart @ 2018-04-30 13:11 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Ben.Peart, Johannes.Schindelin, eckhard.s.maass, git, gitster,
	kewillf, pclouds, peff, vmiklos, Elijah Newren



On 4/27/2018 2:19 PM, Elijah Newren wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> On Thu, Apr 26, 2018 at 5:54 PM, Ben Peart <peartben@gmail.com> wrote:
> 
>> Can you write the documentation that clearly explains the exact behavior you
>> want?  That would kill two birds with one stone... :)
> 
> Sure, something like the following is what I envision, and I've tried to
> include the suggestion from Junio to document the copy behavior in the
> merge-recursive documentation.
> 
> -- 8< --
> Subject: [PATCH] fixup! merge: Add merge.renames config setting
> 
> ---
>   Documentation/merge-config.txt     | 3 +--
>   Documentation/merge-strategies.txt | 5 +++--
>   merge-recursive.c                  | 8 ++++++++
>   3 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/merge-config.txt b/Documentation/merge-config.txt
> index 59848e5634..662c2713ca 100644
> --- a/Documentation/merge-config.txt
> +++ b/Documentation/merge-config.txt
> @@ -41,8 +41,7 @@ merge.renameLimit::
>   merge.renames::
>   	Whether and how Git detects renames.  If set to "false",
>   	rename detection is disabled. If set to "true", basic rename
> -	detection is enabled.  If set to "copies" or "copy", Git will
> -	detect copies, as well.  Defaults to the value of diff.renames.
> +	detection is enabled.  Defaults to the value of diff.renames.
>   
>   merge.renormalize::
>   	Tell Git that canonical representation of files in the
> diff --git a/Documentation/merge-strategies.txt b/Documentation/merge-strategies.txt
> index 1e0728aa12..aa66cbe41e 100644
> --- a/Documentation/merge-strategies.txt
> +++ b/Documentation/merge-strategies.txt
> @@ -23,8 +23,9 @@ recursive::
>   	causing mismerges by tests done on actual merge commits
>   	taken from Linux 2.6 kernel development history.
>   	Additionally this can detect and handle merges involving
> -	renames.  This is the default merge strategy when
> -	pulling or merging one branch.
> +	renames, but currently cannot make use of detected
> +	copies.  This is the default merge strategy when pulling
> +	or merging one branch.
>   +
>   The 'recursive' strategy can take the following options:
>   
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 6cc4404144..b618f134d2 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -564,6 +564,14 @@ static struct string_list *get_renames(struct merge_options *o,
>   	opts.flags.recursive = 1;
>   	opts.flags.rename_empty = 0;
>   	opts.detect_rename = merge_detect_rename(o);
> +	/*
> +	 * We do not have logic to handle the detection of copies.  In
> +	 * fact, it may not even make sense to add such logic: would we
> +	 * really want a change to a base file to be propagated through
> +	 * multiple other files by a merge?
> +	 */
> +	if (opts.detect_rename > DIFF_DETECT_RENAME)
> +		opts.detect_rename = DIFF_DETECT_RENAME;
>   	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
>   			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
>   			    1000;
> 

Thanks Elijah. I've applied this patch and reviewed and tested it.  It 
works and addresses the concerns around the settings inheritance from 
diff.renames.  I still _prefer_ the simpler model that doesn't do the 
partial inheritance but I can use this model as well.

I'm unsure on the protocol here.  Should I incorporate this patch and 
submit a reroll or can it just be applied as is?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-04-30 13:11   ` Ben Peart
@ 2018-04-30 16:12     ` Elijah Newren
  2018-05-02 14:33       ` Re: Ben Peart
  0 siblings, 1 reply; 1651+ messages in thread
From: Elijah Newren @ 2018-04-30 16:12 UTC (permalink / raw)
  To: Ben Peart
  Cc: Elijah Newren, Ben Peart, Johannes Schindelin, Eckhard Maaß,
	Git Mailing List, Junio C Hamano, Kevin Willford,
	Nguyễn Thái Ngọc, Jeff King, Miklos Vajna

On Mon, Apr 30, 2018 at 6:11 AM, Ben Peart <peartben@gmail.com> wrote:
> On 4/27/2018 2:19 PM, Elijah Newren wrote:
>>
>> From: Elijah Newren <newren@gmail.com>
>>
>> On Thu, Apr 26, 2018 at 5:54 PM, Ben Peart <peartben@gmail.com> wrote:
>>
>>> Can you write the documentation that clearly explains the exact behavior
>>> you
>>> want?  That would kill two birds with one stone... :)
>>
>>
>> Sure, something like the following is what I envision, and I've tried to
>> include the suggestion from Junio to document the copy behavior in the
>> merge-recursive documentation.
>>
<snip>
>
> Thanks Elijah. I've applied this patch and reviewed and tested it.  It works
> and addresses the concerns around the settings inheritance from
> diff.renames.  I still _prefer_ the simpler model that doesn't do the
> partial inheritance but I can use this model as well.
>
> I'm unsure on the protocol here.  Should I incorporate this patch and submit
> a reroll or can it just be applied as is?

I suspect you'll want to re-roll anyway, to base your series on
en/rename-directory-detection-reboot instead of on master.  (Junio
plans to merge it down to next, and your series has four different
merge conflicts with it.)

There are two other loose ends with this series that Junio will need
to weigh in on:

- I'm obviously a strong proponent of the inherited setting, but Junio
may change his mind after reading Dscho's arguments against it (or
after reading my arguments for it).

- I like the setting as-is, and think we could allow a "copy" setting
for merge.renames to specify that the post-merge diffstat should
detect copies (not part of your series, but a useful addition I'd like
to tackle afterwards).  However, Junio had comments in
xmqqwox19ohw.fsf@gitster-ct.c.googlers.com about merge.renames
handling the scoring as well, like -Xfind-renames.  Those sound
incompatible to me for a single setting, and I'm unsure if Junio would
resolve them the way I do or still feels strongly about the scoring.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-04-30 16:12     ` Re: Elijah Newren
@ 2018-05-02 14:33       ` Ben Peart
  0 siblings, 0 replies; 1651+ messages in thread
From: Ben Peart @ 2018-05-02 14:33 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Elijah Newren, Ben Peart, Johannes Schindelin, Eckhard Maaß,
	Git Mailing List, Junio C Hamano, Kevin Willford,
	Nguyễn Thái Ngọc, Jeff King, Miklos Vajna



On 4/30/2018 12:12 PM, Elijah Newren wrote:
> On Mon, Apr 30, 2018 at 6:11 AM, Ben Peart <peartben@gmail.com> wrote:
>> On 4/27/2018 2:19 PM, Elijah Newren wrote:
>>>
>>> From: Elijah Newren <newren@gmail.com>
>>>
>>> On Thu, Apr 26, 2018 at 5:54 PM, Ben Peart <peartben@gmail.com> wrote:
>>>
>>>> Can you write the documentation that clearly explains the exact behavior
>>>> you
>>>> want?  That would kill two birds with one stone... :)
>>>
>>>
>>> Sure, something like the following is what I envision, and I've tried to
>>> include the suggestion from Junio to document the copy behavior in the
>>> merge-recursive documentation.
>>>
> <snip>
>>
>> Thanks Elijah. I've applied this patch and reviewed and tested it.  It works
>> and addresses the concerns around the settings inheritance from
>> diff.renames.  I still _prefer_ the simpler model that doesn't do the
>> partial inheritance but I can use this model as well.
>>
>> I'm unsure on the protocol here.  Should I incorporate this patch and submit
>> a reroll or can it just be applied as is?
> 
> I suspect you'll want to re-roll anyway, to base your series on
> en/rename-directory-detection-reboot instead of on master.  (Junio
> plans to merge it down to next, and your series has four different
> merge conflicts with it.)
> 
> There are two other loose ends with this series that Junio will need
> to weigh in on:
> 
> - I'm obviously a strong proponent of the inherited setting, but Junio
> may change his mind after reading Dscho's arguments against it (or
> after reading my arguments for it).
> 
> - I like the setting as-is, and think we could allow a "copy" setting
> for merge.renames to specify that the post-merge diffstat should
> detect copies (not part of your series, but a useful addition I'd like
> to tackle afterwards).  However, Junio had comments in
> xmqqwox19ohw.fsf@gitster-ct.c.googlers.com about merge.renames
> handling the scoring as well, like -Xfind-renames.  Those sound
> incompatible to me for a single setting, and I'm unsure if Junio would
> resolve them the way I do or still feels strongly about the scoring.
> 

I think this patch series (including Elijah's fixup!) improves the 
situation from where we were and it provides the necessary functionality 
to solve the problem I started out to solve.  While there are other 
changes that could be made, I think they should be done in separate 
follow up patches.

I'm happy to reroll this incorporating the fixup! so that we can make 
progress.  Junio, would you prefer I reroll this based on 
en/rename-directory-detection-reboot or master?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2018-08-28 17:34 Bills, Jason M
  2018-08-28 17:59 ` Brad Bishop
  0 siblings, 1 reply; 1651+ messages in thread
From: Bills, Jason M @ 2018-08-28 17:34 UTC (permalink / raw)
  To: openbmc@lists.ozlabs.org

I just added a comment to a github discussion about the IPMI SEL and thought I should share it here as well:

I have been working on a proof-of-concept to move the IPMI SEL entries out of D-Bus into journald instead.

Since journald allows custom metadata for log entries, I've thought of having the SEL message logged to the journal and using metadata to store the necessary IPMI info associated with the entry. Here is an example of logging a type 0x02 system event entry to journald:

sd_journal_send("MESSAGE=%s", message.c_str(),
                            "PRIORITY=%i", selPriority,
                            "MESSAGE_ID=%s", selMessageId,
                            "IPMI_SEL_RECORD_ID=%d", recordId,
                            "IPMI_SEL_RECORD_TYPE=%x", selSystemType,
                            "IPMI_SEL_GENERATOR_ID=%x", genId,
                            "IPMI_SEL_SENSOR_PATH=%s", path.c_str(),
                            "IPMI_SEL_EVENT_DIR=%x", assert,
                            "IPMI_SEL_DATA=%s", selDataStr,
                            NULL);
Using journald should allow for scaling to more SEL entries which should also enable us to support more generic IPMI behavior such as the Add SEL command.

-Jason

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-08-28 17:34 Bills, Jason M
@ 2018-08-28 17:59 ` Brad Bishop
  2018-08-28 23:26   ` Bills, Jason M
  0 siblings, 1 reply; 1651+ messages in thread
From: Brad Bishop @ 2018-08-28 17:59 UTC (permalink / raw)
  To: Bills, Jason M; +Cc: openbmc@lists.ozlabs.org



> On Aug 28, 2018, at 1:34 PM, Bills, Jason M <jason.m.bills@intel.com> wrote:
> 
> I just added a comment to a github discussion about the IPMI SEL and thought I should share it here as well:

Can you send a link to the issue?

> 
> I have been working on a proof-of-concept to move the IPMI SEL entries out of D-Bus into journald instead.
> 
> Since journald allows custom metadata for log entries, I've thought of having the SEL message logged to the journal and using metadata to store the necessary IPMI info associated with the entry. Here is an example of logging a type 0x02 system event entry to journald:
> 
> sd_journal_send("MESSAGE=%s", message.c_str(),
>                            "PRIORITY=%i", selPriority,
>                            "MESSAGE_ID=%s", selMessageId,
>                            "IPMI_SEL_RECORD_ID=%d", recordId,
>                            "IPMI_SEL_RECORD_TYPE=%x", selSystemType,
>                            "IPMI_SEL_GENERATOR_ID=%x", genId,
>                            "IPMI_SEL_SENSOR_PATH=%s", path.c_str(),
>                            "IPMI_SEL_EVENT_DIR=%x", assert,
>                            "IPMI_SEL_DATA=%s", selDataStr,
>                            NULL);
> Using journald should allow for scaling to more SEL entries which should also enable us to support more generic IPMI behavior such as the Add SEL command.

A design point of OpenBMC from day one was to not design it around IPMI.
At a glance this feels counter to that goal.

I’m not immediately opposed to moving our error logs out of DBus, but can you provide
an extendible abstraction?  Not everyone uses SEL, or IPMI even.  At a minimum please
drop the letters ‘ipmi’ and ‘sel’ :-) from the base design, and save those for something
that translates to IPMI-speak.

As some background, our systems tend towards fewer ‘error logs’ with much more data per
log (4-16k), and yes I admit the current design is biased towards that and does not scale
when we approach 1000s of small SEL entries.

thx - brad

> 
> -Jason

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
  2018-08-28 17:59 ` Brad Bishop
@ 2018-08-28 23:26   ` Bills, Jason M
  2018-09-04 20:46     ` Brad Bishop
  0 siblings, 1 reply; 1651+ messages in thread
From: Bills, Jason M @ 2018-08-28 23:26 UTC (permalink / raw)
  To: Brad Bishop; +Cc: openbmc@lists.ozlabs.org

Here is a link to the issue: https://github.com/openbmc/openbmc/issues/3283#issuecomment-414361325.

The main things that started this proof-of-concept are that we have requirements to be fully IPMI compliant and to support 4000+ SEL entries.  Our attempts to scale the D-Bus logs to that level were not successful, so we started considering directly accessing journald as an alternative.

So far, I've been focused only on IPMI SEL, so I hadn't considered extending the change to non-IPMI error logs; however, these IPMI SEL entries should still fit in well as a subset of all other error logs which could also be moved to the journal.

My goal is to align with the OpenBMC design and keep anything IPMI-related isolated only to things that care about IPMI.  My thinking was that the metadata is a bit like background info, so it is a good place to hide data that only matters to the minority, such as the IPMI-specific data.  With this, the IPMI SEL logs can be included among all the existing error logs but still have the metadata for additional IPMI stuff that doesn't matter for anyone else.

So, for writing logs:
A. non-IPMI error logs can be written as normal
B. IPMI SEL entries are written with the IPMI-specific metadata populated

For reading logs:
A. non-IPMI readers see IPMI SEL entries as normal text logs
B. IPMI readers dump just the IPMI SEL entries and get the associated IPMI-specific info from the metadata

Thanks,
-Jason

-----Original Message-----
From: Brad Bishop <bradleyb@fuzziesquirrel.com> 
Sent: Tuesday, August 28, 2018 10:59 AM
To: Bills, Jason M <jason.m.bills@intel.com>
Cc: openbmc@lists.ozlabs.org
Subject: Re: 

> On Aug 28, 2018, at 1:34 PM, Bills, Jason M <jason.m.bills@intel.com> wrote:
> 
> I just added a comment to a github discussion about the IPMI SEL and thought I should share it here as well:

Can you send a link to the issue?

> 
> I have been working on a proof-of-concept to move the IPMI SEL entries out of D-Bus into journald instead.
> 
> Since journald allows custom metadata for log entries, I've thought of having the SEL message logged to the journal and using metadata to store the necessary IPMI info associated with the entry. Here is an example of logging a type 0x02 system event entry to journald:
> 
> sd_journal_send("MESSAGE=%s", message.c_str(),
>                            "PRIORITY=%i", selPriority,
>                            "MESSAGE_ID=%s", selMessageId,
>                            "IPMI_SEL_RECORD_ID=%d", recordId,
>                            "IPMI_SEL_RECORD_TYPE=%x", selSystemType,
>                            "IPMI_SEL_GENERATOR_ID=%x", genId,
>                            "IPMI_SEL_SENSOR_PATH=%s", path.c_str(),
>                            "IPMI_SEL_EVENT_DIR=%x", assert,
>                            "IPMI_SEL_DATA=%s", selDataStr,
>                            NULL);
> Using journald should allow for scaling to more SEL entries which should also enable us to support more generic IPMI behavior such as the Add SEL command.

A design point of OpenBMC from day one was to not design it around IPMI.
At a glance this feels counter to that goal.

I’m not immediately opposed to moving our error logs out of DBus, but can you provide an extendible abstraction?  Not everyone uses SEL, or IPMI even.  At a minimum please drop the letters ‘ipmi’ and ‘sel’ :-) from the base design, and save those for something that translates to IPMI-speak.

As some background, our systems tend towards fewer ‘error logs’ with much more data per log (4-16k), and yes I admit the current design is biased towards that and does not scale when we approach 1000s of small SEL entries.

thx - brad

> 
> -Jason

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-08-28 23:26   ` Bills, Jason M
@ 2018-09-04 20:46     ` Brad Bishop
  2018-09-04 21:28       ` Re: Ed Tanous
  0 siblings, 1 reply; 1651+ messages in thread
From: Brad Bishop @ 2018-09-04 20:46 UTC (permalink / raw)
  To: Bills, Jason M, Deepak Kodihalli; +Cc: openbmc@lists.ozlabs.org

> On Aug 28, 2018, at 7:26 PM, Bills, Jason M <jason.m.bills@intel.com> wrote:
> 
> Here is a link to the issue: https://github.com/openbmc/openbmc/issues/3283#issuecomment-414361325.
> 
> The main things that started this proof-of-concept are that we have requirements to be fully IPMI compliant and to support 4000+ SEL entries.  Our attempts to scale the D-Bus logs to that level were not successful, so we started considering directly accessing journald as an alternative.
> 
> So far, I've been focused only on IPMI SEL, so I hadn't considered extending the change to non-IPMI error logs; however, these IPMI SEL entries should still fit in well as a subset of all other error logs which could also be moved to the journal.
> 
> My goal is to align with the OpenBMC design and keep anything IPMI-related isolated only to things that care about IPMI.  

But it seems like you are proposing that every application that wants to make
a log needs to have the logic to translate its internal data model to IPMI speak,
so it can make a journal call with all the IPMI metadata populated.  Am I
understanding correctly?  That doesn’t seem aligned with keeping IPMI isolated.

A concrete example - phosphor-hwmon.  How do you intend to figure out something
like IPMI_SEL_SENSOR_PATH in the phosphor-hwmon application?  Actually it would
help quite a bit to understand how each of the fields in your sample below would
be determined by an arbitrary dbus application (like phosphor-hwmon).

Further, if you expand this approach to further log formats other than SEL,
won’t the applications become a mess of translation logic from the applications
data mode <-> log format in use?

> My thinking was that the metadata is a bit like background info, so it is a good place to hide data that only matters to the minority, such as the IPMI-specific data.  With this, the IPMI SEL logs can be included among all the existing error logs but still have the metadata for additional IPMI stuff that doesn't matter for anyone else.
> 
> So, for writing logs:
> A. non-IPMI error logs can be written as normal
> B. IPMI SEL entries are written with the IPMI-specific metadata populated
> 
> For reading logs:
> A. non-IPMI readers see IPMI SEL entries as normal text logs
> B. IPMI readers dump just the IPMI SEL entries and get the associated IPMI-specific info from the metadata

I’d rather have a single approach that works for everyone; although, I’m
not sure how that would look.

> 
> Thanks,
> -Jason

This is called top posting, please try to avoid when using the mail-list.
It makes threaded conversation hard to follow and respond to.  thx.

> 
> -----Original Message-----
> From: Brad Bishop <bradleyb@fuzziesquirrel.com> 
> Sent: Tuesday, August 28, 2018 10:59 AM
> To: Bills, Jason M <jason.m.bills@intel.com>
> Cc: openbmc@lists.ozlabs.org
> Subject: Re: 
> 
> 
> 
>> On Aug 28, 2018, at 1:34 PM, Bills, Jason M <jason.m.bills@intel.com> wrote:
>> 
>> I just added a comment to a github discussion about the IPMI SEL and thought I should share it here as well:
> 
> Can you send a link to the issue?
> 
>> 
>> I have been working on a proof-of-concept to move the IPMI SEL entries out of D-Bus into journald instead.
>> 
>> Since journald allows custom metadata for log entries, I've thought of having the SEL message logged to the journal and using metadata to store the necessary IPMI info associated with the entry. Here is an example of logging a type 0x02 system event entry to journald:
>> 
>> sd_journal_send("MESSAGE=%s", message.c_str(),
>>                           "PRIORITY=%i", selPriority,
>>                           "MESSAGE_ID=%s", selMessageId,
>>                           "IPMI_SEL_RECORD_ID=%d", recordId,
>>                           "IPMI_SEL_RECORD_TYPE=%x", selSystemType,
>>                           "IPMI_SEL_GENERATOR_ID=%x", genId,
>>                           "IPMI_SEL_SENSOR_PATH=%s", path.c_str(),
>>                           "IPMI_SEL_EVENT_DIR=%x", assert,
>>                           "IPMI_SEL_DATA=%s", selDataStr,
>>                           NULL);
>> Using journald should allow for scaling to more SEL entries which should also enable us to support more generic IPMI behavior such as the Add SEL command.
> 
> A design point of OpenBMC from day one was to not design it around IPMI.
> At a glance this feels counter to that goal.
> 
> I’m not immediately opposed to moving our error logs out of DBus, but can you provide an extendible abstraction?  Not everyone uses SEL, or IPMI even.  At a minimum please drop the letters ‘ipmi’ and ‘sel’ :-) from the base design, and save those for something that translates to IPMI-speak.
> 
> As some background, our systems tend towards fewer ‘error logs’ with much more data per log (4-16k), and yes I admit the current design is biased towards that and does not scale when we approach 1000s of small SEL entries.
> 
> thx - brad
> 
>> 
>> -Jason

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2018-09-04 20:46     ` Brad Bishop
@ 2018-09-04 21:28       ` Ed Tanous
  2018-09-04 22:34         ` Re: Brad Bishop
  0 siblings, 1 reply; 1651+ messages in thread
From: Ed Tanous @ 2018-09-04 21:28 UTC (permalink / raw)
  To: openbmc

On 09/04/2018 01:46 PM, Brad Bishop wrote:
> 
> But it seems like you are proposing that every application that wants to make
> a log needs to have the logic to translate its internal data model to IPMI speak,
> so it can make a journal call with all the IPMI metadata populated.  Am I
> understanding correctly?  That doesn’t seem aligned with keeping IPMI isolated.
> 

I think a key here is that not all logs will be implicitly converted to 
IPMI logs.  Having them be identical was the design that we started 
with, and abandoned because IPMI has some requirements that don't 
cleanly map from a standard syslog/text style to IPMI.

> A concrete example - phosphor-hwmon.  How do you intend to figure out something
> like IPMI_SEL_SENSOR_PATH in the phosphor-hwmon application?  Actually it would
> help quite a bit to understand how each of the fields in your sample below would
> be determined by an arbitrary dbus application (like phosphor-hwmon).

I'm not really understanding the root of the question.  If 
phosphor-hwmon is generating a threshold crossing log that stemmed from 
the /xyz/openbmc_project/sensors/my_super_awesome_temp_sensor, then it 
would simply fill that path into the IPMI_SEL_SENSOR_PATH field.  This 
is the same kind of mapping that the associations produce today, but 
captured in journald instead of the mapper.

Our thinking was that we could build either a static library, or a dbus 
daemon to simplify producing IPMI logs.  Because of the IPMI 
requirements around unique record ids, right now we're leaning toward 
the dbus interface with a single daemon responsible for IPMI SEL creation.
While technically it could be a part of phosphor-logging, we really want 
it to be easily removable for future platforms that have no need for 
IPMI, so the thought at this time it to keep it separate.

> 
> Further, if you expand this approach to further log formats other than SEL,
> won’t the applications become a mess of translation logic from the applications
> data mode <-> log format in use?
> 

I'm not really following this question.  Are there other binary log 
formats that we expect to come in the future that aren't text based, and 
could just be a journald translation?  So far as I know, IPMI SEL is the 
only one on my road map that has weird requirements, and needs some 
translation.  I don't expect it to be a mess, and I'm running under the 
assumption that _most_ daemons won't care about or support IPMI given 
its limitations.
You're right, this isn't intended to be a general solution for all 
binary logging formats, it's intended to be a short term hack while the 
industry transitions away from IPMI and toward something easier to 
generate arbitrarily.

> 
> I’d rather have a single approach that works for everyone; although, I’m
> not sure how that would look.
> 
The single approach is where we started, and weren't able to come up 
with anything that even came close to working in a production sense.  If 
you have ideas here on how this could be built that are cleaner than 
what we're proposing, we're very much interested.

> 
> This is called top posting, please try to avoid when using the mail-list.
> It makes threaded conversation hard to follow and respond to.  thx.
> 

(Ed beats Jason with very big stick)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-09-04 21:28       ` Re: Ed Tanous
@ 2018-09-04 22:34         ` Brad Bishop
  2018-09-04 23:18           ` Re: Ed Tanous
  0 siblings, 1 reply; 1651+ messages in thread
From: Brad Bishop @ 2018-09-04 22:34 UTC (permalink / raw)
  To: Ed Tanous; +Cc: OpenBMC Maillist, Deepak Kodihalli



> On Sep 4, 2018, at 5:28 PM, Ed Tanous <ed.tanous@intel.com> wrote:
> 
> On 09/04/2018 01:46 PM, Brad Bishop wrote:
>> But it seems like you are proposing that every application that wants to make
>> a log needs to have the logic to translate its internal data model to IPMI speak,
>> so it can make a journal call with all the IPMI metadata populated.  Am I
>> understanding correctly?  That doesn’t seem aligned with keeping IPMI isolated.
> 
> I think a key here is that not all logs will be implicitly converted to IPMI logs.  Having them be identical was the design that we started with, and abandoned because IPMI has some requirements that don't cleanly map from a standard syslog/text style to IPMI.
> 
> 
>> A concrete example - phosphor-hwmon.  How do you intend to figure out something
>> like IPMI_SEL_SENSOR_PATH in the phosphor-hwmon application?  Actually it would
>> help quite a bit to understand how each of the fields in your sample below would
>> be determined by an arbitrary dbus application (like phosphor-hwmon).
> 
> I'm not really understanding the root of the question.  If phosphor-hwmon is generating a threshold crossing log that stemmed from the /xyz/openbmc_project/sensors/my_super_awesome_temp_sensor, then it would simply fill that path into the IPMI_SEL_SENSOR_PATH field.  

ok, then this is my ignorance of IPMI showing.  I thought IPMI_SEL_SENSOR_PATH was
an IPMI construct...

If this is the case then why not just call it SENSOR_PATH?  Then other logging formats
could use that metadata key without it being weird that it has ‘ipmi_sel’ in the name
of it.  And can we apply the same logic to the other keys or do some of the other keys
have more complicated translation logic (than none at all as in the case of the sensor
path) ?

> This is the same kind of mapping that the associations produce today, but captured in journald instead of the mapper.
> 
> Our thinking was that we could build either a static library, or a dbus daemon to simplify producing IPMI logs.  Because of the IPMI requirements around unique record ids, right now we're leaning toward the dbus interface with a single daemon responsible for IPMI SEL creation.

Thats great!  This is similar to how the phosphor-logging daemon creates dbus error
objects today.

Would you mind elaborating on this daemon and its dbus API?  I’m guessing it would probably
clear up any concerns I have.

> While technically it could be a part of phosphor-logging,

That isn’t what I was going for.  If you plan to implement a (separate) daemon that acts on
the journald metadata I think that is right approach too.

> we really want it to be easily removable for future platforms that have no need for IPMI, so the thought at this time it to keep it separate.

Agreed.

> 
>> Further, if you expand this approach to further log formats other than SEL,
>> won’t the applications become a mess of translation logic from the applications
>> data mode <-> log format in use?
> 
> I'm not really following this question.  Are there other binary log formats that we expect to come in the future that aren't text based, and could just be a journald translation?

Yes.  We have a binary format called PEL.  I doubt anyone would be interested in using
it but we need a foundation in OpenBMC that enables us to use it...

>  So far as I know, IPMI SEL is the only one on my road map that has weird requirements, and needs some translation.

Where is the translation happening?  In the new ipmi-sel daemon?  Or somewhere else?

>  I don't expect it to be a mess, and I'm running under the assumption that _most_ daemons won't care about or support IPMI given its limitations.

Well _all_ daemons already support IPMI SEL today.  The problem is just that the
implementation doesn’t scale.  I’m confused by _most_ daemons wouldn’t support
IPMI?

> You're right, this isn't intended to be a general solution for all binary logging formats, it's intended to be a short term hack while the industry transitions away from IPMI and toward something easier to generate arbitrarily.
> 
>> I’d rather have a single approach that works for everyone; although, I’m
>> not sure how that would look.
> The single approach is where we started, and weren't able to come up with anything that even came close to working in a production sense.  If you have ideas here on how this could be built that are cleaner than what we're proposing, we're very much interested.

I’m still trying to understand what is being proposed.

> 
>> This is called top posting, please try to avoid when using the mail-list.
>> It makes threaded conversation hard to follow and respond to.  thx.
> 
> (Ed beats Jason with very big stick)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2018-09-04 22:34         ` Re: Brad Bishop
@ 2018-09-04 23:18           ` Ed Tanous
  2018-09-04 23:42             ` Re: Brad Bishop
  0 siblings, 1 reply; 1651+ messages in thread
From: Ed Tanous @ 2018-09-04 23:18 UTC (permalink / raw)
  To: Brad Bishop; +Cc: OpenBMC Maillist, Deepak Kodihalli

On 09/04/2018 03:34 PM, Brad Bishop wrote:
> 
> ok, then this is my ignorance of IPMI showing.  I thought IPMI_SEL_SENSOR_PATH was
> an IPMI construct...
> 
> If this is the case then why not just call it SENSOR_PATH?  Then other logging formats
> could use that metadata key without it being weird that it has ‘ipmi_sel’ in the name
> of it.  And can we apply the same logic to the other keys or do some of the other keys
> have more complicated translation logic (than none at all as in the case of the sensor
> path) ?

The thinking was that we would namespace all the parameters using 
IPMI_SEL to make it clear that was the only place they were used, and to 
avoid someone else using it inadvertently.  With that said, I could 
understand how it could be confusing.  Jason, any objections to 
un-namespacing the parameters?

> 
> Thats great!  This is similar to how the phosphor-logging daemon creates dbus error
> objects today.
> 
> Would you mind elaborating on this daemon and its dbus API?  I’m guessing it would probably
> clear up any concerns I have.
> 

Patches to phosphor-dbus-interfaces for a suggested interface are being 
put together as we speak.  Hopefully that will clarify it a little bit.

>> While technically it could be a part of phosphor-logging,
> 
> That isn’t what I was going for.  If you plan to implement a (separate) daemon that acts on
> the journald metadata I think that is right approach too.
> 
Agreed.

>> we really want it to be easily removable for future platforms that have no need for IPMI, so the thought at this time it to keep it separate.
> 
> Agreed.
> 
>>
>>> Further, if you expand this approach to further log formats other than SEL,
>>> won’t the applications become a mess of translation logic from the applications
>>> data mode <-> log format in use?
>>
>> I'm not really following this question.  Are there other binary log formats that we expect to come in the future that aren't text based, and could just be a journald translation?
> 
> Yes.  We have a binary format called PEL.  I doubt anyone would be interested in using
> it but we need a foundation in OpenBMC that enables us to use it...
> 

That makes more sense now.  A quick google on PEL makes it look like it 
could follow a similar model to what we're doing with IPMI by adding the 
extra metadata to journald when needed, while still producing the string 
versions for the basic cases.  By foundation do you mean shared code?  A 
quick skim of the implementation makes me suspect that there isn't going 
to be a lot of shared code, although they could share a similar design 
with a different implementation.

>>   So far as I know, IPMI SEL is the only one on my road map that has weird requirements, and needs some translation.
> 
> Where is the translation happening?  In the new ipmi-sel daemon?  Or somewhere else?

The translation would happen on the "addSel" IPMI command that gets used 
in-band by most BIOS implementations.  The ipmi-sel daemon will 
translate the raw bytes to a string, to be used in most modern loggers, 
along with the IPMI metadata, to be used in IPMI to source the various 
"get" SEL entry commands.

> 
>>   I don't expect it to be a mess, and I'm running under the assumption that _most_ daemons won't care about or support IPMI given its limitations.
> 
> Well _all_ daemons already support IPMI SEL today.  The problem is just that the
> implementation doesn’t scale.  I’m confused by _most_ daemons wouldn’t support
> IPMI?
> 

That should've clarified that most daemons won't care about IPMI _SEL_, 
given the extra calls and metadata that needs to be provided to 
implement it correctly.  My teams intention was to support the minimum 
subset of SEL that we can for backward compatibility, while providing 
the advanced logging (journald/syslog/redfish LogService) for a greater 
level of detail and capability.
If this assumption turns out to not be true, and we end up adding IPMI 
SEL logging to all the daemons, so be it, I think it will still scale, 
but I really hope that's not the path we go down.

>> You're right, this isn't intended to be a general solution for all binary logging formats, it's intended to be a short term hack while the industry transitions away from IPMI and toward something easier to generate arbitrarily.
>>
>>> I’d rather have a single approach that works for everyone; although, I’m
>>> not sure how that would look.
>> The single approach is where we started, and weren't able to come up with anything that even came close to working in a production sense.  If you have ideas here on how this could be built that are cleaner than what we're proposing, we're very much interested.
> 
> I’m still trying to understand what is being proposed.
> 
>>
>>> This is called top posting, please try to avoid when using the mail-list.
>>> It makes threaded conversation hard to follow and respond to.  thx.
>>
>> (Ed beats Jason with very big stick)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-09-04 23:18           ` Re: Ed Tanous
@ 2018-09-04 23:42             ` Brad Bishop
  2018-09-05 21:20               ` Re: Bills, Jason M
  0 siblings, 1 reply; 1651+ messages in thread
From: Brad Bishop @ 2018-09-04 23:42 UTC (permalink / raw)
  To: Ed Tanous; +Cc: OpenBMC Maillist, Deepak Kodihalli



> On Sep 4, 2018, at 7:18 PM, Ed Tanous <ed.tanous@intel.com> wrote:
> 
> On 09/04/2018 03:34 PM, Brad Bishop wrote:
>> ok, then this is my ignorance of IPMI showing.  I thought IPMI_SEL_SENSOR_PATH was
>> an IPMI construct...
>> If this is the case then why not just call it SENSOR_PATH?  Then other logging formats
>> could use that metadata key without it being weird that it has ‘ipmi_sel’ in the name
>> of it.  And can we apply the same logic to the other keys or do some of the other keys
>> have more complicated translation logic (than none at all as in the case of the sensor
>> path) ?
> 
> The thinking was that we would namespace all the parameters using IPMI_SEL to make it clear that was the only place they were used, and to avoid someone else using it inadvertently.  
> With that said, I could understand how it could be confusing.  Jason, any objections to un-namespacing the parameters?

Thanks for being flexible on this but lets wait until we are on the same page before
changing anything.  Why would you want to discourage it from being used in another
context?

> 
>> Thats great!  This is similar to how the phosphor-logging daemon creates dbus error
>> objects today.
>> Would you mind elaborating on this daemon and its dbus API?  I’m guessing it would probably
>> clear up any concerns I have.
> 
> Patches to phosphor-dbus-interfaces for a suggested interface are being put together as we speak.  Hopefully that will clarify it a little bit.

Great, thank you.

> 
>>> While technically it could be a part of phosphor-logging,
>> That isn’t what I was going for.  If you plan to implement a (separate) daemon that acts on
>> the journald metadata I think that is right approach too.
> Agreed.
> 
>>> we really want it to be easily removable for future platforms that have no need for IPMI, so the thought at this time it to keep it separate.
>> Agreed.
>>> 
>>>> Further, if you expand this approach to further log formats other than SEL,
>>>> won’t the applications become a mess of translation logic from the applications
>>>> data mode <-> log format in use?
>>> 
>>> I'm not really following this question.  Are there other binary log formats that we expect to come in the future that aren't text based, and could just be a journald translation?
>> Yes.  We have a binary format called PEL.  I doubt anyone would be interested in using
>> it but we need a foundation in OpenBMC that enables us to use it...
> 
> That makes more sense now.  A quick google on PEL makes it look like it could follow a similar model to what we're doing with IPMI by adding the extra metadata to journald when needed, while still producing the string versions for the basic cases.  By foundation do you mean shared code?  A quick skim of the implementation makes me suspect that there isn't going to be a lot of shared code, although they could share a similar design with a different implementation.

By foundation I simply mean we need a way to support multiple logging formats that doesn’t
require every OpenBMC application to know how to translate from its internal data model
(usually dbus) to N logging formats.

> 
>>>  So far as I know, IPMI SEL is the only one on my road map that has weird requirements, and needs some translation.
>> Where is the translation happening?  In the new ipmi-sel daemon?  Or somewhere else?
> 
> The translation would happen on the "addSel" IPMI command that gets used in-band by most BIOS implementations.  The ipmi-sel daemon will translate the raw bytes to a string, to be used in most modern loggers, along with the IPMI metadata, to be used in IPMI to source the various "get" SEL entry commands.

That all sounds fine.  But what about applications on the BMC creating SELs for their
own errors?  Do you want to do that?  How will that work?

> 
>>>  I don't expect it to be a mess, and I'm running under the assumption that _most_ daemons won't care about or support IPMI given its limitations.
>> Well _all_ daemons already support IPMI SEL today.  The problem is just that the
>> implementation doesn’t scale.  I’m confused by _most_ daemons wouldn’t support
>> IPMI?
> 
> That should've clarified that most daemons won't care about IPMI _SEL_, given the extra calls and metadata that needs to be provided to implement it correctly.  My teams intention was to support the minimum subset of SEL that we can for backward compatibility, while providing the advanced logging (journald/syslog/redfish LogService) for a greater level of detail and capability.
> If this assumption turns out to not be true, and we end up adding IPMI SEL logging to all the daemons, so be it, I think it will still scale, but I really hope that's not the path we go down.

Oh.  Does this mean you intend for code like Jason originally proposed to _only_ appear in
the ipmi-sel daemon?  And not in applications like phosphor-hwmon?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-09-04 23:42             ` Re: Brad Bishop
@ 2018-09-05 21:20               ` Bills, Jason M
  0 siblings, 0 replies; 1651+ messages in thread
From: Bills, Jason M @ 2018-09-05 21:20 UTC (permalink / raw)
  To: openbmc

>>> ok, then this is my ignorance of IPMI showing.  I thought IPMI_SEL_SENSOR_PATH was
>>> an IPMI construct...
>>> If this is the case then why not just call it SENSOR_PATH?  Then other logging formats
>>> could use that metadata key without it being weird that it has ‘ipmi_sel’ in the name
>>> of it.  And can we apply the same logic to the other keys or do some of the other keys
>>> have more complicated translation logic (than none at all as in the case of the sensor
>>> path) ?
>>
>> The thinking was that we would namespace all the parameters using IPMI_SEL to make it clear that was the only place they were used, and to avoid someone else using it inadvertently.
>> With that said, I could understand how it could be confusing.  Jason, any objections to un-namespacing the parameters?
> 
> Thanks for being flexible on this but lets wait until we are on the same page before
> changing anything.  Why would you want to discourage it from being used in another
> context?
> 
Except for the sensor path, all of the proposed IPMI metadata is 
specific for IPMI:
"IPMI_SEL_RECORD_ID" = Two byte unique ID number for each SEL entry
"IPMI_SEL_RECORD_TYPE" = The type of SEL entry (system or OEM) which 
determines the definition of the remaining bytes
"IPMI_SEL_GENERATOR_ID" = The IPMI Generator ID (usually the IPMB Slave 
Address) of the SEL entry requester
"IPMI_SEL_SENSOR_PATH" = Path of the sensor used to find IPMI data (such 
as sensor number) for the sensor
"IPMI_SEL_EVENT_DIR" = Whether the sensor is asserting or de-asserting
"IPMI_SEL_DATA" = Raw binary data included in the SEL entry

I named them all as IPMI_SEL as a group so they would be clearly 
separate and easy to remove in the future.  However, if any of the 
metadata would be useful elsewhere, the names can be more generic.

>>
>>> Thats great!  This is similar to how the phosphor-logging daemon creates dbus error
>>> objects today.
>>> Would you mind elaborating on this daemon and its dbus API?  I’m guessing it would probably
>>> clear up any concerns I have.
>>
>> Patches to phosphor-dbus-interfaces for a suggested interface are being put together as we speak.  Hopefully that will clarify it a little bit.
> 
> Great, thank you.
> 
I have pushed a suggestion for the interface here: 
https://gerrit.openbmc-project.xyz/#/c/openbmc/phosphor-dbus-interfaces/+/12494/

>>
>>>> While technically it could be a part of phosphor-logging,
>>> That isn’t what I was going for.  If you plan to implement a (separate) daemon that acts on
>>> the journald metadata I think that is right approach too.
>> Agreed.
>>
>>>> we really want it to be easily removable for future platforms that have no need for IPMI, so the thought at this time it to keep it separate.
>>> Agreed.
>>>>
>>>>> Further, if you expand this approach to further log formats other than SEL,
>>>>> won’t the applications become a mess of translation logic from the applications
>>>>> data mode <-> log format in use?
>>>>
>>>> I'm not really following this question.  Are there other binary log formats that we expect to come in the future that aren't text based, and could just be a journald translation?
>>> Yes.  We have a binary format called PEL.  I doubt anyone would be interested in using
>>> it but we need a foundation in OpenBMC that enables us to use it...
>>
>> That makes more sense now.  A quick google on PEL makes it look like it could follow a similar model to what we're doing with IPMI by adding the extra metadata to journald when needed, while still producing the string versions for the basic cases.  By foundation do you mean shared code?  A quick skim of the implementation makes me suspect that there isn't going to be a lot of shared code, although they could share a similar design with a different implementation.
> 
> By foundation I simply mean we need a way to support multiple logging formats that doesn’t
> require every OpenBMC application to know how to translate from its internal data model
> (usually dbus) to N logging formats.
> 
>>
>>>>   So far as I know, IPMI SEL is the only one on my road map that has weird requirements, and needs some translation.
>>> Where is the translation happening?  In the new ipmi-sel daemon?  Or somewhere else?
>>
>> The translation would happen on the "addSel" IPMI command that gets used in-band by most BIOS implementations.  The ipmi-sel daemon will translate the raw bytes to a string, to be used in most modern loggers, along with the IPMI metadata, to be used in IPMI to source the various "get" SEL entry commands.
> 
> That all sounds fine.  But what about applications on the BMC creating SELs for their
> own errors?  Do you want to do that?  How will that work?
> 

An application on the BMC that needs to create a SEL can call the 
IpmiSelAdd method to request a new SEL entry in the journal.
>>
>>>>   I don't expect it to be a mess, and I'm running under the assumption that _most_ daemons won't care about or support IPMI given its limitations.
>>> Well _all_ daemons already support IPMI SEL today.  The problem is just that the
>>> implementation doesn’t scale.  I’m confused by _most_ daemons wouldn’t support
>>> IPMI?
>>
>> That should've clarified that most daemons won't care about IPMI _SEL_, given the extra calls and metadata that needs to be provided to implement it correctly.  My teams intention was to support the minimum subset of SEL that we can for backward compatibility, while providing the advanced logging (journald/syslog/redfish LogService) for a greater level of detail and capability.
>> If this assumption turns out to not be true, and we end up adding IPMI SEL logging to all the daemons, so be it, I think it will still scale, but I really hope that's not the path we go down.
> 
> Oh.  Does this mean you intend for code like Jason originally proposed to _only_ appear in
> the ipmi-sel daemon?  And not in applications like phosphor-hwmon?
> 

Yes, the original proposed code:
             sd_journal_send("MESSAGE=%s", message.c_str(),
                             "PRIORITY=%i", selPriority,
                             "MESSAGE_ID=%s", selMessageId,
                             "IPMI_SEL_RECORD_ID=%d", recordId,
                             "IPMI_SEL_RECORD_TYPE=%x", selSystemType,
                             "IPMI_SEL_GENERATOR_ID=%x", genId,
                             "IPMI_SEL_SENSOR_PATH=%s", path.c_str(),
                             "IPMI_SEL_EVENT_DIR=%x", assert,
                             "IPMI_SEL_DATA=%s", selDataStr,
                             NULL);
will only be in the ipmi-sel daemon.  Applications like phosphor-hwmon 
would use the IpmiSelAdd method to request SEL entries.	

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2018-10-08 13:33 Netravnen
  2018-10-08 13:34 ` Inderpreet Saini
  0 siblings, 1 reply; 1651+ messages in thread
From: Netravnen @ 2018-10-08 13:33 UTC (permalink / raw)
  To: git

unsubscribe git

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
  2018-10-08 13:33 Netravnen
@ 2018-10-08 13:34 ` Inderpreet Saini
  0 siblings, 0 replies; 1651+ messages in thread
From: Inderpreet Saini @ 2018-10-08 13:34 UTC (permalink / raw)
  To: git@vger.kernel.org

unsubscribe git

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown), 
@ 2018-10-21 16:25 Michael Tirado
  2018-10-22  0:26 ` Dave Airlie
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael Tirado @ 2018-10-21 16:25 UTC (permalink / raw)
  To: Airlied, dri-devel, LKML, kraxel, alexander.deucher,
	christian.koenig, David1.zhou, Hongbo.He
  Cc: seanpaul, Gustavo, maarten.lankhorst

[-- Attachment #1: Type: text/plain, Size: 5516 bytes --]

Mapping a drm "dumb" buffer fails on 32-bit system (i686) from what
appears to be a truncated memory address that has been copied
throughout several files. The bug manifests as an -EINVAL when calling
mmap with the offset gathered from DRM_IOCTL_MODE_MAP_DUMB <--
DRM_IOCTL_MODE_ADDFB <-- DRM_IOCTL_MODE_CREATE_DUMB.  I can provide
test code if needed.

The following patch will apply to 4.18 though I've only been able to
test through qemu bochs driver and nouveau. Intel driver worked
without any issues.  I'm not sure if everyone is going to want to
share a constant, and the whitespace is screwed up from gmail's awful
javascript client, so let me know if I should resend this with any
specific changes.  I have also attached the file with preserved
whitespace.

--- linux-4.13.8/drivers/gpu/drm/bochs/bochs.h  2017-10-18
07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/bochs/bochs.h 2017-10-20
14:34:50.308633773 +0000
@@ -115,8 +115,6 @@
        return container_of(gem, struct bochs_bo, gem);
 }

-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 static inline u64 bochs_bo_mmap_offset(struct bochs_bo *bo)
 {
        return drm_vma_node_offset_addr(&bo->bo.vma_node);
--- linux-4.13.8/drivers/gpu/drm/nouveau/nouveau_drv.h  2017-10-18
07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/nouveau/nouveau_drv.h
2017-10-20 14:34:51.581633751 +0000
@@ -57,8 +57,6 @@
 struct nouveau_channel;
 struct platform_device;

-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 #include "nouveau_fence.h"
 #include "nouveau_bios.h"

--- linux-4.13.8/drivers/gpu/drm/ast/ast_drv.h  2017-10-18
07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/ast/ast_drv.h 2017-10-20
14:34:50.289633773 +0000
@@ -356,8 +356,6 @@
                                uint32_t handle,
                                uint64_t *offset);

-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 int ast_mm_init(struct ast_private *ast);
 void ast_mm_fini(struct ast_private *ast);

--- linux-4.13.8/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
 2017-10-20 14:34:50.644633767 +0000
@@ -21,8 +21,6 @@

 #include "hibmc_drm_drv.h"

-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 static inline struct hibmc_drm_private *
 hibmc_bdev(struct ttm_bo_device *bd)
 {
--- linux-4.13.8/drivers/gpu/drm/virtio/virtgpu_ttm.c   2017-10-18
07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/virtio/virtgpu_ttm.c
2017-10-20 14:34:53.055633725 +0000
@@ -37,8 +37,6 @@

 #include <linux/delay.h>

-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 static struct
 virtio_gpu_device *virtio_gpu_get_vgdev(struct ttm_bo_device *bdev)
 {
--- linux-4.13.8/drivers/gpu/drm/qxl/qxl_drv.h  2017-10-18
07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/qxl/qxl_drv.h 2017-10-20
14:34:52.072633742 +0000
@@ -88,9 +88,6 @@
                }                                               \
        } while (0)

-#define DRM_FILE_OFFSET 0x100000000ULL
-#define DRM_FILE_PAGE_OFFSET (DRM_FILE_OFFSET >> PAGE_SHIFT)
-
 #define QXL_INTERRUPT_MASK (\
        QXL_INTERRUPT_DISPLAY |\
        QXL_INTERRUPT_CURSOR |\
--- linux-4.13.8/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
 2017-10-20 14:34:43.264633895 +0000
@@ -48,3 +48,1 @@

-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
--- linux-4.13.8/drivers/gpu/drm/mgag200/mgag200_drv.h  2017-10-18
07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/mgag200/mgag200_drv.h
2017-10-20 14:34:51.404633754 +0000
@@ -276,7 +276,6 @@
 struct mga_i2c_chan *mgag200_i2c_create(struct drm_device *dev);
 void mgag200_i2c_destroy(struct mga_i2c_chan *i2c);

-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
 void mgag200_ttm_placement(struct mgag200_bo *bo, int domain);

 static inline int mgag200_bo_reserve(struct mgag200_bo *bo, bool no_wait)
--- linux-4.13.8/drivers/gpu/drm/radeon/radeon_ttm.c    2017-10-18
07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/radeon/radeon_ttm.c
2017-10-20 14:34:52.588633733 +0000
@@ -45,8 +45,6 @@
 #include "radeon_reg.h"
 #include "radeon.h"

-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 static int radeon_ttm_debugfs_init(struct radeon_device *rdev);
 static void radeon_ttm_debugfs_fini(struct radeon_device *rdev);

--- linux-4.13.8/drivers/gpu/drm/cirrus/cirrus_drv.h    2017-10-18
07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/cirrus/cirrus_drv.h
2017-10-20 14:34:50.333633772 +0000
@@ -178,7 +178,6 @@


 #define to_cirrus_obj(x) container_of(x, struct cirrus_gem_object, base)
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)

                                /* cirrus_mode.c */
 void cirrus_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
--- linux-4.13.8/include/drm/drmP.h     2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/include/drm/drmP.h    2017-10-20
14:35:31.300633060 +0000
@@ -503,4 +503,10 @@
 /* helper for handling conditionals in various for_each macros */
 #define for_each_if(condition) if (!(condition)) {} else

+#if BITS_PER_LONG == 64
+#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
+#else
+#define DRM_FILE_PAGE_OFFSET (0x10000000ULL >> PAGE_SHIFT)
+#endif
+
 #endif

[-- Attachment #2: drm_file_offset.patch --]
[-- Type: application/octet-stream, Size: 4581 bytes --]

--- linux-4.13.8/drivers/gpu/drm/bochs/bochs.h	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/bochs/bochs.h	2017-10-20 14:34:50.308633773 +0000
@@ -115,8 +115,6 @@
 	return container_of(gem, struct bochs_bo, gem);
 }
 
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 static inline u64 bochs_bo_mmap_offset(struct bochs_bo *bo)
 {
 	return drm_vma_node_offset_addr(&bo->bo.vma_node);
--- linux-4.13.8/drivers/gpu/drm/nouveau/nouveau_drv.h	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/nouveau/nouveau_drv.h	2017-10-20 14:34:51.581633751 +0000
@@ -57,8 +57,6 @@
 struct nouveau_channel;
 struct platform_device;
 
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 #include "nouveau_fence.h"
 #include "nouveau_bios.h"
 
--- linux-4.13.8/drivers/gpu/drm/ast/ast_drv.h	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/ast/ast_drv.h	2017-10-20 14:34:50.289633773 +0000
@@ -356,8 +356,6 @@
 				uint32_t handle,
 				uint64_t *offset);
 
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 int ast_mm_init(struct ast_private *ast);
 void ast_mm_fini(struct ast_private *ast);
 
--- linux-4.13.8/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c	2017-10-20 14:34:50.644633767 +0000
@@ -21,8 +21,6 @@
 
 #include "hibmc_drm_drv.h"
 
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 static inline struct hibmc_drm_private *
 hibmc_bdev(struct ttm_bo_device *bd)
 {
--- linux-4.13.8/drivers/gpu/drm/virtio/virtgpu_ttm.c	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/virtio/virtgpu_ttm.c	2017-10-20 14:34:53.055633725 +0000
@@ -37,8 +37,6 @@
 
 #include <linux/delay.h>
 
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 static struct
 virtio_gpu_device *virtio_gpu_get_vgdev(struct ttm_bo_device *bdev)
 {
--- linux-4.13.8/drivers/gpu/drm/qxl/qxl_drv.h	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/qxl/qxl_drv.h	2017-10-20 14:34:52.072633742 +0000
@@ -88,9 +88,6 @@
 		}						\
 	} while (0)
 
-#define DRM_FILE_OFFSET 0x100000000ULL
-#define DRM_FILE_PAGE_OFFSET (DRM_FILE_OFFSET >> PAGE_SHIFT)
-
 #define QXL_INTERRUPT_MASK (\
 	QXL_INTERRUPT_DISPLAY |\
 	QXL_INTERRUPT_CURSOR |\
--- linux-4.13.8/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c	2017-10-20 14:34:43.264633895 +0000
@@ -48,3 +48,1 @@
 
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
--- linux-4.13.8/drivers/gpu/drm/mgag200/mgag200_drv.h	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/mgag200/mgag200_drv.h	2017-10-20 14:34:51.404633754 +0000
@@ -276,7 +276,6 @@
 struct mga_i2c_chan *mgag200_i2c_create(struct drm_device *dev);
 void mgag200_i2c_destroy(struct mga_i2c_chan *i2c);
 
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
 void mgag200_ttm_placement(struct mgag200_bo *bo, int domain);
 
 static inline int mgag200_bo_reserve(struct mgag200_bo *bo, bool no_wait)
--- linux-4.13.8/drivers/gpu/drm/radeon/radeon_ttm.c	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/radeon/radeon_ttm.c	2017-10-20 14:34:52.588633733 +0000
@@ -45,8 +45,6 @@
 #include "radeon_reg.h"
 #include "radeon.h"
 
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
-
 static int radeon_ttm_debugfs_init(struct radeon_device *rdev);
 static void radeon_ttm_debugfs_fini(struct radeon_device *rdev);
 
--- linux-4.13.8/drivers/gpu/drm/cirrus/cirrus_drv.h	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/drivers/gpu/drm/cirrus/cirrus_drv.h	2017-10-20 14:34:50.333633772 +0000
@@ -178,7 +178,6 @@
 
 
 #define to_cirrus_obj(x) container_of(x, struct cirrus_gem_object, base)
-#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
 
 				/* cirrus_mode.c */
 void cirrus_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
--- linux-4.13.8/include/drm/drmP.h	2017-10-18 07:38:33.000000000 +0000
+++ linux-4.13.8-modified/include/drm/drmP.h	2017-10-20 14:35:31.300633060 +0000
@@ -503,4 +503,10 @@
 /* helper for handling conditionals in various for_each macros */
 #define for_each_if(condition) if (!(condition)) {} else
 
+#if BITS_PER_LONG == 64
+#define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
+#else
+#define DRM_FILE_PAGE_OFFSET (0x10000000ULL >> PAGE_SHIFT)
+#endif
+
 #endif

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-10-21 16:25 (unknown), Michael Tirado
@ 2018-10-22  0:26 ` Dave Airlie
  2018-10-21 20:23   ` Re: Michael Tirado
  0 siblings, 1 reply; 1651+ messages in thread
From: Dave Airlie @ 2018-10-22  0:26 UTC (permalink / raw)
  To: mtirado418
  Cc: Dave Airlie, LKML, dri-devel, Hongbo.He, Gerd Hoffmann,
	Deucher, Alexander, Sean Paul, Koenig, Christian

On Mon, 22 Oct 2018 at 07:22, Michael Tirado <mtirado418@gmail.com> wrote:
>
> Mapping a drm "dumb" buffer fails on 32-bit system (i686) from what
> appears to be a truncated memory address that has been copied
> throughout several files. The bug manifests as an -EINVAL when calling
> mmap with the offset gathered from DRM_IOCTL_MODE_MAP_DUMB <--
> DRM_IOCTL_MODE_ADDFB <-- DRM_IOCTL_MODE_CREATE_DUMB.  I can provide
> test code if needed.
>
> The following patch will apply to 4.18 though I've only been able to
> test through qemu bochs driver and nouveau. Intel driver worked
> without any issues.  I'm not sure if everyone is going to want to
> share a constant, and the whitespace is screwed up from gmail's awful
> javascript client, so let me know if I should resend this with any
> specific changes.  I have also attached the file with preserved
> whitespace.
>

This shouldn't be necessary, did someone misbackport the mmap changes without:

drm: set FMODE_UNSIGNED_OFFSET for drm files

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-10-22  0:26 ` Dave Airlie
@ 2018-10-21 20:23   ` Michael Tirado
  2018-10-22  1:50     ` Re: Dave Airlie
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael Tirado @ 2018-10-21 20:23 UTC (permalink / raw)
  To: airlied
  Cc: Airlied, dri-devel, LKML, kraxel, alexander.deucher,
	christian.koenig, David1.zhou, Hongbo.He, Sean Paul, Gustavo,
	maarten.lankhorst

On Mon, Oct 22, 2018 at 12:26 AM Dave Airlie <airlied@gmail.com> wrote:
>
> This shouldn't be necessary, did someone misbackport the mmap changes without:
>
> drm: set FMODE_UNSIGNED_OFFSET for drm files
>
> Dave.

The latest kernel I have had to patch was a 4.18-rc6.  I'll try with a
newer 4.19 and let you know if it decides to work.  If not I'll
prepare a test case for demonstration on qemu-system-i386.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-10-21 20:23   ` Re: Michael Tirado
@ 2018-10-22  1:50     ` Dave Airlie
  2018-10-21 22:20       ` Re: Michael Tirado
  2018-10-23  1:47       ` Re: Michael Tirado
  0 siblings, 2 replies; 1651+ messages in thread
From: Dave Airlie @ 2018-10-22  1:50 UTC (permalink / raw)
  To: mtirado418
  Cc: Dave Airlie, LKML, dri-devel, Hongbo.He, Gerd Hoffmann,
	Deucher, Alexander, Sean Paul, Koenig, Christian

On Mon, 22 Oct 2018 at 10:49, Michael Tirado <mtirado418@gmail.com> wrote:
>
> On Mon, Oct 22, 2018 at 12:26 AM Dave Airlie <airlied@gmail.com> wrote:
> >
> > This shouldn't be necessary, did someone misbackport the mmap changes without:
> >
> > drm: set FMODE_UNSIGNED_OFFSET for drm files
> >
> > Dave.
>
> The latest kernel I have had to patch was a 4.18-rc6.  I'll try with a
> newer 4.19 and let you know if it decides to work.  If not I'll
> prepare a test case for demonstration on qemu-system-i386.

If you have custom userspace software, make sure it's using
AC_SYS_LARGEFILE or whatever the equivalant is in your build system.

64-bit file offsets are important.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-10-22  1:50     ` Re: Dave Airlie
@ 2018-10-21 22:20       ` Michael Tirado
  2018-10-23  1:47       ` Re: Michael Tirado
  1 sibling, 0 replies; 1651+ messages in thread
From: Michael Tirado @ 2018-10-21 22:20 UTC (permalink / raw)
  To: Dave Airlie
  Cc: Airlied, dri-devel, LKML, kraxel, alexander.deucher,
	christian.koenig, David1.zhou, Hongbo.He, Sean Paul, Gustavo,
	maarten.lankhorst

[-- Attachment #1: Type: text/plain, Size: 837 bytes --]

On Mon, Oct 22, 2018 at 1:50 AM Dave Airlie <airlied@gmail.com> wrote:
>
> On Mon, 22 Oct 2018 at 10:49, Michael Tirado <mtirado418@gmail.com> wrote:
> >
> > On Mon, Oct 22, 2018 at 12:26 AM Dave Airlie <airlied@gmail.com> wrote:
> > >
> > > This shouldn't be necessary, did someone misbackport the mmap changes without:
> If you have custom userspace software, make sure it's using
> AC_SYS_LARGEFILE or whatever the equivalant is in your build system.
>
> 64-bit file offsets are important.
>

That fixed it! -D_FILE_OFFSET_BITS=64 is the pre-processor define
needed. It's a bit more than unintuitive but I'm glad I don't need
this stupid patch anymore, Thanks.

In case anyone is further interested I have attached test program
since I spent the last hour or so chopping it up anyway :S   [ gcc -o
kms -D_FILE_OFFSET_BITS=64 main.c ]

[-- Attachment #2: main.c --]
[-- Type: application/octet-stream, Size: 17153 bytes --]

/* Copyright (C) 2017 Michael R. Tirado <mtirado418@gmail.com> -- GPLv3+
 *
 * This program is libre software: you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation, either version 3 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details. You should have
 * received a copy of the GNU General Public License version 3
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */


#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <stdio.h>
#include <malloc.h>
#include <signal.h>
#include <stdlib.h>
#include <stdint.h>
#include <stddef.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
#include <sys/mman.h>
#include <sys/ioctl.h>
#include <drm/drm.h>
#include <drm/drm_mode.h>

#define STRERR strerror(errno)

#ifndef PTRBITCOUNT
	#define PTRBITCOUNT 32
#endif
/* kernel structs use __u64 for pointer types */
#if (PTRBITCOUNT == 32)
	#define ptr_from_krn(ptr) ((void *)(uint32_t)(ptr))
	#define ptr_to_krn(ptr)   ((uint32_t)(ptr))
#elif (PTRBITCOUNT == 64)
	#define ptr_from_krn(ptr) ((void *)(uint64_t)(ptr))
	#define ptr_to_krn(ptr)   ((uint64_t)(ptr))
#else
	#error "PTRBITCOUNT is undefined"
#endif
#ifndef MAX_FBS
	#define MAX_FBS   12
#endif
#ifndef MAX_CRTCS
	#define MAX_CRTCS 12
#endif
#ifndef MAX_CONNECTORS
	#define MAX_CONNECTORS 12
#endif
#ifndef MAX_ENCODERS
	#define MAX_ENCODERS 12
#endif
#ifndef MAX_PROPS
	#define MAX_PROPS 256
#endif
#ifndef MAX_MODES
	#define MAX_MODES 256
#endif
#if (PTRBITCOUNT == 32)
	#define drm_to_ptr(ptr)   ((void *)(uint32_t)(ptr))
	#define drm_from_ptr(ptr) ((uint32_t)(ptr))
#elif (PTRBITCOUNT == 64)
	#define drm_to_ptr(ptr)   ((void *)(uint64_t)(ptr))
	#define drm_from_ptr(ptr) ((uint64_t)(ptr))
#else
	#error "PTRBITCOUNT is undefined"
#endif
#define drm_alloc(size) (drm_from_ptr(calloc(1,size)))

struct drm_buffer
{
	uint32_t drm_id;
	uint32_t fb_id;
	uint32_t pitch;
	uint32_t width;
	uint32_t height;
	uint32_t depth;
	uint32_t bpp;
	char    *addr;
	size_t   size;
};
struct drm_display
{
	struct drm_mode_get_encoder encoder;
	struct drm_mode_crtc crtc;
	struct drm_mode_get_connector *conn; /* do we need array for multi-screen? */
	struct drm_mode_modeinfo *modes; /* these both point to conn's mode array */
	struct drm_mode_modeinfo *cur_mode;
	uint32_t cur_mode_idx;
	uint32_t mode_count;
	uint32_t conn_id;
};
struct drm_kms
{
	struct drm_display display;
	struct drm_buffer *sfb;
	struct drm_mode_card_res *res;
	int card_fd;
};

/* get id out of drm_id_ptr */
static uint32_t drm_get_id(uint64_t addr, uint32_t idx)
{
	return ((uint32_t *)drm_to_ptr(addr))[idx];
}

static int free_mode_card_res(struct drm_mode_card_res *res)
{
	if (!res)
		return -1;
	if (res->fb_id_ptr)
		free(drm_to_ptr(res->fb_id_ptr));
	if (res->crtc_id_ptr)
		free(drm_to_ptr(res->crtc_id_ptr));
	if (res->encoder_id_ptr)
		free(drm_to_ptr(res->encoder_id_ptr));
	if (res->connector_id_ptr)
		free(drm_to_ptr(res->connector_id_ptr));
	free(res);
	return 0;
}

static struct drm_mode_card_res *alloc_mode_card_res(int fd)
{
	struct drm_mode_card_res res;
	struct drm_mode_card_res *ret;
	uint32_t count_fbs, count_crtcs, count_connectors, count_encoders;

	memset(&res, 0, sizeof(struct drm_mode_card_res));
	if (ioctl(fd, DRM_IOCTL_MODE_GETRESOURCES, &res)) {
		printf("ioctl(DRM_IOCTL_MODE_GETRESOURCES, &res): %s\n", STRERR);
		return NULL;
	}
	if (res.count_fbs > MAX_FBS
			|| res.count_crtcs > MAX_CRTCS
			|| res.count_encoders > MAX_ENCODERS
			|| res.count_connectors > MAX_CONNECTORS) {
		printf("resource limit reached, see defines.h\n");
		return NULL;
	}
	if (res.count_fbs) {
		res.fb_id_ptr = drm_alloc(sizeof(uint32_t)*res.count_fbs);
		if (!res.fb_id_ptr)
			goto alloc_err;
	}
	if (res.count_crtcs) {
		res.crtc_id_ptr = drm_alloc(sizeof(uint32_t)*res.count_crtcs);
		if (!res.crtc_id_ptr)
			goto alloc_err;
	}
	if (res.count_encoders) {
		res.encoder_id_ptr = drm_alloc(sizeof(uint32_t)*res.count_encoders);
		if (!res.encoder_id_ptr)
			goto alloc_err;
	}
	if (res.count_connectors) {
		res.connector_id_ptr = drm_alloc(sizeof(uint32_t)*res.count_connectors);
		if (!res.connector_id_ptr)
			goto alloc_err;
	}
	count_fbs = res.count_fbs;
	count_crtcs = res.count_crtcs;
	count_encoders = res.count_encoders;
	count_connectors = res.count_connectors;

	if (ioctl(fd, DRM_IOCTL_MODE_GETRESOURCES, &res) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETRESOURCES, &res): %s\n", STRERR);
		goto free_err;
	}

	if (count_fbs != res.count_fbs
			|| count_crtcs != res.count_crtcs
			|| count_encoders != res.count_encoders
			|| count_connectors != res.count_connectors) {
		errno = EAGAIN;
		goto free_err;
	}

	ret = calloc(1, sizeof(struct drm_mode_card_res));
	if (!ret)
		goto alloc_err;

	memcpy(ret, &res, sizeof(struct drm_mode_card_res));
	return ret;

alloc_err:
	errno = ENOMEM;
free_err:
	free(drm_to_ptr(res.fb_id_ptr));
	free(drm_to_ptr(res.crtc_id_ptr));
	free(drm_to_ptr(res.connector_id_ptr));
	free(drm_to_ptr(res.encoder_id_ptr));
	return NULL;
}


static struct drm_mode_get_connector *alloc_connector(int fd, uint32_t conn_id)
{
	struct drm_mode_get_connector conn;
	struct drm_mode_get_connector *ret;
	uint32_t count_modes, count_props, count_encoders;

	memset(&conn, 0, sizeof(struct drm_mode_get_connector));
	conn.connector_id = conn_id;

	if (ioctl(fd, DRM_IOCTL_MODE_GETCONNECTOR, &conn) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETCONNECTOR, &conn): %s\n", STRERR);
		return NULL;
	}
	if (conn.count_modes > MAX_MODES
			|| conn.count_props > MAX_PROPS
			|| conn.count_encoders > MAX_ENCODERS) {
		printf("resource limit reached, see defines.h\n");
		return NULL;
	}
	if (conn.count_modes) {
		conn.modes_ptr = drm_alloc(sizeof(struct drm_mode_modeinfo)
					   * conn.count_modes);
		if (!conn.modes_ptr)
			goto alloc_err;
	}
	if (conn.count_props) {
		conn.props_ptr = drm_alloc(sizeof(uint32_t)*conn.count_props);
		if (!conn.props_ptr)
			goto alloc_err;
		conn.prop_values_ptr = drm_alloc(sizeof(uint64_t)*conn.count_props);
		if (!conn.prop_values_ptr)
			goto alloc_err;
	}
	if (conn.count_encoders) {
		conn.encoders_ptr = drm_alloc(sizeof(uint32_t)*conn.count_encoders);
		if (!conn.encoders_ptr)
			goto alloc_err;
	}
	count_modes = conn.count_modes;
	count_props = conn.count_props;
	count_encoders = conn.count_encoders;

	if (ioctl(fd, DRM_IOCTL_MODE_GETCONNECTOR, &conn) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETCONNECTOR, &conn): %s\n", STRERR);
		goto free_err;
	}

	if (count_modes != conn.count_modes
			|| count_props != conn.count_props
			|| count_encoders != conn.count_encoders) {
		errno = EAGAIN;
		goto free_err;
	}

	ret = calloc(1, sizeof(struct drm_mode_get_connector));
	if (!ret)
		goto alloc_err;

	memcpy(ret, &conn, sizeof(struct drm_mode_get_connector));
	return ret;

alloc_err:
	errno = ENOMEM;
free_err:
	free(drm_to_ptr(conn.modes_ptr));
	free(drm_to_ptr(conn.props_ptr));
	free(drm_to_ptr(conn.encoders_ptr));
	free(drm_to_ptr(conn.prop_values_ptr));
	return NULL;
}

static struct drm_mode_modeinfo *get_connector_modeinfo(struct drm_mode_get_connector *conn,  uint32_t *count)
{
	if (!conn || !count)
		return NULL;
	*count = conn->count_modes;
	return drm_to_ptr(conn->modes_ptr);

}

static int free_connector(struct drm_mode_get_connector *conn)
{
	if (!conn)
		return -1;
	if (conn->modes_ptr)
		free(drm_to_ptr(conn->modes_ptr));
	if (conn->props_ptr)
		free(drm_to_ptr(conn->props_ptr));
	if (conn->encoders_ptr)
		free(drm_to_ptr(conn->encoders_ptr));
	if (conn->prop_values_ptr)
		free(drm_to_ptr(conn->prop_values_ptr));
	free(conn);
	return 0;
}


static int drm_kms_connect_sfb(struct drm_kms *self)
{
	struct drm_display *display = &self->display;
	struct drm_mode_get_connector *conn;
	struct drm_mode_modeinfo *cur_mode;
	struct drm_mode_get_encoder *encoder;
	struct drm_mode_crtc *crtc;
	if (!display || !display->conn || !display->cur_mode || !self->sfb)
		return -1;

	conn = display->conn;
	cur_mode = display->cur_mode;
	encoder = &self->display.encoder;
	crtc = &self->display.crtc;
	memset(crtc, 0, sizeof(struct drm_mode_crtc));
	memset(encoder,  0, sizeof(struct drm_mode_get_encoder));


	/* XXX: there can be multiple encoders, have not investigated this much */
	if (conn->encoder_id == 0) {
		printf("conn->encoder_id was 0, defaulting to encoder[0]\n");
		conn->encoder_id = ((uint32_t *)drm_to_ptr(conn->encoders_ptr))[0];
	}
	encoder->encoder_id = conn->encoder_id;

	if (ioctl(self->card_fd, DRM_IOCTL_MODE_GETENCODER, encoder) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETENCODER): %s\n", STRERR);
		return -1;
	}

	if (encoder->crtc_id == 0) {
		printf("encoder->crtc_id was 0, defaulting to crtc[0]\n");
		encoder->crtc_id = ((uint32_t *)drm_to_ptr(self->res->crtc_id_ptr))[0];
	}
	crtc->crtc_id = encoder->crtc_id;

	if (ioctl(self->card_fd, DRM_IOCTL_MODE_GETCRTC, crtc) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_GETCRTC): %s\n", STRERR);
		return -1;
	}

	/* set crtc mode */
	crtc->fb_id = self->sfb->fb_id;
	crtc->set_connectors_ptr = drm_from_ptr((void *)&conn->connector_id);
	crtc->count_connectors = 1;
	crtc->mode = *cur_mode;
	/*printf("\nsetting mode:\n\n");
	print_mode_modeinfo(cur_mode);*/
	crtc->mode_valid = 1;
	if (ioctl(self->card_fd, DRM_IOCTL_MODE_SETCRTC, crtc) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_SETCRTC): %s\n", STRERR);
		return -1;
	}
	return 0;
}

/* stupid frame buffer */
static struct drm_buffer *alloc_sfb(int card_fd,
			     uint32_t width,
			     uint32_t height,
			     uint32_t depth,
			     uint32_t bpp)
{
	struct drm_mode_create_dumb cdumb;
	struct drm_mode_map_dumb    moff;
	struct drm_mode_fb_cmd      cmd;
	struct drm_buffer *ret;
	void  *fbmap;

	memset(&cdumb, 0, sizeof(cdumb));
	memset(&moff,  0, sizeof(moff));
	memset(&cmd,   0, sizeof(cmd));

	/* create dumb buffer */
	cdumb.width  = width;
	cdumb.height = height;
	cdumb.bpp    = bpp;
	cdumb.flags  = 0;
	cdumb.pitch  = 0;
	cdumb.size   = 0;
	cdumb.handle = 0;
	if (ioctl(card_fd, DRM_IOCTL_MODE_CREATE_DUMB, &cdumb) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_CREATE_DUMB): %s\n", STRERR);
		return NULL;
	}
	/* add framebuffer object */
	cmd.width  = cdumb.width;
	cmd.height = cdumb.height;
	cmd.bpp    = cdumb.bpp;
	cmd.pitch  = cdumb.pitch;
	cmd.depth  = depth;
	cmd.handle = cdumb.handle;
	if (ioctl(card_fd, DRM_IOCTL_MODE_ADDFB, &cmd) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_ADDFB): %s\n", STRERR);
		ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &cdumb.handle);
		return NULL;
	}
	/* get mmap offset */
	moff.handle = cdumb.handle;
	if (ioctl(card_fd, DRM_IOCTL_MODE_MAP_DUMB, &moff) == -1) {
		printf("ioctl(DRM_IOCTL_MODE_MAP_DUMB): %s\n", STRERR);
		ioctl(card_fd, DRM_IOCTL_MODE_RMFB, &cmd.fb_id);
		ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &cdumb.handle);
		return NULL;
	}
	/* XXX this is probably better off as MAP_PRIVATE, we can't prime
	 * the main framebuffer if it's "dumb", AFAIK */
	fbmap = mmap(0, (size_t)cdumb.size, PROT_READ|PROT_WRITE,
			MAP_SHARED, card_fd, (off_t)moff.offset);
	if (fbmap == MAP_FAILED) {
		printf("framebuffer mmap failed: %s\n", STRERR);
		ioctl(card_fd, DRM_IOCTL_MODE_RMFB, &cmd.fb_id);
		ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &cdumb.handle);
		return NULL;
	}

	ret = calloc(1, sizeof(struct drm_buffer));
	if (!ret) {
		printf("-ENOMEM\n");
		munmap(fbmap, cdumb.size);
		ioctl(card_fd, DRM_IOCTL_MODE_RMFB, &cmd.fb_id);
		ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &cdumb.handle);
		return NULL;
	}
	ret->addr     = fbmap;
	ret->size     = cdumb.size;
	ret->pitch    = cdumb.pitch;
	ret->width    = cdumb.width;
	ret->height   = cdumb.height;
	ret->bpp      = cdumb.bpp;
	ret->depth    = cmd.depth;
	ret->fb_id    = cmd.fb_id;
	ret->drm_id   = cdumb.handle;
	memset(fbmap, 0x27, cdumb.size);
	return ret;
}

static int destroy_sfb(int card_fd, struct drm_buffer *sfb)
{
	if (!sfb)
		return -1;

	if (munmap(sfb->addr, sfb->size) == -1)
		printf("munmap: %s\n", STRERR);
	if (ioctl(card_fd, DRM_IOCTL_MODE_RMFB, &sfb->fb_id))
		printf("ioctl(DRM_IOCTL_MODE_RMFB): %s\n", STRERR);
	if (ioctl(card_fd, DRM_IOCTL_MODE_DESTROY_DUMB, &sfb->drm_id))
		printf("ioctl(DRM_IOCTL_MODE_DESTROY_DUMB): %s\n", STRERR);
	free(sfb);
	return 0;
}
static int card_set_master(int card_fd)
{
	if (ioctl(card_fd, DRM_IOCTL_SET_MASTER, 0)) {
		printf("ioctl(DRM_IOCTL_SET_MASTER, 0): %s\n", STRERR);
		return -1;
	}
	return 0;
}
static int card_drop_master(int card_fd)
{
	if (ioctl(card_fd, DRM_IOCTL_DROP_MASTER, 0)) {
		printf("ioctl(DRM_IOCTL_DROP_MASTER, 0): %s\n", STRERR);
		return -1;
	}
	return 0;
}
static int drm_display_destroy(struct drm_display *display)
{
	if (display->conn)
		free_connector(display->conn);
	memset(display, 0, sizeof(struct drm_display));
	return 0;
}
int drm_kms_destroy(struct drm_kms *self)
{
	if (self->sfb)
		destroy_sfb(self->card_fd, self->sfb);
	if (self->res)
		free_mode_card_res(self->res);
	drm_display_destroy(&self->display);

	close(self->card_fd);
	memset(self, 0, sizeof(struct drm_kms));
	free(self);
	return 0;
}
static int get_mode_idx(struct drm_mode_modeinfo *modes,
			uint16_t count,
			uint16_t width,
			uint16_t height,
			uint16_t refresh)
{
	int i;
	int pick = -1;
	if (width == 0)
		width = 0xffff;
	if (height == 0)
		height = 0xffff;
	for (i = 0; i < count; ++i)
	{
		if (modes[i].hdisplay > width || modes[i].vdisplay > height)
			continue;
		/* pretend these radical modes don't exist for now */
		if (modes[i].hdisplay % 16 == 0) {
			if (pick < 0) {
				pick = i;
				continue;
			}
			if (modes[i].hdisplay > modes[pick].hdisplay)
				pick = i;
			else if (modes[i].vdisplay > modes[pick].vdisplay)
				pick = i;
			else if (modes[i].hdisplay == modes[pick].hdisplay
					&& modes[i].vdisplay == modes[pick].vdisplay) {
				if (abs(refresh - modes[i].vrefresh)
					  < abs(refresh - modes[pick].vrefresh)) {
					pick = i;
				}
			}
		}
	}
	if (pick < 0) {
		printf("could not find any usable modes for (%dx%d@%dhz)\n",
				width, height, refresh);
		return -1;
	}
	return pick;
}
/* TODO handle hotplugging */
static int drm_display_load(struct drm_kms *self,
		     uint16_t req_width,
		     uint16_t req_height,
		     uint16_t req_refresh,
		     struct drm_display *out)
{
	uint32_t conn_id;
	int idx = -1;

	/* FIXME uses primary connector? "0" */
	conn_id = drm_get_id(self->res->connector_id_ptr, 0);
	out->conn = alloc_connector(self->card_fd, conn_id);
	if (!out->conn) {
		printf("unable to create drm connector structure\n");
		return -1;
	}

	out->conn_id = conn_id;
	out->modes = get_connector_modeinfo(out->conn, &out->mode_count);
	idx = get_mode_idx(out->modes, out->mode_count,
			   req_width, req_height, req_refresh);
	if (idx < 0)
		goto free_err;

	out->cur_mode_idx = (uint32_t)idx;
	out->cur_mode = &out->modes[out->cur_mode_idx];
	return 0;
free_err:
	drm_display_destroy(out);
	return -1;
}
struct drm_kms *drm_mode_create(char *devname,
				int no_connect,
				uint16_t req_width,
				uint16_t req_height,
				uint16_t req_refresh)
{
	char devpath[128];
	struct drm_kms *self;
	struct drm_mode_modeinfo *cur_mode;
	int card_fd;

	snprintf(devpath, sizeof(devpath), "/dev/dri/%s", devname);
	card_fd = open(devpath, O_RDWR|O_CLOEXEC);
	if (card_fd == -1) {
		printf("open(%s): %s\n", devpath, STRERR);
		return NULL;
	}
	if (card_set_master(card_fd)) {
		printf("card_set_master failed\n");
		return NULL;
	}

	self = calloc(1, sizeof(struct drm_kms));
	if (!self)
		return NULL;

	self->card_fd = card_fd;
	self->res = alloc_mode_card_res(card_fd);
	if (!self->res) {
		printf("unable to create drm structure\n");
		goto free_err;
	}

	if (drm_display_load(self, req_width, req_height, req_refresh, &self->display)) {
		printf("drm_display_load failed\n");
		goto free_err;
	}
	cur_mode = self->display.cur_mode;
	printf("connector(%d) using mode[%d] (%dx%d@%dhz)\n",
				self->display.conn_id,
				self->display.cur_mode_idx,
				cur_mode->hdisplay,
				cur_mode->vdisplay,
				cur_mode->vrefresh);

	/* buffer pitch must divide evenly by 16,
	 * TODO check against bpp here when that is variable instead of 32 */
	self->sfb = alloc_sfb(card_fd, cur_mode->hdisplay, cur_mode->vdisplay, 24, 32);
	if (!self->sfb) {
		printf("alloc_sfb failed\n");
		goto free_err;
	}

	if (!no_connect && drm_kms_connect_sfb(self)) {
		printf("drm_kms_connect_sfb failed\n");
		goto free_err;
	}
	return self;

free_err:
	drm_kms_destroy(self);
	return NULL;
}


int main(int argc, char *argv[])
{
	int ret = -1;
	struct drm_kms *card0;
	/*card0 = drm_mode_create("card0", g_srv_opts.inactive_vt,
					   g_srv_opts.request_width,
					   g_srv_opts.request_height,
					   g_srv_opts.request_refresh);*/
	/* do not connect to vt */
	card0 = drm_mode_create("card0", 1, 640, 480, 60);
	if (card0 == NULL) {
		printf("drm_mode_create failed\n");
		return -1;
	}


	drm_kms_destroy(card0);

	printf("looks ok, returning 0\n");
	return 0;
}

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-10-22  1:50     ` Re: Dave Airlie
  2018-10-21 22:20       ` Re: Michael Tirado
@ 2018-10-23  1:47       ` Michael Tirado
  2018-10-23  6:23         ` Re: Dave Airlie
  1 sibling, 1 reply; 1651+ messages in thread
From: Michael Tirado @ 2018-10-23  1:47 UTC (permalink / raw)
  To: Dave Airlie, LKML, dri-devel

That preprocessor define worked but I'm still confused about this
DRM_FILE_PAGE_OFFSET thing.  Check out drivers/gpu/drm/drm_gem.c
right above drm_gem_init.

---

/*
 * We make up offsets for buffer objects so we can recognize them at
 * mmap time.
 */

/* pgoff in mmap is an unsigned long, so we need to make sure that
 * the faked up offset will fit
 */

#if BITS_PER_LONG == 64
#define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFFUL >> PAGE_SHIFT) + 1)
#define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFFUL >> PAGE_SHIFT) * 16)
#else
#define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFUL >> PAGE_SHIFT) + 1)
#define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFUL >> PAGE_SHIFT) * 16)
#endif


---

Why is having a 64-bit file offsets critical, causing -EINVAL on mmap?
What problems might be associated with using (0x10000000UL >>
PAGE_SHIFT) ?
On Mon, Oct 22, 2018 at 1:50 AM Dave Airlie <airlied@gmail.com> wrote:
>
> On Mon, 22 Oct 2018 at 10:49, Michael Tirado <mtirado418@gmail.com> wrote:
> >
> > On Mon, Oct 22, 2018 at 12:26 AM Dave Airlie <airlied@gmail.com> wrote:
> > >
> > > This shouldn't be necessary, did someone misbackport the mmap changes without:
> > >
> > > drm: set FMODE_UNSIGNED_OFFSET for drm files
> > >
> > > Dave.
> >
> > The latest kernel I have had to patch was a 4.18-rc6.  I'll try with a
> > newer 4.19 and let you know if it decides to work.  If not I'll
> > prepare a test case for demonstration on qemu-system-i386.
>
> If you have custom userspace software, make sure it's using
> AC_SYS_LARGEFILE or whatever the equivalant is in your build system.
>
> 64-bit file offsets are important.
>
> Dave.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-10-23  1:47       ` Re: Michael Tirado
@ 2018-10-23  6:23         ` Dave Airlie
  0 siblings, 0 replies; 1651+ messages in thread
From: Dave Airlie @ 2018-10-23  6:23 UTC (permalink / raw)
  To: mtirado418; +Cc: LKML, dri-devel

On Tue, 23 Oct 2018 at 16:13, Michael Tirado <mtirado418@gmail.com> wrote:
>
> That preprocessor define worked but I'm still confused about this
> DRM_FILE_PAGE_OFFSET thing.  Check out drivers/gpu/drm/drm_gem.c
> right above drm_gem_init.
>
> ---
>
> /*
>  * We make up offsets for buffer objects so we can recognize them at
>  * mmap time.
>  */
>
> /* pgoff in mmap is an unsigned long, so we need to make sure that
>  * the faked up offset will fit
>  */
>
> #if BITS_PER_LONG == 64
> #define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFFUL >> PAGE_SHIFT) + 1)
> #define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFFUL >> PAGE_SHIFT) * 16)
> #else
> #define DRM_FILE_PAGE_OFFSET_START ((0xFFFFFFFUL >> PAGE_SHIFT) + 1)
> #define DRM_FILE_PAGE_OFFSET_SIZE ((0xFFFFFFFUL >> PAGE_SHIFT) * 16)
> #endif
>
>
> ---
>
> Why is having a 64-bit file offsets critical, causing -EINVAL on mmap?
> What problems might be associated with using (0x10000000UL >>
> PAGE_SHIFT) ?

a) it finds people not using the correct userspace defines. mostly
libdrm should handle this,
and possibly mesa.

b) there used to be legacy maps below that address on older drivers,
so we decided to never put stuff in the first 32-bit range that they
could clash with.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2018-10-26 12:54 Mohanraj B
  2018-10-27 16:55 ` Jens Axboe
  0 siblings, 1 reply; 1651+ messages in thread
From: Mohanraj B @ 2018-10-26 12:54 UTC (permalink / raw)
  To: fio

Hello,

I am trying to check how option --clocksource works.

bash# fio --name job1 --size 10m --clocksource 2
        valid values: gettimeofday Use gettimeofday(2) for timing
                    : clock_gettime Use clock_gettime(2) for timing
                    : cpu        Use CPU private clock

fio: failed parsing clocksource=2

bash# fio --name job1 --size 10m --clocksource gettimeofday(2)
bash: syntax error near unexpected token `('

Below command works fine.
bash# fio --name job1 --size 10m --clocksource gettimeofday

It runs without error but quiet not sure how to see the effect of this
option. also tried other options - clock_gettime, cpu gettimeofday and
dont see any difference.

Also is there any error in documentation passing gettimeofday(2)
throws parse error.

Thanks and Regards,
Mohan

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-10-26 12:54 Mohanraj B
@ 2018-10-27 16:55 ` Jens Axboe
  0 siblings, 0 replies; 1651+ messages in thread
From: Jens Axboe @ 2018-10-27 16:55 UTC (permalink / raw)
  To: Mohanraj B, fio

On 10/26/18 6:54 AM, Mohanraj B wrote:
> Hello,
> 
> I am trying to check how option --clocksource works.
> 
> 
> bash# fio --name job1 --size 10m --clocksource 2
>         valid values: gettimeofday Use gettimeofday(2) for timing
>                     : clock_gettime Use clock_gettime(2) for timing
>                     : cpu        Use CPU private clock
> 
> fio: failed parsing clocksource=2
> 
> bash# fio --name job1 --size 10m --clocksource gettimeofday(2)
> bash: syntax error near unexpected token `('
> 
> Below command works fine.
> bash# fio --name job1 --size 10m --clocksource gettimeofday
> 
> It runs without error but quiet not sure how to see the effect of this
> option. also tried other options - clock_gettime, cpu gettimeofday and
> dont see any difference.
> 
> Also is there any error in documentation passing gettimeofday(2)
> throws parse error.

The format is 'value' 'help', so you'd want to do:

--clocksource=gettimeofday

for instance.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2018-10-29 14:20 Beierl, Mark
  2018-10-29 14:37 ` Re: Mohanraj B
  0 siblings, 1 reply; 1651+ messages in thread
From: Beierl, Mark @ 2018-10-29 14:20 UTC (permalink / raw)
  To: Jens Axboe, Mohanraj B, fio@vger.kernel.org

On 2018-10-27, 12:55, "fio-owner@vger.kernel.org on behalf of Jens Axboe" <fio-owner@vger.kernel.org on behalf of axboe@kernel.dk> wrote:
    
    [EXTERNAL EMAIL] 
    Please report any suspicious attachments, links, or requests for sensitive information.
    
    
    On 10/26/18 6:54 AM, Mohanraj B wrote:
    > Hello,
    > 
    > I am trying to check how option --clocksource works.
    > 
    > 
    > bash# fio --name job1 --size 10m --clocksource 2
    >         valid values: gettimeofday Use gettimeofday(2) for timing
    >                     : clock_gettime Use clock_gettime(2) for timing
    >                     : cpu        Use CPU private clock
    > 
    > fio: failed parsing clocksource=2
    > 
    > bash# fio --name job1 --size 10m --clocksource gettimeofday(2)
    > bash: syntax error near unexpected token `('
    > 
    > Below command works fine.
    > bash# fio --name job1 --size 10m --clocksource gettimeofday
    > 
    > It runs without error but quiet not sure how to see the effect of this
    > option. also tried other options - clock_gettime, cpu gettimeofday and
    > dont see any difference.
    > 
    > Also is there any error in documentation passing gettimeofday(2)
    > throws parse error.
    
    The format is 'value' 'help', so you'd want to do:
    
    --clocksource=gettimeofday
    
    for instance.
    
    -- 
    Jens Axboe

Hello, Mohanraj

The help output that you see above states that using --clocksource=gettimeofday will use the gettimeofday function as defined in the man page in the section (2), which is where all the system calls manuals are stored.  The (2) is  not meant to be part of the command line, it is part of the description of the help text, which tells you where to find more information on what is being used to implement the clocksource.

Hope that helps clarify the help text.

Regards,
Mark




^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-10-29 14:20 Re: Beierl, Mark
@ 2018-10-29 14:37 ` Mohanraj B
  0 siblings, 0 replies; 1651+ messages in thread
From: Mohanraj B @ 2018-10-29 14:37 UTC (permalink / raw)
  To: Beierl, Mark; +Cc: Jens Axboe, fio

[-- Attachment #1: Type: text/plain, Size: 2053 bytes --]

Thanks Axboe and Mark.

On Mon 29 Oct, 2018, 7:50 PM Beierl, Mark, <Mark.Beierl@dell.com> wrote:

> On 2018-10-27, 12:55, "fio-owner@vger.kernel.org on behalf of Jens Axboe"
> <fio-owner@vger.kernel.org on behalf of axboe@kernel.dk> wrote:
>
>     [EXTERNAL EMAIL]
>     Please report any suspicious attachments, links, or requests for
> sensitive information.
>
>
>     On 10/26/18 6:54 AM, Mohanraj B wrote:
>     > Hello,
>     >
>     > I am trying to check how option --clocksource works.
>     >
>     >
>     > bash# fio --name job1 --size 10m --clocksource 2
>     >         valid values: gettimeofday Use gettimeofday(2) for timing
>     >                     : clock_gettime Use clock_gettime(2) for timing
>     >                     : cpu        Use CPU private clock
>     >
>     > fio: failed parsing clocksource=2
>     >
>     > bash# fio --name job1 --size 10m --clocksource gettimeofday(2)
>     > bash: syntax error near unexpected token `('
>     >
>     > Below command works fine.
>     > bash# fio --name job1 --size 10m --clocksource gettimeofday
>     >
>     > It runs without error but quiet not sure how to see the effect of
> this
>     > option. also tried other options - clock_gettime, cpu gettimeofday
> and
>     > dont see any difference.
>     >
>     > Also is there any error in documentation passing gettimeofday(2)
>     > throws parse error.
>
>     The format is 'value' 'help', so you'd want to do:
>
>     --clocksource=gettimeofday
>
>     for instance.
>
>     --
>     Jens Axboe
>
> Hello, Mohanraj
>
> The help output that you see above states that using
> --clocksource=gettimeofday will use the gettimeofday function as defined in
> the man page in the section (2), which is where all the system calls
> manuals are stored.  The (2) is  not meant to be part of the command line,
> it is part of the description of the help text, which tells you where to
> find more information on what is being used to implement the clocksource.
>
> Hope that helps clarify the help text.
>
> Regards,
> Mark
>
>
>
>

[-- Attachment #2: Type: text/html, Size: 2895 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE,
@ 2018-11-06  1:21 Miss Juliet Muhammad
  0 siblings, 0 replies; 1651+ messages in thread
From: Miss Juliet Muhammad @ 2018-11-06  1:21 UTC (permalink / raw)
  To: Recipients

I have a deal for you, in your region.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE,
@ 2018-11-11  4:20 Miss Juliet Muhammad
  0 siblings, 0 replies; 1651+ messages in thread
From: Miss Juliet Muhammad @ 2018-11-11  4:20 UTC (permalink / raw)
  To: Recipients

I have a deal for you, in your region.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CACikiw1uNCYKzo9vjG=AZHpARWv-nzkCX=D-aWBssM7vYZrQdQ@mail.gmail.com>]

* Re:
       [not found] <CACikiw1uNCYKzo9vjG=AZHpARWv-nzkCX=D-aWBssM7vYZrQdQ@mail.gmail.com>
@ 2018-11-12 10:09 ` Ravi Kumar
  2018-11-15 13:11 ` Re: Ondrej Mosnacek
  1 sibling, 0 replies; 1651+ messages in thread
From: Ravi Kumar @ 2018-11-12 10:09 UTC (permalink / raw)
  To: selinux

<<Sorry re-sending in plan text >>
Hi team ,

On android- with latest kernels 4.14  we are seeing some denials which
seem to be very much genuine to be address . Where kernel is trying to
kill its own  created process ( might be for maintenance) .
These are seen in long Stress testing .  But  I dont see any one
adding such rule in general so the question is  do we see any risk
which made us not to add such rules ?

1.   avc: denied { kill } for pid=2432 comm="irq/66-90b6300."
capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0
tclass=capability permissive=0
2.   avc: denied { kill } for pid=69 comm="rcuop/6" capability=5
scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability
permissive=0
3.   avc: denied { kill } for pid=0 comm="swapper/1" capability=5
scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability
permissive=0
4.   avc: denied { kill } for pid=4185 comm="kworker/0:4" capability=5
scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability
permissive=0

This is self capability any one in kernel context  should be able to
do such operations  I guess.

Regards,
Ravi

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
       [not found] <CACikiw1uNCYKzo9vjG=AZHpARWv-nzkCX=D-aWBssM7vYZrQdQ@mail.gmail.com>
  2018-11-12 10:09 ` Ravi Kumar
@ 2018-11-15 13:11 ` Ondrej Mosnacek
  1 sibling, 0 replies; 1651+ messages in thread
From: Ondrej Mosnacek @ 2018-11-15 13:11 UTC (permalink / raw)
  To: nxp.ravi; +Cc: selinux, Paul Moore, Stephen Smalley, SElinux list

On Mon, Nov 12, 2018 at 7:56 AM Ravi Kumar <nxp.ravi@gmail.com> wrote:
> Hi team ,
>
> On android- with latest kernels 4.14  we are seeing some denials which seem to be very much genuine to be address . Where kernel is trying to kill its own  created process ( might be for maintenance) .
> These are seen in long Stress testing .  But  I dont see any one adding such rule in general so the question is  do we see any risk  which made us not to add such rules ?
>
> 1.   avc: denied { kill } for pid=2432 comm="irq/66-90b6300." capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability permissive=0
> 2.   avc: denied { kill } for pid=69 comm="rcuop/6" capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability permissive=0
> 3.   avc: denied { kill } for pid=0 comm="swapper/1" capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability permissive=0
> 4.   avc: denied { kill } for pid=4185 comm="kworker/0:4" capability=5 scontext=u:r:kernel:s0 tcontext=u:r:kernel:s0 tclass=capability permissive=0
>
> This is self capability any one in kernel context  should be able to do such operations  I guess.

The reference policy does contain a rule that allows this kind of
operations, see:
https://github.com/SELinuxProject/refpolicy/blob/master/policy/modules/kernel/kernel.te#L203

It is also present in the Fedora policy on my system:

$ sesearch -A -s kernel_t -t kernel_t -c capability -p kill
allow kernel_t kernel_t:capability { audit_control audit_write chown
dac_override dac_read_search fowner fsetid ipc_lock ipc_owner kill
lease linux_immutable mknod net_admin net_bind_service net_broadcast
net_raw setfcap setgid setpcap s
etuid sys_admin sys_boot sys_chroot sys_nice sys_pacct sys_ptrace
sys_rawio sys_resource sys_time sys_tty_config };

Therefore I would say it is perfectly fine to add such rule to your
policy as well.

Cheers,

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAJUWh6qyHerKg=-oaFN+USa10_Aag5+SYjBOeLCX1qM+WcDUwA@mail.gmail.com>]

* Re:
       [not found] <CAJUWh6qyHerKg=-oaFN+USa10_Aag5+SYjBOeLCX1qM+WcDUwA@mail.gmail.com>
@ 2018-11-23  7:52 ` Chris Murphy
  2018-11-23  9:34   ` Re: Andy Leadbetter
  0 siblings, 1 reply; 1651+ messages in thread
From: Chris Murphy @ 2018-11-23  7:52 UTC (permalink / raw)
  To: andy.leadbetter, Btrfs BTRFS

On Thu, Nov 22, 2018 at 11:41 PM Andy Leadbetter
<andy.leadbetter@theleadbetters.com> wrote:
>
> I have a failing 2TB disk that is part of a 4 disk RAID 6 system.  I
> have added a new 2TB disk to the computer, and started a BTRFS replace
> for the old and new disk.  The process starts correctly however some
> hours into the job, there is an error and kernel oops. relevant log
> below.

The relevant log is the entire dmesg, not a snippet. It's decently
likely there's more than one thing going on here. We also need full
output of 'smartctl -x' for all four drives, and also 'smartctl -l
scterc' for all four drives, and also 'cat
/sys/block/sda/device/timeout' for all four drives. And which bcache
mode you're using.

The call trace provided is from kernel 4.15 which is sufficiently long
ago I think any dev working on raid56 might want to see where it's
getting tripped up on something a lot newer, and this is why:

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/fs/btrfs/raid56.c?id=v4.19.3&id2=v4.15.1

That's a lot of changes in just the raid56 code between 4.15 and 4.19.
And then in you call trace, btrfs_dev_replace_start is found in
dev-replace.c which likewise has a lot of changes. But then also, I
think 4.15 might still be in the era where it was not recommended to
use 'btrfs dev replace' for raid56, only non-raid56. I'm not sure if
the problems with device replace were fixed, and if they were fixed
kernel or progs side. Anyway, the latest I recall, it was recommended
on raid56 to 'btrfs dev add' then 'btrfs dev remove'.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/fs/btrfs/dev-replace.c?id=v4.19.3&id2=v4.15.1

And that's only a few hundred changes for each. Check out inode.c -
there are over 2000 changes.

> The disks are configured on top of bcache, in 5 arrays with a small
> 128GB SSD cache shared.  The system in this configuration has worked
> perfectly for 3 years, until 2 weeks ago csum errors started
> appearing.  I have a crashplan backup of all files on the disk, so I
> am not concerned about data loss, but I would like to avoid rebuild
> the system.

btrfs-progs 4.17 still considers raid56 experimental, not for
production use. And three years ago, the current upstream kernel
released was 4.3 so I'm gonna guess the kernel history of this file
system goes back older than that, very close to raid56 code birth. And
then adding bcache to this mix just makes it all the more complicated.

>
> btrfs dev stats shows
> [/dev/bcache0].write_io_errs    0
> [/dev/bcache0].read_io_errs     0
> [/dev/bcache0].flush_io_errs    0
> [/dev/bcache0].corruption_errs  0
> [/dev/bcache0].generation_errs  0
> [/dev/bcache1].write_io_errs    0
> [/dev/bcache1].read_io_errs     20
> [/dev/bcache1].flush_io_errs    0
> [/dev/bcache1].corruption_errs  0
> [/dev/bcache1].generation_errs  14
> [/dev/bcache3].write_io_errs    0
> [/dev/bcache3].read_io_errs     0
> [/dev/bcache3].flush_io_errs    0
> [/dev/bcache3].corruption_errs  0
> [/dev/bcache3].generation_errs  19
> [/dev/bcache2].write_io_errs    0
> [/dev/bcache2].read_io_errs     0
> [/dev/bcache2].flush_io_errs    0
> [/dev/bcache2].corruption_errs  0
> [/dev/bcache2].generation_errs  2

3 of 4 drives have at least one generation error. While there are no
corruptions reported, generation errors can be really tricky to
recover from at all. If only one device had only read errors, this
would be a lot less difficult.

> I've tried the latest kernel, and the latest tools, but nothing will
> allow me to replace, or delete the failed disk.

If the file system is mounted, I would try to make a local backup ASAP
before you lose the whole volume. Whether it's LVM pool of two drives
(linear/concat) with XFS, or if you go with Btrfs -dsingle -mraid1
(also basically a concat) doesn't really matter, but I'd get whatever
you can off the drive. I expect avoiding a rebuild in some form or
another is very wishful thinking and not very likely.

The more changes are made to the file system, repair attempts or
otherwise writing to it, decreases the chance of recovery.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2018-11-23  7:52 ` Re: Chris Murphy
@ 2018-11-23  9:34   ` Andy Leadbetter
  0 siblings, 0 replies; 1651+ messages in thread
From: Andy Leadbetter @ 2018-11-23  9:34 UTC (permalink / raw)
  To: lists; +Cc: linux-btrfs

Will capture all of that this evening, and try it with the latest
kernel and tools.  Thanks for the input on what info is relevant, with
gather it asap.
On Fri, 23 Nov 2018 at 07:53, Chris Murphy <lists@colorremedies.com> wrote:
>
> On Thu, Nov 22, 2018 at 11:41 PM Andy Leadbetter
> <andy.leadbetter@theleadbetters.com> wrote:
> >
> > I have a failing 2TB disk that is part of a 4 disk RAID 6 system.  I
> > have added a new 2TB disk to the computer, and started a BTRFS replace
> > for the old and new disk.  The process starts correctly however some
> > hours into the job, there is an error and kernel oops. relevant log
> > below.
>
> The relevant log is the entire dmesg, not a snippet. It's decently
> likely there's more than one thing going on here. We also need full
> output of 'smartctl -x' for all four drives, and also 'smartctl -l
> scterc' for all four drives, and also 'cat
> /sys/block/sda/device/timeout' for all four drives. And which bcache
> mode you're using.
>
> The call trace provided is from kernel 4.15 which is sufficiently long
> ago I think any dev working on raid56 might want to see where it's
> getting tripped up on something a lot newer, and this is why:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/fs/btrfs/raid56.c?id=v4.19.3&id2=v4.15.1
>
> That's a lot of changes in just the raid56 code between 4.15 and 4.19.
> And then in you call trace, btrfs_dev_replace_start is found in
> dev-replace.c which likewise has a lot of changes. But then also, I
> think 4.15 might still be in the era where it was not recommended to
> use 'btrfs dev replace' for raid56, only non-raid56. I'm not sure if
> the problems with device replace were fixed, and if they were fixed
> kernel or progs side. Anyway, the latest I recall, it was recommended
> on raid56 to 'btrfs dev add' then 'btrfs dev remove'.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/diff/fs/btrfs/dev-replace.c?id=v4.19.3&id2=v4.15.1
>
> And that's only a few hundred changes for each. Check out inode.c -
> there are over 2000 changes.
>
>
> > The disks are configured on top of bcache, in 5 arrays with a small
> > 128GB SSD cache shared.  The system in this configuration has worked
> > perfectly for 3 years, until 2 weeks ago csum errors started
> > appearing.  I have a crashplan backup of all files on the disk, so I
> > am not concerned about data loss, but I would like to avoid rebuild
> > the system.
>
> btrfs-progs 4.17 still considers raid56 experimental, not for
> production use. And three years ago, the current upstream kernel
> released was 4.3 so I'm gonna guess the kernel history of this file
> system goes back older than that, very close to raid56 code birth. And
> then adding bcache to this mix just makes it all the more complicated.
>
>
>
> >
> > btrfs dev stats shows
> > [/dev/bcache0].write_io_errs    0
> > [/dev/bcache0].read_io_errs     0
> > [/dev/bcache0].flush_io_errs    0
> > [/dev/bcache0].corruption_errs  0
> > [/dev/bcache0].generation_errs  0
> > [/dev/bcache1].write_io_errs    0
> > [/dev/bcache1].read_io_errs     20
> > [/dev/bcache1].flush_io_errs    0
> > [/dev/bcache1].corruption_errs  0
> > [/dev/bcache1].generation_errs  14
> > [/dev/bcache3].write_io_errs    0
> > [/dev/bcache3].read_io_errs     0
> > [/dev/bcache3].flush_io_errs    0
> > [/dev/bcache3].corruption_errs  0
> > [/dev/bcache3].generation_errs  19
> > [/dev/bcache2].write_io_errs    0
> > [/dev/bcache2].read_io_errs     0
> > [/dev/bcache2].flush_io_errs    0
> > [/dev/bcache2].corruption_errs  0
> > [/dev/bcache2].generation_errs  2
>
>
> 3 of 4 drives have at least one generation error. While there are no
> corruptions reported, generation errors can be really tricky to
> recover from at all. If only one device had only read errors, this
> would be a lot less difficult.
>
>
> > I've tried the latest kernel, and the latest tools, but nothing will
> > allow me to replace, or delete the failed disk.
>
> If the file system is mounted, I would try to make a local backup ASAP
> before you lose the whole volume. Whether it's LVM pool of two drives
> (linear/concat) with XFS, or if you go with Btrfs -dsingle -mraid1
> (also basically a concat) doesn't really matter, but I'd get whatever
> you can off the drive. I expect avoiding a rebuild in some form or
> another is very wishful thinking and not very likely.
>
> The more changes are made to the file system, repair attempts or
> otherwise writing to it, decreases the chance of recovery.
>
> --
> Chris Murphy

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE,
@ 2018-11-24 14:03 Miss Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1651+ messages in thread
From: Miss Sharifah Ahmad Mustahfa @ 2018-11-24 14:03 UTC (permalink / raw)
  To: Recipients

Hello,

First of all i will like to apologies for my manner of communication because you do not know me personally, its due to the fact that i have a very important proposal for you.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE,
@ 2018-11-24 14:16 Miss Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1651+ messages in thread
From: Miss Sharifah Ahmad Mustahfa @ 2018-11-24 14:16 UTC (permalink / raw)
  To: Recipients

Hello,

First of all i will like to apologies for my manner of communication because you do not know me personally, its due to the fact that i have a very important proposal for you.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE,
@ 2018-11-24 14:16 Miss Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1651+ messages in thread
From: Miss Sharifah Ahmad Mustahfa @ 2018-11-24 14:16 UTC (permalink / raw)
  To: Recipients

Hello,

First of all i will like to apologies for my manner of communication because you do not know me personally, its due to the fact that i have a very important proposal for you.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE,
@ 2018-11-24 14:19 Miss Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1651+ messages in thread
From: Miss Sharifah Ahmad Mustahfa @ 2018-11-24 14:19 UTC (permalink / raw)
  To: Recipients

Hello,

First of all i will like to apologies for my manner of communication because you do not know me personally, its due to the fact that i have a very important proposal for you.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20181130011234.32674-1-axboe@kernel.dk>]

* Re:
       [not found] <20181130011234.32674-1-axboe@kernel.dk>
@ 2018-11-30  2:09 ` Jens Axboe
  0 siblings, 0 replies; 1651+ messages in thread
From: Jens Axboe @ 2018-11-30  2:09 UTC (permalink / raw)
  To: linux-block, osandov

On 11/29/18 6:12 PM, Jens Axboe wrote:
> Three patches here:
> 
> 1) Ensure that we align ->map properly
> 
> 2) v2 of the sbitmap clear cost ammortization. Updated to do a wakeup
>    check AFTER we're done swapping free/cleared masks. Kept the
>    separate alignment for ->word, as it is faster in testing.
> 
> 3) Cost reduction of having to do wait queue checks.

Ignore this one, see v3 posted.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE,
@ 2018-12-04  2:28 Ms Sharifah Ahmad Mustahfa
  0 siblings, 0 replies; 1651+ messages in thread
From: Ms Sharifah Ahmad Mustahfa @ 2018-12-04  2:28 UTC (permalink / raw)




-- 
Hello,

First of all i will like to apologies for my manner of communication 
because you do not know me personally, its due to the fact that i have a 
very important proposal for you.



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2018-12-21 15:22 kenneth johansson
  2018-12-22  8:18 ` Richard Weinberger
  0 siblings, 1 reply; 1651+ messages in thread
From: kenneth johansson @ 2018-12-21 15:22 UTC (permalink / raw)
  To: linux-mtd

>From 9815710fa078241c683de1b49d9a0c9631502e17 Mon Sep 17 00:00:00 2001
From: Kenneth Johansson <kenjo@kenjo.org>
Date: Fri, 21 Dec 2018 15:46:24 +0100
Subject: [PATCH] mtd: rawnand: nandsim: Add support to disable subpage writes.
X-IMAPbase: 1545405463 2
Status: O
X-UID: 1

This is needed if you try to use an already existing ubifs image that
is created for hardware that do not support subpage write.

It is not enough that you can select what nandchip to emulate as the
subpage support might not exist in the actual nand driver.

Signed-off-by: Kenneth Johansson <ken@kenjo.org>
---
 drivers/mtd/nand/raw/nandsim.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/mtd/nand/raw/nandsim.c b/drivers/mtd/nand/raw/nandsim.c
index c452819..858ef30b 100644
--- a/drivers/mtd/nand/raw/nandsim.c
+++ b/drivers/mtd/nand/raw/nandsim.c
@@ -110,6 +110,7 @@ static unsigned int overridesize = 0;
 static char *cache_file = NULL;
 static unsigned int bbt;
 static unsigned int bch;
+static unsigned int no_subpage;
 static u_char id_bytes[8] = {
 	[0] = CONFIG_NANDSIM_FIRST_ID_BYTE,
 	[1] = CONFIG_NANDSIM_SECOND_ID_BYTE,
@@ -142,6 +143,7 @@ module_param(overridesize,   uint, 0400);
 module_param(cache_file,     charp, 0400);
 module_param(bbt,	     uint, 0400);
 module_param(bch,	     uint, 0400);
+module_param(no_subpage,     uint, 0400);
 
 MODULE_PARM_DESC(id_bytes,       "The ID bytes returned by NAND Flash 'read ID' command");
 MODULE_PARM_DESC(first_id_byte,  "The first byte returned by NAND Flash 'read ID' command (manufacturer ID) (obsolete)");
@@ -177,6 +179,7 @@ MODULE_PARM_DESC(cache_file,     "File to use to cache nand pages instead of mem
 MODULE_PARM_DESC(bbt,		 "0 OOB, 1 BBT with marker in OOB, 2 BBT with marker in data area");
 MODULE_PARM_DESC(bch,		 "Enable BCH ecc and set how many bits should "
 				 "be correctable in 512-byte blocks");
+MODULE_PARM_DESC(no_subpage,	 "Disable use of subpage write");
 
 /* The largest possible page size */
 #define NS_LARGEST_PAGE_SIZE	4096
@@ -2260,6 +2263,10 @@ static int __init ns_init_module(void)
 	/* and 'badblocks' parameters to work */
 	chip->options   |= NAND_SKIP_BBTSCAN;
 
+	/* turn off subpage to be able to read ubifs images created for hardware without subpage support */
+	if (no_subpage)
+		chip->options   |= NAND_NO_SUBPAGE_WRITE;
+
 	switch (bbt) {
 	case 2:
 		 chip->bbt_options |= NAND_BBT_NO_OOB;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2018-12-21 15:22 kenneth johansson
@ 2018-12-22  8:18 ` Richard Weinberger
  0 siblings, 0 replies; 1651+ messages in thread
From: Richard Weinberger @ 2018-12-22  8:18 UTC (permalink / raw)
  To: kenneth johansson; +Cc: linux-mtd

On Fri, Dec 21, 2018 at 4:24 PM kenneth johansson <kenjo@kenjo.org> wrote:
>
> From 9815710fa078241c683de1b49d9a0c9631502e17 Mon Sep 17 00:00:00 2001
> From: Kenneth Johansson <kenjo@kenjo.org>
> Date: Fri, 21 Dec 2018 15:46:24 +0100
> Subject: [PATCH] mtd: rawnand: nandsim: Add support to disable subpage writes.
> X-IMAPbase: 1545405463 2
> Status: O
> X-UID: 1
>
> This is needed if you try to use an already existing ubifs image that
> is created for hardware that do not support subpage write.
>
> It is not enough that you can select what nandchip to emulate as the
> subpage support might not exist in the actual nand driver.

Is this really needed? Usually using ubiattach's -O parameter does the trick.

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH] arch/arm/mm: Remove duplicate header
@ 2019-01-07 17:28 Souptick Joarder
  2019-01-17 11:23 ` Souptick Joarder
  0 siblings, 1 reply; 1651+ messages in thread
From: Souptick Joarder @ 2019-01-07 17:28 UTC (permalink / raw)
  To: linux, mhocko, rppt, akpm
  Cc: sabyasachi.linux, linux-kernel, linux-arm-kernel, brajeswar.linux

Remove duplicate headers which are included twice.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
---
 arch/arm/mm/mmu.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index f5cc1cc..dde3032 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -23,7 +23,6 @@
 #include <asm/sections.h>
 #include <asm/cachetype.h>
 #include <asm/fixmap.h>
-#include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/smp_plat.h>
 #include <asm/tlb.h>
@@ -36,7 +35,6 @@
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
 #include <asm/mach/pci.h>
-#include <asm/fixmap.h>
 
 #include "fault.h"
 #include "mm.h"
-- 
1.9.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re: [PATCH] arch/arm/mm: Remove duplicate header
  2019-01-07 17:28 [PATCH] arch/arm/mm: Remove duplicate header Souptick Joarder
@ 2019-01-17 11:23 ` Souptick Joarder
  2019-01-17 11:28   ` Mike Rapoport
  0 siblings, 1 reply; 1651+ messages in thread
From: Souptick Joarder @ 2019-01-17 11:23 UTC (permalink / raw)
  To: Russell King - ARM Linux, Michal Hocko, rppt, Andrew Morton
  Cc: Sabyasachi Gupta, linux-kernel, linux-arm-kernel, Brajeswar Ghosh

On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
>
> Remove duplicate headers which are included twice.
>
> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>

Any comment on this patch ?

> ---
>  arch/arm/mm/mmu.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> index f5cc1cc..dde3032 100644
> --- a/arch/arm/mm/mmu.c
> +++ b/arch/arm/mm/mmu.c
> @@ -23,7 +23,6 @@
>  #include <asm/sections.h>
>  #include <asm/cachetype.h>
>  #include <asm/fixmap.h>
> -#include <asm/sections.h>
>  #include <asm/setup.h>
>  #include <asm/smp_plat.h>
>  #include <asm/tlb.h>
> @@ -36,7 +35,6 @@
>  #include <asm/mach/arch.h>
>  #include <asm/mach/map.h>
>  #include <asm/mach/pci.h>
> -#include <asm/fixmap.h>
>
>  #include "fault.h"
>  #include "mm.h"
> --
> 1.9.1
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] arch/arm/mm: Remove duplicate header
  2019-01-17 11:23 ` Souptick Joarder
@ 2019-01-17 11:28   ` Mike Rapoport
  2019-01-31  5:54     ` Souptick Joarder
  0 siblings, 1 reply; 1651+ messages in thread
From: Mike Rapoport @ 2019-01-17 11:28 UTC (permalink / raw)
  To: Souptick Joarder
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton,
	linux-arm-kernel

On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> >
> > Remove duplicate headers which are included twice.
> >
> > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>

Acked-by: Mike Rapoport <rppt@linux.ibm.com>
 
> Any comment on this patch ?
> 
> > ---
> >  arch/arm/mm/mmu.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > index f5cc1cc..dde3032 100644
> > --- a/arch/arm/mm/mmu.c
> > +++ b/arch/arm/mm/mmu.c
> > @@ -23,7 +23,6 @@
> >  #include <asm/sections.h>
> >  #include <asm/cachetype.h>
> >  #include <asm/fixmap.h>
> > -#include <asm/sections.h>
> >  #include <asm/setup.h>
> >  #include <asm/smp_plat.h>
> >  #include <asm/tlb.h>
> > @@ -36,7 +35,6 @@
> >  #include <asm/mach/arch.h>
> >  #include <asm/mach/map.h>
> >  #include <asm/mach/pci.h>
> > -#include <asm/fixmap.h>
> >
> >  #include "fault.h"
> >  #include "mm.h"
> > --
> > 1.9.1
> >
> 

-- 
Sincerely yours,
Mike.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] arch/arm/mm: Remove duplicate header
  2019-01-17 11:28   ` Mike Rapoport
@ 2019-01-31  5:54     ` Souptick Joarder
  2019-01-31 12:58       ` Vladimir Murzin
  0 siblings, 1 reply; 1651+ messages in thread
From: Souptick Joarder @ 2019-01-31  5:54 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton,
	linux-arm-kernel

On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> > On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> > >
> > > Remove duplicate headers which are included twice.
> > >
> > > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
>
> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
>
> > Any comment on this patch ?

If no further comment, can we get this patch in queue for 5.1 ?

> >
> > > ---
> > >  arch/arm/mm/mmu.c | 2 --
> > >  1 file changed, 2 deletions(-)
> > >
> > > diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
> > > index f5cc1cc..dde3032 100644
> > > --- a/arch/arm/mm/mmu.c
> > > +++ b/arch/arm/mm/mmu.c
> > > @@ -23,7 +23,6 @@
> > >  #include <asm/sections.h>
> > >  #include <asm/cachetype.h>
> > >  #include <asm/fixmap.h>
> > > -#include <asm/sections.h>
> > >  #include <asm/setup.h>
> > >  #include <asm/smp_plat.h>
> > >  #include <asm/tlb.h>
> > > @@ -36,7 +35,6 @@
> > >  #include <asm/mach/arch.h>
> > >  #include <asm/mach/map.h>
> > >  #include <asm/mach/pci.h>
> > > -#include <asm/fixmap.h>
> > >
> > >  #include "fault.h"
> > >  #include "mm.h"
> > > --
> > > 1.9.1
> > >
> >
>
> --
> Sincerely yours,
> Mike.
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-01-31  5:54     ` Souptick Joarder
@ 2019-01-31 12:58       ` Vladimir Murzin
  2019-02-01 12:32         ` Re: Souptick Joarder
  0 siblings, 1 reply; 1651+ messages in thread
From: Vladimir Murzin @ 2019-01-31 12:58 UTC (permalink / raw)
  To: Souptick Joarder, Mike Rapoport
  Cc: Michal Hocko, Sabyasachi Gupta, linux-kernel,
	Russell King - ARM Linux, rppt, Brajeswar Ghosh, Andrew Morton,
	linux-arm-kernel

Hi Souptick,

On 1/31/19 5:54 AM, Souptick Joarder wrote:
> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
>>
>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
>>>>
>>>> Remove duplicate headers which are included twice.
>>>>
>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
>>
>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
>>
>>> Any comment on this patch ?
> 
> If no further comment, can we get this patch in queue for 5.1 ?

I'd be nice to use proper tags in subject
line. I'd suggest 

[PATCH] ARM: mm: Remove duplicate header

but you can get some inspiration form

git log --oneline --no-merges arch/arm/mm/

In case you want to route it via ARM tree you need to drop it into
Russell's patch system [1]. 

[1] https://www.armlinux.org.uk/developer/patches/

Cheers
Vladimir

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-01-31 12:58       ` Vladimir Murzin
@ 2019-02-01 12:32         ` Souptick Joarder
  2019-02-01 12:36           ` Re: Vladimir Murzin
  0 siblings, 1 reply; 1651+ messages in thread
From: Souptick Joarder @ 2019-02-01 12:32 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton, Mike Rapoport,
	linux-arm-kernel

On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>
> Hi Souptick,
>
> On 1/31/19 5:54 AM, Souptick Joarder wrote:
> > On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> >>
> >> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> >>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> >>>>
> >>>> Remove duplicate headers which are included twice.
> >>>>
> >>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> >>
> >> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> >>
> >>> Any comment on this patch ?
> >
> > If no further comment, can we get this patch in queue for 5.1 ?
>
> I'd be nice to use proper tags in subject
> line. I'd suggest
>
> [PATCH] ARM: mm: Remove duplicate header
>
> but you can get some inspiration form
>
> git log --oneline --no-merges arch/arm/mm/
>
> In case you want to route it via ARM tree you need to drop it into
> Russell's patch system [1].

How to drop it to Russell's patch system other than posting it to
mailing list ? I don't know.
>
> [1] https://www.armlinux.org.uk/developer/patches/
>
> Cheers
> Vladimir

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-02-01 12:32         ` Re: Souptick Joarder
@ 2019-02-01 12:36           ` Vladimir Murzin
  2019-02-01 12:41             ` Re: Souptick Joarder
  0 siblings, 1 reply; 1651+ messages in thread
From: Vladimir Murzin @ 2019-02-01 12:36 UTC (permalink / raw)
  To: Souptick Joarder
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton, Mike Rapoport,
	linux-arm-kernel

On 2/1/19 12:32 PM, Souptick Joarder wrote:
> On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>>
>> Hi Souptick,
>>
>> On 1/31/19 5:54 AM, Souptick Joarder wrote:
>>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
>>>>
>>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
>>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
>>>>>>
>>>>>> Remove duplicate headers which are included twice.
>>>>>>
>>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
>>>>
>>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
>>>>
>>>>> Any comment on this patch ?
>>>
>>> If no further comment, can we get this patch in queue for 5.1 ?
>>
>> I'd be nice to use proper tags in subject
>> line. I'd suggest
>>
>> [PATCH] ARM: mm: Remove duplicate header
>>
>> but you can get some inspiration form
>>
>> git log --oneline --no-merges arch/arm/mm/
>>
>> In case you want to route it via ARM tree you need to drop it into
>> Russell's patch system [1].
> 
> How to drop it to Russell's patch system other than posting it to
> mailing list ? I don't know.

https://www.armlinux.org.uk/developer/patches/info.php

Vladimir

>>
>> [1] https://www.armlinux.org.uk/developer/patches/
>>
>> Cheers
>> Vladimir
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-02-01 12:36           ` Re: Vladimir Murzin
@ 2019-02-01 12:41             ` Souptick Joarder
  2019-02-01 13:02               ` Re: Vladimir Murzin
  2019-02-01 15:15               ` Re: Russell King - ARM Linux admin
  0 siblings, 2 replies; 1651+ messages in thread
From: Souptick Joarder @ 2019-02-01 12:41 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton, Mike Rapoport,
	linux-arm-kernel

On Fri, Feb 1, 2019 at 6:06 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>
> On 2/1/19 12:32 PM, Souptick Joarder wrote:
> > On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> >>
> >> Hi Souptick,
> >>
> >> On 1/31/19 5:54 AM, Souptick Joarder wrote:
> >>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> >>>>
> >>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> >>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> >>>>>>
> >>>>>> Remove duplicate headers which are included twice.
> >>>>>>
> >>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> >>>>
> >>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> >>>>
> >>>>> Any comment on this patch ?
> >>>
> >>> If no further comment, can we get this patch in queue for 5.1 ?
> >>
> >> I'd be nice to use proper tags in subject
> >> line. I'd suggest
> >>
> >> [PATCH] ARM: mm: Remove duplicate header
> >>
> >> but you can get some inspiration form
> >>
> >> git log --oneline --no-merges arch/arm/mm/
> >>
> >> In case you want to route it via ARM tree you need to drop it into
> >> Russell's patch system [1].
> >
> > How to drop it to Russell's patch system other than posting it to
> > mailing list ? I don't know.
>
> https://www.armlinux.org.uk/developer/patches/info.php

This link is not reachable.

>
> Vladimir
>
> >>
> >> [1] https://www.armlinux.org.uk/developer/patches/
> >>
> >> Cheers
> >> Vladimir
> >
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-02-01 12:41             ` Re: Souptick Joarder
@ 2019-02-01 13:02               ` Vladimir Murzin
  2019-02-01 15:15               ` Re: Russell King - ARM Linux admin
  1 sibling, 0 replies; 1651+ messages in thread
From: Vladimir Murzin @ 2019-02-01 13:02 UTC (permalink / raw)
  To: Souptick Joarder
  Cc: Michal Hocko, Sabyasachi Gupta, Russell King - ARM Linux,
	linux-kernel, rppt, Brajeswar Ghosh, Andrew Morton, Mike Rapoport,
	linux-arm-kernel

On 2/1/19 12:41 PM, Souptick Joarder wrote:
> On Fri, Feb 1, 2019 at 6:06 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>>
>> On 2/1/19 12:32 PM, Souptick Joarder wrote:
>>> On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
>>>>
>>>> Hi Souptick,
>>>>
>>>> On 1/31/19 5:54 AM, Souptick Joarder wrote:
>>>>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
>>>>>>
>>>>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
>>>>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Remove duplicate headers which are included twice.
>>>>>>>>
>>>>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
>>>>>>
>>>>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
>>>>>>
>>>>>>> Any comment on this patch ?
>>>>>
>>>>> If no further comment, can we get this patch in queue for 5.1 ?
>>>>
>>>> I'd be nice to use proper tags in subject
>>>> line. I'd suggest
>>>>
>>>> [PATCH] ARM: mm: Remove duplicate header
>>>>
>>>> but you can get some inspiration form
>>>>
>>>> git log --oneline --no-merges arch/arm/mm/
>>>>
>>>> In case you want to route it via ARM tree you need to drop it into
>>>> Russell's patch system [1].
>>>
>>> How to drop it to Russell's patch system other than posting it to
>>> mailing list ? I don't know.
>>
>> https://www.armlinux.org.uk/developer/patches/info.php
> 
> This link is not reachable.
> 

Bad luck :(

Vladimir

>>
>> Vladimir
>>
>>>>
>>>> [1] https://www.armlinux.org.uk/developer/patches/
>>>>
>>>> Cheers
>>>> Vladimir
>>>
>>
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-02-01 12:41             ` Re: Souptick Joarder
  2019-02-01 13:02               ` Re: Vladimir Murzin
@ 2019-02-01 15:15               ` Russell King - ARM Linux admin
  2019-02-01 15:22                 ` Re: Russell King - ARM Linux admin
  1 sibling, 1 reply; 1651+ messages in thread
From: Russell King - ARM Linux admin @ 2019-02-01 15:15 UTC (permalink / raw)
  To: Souptick Joarder
  Cc: Vladimir Murzin, Sabyasachi Gupta, Michal Hocko, linux-kernel,
	Mike Rapoport, rppt, Brajeswar Ghosh, Andrew Morton,
	linux-arm-kernel

On Fri, Feb 01, 2019 at 06:11:21PM +0530, Souptick Joarder wrote:
> On Fri, Feb 1, 2019 at 6:06 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> >
> > On 2/1/19 12:32 PM, Souptick Joarder wrote:
> > > On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> > >>
> > >> Hi Souptick,
> > >>
> > >> On 1/31/19 5:54 AM, Souptick Joarder wrote:
> > >>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > >>>>
> > >>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> > >>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> > >>>>>>
> > >>>>>> Remove duplicate headers which are included twice.
> > >>>>>>
> > >>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> > >>>>
> > >>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> > >>>>
> > >>>>> Any comment on this patch ?
> > >>>
> > >>> If no further comment, can we get this patch in queue for 5.1 ?
> > >>
> > >> I'd be nice to use proper tags in subject
> > >> line. I'd suggest
> > >>
> > >> [PATCH] ARM: mm: Remove duplicate header
> > >>
> > >> but you can get some inspiration form
> > >>
> > >> git log --oneline --no-merges arch/arm/mm/
> > >>
> > >> In case you want to route it via ARM tree you need to drop it into
> > >> Russell's patch system [1].
> > >
> > > How to drop it to Russell's patch system other than posting it to
> > > mailing list ? I don't know.
> >
> > https://www.armlinux.org.uk/developer/patches/info.php
> 
> This link is not reachable.

In what way?  The site is certainly getting hits over ipv4 and ipv6.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-02-01 15:15               ` Re: Russell King - ARM Linux admin
@ 2019-02-01 15:22                 ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 1651+ messages in thread
From: Russell King - ARM Linux admin @ 2019-02-01 15:22 UTC (permalink / raw)
  To: Souptick Joarder
  Cc: Vladimir Murzin, Sabyasachi Gupta, Michal Hocko, linux-kernel,
	Mike Rapoport, rppt, Brajeswar Ghosh, Andrew Morton,
	linux-arm-kernel

On Fri, Feb 01, 2019 at 03:15:11PM +0000, Russell King - ARM Linux admin wrote:
> On Fri, Feb 01, 2019 at 06:11:21PM +0530, Souptick Joarder wrote:
> > On Fri, Feb 1, 2019 at 6:06 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> > >
> > > On 2/1/19 12:32 PM, Souptick Joarder wrote:
> > > > On Thu, Jan 31, 2019 at 6:28 PM Vladimir Murzin <vladimir.murzin@arm.com> wrote:
> > > >>
> > > >> Hi Souptick,
> > > >>
> > > >> On 1/31/19 5:54 AM, Souptick Joarder wrote:
> > > >>> On Thu, Jan 17, 2019 at 4:58 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > > >>>>
> > > >>>> On Thu, Jan 17, 2019 at 04:53:44PM +0530, Souptick Joarder wrote:
> > > >>>>> On Mon, Jan 7, 2019 at 10:54 PM Souptick Joarder <jrdr.linux@gmail.com> wrote:
> > > >>>>>>
> > > >>>>>> Remove duplicate headers which are included twice.
> > > >>>>>>
> > > >>>>>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
> > > >>>>
> > > >>>> Acked-by: Mike Rapoport <rppt@linux.ibm.com>
> > > >>>>
> > > >>>>> Any comment on this patch ?
> > > >>>
> > > >>> If no further comment, can we get this patch in queue for 5.1 ?
> > > >>
> > > >> I'd be nice to use proper tags in subject
> > > >> line. I'd suggest
> > > >>
> > > >> [PATCH] ARM: mm: Remove duplicate header
> > > >>
> > > >> but you can get some inspiration form
> > > >>
> > > >> git log --oneline --no-merges arch/arm/mm/
> > > >>
> > > >> In case you want to route it via ARM tree you need to drop it into
> > > >> Russell's patch system [1].
> > > >
> > > > How to drop it to Russell's patch system other than posting it to
> > > > mailing list ? I don't know.
> > >
> > > https://www.armlinux.org.uk/developer/patches/info.php
> > 
> > This link is not reachable.
> 
> In what way?  The site is certainly getting hits over ipv4 and ipv6.

Ah, I see - the site is accessible over IPv6 using port 80 only, but
port 443 is blocked.  Problem is, I can't test IPv6 from "outside",
so I rely on people *reporting* when things stop working.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2019-01-23 10:50 Christopher Hagler
  2019-01-23 14:16 ` Cody Kratzer
  0 siblings, 1 reply; 1651+ messages in thread
From: Christopher Hagler @ 2019-01-23 10:50 UTC (permalink / raw)
  To: git

Unsubscribe git

Sent from my iPhone

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-01-23 10:50 Christopher Hagler
@ 2019-01-23 14:16 ` Cody Kratzer
  2019-01-23 14:25   ` Re: Thomas Braun
                     ` (2 more replies)
  0 siblings, 3 replies; 1651+ messages in thread
From: Cody Kratzer @ 2019-01-23 14:16 UTC (permalink / raw)
  To: Christopher Hagler; +Cc: git

I've sent this same email 3 times. I don't think it works. I'm
researching this morning how to unsubscribe from this git group.

CODY KRATZER WEB DEVELOPMENT MANAGER
866-344-3875 x145
CODY@LIGHTINGNEWYORK.COM
M - F 9 - 5:30


On Wed, Jan 23, 2019 at 5:51 AM Christopher Hagler
<haglerchristopher@gmail.com> wrote:
>
> Unsubscribe git
>
> Sent from my iPhone

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-01-23 14:16 ` Cody Kratzer
@ 2019-01-23 14:25   ` Thomas Braun
  2019-01-23 16:00   ` Re: Christopher Hagler
  2019-01-24 17:11   ` Johannes Schindelin
  2 siblings, 0 replies; 1651+ messages in thread
From: Thomas Braun @ 2019-01-23 14:25 UTC (permalink / raw)
  To: Cody Kratzer; +Cc: GIT Mailing-list

Am 23.01.2019 um 15:16 schrieb Cody Kratzer:
> I've sent this same email 3 times. I don't think it works. I'm
> researching this morning how to unsubscribe from this git group.

Hi Cody,

https://git-scm.com/community says to subscribe you should send an email

with body content

subscribe git

to

majordomo@vger.kernel.org

so maybe

sending

unsubscribe git

to *that* address does unsubscribe you.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-01-23 14:16 ` Cody Kratzer
  2019-01-23 14:25   ` Re: Thomas Braun
@ 2019-01-23 16:00   ` Christopher Hagler
  2019-01-23 16:35     ` Randall S. Becker
  2019-01-24 17:11   ` Johannes Schindelin
  2 siblings, 1 reply; 1651+ messages in thread
From: Christopher Hagler @ 2019-01-23 16:00 UTC (permalink / raw)
  To: Cody Kratzer; +Cc: git

Send the email to this address 
Majordomo@vger.kernel.org and it will work

Sent from my iPhone

> On Jan 23, 2019, at 8:16 AM, Cody Kratzer <cody@lightingnewyork.com> wrote:
> 
> I've sent this same email 3 times. I don't think it works. I'm
> researching this morning how to unsubscribe from this git group.
> 
> CODY KRATZER WEB DEVELOPMENT MANAGER
> 866-344-3875 x145
> CODY@LIGHTINGNEWYORK.COM
> M - F 9 - 5:30
> 
> 
> On Wed, Jan 23, 2019 at 5:51 AM Christopher Hagler
> <haglerchristopher@gmail.com> wrote:
>> 
>> Unsubscribe git
>> 
>> Sent from my iPhone

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
  2019-01-23 16:00   ` Re: Christopher Hagler
@ 2019-01-23 16:35     ` Randall S. Becker
  0 siblings, 0 replies; 1651+ messages in thread
From: Randall S. Becker @ 2019-01-23 16:35 UTC (permalink / raw)
  To: 'Christopher Hagler', 'Cody Kratzer'; +Cc: git

On January 23, 2019 11:00, Christopher Hagler wrote:
> Send the email to this address
> Majordomo@vger.kernel.org and it will work
> > On Jan 23, 2019, at 8:16 AM, Cody Kratzer <cody@lightingnewyork.com>
> > I've sent this same email 3 times. I don't think it works. I'm
> > researching this morning how to unsubscribe from this git group.
> >
> > CODY KRATZER WEB DEVELOPMENT MANAGER
> > 866-344-3875 x145
> > CODY@LIGHTINGNEWYORK.COM
> > M - F 9 - 5:30
> > On Wed, Jan 23, 2019 at 5:51 AM Christopher Hagler
> > <haglerchristopher@gmail.com> wrote:
> >>
> >> Unsubscribe git

Reference information to the mailing lists are available here:
http://vger.kernel.org/vger-lists.html#git


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-01-23 14:16 ` Cody Kratzer
  2019-01-23 14:25   ` Re: Thomas Braun
  2019-01-23 16:00   ` Re: Christopher Hagler
@ 2019-01-24 17:11   ` Johannes Schindelin
  2 siblings, 0 replies; 1651+ messages in thread
From: Johannes Schindelin @ 2019-01-24 17:11 UTC (permalink / raw)
  To: Cody Kratzer; +Cc: Christopher Hagler, git

Hi,

don't send this to git@vger.kernel.org. Send it to
majordomo@vger.kernel.org instead.

Thanks,
Johannes


On Wed, 23 Jan 2019, Cody Kratzer wrote:

> I've sent this same email 3 times. I don't think it works. I'm
> researching this morning how to unsubscribe from this git group.
> 
> CODY KRATZER WEB DEVELOPMENT MANAGER
> 866-344-3875 x145
> CODY@LIGHTINGNEWYORK.COM
> M - F 9 - 5:30
> 
> 
> On Wed, Jan 23, 2019 at 5:51 AM Christopher Hagler
> <haglerchristopher@gmail.com> wrote:
> >
> > Unsubscribe git
> >
> > Sent from my iPhone
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2019-02-16  0:08 Graham Loan Firm
  0 siblings, 0 replies; 1651+ messages in thread
From: Graham Loan Firm @ 2019-02-16  0:08 UTC (permalink / raw)




* Sie brauchen eine Art Darlehen?
Brauchen Sie dringend Kredite, um Ihre Schulden zu konsolidieren?
Benötigen Sie ein Auto, ein Geschäft oder ein Darlehen, um Rechnungen zu bezahlen?
Setzen Sie sich noch heute mit uns in Verbindung und holen Sie sich den Kredit, den Sie benötigen. *
* Kontakt Email: grahamloanfirm01@gmail.com
BORROWER DETAILS
Namen:
Adresse:
Telefonnummer:
Betrag nötig:
Dauer:
Alter:
Sex: *
* Email: grahamloanfirm01@gmail.com

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re;
@ 2019-02-16  4:17 Richard Wahl
  0 siblings, 0 replies; 1651+ messages in thread
From: Richard Wahl @ 2019-02-16  4:17 UTC (permalink / raw)
  To: sparclinux

I am Richard Wahl $1,900,000.00USD Has Been Granted To You As A Donation, Kindly reply for claim.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2019-02-18 23:41 Pablo Mancilla
  0 siblings, 0 replies; 1651+ messages in thread
From: Pablo Mancilla @ 2019-02-18 23:41 UTC (permalink / raw)


Good day,
I did send you an email on charity works and I
dont know if you got it.Please reach me for updates or let me know if to
resend

Pablo Mancilla

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2019-02-19  2:20 Pablo Mancilla
  0 siblings, 0 replies; 1651+ messages in thread
From: Pablo Mancilla @ 2019-02-19  2:20 UTC (permalink / raw)
  To: sparclinux

Good day,
I did send you an email on charity works and I
dont know if you got it.Please reach me for updates or let me know if to
resend

Pablo Mancilla

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [GSoC][PATCH v2 3/3] t3600: use helpers to replace test -d/f/e/s <path>
@ 2019-03-05 14:57 Eric Sunshine
  2019-03-05 23:38 ` Rohit Ashiwal
  0 siblings, 1 reply; 1651+ messages in thread
From: Eric Sunshine @ 2019-03-05 14:57 UTC (permalink / raw)
  To: Rohit Ashiwal
  Cc: Johannes Schindelin, Christian Couder, Git List, Junio C Hamano,
	Thomas Gummerer

On Tue, Mar 5, 2019 at 9:22 AM Rohit Ashiwal <rohit.ashiwal265@gmail.com> wrote:
> I was asking if this patch is good enough to be added to the
> existing code? Does this patch look good?

I didn't review the patch with a critical-enough eye to be able to say
that every change maintains fidelity with the original code. As
mentioned in [1]:

    [...] an important reason for limiting the scope of this change
    [...] is to ease the burden on people who review your submission.
    Large patch series tend to tax reviewers heavily, even (and often)
    when repetitive and simple, like replacing `test -d` with
    `test_path_is_dir()`. The shorter and more concise a patch series
    is, the more likely that it will receive quality reviews.

This patch, due to its length and repetitive nature, falls under the
category of being tedious to review, which makes it all the more
likely that a reviewer will overlook a problem.

And, it's not always obvious at a glance that a change is correct. For
instance, taking a look at the final patch band:

    - ! test -d submod &&
    - ! test -d submod/subsubmod/.git &&
    + test_path_is_missing submod &&
    + test_path_is_missing submod/subsubmod/.git &&

Superficially, the transformation seems straightforward. However, that
doesn't mean it maintains fidelity with the original or even means the
same thing. To review this change properly requires understanding the
original intent of "! test -d".

The meaning of that expression can vary depending upon the context. Is
it checking that that path is not a directory (but it is okay if a
plain file exists there)? Or does it merely care about existence
(neither directory nor any other type of entry)? If the latter, then
the transformation is probably correct, however, if the former, then
it likely isn't correct. So, understanding the overall context of the
test is important for judging if a particular change is correct, and
many (volunteer) reviewers simply don't have the time to delve that
deeply to make a proper judgment.

[1]: https://public-inbox.org/git/CAPig+cSZZaCT0G3hysmjn_tNvZmYGp=5cXpZHkdphbWXnONSVQ@mail.gmail.com/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-05 14:57 [GSoC][PATCH v2 3/3] t3600: use helpers to replace test -d/f/e/s <path> Eric Sunshine
@ 2019-03-05 23:38 ` Rohit Ashiwal
  0 siblings, 0 replies; 1651+ messages in thread
From: Rohit Ashiwal @ 2019-03-05 23:38 UTC (permalink / raw)
  To: sunshine
  Cc: Johannes.Schindelin, christian.couder, git, gitster,
	rohit.ashiwal265, t.gummerer

Hey Eric

On Tue, 5 Mar 2019 09:57:40 -0500 Eric Sunshine <sunshine@sunshineco.com> wrote:
> This patch, due to its length and repetitive nature, falls under the
> category of being tedious to review, which makes it all the more
> likely that a reviewer will overlook a problem.

Yes, I clearly understand that this patch has become too big to review.
It will require time to carefully review and reviewers are doing their
best to maintain the utmost quality of code.

> And, it's not always obvious at a glance that a change is correct. For
> instance, taking a look at the final patch band:
>
>     - ! test -d submod &&
>     - ! test -d submod/subsubmod/.git &&
>     + test_path_is_missing submod &&
>     + test_path_is_missing submod/subsubmod/.git &&

Duy actually confirms that this transformation is correct in this[1] email.
(I know that, it was given as an example, but I'll leave the link anyway).

Thanks
Rohit

[1]: https://public-inbox.org/git/CACsJy8BYeLvB7BSM_Jt4vwfGsEBuhaCZfzGPOHe=B=7cvnRwrg@mail.gmail.com/


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* No subject
@ 2019-03-19 14:41 Maxim Levitsky
  2019-03-20 11:03 ` Felipe Franciosi
  0 siblings, 1 reply; 1651+ messages in thread
From: Maxim Levitsky @ 2019-03-19 14:41 UTC (permalink / raw)


Date: Tue, 19 Mar 2019 14:45:45 +0200
Subject: [PATCH 0/9] RFC: NVME VFIO mediated device

Hi everyone!

In this patch series, I would like to introduce my take on the problem of doing 
as fast as possible virtualization of storage with emphasis on low latency.

In this patch series I implemented a kernel vfio based, mediated device that 
allows the user to pass through a partition and/or whole namespace to a guest.

The idea behind this driver is based on paper you can find at
https://www.usenix.org/conference/atc18/presentation/peng,

Although note that I stared the development prior to reading this paper, 
independently.

In addition to that implementation is not based on code used in the paper as 
I wasn't being able at that time to make the source available to me.

***Key points about the implementation:***

* Polling kernel thread is used. The polling is stopped after a 
predefined timeout (1/2 sec by default).
Support for all interrupt driven mode is planned, and it shows promising results.

* Guest sees a standard NVME device - this allows to run guest with 
unmodified drivers, for example windows guests.

* The NVMe device is shared between host and guest.
That means that even a single namespace can be split between host 
and guest based on different partitions.

* Simple configuration

*** Performance ***

Performance was tested on Intel DC P3700, With Xeon E5-2620 v2 
and both latency and throughput is very similar to SPDK.

Soon I will test this on a better server and nvme device and provide
more formal performance numbers.

Latency numbers:
~80ms - spdk with fio plugin on the host.
~84ms - nvme driver on the host
~87ms - mdev-nvme + nvme driver in the guest

Throughput was following similar pattern as well.

* Configuration example
  $ modprobe nvme mdev_queues=4
  $ modprobe nvme-mdev

  $ UUID=$(uuidgen)
  $ DEVICE='device pci address'
  $ echo $UUID > /sys/bus/pci/devices/$DEVICE/mdev_supported_types/nvme-2Q_V1/create
  $ echo n1p3 > /sys/bus/mdev/devices/$UUID/namespaces/add_namespace #attach host namespace 1 parition 3
  $ echo 11 > /sys/bus/mdev/devices/$UUID/settings/iothread_cpu #pin the io thread to cpu 11

  Afterward boot qemu with
  -device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$UUID
  
  Zero configuration on the guest.
  
*** FAQ ***

* Why to make this in the kernel? Why this is better that SPDK

  -> Reuse the existing nvme kernel driver in the host. No new drivers in the guest.
  
  -> Share the NVMe device between host and guest. 
     Even in fully virtualized configurations,
     some partitions of nvme device could be used by guests as block devices 
     while others passed through with nvme-mdev to achieve balance between
     all features of full IO stack emulation and performance.
  
  -> NVME-MDEV is a bit faster due to the fact that in-kernel driver 
     can send interrupts to the guest directly without a context 
     switch that can be expensive due to meltdown mitigation.

  -> Is able to utilize interrupts to get reasonable performance. 
     This is only implemented
     as a proof of concept and not included in the patches, 
     but interrupt driven mode shows reasonable performance
     
  -> This is a framework that later can be used to support NVMe devices 
     with more of the IO virtualization built-in 
     (IOMMU with PASID support coupled with device that supports it)

* Why to attach directly to nvme-pci driver and not use block layer IO
  -> The direct attachment allows for better performance, but I will
     check the possibility of using block IO, especially for fabrics drivers.
  
*** Implementation notes ***

*  All guest memory is mapped into the physical nvme device 
   but not 1:1 as vfio-pci would do this.
   This allows very efficient DMA.
   To support this, patch 2 adds ability for a mdev device to listen on 
   guest's memory map events. 
   Any such memory is immediately pinned and then DMA mapped.
   (Support for fabric drivers where this is not possible exits too,
    in which case the fabric driver will do its own DMA mapping)

*  nvme core driver is modified to announce the appearance 
   and disappearance of nvme controllers and namespaces,
   to which the nvme-mdev driver is subscribed.
 
*  nvme-pci driver is modified to expose raw interface of attaching to 
   and sending/polling the IO queues.
   This allows the mdev driver very efficiently to submit/poll for the IO.
   By default one host queue is used per each mediated device.
   (support for other fabric based host drivers is planned)

* The nvme-mdev doesn't assume presence of KVM, thus any VFIO user, including
  SPDK, a qemu running with tccg, ... can use this virtual device.

*** Testing ***

The device was tested with stock QEMU 3.0 on the host,
with host was using 5.0 kernel with nvme-mdev added and the following hardware:
 * QEMU nvme virtual device (with nested guest)
 * Intel DC P3700 on Xeon E5-2620 v2 server
 * Samsung SM981 (in a Thunderbolt enclosure, with my laptop)
 * Lenovo NVME device found in my laptop

The guest was tested with kernel 4.16, 4.18, 4.20 and
the same custom complied kernel 5.0
Windows 10 guest was tested too with both Microsoft's inbox driver and
open source community NVME driver
(https://lists.openfabrics.org/pipermail/nvmewin/2016-December/001420.html)

Testing was mostly done on x86_64, but 32 bit host/guest combination
was lightly tested too.

In addition to that, the virtual device was tested with nested guest,
by passing the virtual device to it,
using pci passthrough, qemu userspace nvme driver, and spdk


PS: I used to contribute to the kernel as a hobby using the
    maximlevitsky at gmail.com address

Maxim Levitsky (9):
  vfio/mdev: add .request callback
  nvme/core: add some more values from the spec
  nvme/core: add NVME_CTRL_SUSPENDED controller state
  nvme/pci: use the NVME_CTRL_SUSPENDED state
  nvme/pci: add known admin effects to augument admin effects log page
  nvme/pci: init shadow doorbell after each reset
  nvme/core: add mdev interfaces
  nvme/core: add nvme-mdev core driver
  nvme/pci: implement the mdev external queue allocation interface

 MAINTAINERS                   |   5 +
 drivers/nvme/Kconfig          |   1 +
 drivers/nvme/Makefile         |   1 +
 drivers/nvme/host/core.c      | 149 +++++-
 drivers/nvme/host/nvme.h      |  55 ++-
 drivers/nvme/host/pci.c       | 385 ++++++++++++++-
 drivers/nvme/mdev/Kconfig     |  16 +
 drivers/nvme/mdev/Makefile    |   5 +
 drivers/nvme/mdev/adm.c       | 873 ++++++++++++++++++++++++++++++++++
 drivers/nvme/mdev/events.c    | 142 ++++++
 drivers/nvme/mdev/host.c      | 491 +++++++++++++++++++
 drivers/nvme/mdev/instance.c  | 802 +++++++++++++++++++++++++++++++
 drivers/nvme/mdev/io.c        | 563 ++++++++++++++++++++++
 drivers/nvme/mdev/irq.c       | 264 ++++++++++
 drivers/nvme/mdev/mdev.h      |  56 +++
 drivers/nvme/mdev/mmio.c      | 591 +++++++++++++++++++++++
 drivers/nvme/mdev/pci.c       | 247 ++++++++++
 drivers/nvme/mdev/priv.h      | 700 +++++++++++++++++++++++++++
 drivers/nvme/mdev/udata.c     | 390 +++++++++++++++
 drivers/nvme/mdev/vcq.c       | 207 ++++++++
 drivers/nvme/mdev/vctrl.c     | 514 ++++++++++++++++++++
 drivers/nvme/mdev/viommu.c    | 322 +++++++++++++
 drivers/nvme/mdev/vns.c       | 356 ++++++++++++++
 drivers/nvme/mdev/vsq.c       | 178 +++++++
 drivers/vfio/mdev/vfio_mdev.c |  11 +
 include/linux/mdev.h          |   4 +
 include/linux/nvme.h          |  88 +++-
 27 files changed, 7375 insertions(+), 41 deletions(-)
 create mode 100644 drivers/nvme/mdev/Kconfig
 create mode 100644 drivers/nvme/mdev/Makefile
 create mode 100644 drivers/nvme/mdev/adm.c
 create mode 100644 drivers/nvme/mdev/events.c
 create mode 100644 drivers/nvme/mdev/host.c
 create mode 100644 drivers/nvme/mdev/instance.c
 create mode 100644 drivers/nvme/mdev/io.c
 create mode 100644 drivers/nvme/mdev/irq.c
 create mode 100644 drivers/nvme/mdev/mdev.h
 create mode 100644 drivers/nvme/mdev/mmio.c
 create mode 100644 drivers/nvme/mdev/pci.c
 create mode 100644 drivers/nvme/mdev/priv.h
 create mode 100644 drivers/nvme/mdev/udata.c
 create mode 100644 drivers/nvme/mdev/vcq.c
 create mode 100644 drivers/nvme/mdev/vctrl.c
 create mode 100644 drivers/nvme/mdev/viommu.c
 create mode 100644 drivers/nvme/mdev/vns.c
 create mode 100644 drivers/nvme/mdev/vsq.c

-- 
2.17.2

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-19 14:41 No subject Maxim Levitsky
@ 2019-03-20 11:03 ` Felipe Franciosi
  2019-03-20 19:08   ` Re: Maxim Levitsky
  0 siblings, 1 reply; 1651+ messages in thread
From: Felipe Franciosi @ 2019-03-20 11:03 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, Jens Axboe, Alex Williamson, Keith Busch,
	Christoph Hellwig, Sagi Grimberg, Kirti Wankhede,
	David S . Miller, Mauro Carvalho Chehab, Greg Kroah-Hartman,
	Wolfram Sang, Nicolas Ferre, Paul E . McKenney, Paolo Bonzini,
	Liang Cunming, Liu Changpeng <changpeng.

> On Mar 19, 2019, at 2:41 PM, Maxim Levitsky <mlevitsk@redhat.com> wrote:
> 
> Date: Tue, 19 Mar 2019 14:45:45 +0200
> Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
> 
> Hi everyone!
> 
> In this patch series, I would like to introduce my take on the problem of doing 
> as fast as possible virtualization of storage with emphasis on low latency.
> 
> In this patch series I implemented a kernel vfio based, mediated device that 
> allows the user to pass through a partition and/or whole namespace to a guest.

Hey Maxim!

I'm really excited to see this series, as it aligns to some extent with what we discussed in last year's KVM Forum VFIO BoF.

There's no arguing that we need a better story to efficiently virtualise NVMe devices. So far, for Qemu-based VMs, Changpeng's vhost-user-nvme is the best attempt at that. However, I seem to recall there was some pushback from qemu-devel in the sense that they would rather see investment in virtio-blk. I'm not sure what's the latest on that work and what are the next steps.

The pushback drove the discussion towards pursuing an mdev approach, which is why I'm excited to see your patches.

What I'm thinking is that passing through namespaces or partitions is very restrictive. It leaves no room to implement more elaborate virtualisation stacks like replicating data across multiple devices (local or remote), storage migration, software-managed thin provisioning, encryption, deduplication, compression, etc. In summary, anything that requires software intervention in the datapath. (Worth noting: vhost-user-nvme allows all of that to be easily done in SPDK's bdev layer.)

These complicated stacks should probably not be implemented in the kernel, though. So I'm wondering whether we could talk about mechanisms to allow efficient and performant userspace datapath intervention  in your approach or pursue a mechanism to completely offload the device emulation to userspace (and align with what SPDK has to offer).

Thoughts welcome!
Felipe

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-20 11:03 ` Felipe Franciosi
@ 2019-03-20 19:08   ` Maxim Levitsky
  2019-03-21 16:12     ` Re: Stefan Hajnoczi
  0 siblings, 1 reply; 1651+ messages in thread
From: Maxim Levitsky @ 2019-03-20 19:08 UTC (permalink / raw)
  To: Felipe Franciosi
  Cc: Fam Zheng, kvm@vger.kernel.org, Wolfram Sang,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
	Keith Busch, Kirti Wankhede, Mauro Carvalho Chehab,
	Paul E . McKenney, Christoph Hellwig, Sagi Grimberg,
	Harris, James R, Liang Cunming, Jens Axboe, Alex Williamson,
	Stefan Hajnoczi, Thanos Makatos, John Ferlan

On Wed, 2019-03-20 at 11:03 +0000, Felipe Franciosi wrote:
> > On Mar 19, 2019, at 2:41 PM, Maxim Levitsky <mlevitsk@redhat.com> wrote:
> > 
> > Date: Tue, 19 Mar 2019 14:45:45 +0200
> > Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
> > 
> > Hi everyone!
> > 
> > In this patch series, I would like to introduce my take on the problem of
> > doing 
> > as fast as possible virtualization of storage with emphasis on low latency.
> > 
> > In this patch series I implemented a kernel vfio based, mediated device
> > that 
> > allows the user to pass through a partition and/or whole namespace to a
> > guest.
> 
> Hey Maxim!
> 
> I'm really excited to see this series, as it aligns to some extent with what
> we discussed in last year's KVM Forum VFIO BoF.
> 
> There's no arguing that we need a better story to efficiently virtualise NVMe
> devices. So far, for Qemu-based VMs, Changpeng's vhost-user-nvme is the best
> attempt at that. However, I seem to recall there was some pushback from qemu-
> devel in the sense that they would rather see investment in virtio-blk. I'm
> not sure what's the latest on that work and what are the next steps.
I agree with that. All my benchmarks were agains his vhost-user-nvme driver, and
I am able to get pretty much the same througput and latency.

The ssd I tested on died just recently (Murphy law), not due to bug in my driver
but some internal fault (even though most of my tests were reads, plus
occassional 'nvme format's.
We are in process of buying an replacement.

> 
> The pushback drove the discussion towards pursuing an mdev approach, which is
> why I'm excited to see your patches.
> 
> What I'm thinking is that passing through namespaces or partitions is very
> restrictive. It leaves no room to implement more elaborate virtualisation
> stacks like replicating data across multiple devices (local or remote),
> storage migration, software-managed thin provisioning, encryption,
> deduplication, compression, etc. In summary, anything that requires software
> intervention in the datapath. (Worth noting: vhost-user-nvme allows all of
> that to be easily done in SPDK's bdev layer.)

Hi Felipe!

I guess that my driver is not geared toward more complicated use cases like you
mentioned, but instead it is focused to get as fast as possible performance for
the common case.

One thing that I can do which would solve several of the above problems is to
accept an map betwent virtual and real logical blocks, pretty much in exactly
the same way as EPT does it.
Then userspace can map any portions of the device anywhere, while still keeping
the dataplane in the kernel, and having minimal overhead.

On top of that, note that the direction of IO virtualization is to do dataplane
in hardware, which will probably give you even worse partition granuality /
features but will be the fastest option aviable,
like for instance SR-IOV which alrady exists and just allows to split by
namespaces without any more fine grained control.

Think of nvme-mdev as a very low level driver, which currntly uses polling, but
eventually will use PASID based IOMMU to provide the guest with raw PCI device.
The userspace / qemu can build on top of that with varios software layers.

On top of that I am thinking to solve the problem of migration in Qemu, by
creating a 'vfio-nvme' driver which would bind vfio to bind to device exposed by
the kernel, and would pass through all the doorbells and queues to the guest,
while intercepting the admin queue. Such driver I think can be made to support
migration while beeing able to run on top both SR-IOV device, my vfio-nvme abit
with double admin queue emulation (its a bit ugly but won't affect performance
at all) and on top of even regular NVME device vfio assigned to guest.

Best regards,
	Maxim Levitsky

> 
> These complicated stacks should probably not be implemented in the kernel,
> though. So I'm wondering whether we could talk about mechanisms to allow
> efficient and performant userspace datapath intervention  in your approach or
> pursue a mechanism to completely offload the device emulation to userspace
> (and align with what SPDK has to offer).
> 
> Thoughts welcome!
> Felipe
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-20 19:08   ` Re: Maxim Levitsky
@ 2019-03-21 16:12     ` Stefan Hajnoczi
  2019-03-21 16:21       ` Re: Keith Busch
  0 siblings, 1 reply; 1651+ messages in thread
From: Stefan Hajnoczi @ 2019-03-21 16:12 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Felipe Franciosi, Fam Zheng, kvm@vger.kernel.org, Wolfram Sang,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
	Keith Busch, Kirti Wankhede, Mauro Carvalho Chehab,
	Paul E . McKenney, Christoph Hellwig, Sagi Grimberg,
	Harris, James R, Liang Cunming, Jens Axboe, Alex Williamson,
	Thanos Makatos, John Ferlan, Liu 

[-- Attachment #1: Type: text/plain, Size: 4404 bytes --]

On Wed, Mar 20, 2019 at 09:08:37PM +0200, Maxim Levitsky wrote:
> On Wed, 2019-03-20 at 11:03 +0000, Felipe Franciosi wrote:
> > > On Mar 19, 2019, at 2:41 PM, Maxim Levitsky <mlevitsk@redhat.com> wrote:
> > > 
> > > Date: Tue, 19 Mar 2019 14:45:45 +0200
> > > Subject: [PATCH 0/9] RFC: NVME VFIO mediated device
> > > 
> > > Hi everyone!
> > > 
> > > In this patch series, I would like to introduce my take on the problem of
> > > doing 
> > > as fast as possible virtualization of storage with emphasis on low latency.
> > > 
> > > In this patch series I implemented a kernel vfio based, mediated device
> > > that 
> > > allows the user to pass through a partition and/or whole namespace to a
> > > guest.
> > 
> > Hey Maxim!
> > 
> > I'm really excited to see this series, as it aligns to some extent with what
> > we discussed in last year's KVM Forum VFIO BoF.
> > 
> > There's no arguing that we need a better story to efficiently virtualise NVMe
> > devices. So far, for Qemu-based VMs, Changpeng's vhost-user-nvme is the best
> > attempt at that. However, I seem to recall there was some pushback from qemu-
> > devel in the sense that they would rather see investment in virtio-blk. I'm
> > not sure what's the latest on that work and what are the next steps.
> I agree with that. All my benchmarks were agains his vhost-user-nvme driver, and
> I am able to get pretty much the same througput and latency.
> 
> The ssd I tested on died just recently (Murphy law), not due to bug in my driver
> but some internal fault (even though most of my tests were reads, plus
> occassional 'nvme format's.
> We are in process of buying an replacement.
> 
> > 
> > The pushback drove the discussion towards pursuing an mdev approach, which is
> > why I'm excited to see your patches.
> > 
> > What I'm thinking is that passing through namespaces or partitions is very
> > restrictive. It leaves no room to implement more elaborate virtualisation
> > stacks like replicating data across multiple devices (local or remote),
> > storage migration, software-managed thin provisioning, encryption,
> > deduplication, compression, etc. In summary, anything that requires software
> > intervention in the datapath. (Worth noting: vhost-user-nvme allows all of
> > that to be easily done in SPDK's bdev layer.)
> 
> Hi Felipe!
> 
> I guess that my driver is not geared toward more complicated use cases like you
> mentioned, but instead it is focused to get as fast as possible performance for
> the common case.
> 
> One thing that I can do which would solve several of the above problems is to
> accept an map betwent virtual and real logical blocks, pretty much in exactly
> the same way as EPT does it.
> Then userspace can map any portions of the device anywhere, while still keeping
> the dataplane in the kernel, and having minimal overhead.
> 
> On top of that, note that the direction of IO virtualization is to do dataplane
> in hardware, which will probably give you even worse partition granuality /
> features but will be the fastest option aviable,
> like for instance SR-IOV which alrady exists and just allows to split by
> namespaces without any more fine grained control.
> 
> Think of nvme-mdev as a very low level driver, which currntly uses polling, but
> eventually will use PASID based IOMMU to provide the guest with raw PCI device.
> The userspace / qemu can build on top of that with varios software layers.
> 
> On top of that I am thinking to solve the problem of migration in Qemu, by
> creating a 'vfio-nvme' driver which would bind vfio to bind to device exposed by
> the kernel, and would pass through all the doorbells and queues to the guest,
> while intercepting the admin queue. Such driver I think can be made to support
> migration while beeing able to run on top both SR-IOV device, my vfio-nvme abit
> with double admin queue emulation (its a bit ugly but won't affect performance
> at all) and on top of even regular NVME device vfio assigned to guest.

mdev-nvme seems like a duplication of SPDK.  The performance is not
better and the features are more limited, so why focus on this approach?

One argument might be that the kernel NVMe subsystem wants to offer this
functionality and loading the kernel module is more convenient than
managing SPDK to some users.

Thoughts?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-21 16:12     ` Re: Stefan Hajnoczi
@ 2019-03-21 16:21       ` Keith Busch
  2019-03-21 16:41         ` Re: Felipe Franciosi
  0 siblings, 1 reply; 1651+ messages in thread
From: Keith Busch @ 2019-03-21 16:21 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Maxim Levitsky, Fam Zheng, kvm@vger.kernel.org, Wolfram Sang,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
	Keith Busch, Kirti Wankhede, Mauro Carvalho Chehab,
	Paul E . McKenney, Christoph Hellwig, Sagi Grimberg,
	Harris, James R, Felipe Franciosi, Liang Cunming, Jens Axboe,
	Alex Williamson, Thanos Makatos, Jo

On Thu, Mar 21, 2019 at 04:12:39PM +0000, Stefan Hajnoczi wrote:
> mdev-nvme seems like a duplication of SPDK.  The performance is not
> better and the features are more limited, so why focus on this approach?
> 
> One argument might be that the kernel NVMe subsystem wants to offer this
> functionality and loading the kernel module is more convenient than
> managing SPDK to some users.
> 
> Thoughts?

Doesn't SPDK bind a controller to a single process? mdev binds to
namespaces (or their partitions), so you could have many mdev's assigned
to many VMs accessing a single controller.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-21 16:21       ` Re: Keith Busch
@ 2019-03-21 16:41         ` Felipe Franciosi
  2019-03-21 17:04           ` Re: Maxim Levitsky
  0 siblings, 1 reply; 1651+ messages in thread
From: Felipe Franciosi @ 2019-03-21 16:41 UTC (permalink / raw)
  To: Keith Busch
  Cc: Stefan Hajnoczi, Maxim Levitsky, Fam Zheng, kvm@vger.kernel.org,
	Wolfram Sang, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org, Keith Busch, Kirti Wankhede,
	Mauro Carvalho Chehab, Paul E . McKenney, Christoph Hellwig,
	Sagi Grimberg, Harris, James R, Liang Cunming, Jens Axboe,
	Alex Williamson, Thanos Makatos

> On Mar 21, 2019, at 4:21 PM, Keith Busch <kbusch@kernel.org> wrote:
> 
> On Thu, Mar 21, 2019 at 04:12:39PM +0000, Stefan Hajnoczi wrote:
>> mdev-nvme seems like a duplication of SPDK.  The performance is not
>> better and the features are more limited, so why focus on this approach?
>> 
>> One argument might be that the kernel NVMe subsystem wants to offer this
>> functionality and loading the kernel module is more convenient than
>> managing SPDK to some users.
>> 
>> Thoughts?
> 
> Doesn't SPDK bind a controller to a single process? mdev binds to
> namespaces (or their partitions), so you could have many mdev's assigned
> to many VMs accessing a single controller.

Yes, it binds to a single process which can drive the datapath of multiple virtual controllers for multiple VMs (similar to what you described for mdev). You can therefore efficiently poll multiple VM submission queues (and multiple device completion queues) from a single physical CPU.

The same could be done in the kernel, but the code gets complicated as you add more functionality to it. As this is a direct interface with an untrusted front-end (the guest), it's also arguably safer to do in userspace.

Worth noting: you can eventually have a single physical core polling all sorts of virtual devices (eg. virtual storage or network controllers) very efficiently. And this is quite configurable, too. In the interest of fairness, performance or efficiency, you can choose to dynamically add or remove queues to the poll thread or spawn more threads and redistribute the work.

F.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-21 16:41         ` Re: Felipe Franciosi
@ 2019-03-21 17:04           ` Maxim Levitsky
  2019-03-22  7:54             ` Re: Felipe Franciosi
  0 siblings, 1 reply; 1651+ messages in thread
From: Maxim Levitsky @ 2019-03-21 17:04 UTC (permalink / raw)
  To: Felipe Franciosi, Keith Busch
  Cc: Stefan Hajnoczi, Fam Zheng, kvm@vger.kernel.org, Wolfram Sang,
	linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org,
	Keith Busch, Kirti Wankhede, Mauro Carvalho Chehab,
	Paul E . McKenney, Christoph Hellwig, Sagi Grimberg,
	Harris, James R, Liang Cunming, Jens Axboe, Alex Williamson,
	Thanos Makatos, John Ferlan, Liu 

On Thu, 2019-03-21 at 16:41 +0000, Felipe Franciosi wrote:
> > On Mar 21, 2019, at 4:21 PM, Keith Busch <kbusch@kernel.org> wrote:
> > 
> > On Thu, Mar 21, 2019 at 04:12:39PM +0000, Stefan Hajnoczi wrote:
> > > mdev-nvme seems like a duplication of SPDK.  The performance is not
> > > better and the features are more limited, so why focus on this approach?
> > > 
> > > One argument might be that the kernel NVMe subsystem wants to offer this
> > > functionality and loading the kernel module is more convenient than
> > > managing SPDK to some users.
> > > 
> > > Thoughts?
> > 
> > Doesn't SPDK bind a controller to a single process? mdev binds to
> > namespaces (or their partitions), so you could have many mdev's assigned
> > to many VMs accessing a single controller.
> 
> Yes, it binds to a single process which can drive the datapath of multiple
> virtual controllers for multiple VMs (similar to what you described for mdev).
> You can therefore efficiently poll multiple VM submission queues (and multiple
> device completion queues) from a single physical CPU.
> 
> The same could be done in the kernel, but the code gets complicated as you add
> more functionality to it. As this is a direct interface with an untrusted
> front-end (the guest), it's also arguably safer to do in userspace.
> 
> Worth noting: you can eventually have a single physical core polling all sorts
> of virtual devices (eg. virtual storage or network controllers) very
> efficiently. And this is quite configurable, too. In the interest of fairness,
> performance or efficiency, you can choose to dynamically add or remove queues
> to the poll thread or spawn more threads and redistribute the work.
> 
> F.

Note though that SPDK doesn't support sharing the device between host and the
guests, it takes over the nvme device, thus it makes the kernel nvme driver
unbind from it.

My driver creates a polling thread per guest, but its trivial to add option to
use the same polling thread for many guests if there need for that.

Best regards,
	Maxim Levitsky

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-21 17:04           ` Re: Maxim Levitsky
@ 2019-03-22  7:54             ` Felipe Franciosi
  2019-03-22 10:32               ` Re: Maxim Levitsky
  2019-03-22 15:30               ` Re: Keith Busch
  0 siblings, 2 replies; 1651+ messages in thread
From: Felipe Franciosi @ 2019-03-22  7:54 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Keith Busch, Stefan Hajnoczi, Fam Zheng, kvm@vger.kernel.org,
	Wolfram Sang, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org, Keith Busch, Kirti Wankhede,
	Mauro Carvalho Chehab, Paul E . McKenney, Christoph Hellwig,
	Sagi Grimberg, Harris, James R, Liang Cunming, Jens Axboe,
	Alex Williamson, Thanos Makatos



> On Mar 21, 2019, at 5:04 PM, Maxim Levitsky <mlevitsk@redhat.com> wrote:
> 
> On Thu, 2019-03-21 at 16:41 +0000, Felipe Franciosi wrote:
>>> On Mar 21, 2019, at 4:21 PM, Keith Busch <kbusch@kernel.org> wrote:
>>> 
>>> On Thu, Mar 21, 2019 at 04:12:39PM +0000, Stefan Hajnoczi wrote:
>>>> mdev-nvme seems like a duplication of SPDK.  The performance is not
>>>> better and the features are more limited, so why focus on this approach?
>>>> 
>>>> One argument might be that the kernel NVMe subsystem wants to offer this
>>>> functionality and loading the kernel module is more convenient than
>>>> managing SPDK to some users.
>>>> 
>>>> Thoughts?
>>> 
>>> Doesn't SPDK bind a controller to a single process? mdev binds to
>>> namespaces (or their partitions), so you could have many mdev's assigned
>>> to many VMs accessing a single controller.
>> 
>> Yes, it binds to a single process which can drive the datapath of multiple
>> virtual controllers for multiple VMs (similar to what you described for mdev).
>> You can therefore efficiently poll multiple VM submission queues (and multiple
>> device completion queues) from a single physical CPU.
>> 
>> The same could be done in the kernel, but the code gets complicated as you add
>> more functionality to it. As this is a direct interface with an untrusted
>> front-end (the guest), it's also arguably safer to do in userspace.
>> 
>> Worth noting: you can eventually have a single physical core polling all sorts
>> of virtual devices (eg. virtual storage or network controllers) very
>> efficiently. And this is quite configurable, too. In the interest of fairness,
>> performance or efficiency, you can choose to dynamically add or remove queues
>> to the poll thread or spawn more threads and redistribute the work.
>> 
>> F.
> 
> Note though that SPDK doesn't support sharing the device between host and the
> guests, it takes over the nvme device, thus it makes the kernel nvme driver
> unbind from it.

That is absolutely true. However, I find it not to be a problem in practice.

Hypervisor products, specially those caring about performance, efficiency and fairness, will dedicate NVMe devices for a particular purpose (eg. vDisk storage, cache, metadata) and will not share these devices for other use cases. That's because these products want to deterministically control the performance aspects of the device, which you just cannot do if you are sharing the device with a subsystem you do not control.

For scenarios where the device must be shared and such fine grained control is not required, it looks like using the kernel driver with io_uring offers very good performance with flexibility.

Cheers,
Felipe

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-22  7:54             ` Re: Felipe Franciosi
@ 2019-03-22 10:32               ` Maxim Levitsky
  2019-03-22 15:30               ` Re: Keith Busch
  1 sibling, 0 replies; 1651+ messages in thread
From: Maxim Levitsky @ 2019-03-22 10:32 UTC (permalink / raw)
  To: Felipe Franciosi
  Cc: Keith Busch, Stefan Hajnoczi, Fam Zheng, kvm@vger.kernel.org,
	Wolfram Sang, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org, Keith Busch, Kirti Wankhede,
	Mauro Carvalho Chehab, Paul E . McKenney, Christoph Hellwig,
	Sagi Grimberg, Harris, James R, Liang Cunming, Jens Axboe,
	Alex Williamson, Thanos Makatos

On Fri, 2019-03-22 at 07:54 +0000, Felipe Franciosi wrote:
> > On Mar 21, 2019, at 5:04 PM, Maxim Levitsky <mlevitsk@redhat.com> wrote:
> > 
> > On Thu, 2019-03-21 at 16:41 +0000, Felipe Franciosi wrote:
> > > > On Mar 21, 2019, at 4:21 PM, Keith Busch <kbusch@kernel.org> wrote:
> > > > 
> > > > On Thu, Mar 21, 2019 at 04:12:39PM +0000, Stefan Hajnoczi wrote:
> > > > > mdev-nvme seems like a duplication of SPDK.  The performance is not
> > > > > better and the features are more limited, so why focus on this
> > > > > approach?
> > > > > 
> > > > > One argument might be that the kernel NVMe subsystem wants to offer
> > > > > this
> > > > > functionality and loading the kernel module is more convenient than
> > > > > managing SPDK to some users.
> > > > > 
> > > > > Thoughts?
> > > > 
> > > > Doesn't SPDK bind a controller to a single process? mdev binds to
> > > > namespaces (or their partitions), so you could have many mdev's assigned
> > > > to many VMs accessing a single controller.
> > > 
> > > Yes, it binds to a single process which can drive the datapath of multiple
> > > virtual controllers for multiple VMs (similar to what you described for
> > > mdev).
> > > You can therefore efficiently poll multiple VM submission queues (and
> > > multiple
> > > device completion queues) from a single physical CPU.
> > > 
> > > The same could be done in the kernel, but the code gets complicated as you
> > > add
> > > more functionality to it. As this is a direct interface with an untrusted
> > > front-end (the guest), it's also arguably safer to do in userspace.
> > > 
> > > Worth noting: you can eventually have a single physical core polling all
> > > sorts
> > > of virtual devices (eg. virtual storage or network controllers) very
> > > efficiently. And this is quite configurable, too. In the interest of
> > > fairness,
> > > performance or efficiency, you can choose to dynamically add or remove
> > > queues
> > > to the poll thread or spawn more threads and redistribute the work.
> > > 
> > > F.
> > 
> > Note though that SPDK doesn't support sharing the device between host and
> > the
> > guests, it takes over the nvme device, thus it makes the kernel nvme driver
> > unbind from it.
> 
> That is absolutely true. However, I find it not to be a problem in practice.
> 
> Hypervisor products, specially those caring about performance, efficiency and
> fairness, will dedicate NVMe devices for a particular purpose (eg. vDisk
> storage, cache, metadata) and will not share these devices for other use
> cases. That's because these products want to deterministically control the
> performance aspects of the device, which you just cannot do if you are sharing
> the device with a subsystem you do not control.
> 
> For scenarios where the device must be shared and such fine grained control is
> not required, it looks like using the kernel driver with io_uring offers very
> good performance with flexibility

I see the host/guest parition in the following way:
The guest assigned partitions are for guests that need lowest possible latency,
and in between these guests it is possible to guarantee good enough level of
fairness in my driver.
For example, in the current implementation of my driver, each guest gets its own
host submission queue.

On the other hand, the host assigned partitions are for significantly higher
latency IO, with no guarantees, and/or for guests that need all the more
advanced features of full IO virtualization, for instance snapshots, thin
provisioning, replication/backup over network, etc.
io_uring can be used here to speed things up but it won't reach the nvme-mdev
levels of latency.

Furthermore on NVME drives that support WRRU, its possible to let queues of
guest's assigned partitions to belong to the high priority class and let the
host queues use the regular medium/low priority class.
For drives that don't support WRRU, the IO throttling can be done in software on
the host queues.

Host assigned partitions also don't need polling, thus allowing polling to be
used only for guests that actually need low latency IO.
This reduces the number of cores that would be otherwise lost to polling,
because the less work the polling core does, the less latency it contributes to
overall latency, thus with less users, you could use less cores to achieve the
same levels of latency.

For Stefan's argument, we can look at it in a slightly different way too:
While the nvme-mdev can be seen as a duplication of SPDK, the SPDK can also be
seen as duplication of an existing kernel functionality which nvme-mdev can
reuse for free.

Best regards,
	Maxim Levitsky

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-22  7:54             ` Re: Felipe Franciosi
  2019-03-22 10:32               ` Re: Maxim Levitsky
@ 2019-03-22 15:30               ` Keith Busch
  2019-03-25 15:44                 ` Re: Felipe Franciosi
  1 sibling, 1 reply; 1651+ messages in thread
From: Keith Busch @ 2019-03-22 15:30 UTC (permalink / raw)
  To: Felipe Franciosi
  Cc: Maxim Levitsky, Stefan Hajnoczi, Fam Zheng, kvm@vger.kernel.org,
	Wolfram Sang, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org, Keith Busch, Kirti Wankhede,
	Mauro Carvalho Chehab, Paul E . McKenney, Christoph Hellwig,
	Sagi Grimberg, Harris, James R, Liang Cunming, Jens Axboe,
	Alex Williamson, Thanos Makatos

On Fri, Mar 22, 2019 at 07:54:50AM +0000, Felipe Franciosi wrote:
> > 
> > Note though that SPDK doesn't support sharing the device between host and the
> > guests, it takes over the nvme device, thus it makes the kernel nvme driver
> > unbind from it.
> 
> That is absolutely true. However, I find it not to be a problem in practice.
> 
> Hypervisor products, specially those caring about performance, efficiency and fairness, will dedicate NVMe devices for a particular purpose (eg. vDisk storage, cache, metadata) and will not share these devices for other use cases. That's because these products want to deterministically control the performance aspects of the device, which you just cannot do if you are sharing the device with a subsystem you do not control.

I don't know, it sounds like you've traded kernel syscalls for IPC,
and I don't think one performs better than the other.

> For scenarios where the device must be shared and such fine grained control is not required, it looks like using the kernel driver with io_uring offers very good performance with flexibility.

NVMe's IO Determinism features provide fine grained control for shared
devices. It's still uncommon to find hardware supporting that, though.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-03-22 15:30               ` Re: Keith Busch
@ 2019-03-25 15:44                 ` Felipe Franciosi
  0 siblings, 0 replies; 1651+ messages in thread
From: Felipe Franciosi @ 2019-03-25 15:44 UTC (permalink / raw)
  To: Keith Busch
  Cc: Maxim Levitsky, Stefan Hajnoczi, Fam Zheng, kvm@vger.kernel.org,
	Wolfram Sang, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org, Keith Busch, Kirti Wankhede,
	Mauro Carvalho Chehab, Paul E . McKenney, Christoph Hellwig,
	Sagi Grimberg, Harris, James R, Liang Cunming, Jens Axboe,
	Alex Williamson, Thanos Makatos

Hi Keith,

> On Mar 22, 2019, at 3:30 PM, Keith Busch <kbusch@kernel.org> wrote:
> 
> On Fri, Mar 22, 2019 at 07:54:50AM +0000, Felipe Franciosi wrote:
>>> 
>>> Note though that SPDK doesn't support sharing the device between host and the
>>> guests, it takes over the nvme device, thus it makes the kernel nvme driver
>>> unbind from it.
>> 
>> That is absolutely true. However, I find it not to be a problem in practice.
>> 
>> Hypervisor products, specially those caring about performance, efficiency and fairness, will dedicate NVMe devices for a particular purpose (eg. vDisk storage, cache, metadata) and will not share these devices for other use cases. That's because these products want to deterministically control the performance aspects of the device, which you just cannot do if you are sharing the device with a subsystem you do not control.
> 
> I don't know, it sounds like you've traded kernel syscalls for IPC,
> and I don't think one performs better than the other.

Sorry, I'm not sure I understand. My point is that if you are packaging a distro to be a hypervisor and you want to use a storage device for VM data, you _most likely_ won't be using that device for anything else. To that end, driving the device directly from your application definitely gives you more deterministic control.

> 
>> For scenarios where the device must be shared and such fine grained control is not required, it looks like using the kernel driver with io_uring offers very good performance with flexibility.
> 
> NVMe's IO Determinism features provide fine grained control for shared
> devices. It's still uncommon to find hardware supporting that, though.

Sure, but then your hypervisor needs to certify devices that support that. This will limit your HCL. Moreover, unless the feature is solid, well-established and works reliably on all devices you support, it's arguably preferable to have an architecture which gives you that control in software.

Cheers,
Felipe

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v6 0/3] add new ima hook ima_kexec_cmdline to measure kexec boot cmdline args
@ 2019-05-21  0:06 Prakhar Srivastava
  2019-05-21  0:06 ` [PATCH v6 2/3] add a new ima template field buf Prakhar Srivastava
  0 siblings, 1 reply; 1651+ messages in thread
From: Prakhar Srivastava @ 2019-05-21  0:06 UTC (permalink / raw)
  To: linux-integrity, linux-security-module, linux-kernel
  Cc: mjg59, zohar, roberto.sassu, vgoyal, Prakhar Srivastava

The motive behind the patch series is to measure the boot cmdline args
used for soft reboot/kexec case.

For secure boot attestation, it is necessary to measure the kernel
command line and the kernel version. For cold boot, the boot loader
can be enhanced to measure these parameters.
(https://mjg59.dreamwidth.org/48897.html)
However, for attestation across soft reboot boundary, these values 
also need to be measured during kexec_file_load.

Currently for Kexec(kexec_file_load)/soft reboot scenario the boot cmdline
args for the next kernel are not measured. For 
normal case of boot/hardreboot the cmdline args are measured into the TPM.

The hash of boot command line is calculated and added to the current 
running kernel's measurement list.  On a soft reboot like kexec, the PCRs
are not reset to zero.  Refer to commit 94c3aac567a9 ("ima: on soft 
reboot, restore the measurement list") patch description.

To achive the above the patch series does the following
  -adds a new ima hook: ima_kexec_cmdline which measures the cmdline args
   into the ima log, behind a new ima policy entry KEXEC_CMDLINE.
  -since the cmldine args cannot be appraised, a new template field(buf) is
   added. The template field contains the buffer passed(cmldine args), which
   can be used to appraise/attest at a later stage.
  -call the ima_kexec_cmdline(...) hook from kexec_file_load call.

The ima logs need to carried over to the next kernel, which will be followed
up by other patchsets for x86_64 and arm64.

Changelog:
V6:
  -add a new ima hook and policy to measure the cmdline
    args(ima_kexec_cmdline)
  -add a new template field buf to contain the buffer measured.
  [suggested by Mimi Zohar]
   add new fields to ima_event_data to store/read buffer data.
  [suggested by Roberto]
  -call ima_kexec_cmdline from kexec_file_load path

v5:
  -add a new ima hook and policy to measure the cmdline
    args(ima_kexec_cmdline)
  -add a new template field buf to contain the buffer measured.
    [suggested by Mimi Zohar]
  -call ima_kexec_cmdline from kexec_file_load path

v4:
  - per feedback from LSM community, removed the LSM hook and renamed the
    IMA policy to KEXEC_CMDLINE

v3: (rebase changes to next-general)
  - Add policy checks for buffer[suggested by Mimi Zohar]
  - use the IMA_XATTR to add buffer
  - Add kexec_cmdline used for kexec file load
  - Add an LSM hook to allow usage by other LSM.[suggestd by Mimi Zohar]

v2:
  - Add policy checks for buffer[suggested by Mimi Zohar]
  - Add an LSM hook to allow usage by other LSM.[suggestd by Mimi Zohar]
  - use the IMA_XATTR to add buffer instead of sig template

v1:
  -Add kconfigs to control the ima_buffer_check
  -measure the cmdline args suffixed with the kernel file name
  -add the buffer to the template sig field.

Prakhar Srivastava (3):
  Add a new ima hook ima_kexec_cmdline to measure cmdline args
  add a new ima template field buf
  call ima_kexec_cmdline to measure the cmdline args

 Documentation/ABI/testing/ima_policy      |  1 +
 Documentation/security/IMA-templates.rst  |  2 +-
 include/linux/ima.h                       |  2 +
 kernel/kexec_file.c                       |  8 ++-
 security/integrity/ima/ima.h              |  3 +
 security/integrity/ima/ima_api.c          |  5 +-
 security/integrity/ima/ima_init.c         |  2 +-
 security/integrity/ima/ima_main.c         | 80 +++++++++++++++++++++++
 security/integrity/ima/ima_policy.c       |  9 +++
 security/integrity/ima/ima_template.c     |  2 +
 security/integrity/ima/ima_template_lib.c | 20 ++++++
 security/integrity/ima/ima_template_lib.h |  4 ++
 12 files changed, 131 insertions(+), 7 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v6 2/3] add a new ima template field buf
  2019-05-21  0:06 [PATCH v6 0/3] add new ima hook ima_kexec_cmdline to measure kexec boot cmdline args Prakhar Srivastava
@ 2019-05-21  0:06 ` Prakhar Srivastava
  2019-05-24 15:12   ` Mimi Zohar
  0 siblings, 1 reply; 1651+ messages in thread
From: Prakhar Srivastava @ 2019-05-21  0:06 UTC (permalink / raw)
  To: linux-integrity, linux-security-module, linux-kernel
  Cc: mjg59, zohar, roberto.sassu, vgoyal, Prakhar Srivastava

A buffer(cmdline args) measured into ima cannot be appraised
without already being aware of the buffer contents.Since we
don't know what cmdline args will be passed (or need to validate
what was passed) it is not possible to appraise it. 

Since hashs are non reversible the raw buffer is needed to 
recompute the hash.
To regenrate the hash of the buffer and appraise the same
the contents of the buffer need to be available.

A new template field buf is added to the existing ima template
fields, which can be used to store/read the buffer itself.
Two new fields are added to the ima_event_data to carry the
buf and buf_len whenever necessary.

Updated the process_buffer_measurement call to add the buf
to the ima_event_data.
process_buffer_measurement added in "Add a new ima hook 
ima_kexec_cmdline to measure cmdline args"

- Add a new template field 'buf' to be used to store/read
the buffer data.
- Added two new fields to ima_event_data to hold the buf and
buf_len [Suggested by Roberto]
-Updated process_buffer_meaurement to add the buffer to
ima_event_data

Signed-off-by: Prakhar Srivastava <prsriva02@gmail.com>
---
 Documentation/security/IMA-templates.rst  |  2 +-
 security/integrity/ima/ima.h              |  2 ++
 security/integrity/ima/ima_api.c          |  4 ++--
 security/integrity/ima/ima_init.c         |  2 +-
 security/integrity/ima/ima_main.c         |  4 +++-
 security/integrity/ima/ima_template.c     |  2 ++
 security/integrity/ima/ima_template_lib.c | 20 ++++++++++++++++++++
 security/integrity/ima/ima_template_lib.h |  4 ++++
 8 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/Documentation/security/IMA-templates.rst b/Documentation/security/IMA-templates.rst
index 2cd0e273cc9a..9cddb66727ee 100644
--- a/Documentation/security/IMA-templates.rst
+++ b/Documentation/security/IMA-templates.rst
@@ -70,7 +70,7 @@ descriptors by adding their identifier to the format string
    prefix is shown only if the hash algorithm is not SHA1 or MD5);
  - 'n-ng': the name of the event, without size limitations;
  - 'sig': the file signature.
-
+ - 'buf': the buffer data that was used to generate the hash without size limitations.
 
 Below, there is the list of defined template descriptors:
 
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 226a26d8de09..4a82541dc3b6 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -65,6 +65,8 @@ struct ima_event_data {
 	struct evm_ima_xattr_data *xattr_value;
 	int xattr_len;
 	const char *violation;
+	const void *buf;
+	int buf_len;
 };
 
 /* IMA template field data definition */
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index 800d965232e5..c12f1cd38f8f 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -134,7 +134,7 @@ void ima_add_violation(struct file *file, const unsigned char *filename,
 	struct ima_template_entry *entry;
 	struct inode *inode = file_inode(file);
 	struct ima_event_data event_data = {iint, file, filename, NULL, 0,
-					    cause};
+					    cause, NULL, 0};
 	int violation = 1;
 	int result;
 
@@ -286,7 +286,7 @@ void ima_store_measurement(struct integrity_iint_cache *iint,
 	struct inode *inode = file_inode(file);
 	struct ima_template_entry *entry;
 	struct ima_event_data event_data = {iint, file, filename, xattr_value,
-					    xattr_len, NULL};
+					    xattr_len, NULL, NULL, 0};
 	int violation = 0;
 
 	if (iint->measured_pcrs & (0x1 << pcr))
diff --git a/security/integrity/ima/ima_init.c b/security/integrity/ima/ima_init.c
index 6c9295449751..0c34d3100b5b 100644
--- a/security/integrity/ima/ima_init.c
+++ b/security/integrity/ima/ima_init.c
@@ -50,7 +50,7 @@ static int __init ima_add_boot_aggregate(void)
 	struct ima_template_entry *entry;
 	struct integrity_iint_cache tmp_iint, *iint = &tmp_iint;
 	struct ima_event_data event_data = {iint, NULL, boot_aggregate_name,
-					    NULL, 0, NULL};
+					    NULL, 0, NULL, NULL, 0};
 	int result = -ENOMEM;
 	int violation = 0;
 	struct {
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index a88c28918a63..6c5691b65b84 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -594,7 +594,7 @@ static void process_buffer_measurement(const void *buf, int size,
 	struct ima_template_entry *entry = NULL;
 	struct integrity_iint_cache tmp_iint, *iint = &tmp_iint;
 	struct ima_event_data event_data = {iint, NULL, NULL,
-						NULL, 0, NULL};
+						NULL, 0, NULL, NULL, 0};
 	struct {
 		struct ima_digest_data hdr;
 		char digest[IMA_MAX_DIGEST_SIZE];
@@ -611,6 +611,8 @@ static void process_buffer_measurement(const void *buf, int size,
 	memset(&hash, 0, sizeof(hash));
 
 	event_data.filename = eventname;
+	event_data.buf = buf;
+	event_data.buf_len = size;
 
 	iint->ima_hash = &hash.hdr;
 	iint->ima_hash->algo = ima_hash_algo;
diff --git a/security/integrity/ima/ima_template.c b/security/integrity/ima/ima_template.c
index b631b8bc7624..a76d1c04162a 100644
--- a/security/integrity/ima/ima_template.c
+++ b/security/integrity/ima/ima_template.c
@@ -43,6 +43,8 @@ static const struct ima_template_field supported_fields[] = {
 	 .field_show = ima_show_template_string},
 	{.field_id = "sig", .field_init = ima_eventsig_init,
 	 .field_show = ima_show_template_sig},
+	{.field_id = "buf", .field_init = ima_eventbuf_init,
+	 .field_show = ima_show_template_buf},
 };
 #define MAX_TEMPLATE_NAME_LEN 15
 
diff --git a/security/integrity/ima/ima_template_lib.c b/security/integrity/ima/ima_template_lib.c
index 513b457ae900..43d1404141c1 100644
--- a/security/integrity/ima/ima_template_lib.c
+++ b/security/integrity/ima/ima_template_lib.c
@@ -162,6 +162,12 @@ void ima_show_template_sig(struct seq_file *m, enum ima_show_type show,
 	ima_show_template_field_data(m, show, DATA_FMT_HEX, field_data);
 }
 
+void ima_show_template_buf(struct seq_file *m, enum ima_show_type show,
+				struct ima_field_data *field_data)
+{
+	ima_show_template_field_data(m, show, DATA_FMT_HEX, field_data);
+}
+
 /**
  * ima_parse_buf() - Parses lengths and data from an input buffer
  * @bufstartp:       Buffer start address.
@@ -389,3 +395,17 @@ int ima_eventsig_init(struct ima_event_data *event_data,
 	return ima_write_template_field_data(xattr_value, event_data->xattr_len,
 					     DATA_FMT_HEX, field_data);
 }
+
+/*
+ *  ima_eventbuf_init - include the buffer(kexec-cmldine) as part of the
+ *  template data.
+ */
+int ima_eventbuf_init(struct ima_event_data *event_data,
+				struct ima_field_data *field_data)
+{
+	if ((!event_data->buf) || (event_data->buf_len == 0))
+		return 0;
+
+	return ima_write_template_field_data(event_data->buf, event_data->buf_len,
+					DATA_FMT_HEX, field_data);
+}
diff --git a/security/integrity/ima/ima_template_lib.h b/security/integrity/ima/ima_template_lib.h
index 6a3d8b831deb..f0178bc60c55 100644
--- a/security/integrity/ima/ima_template_lib.h
+++ b/security/integrity/ima/ima_template_lib.h
@@ -29,6 +29,8 @@ void ima_show_template_string(struct seq_file *m, enum ima_show_type show,
 			      struct ima_field_data *field_data);
 void ima_show_template_sig(struct seq_file *m, enum ima_show_type show,
 			   struct ima_field_data *field_data);
+void ima_show_template_buf(struct seq_file *m, enum ima_show_type show,
+				struct ima_field_data *field_data);
 int ima_parse_buf(void *bufstartp, void *bufendp, void **bufcurp,
 		  int maxfields, struct ima_field_data *fields, int *curfields,
 		  unsigned long *len_mask, int enforce_mask, char *bufname);
@@ -42,4 +44,6 @@ int ima_eventname_ng_init(struct ima_event_data *event_data,
 			  struct ima_field_data *field_data);
 int ima_eventsig_init(struct ima_event_data *event_data,
 		      struct ima_field_data *field_data);
+int ima_eventbuf_init(struct ima_event_data *event_data,
+				struct ima_field_data *field_data);
 #endif /* __LINUX_IMA_TEMPLATE_LIB_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re: [PATCH v6 2/3] add a new ima template field buf
  2019-05-21  0:06 ` [PATCH v6 2/3] add a new ima template field buf Prakhar Srivastava
@ 2019-05-24 15:12   ` Mimi Zohar
  2019-05-24 15:42     ` Roberto Sassu
  0 siblings, 1 reply; 1651+ messages in thread
From: Mimi Zohar @ 2019-05-24 15:12 UTC (permalink / raw)
  To: Prakhar Srivastava, linux-integrity, linux-security-module,
	linux-kernel
  Cc: mjg59, roberto.sassu, vgoyal

On Mon, 2019-05-20 at 17:06 -0700, Prakhar Srivastava wrote:
> A buffer(cmdline args) measured into ima cannot be appraised
> without already being aware of the buffer contents.Since we
> don't know what cmdline args will be passed (or need to validate
> what was passed) it is not possible to appraise it. 
> 
> Since hashs are non reversible the raw buffer is needed to 
> recompute the hash.
> To regenrate the hash of the buffer and appraise the same
> the contents of the buffer need to be available.
> 
> A new template field buf is added to the existing ima template
> fields, which can be used to store/read the buffer itself.
> Two new fields are added to the ima_event_data to carry the
> buf and buf_len whenever necessary.
> 
> Updated the process_buffer_measurement call to add the buf
> to the ima_event_data.
> process_buffer_measurement added in "Add a new ima hook 
> ima_kexec_cmdline to measure cmdline args"
> 
> - Add a new template field 'buf' to be used to store/read
> the buffer data.
> - Added two new fields to ima_event_data to hold the buf and
> buf_len [Suggested by Roberto]
> -Updated process_buffer_meaurement to add the buffer to
> ima_event_data

This patch description can be written more concisely.

Patch 1/3 in this series introduces measuring the kexec boot command
line.  This patch defines a new template field for storing the kexec
boot command line in the measurement list in order for a remote
attestation server to verify.

As mentioned, the first patch description should include a shell
command for verifying the digest in the kexec boot command line
measurement list record against /proc/cmdline.  This patch description
should include a shell command showing how to verify the digest based
on the new field.  Should the new field in the ascii measurement list
be displayed as a string, not hex?

Mimi


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-05-24 15:12   ` Mimi Zohar
@ 2019-05-24 15:42     ` Roberto Sassu
  2019-05-24 15:47       ` Re: Roberto Sassu
  0 siblings, 1 reply; 1651+ messages in thread
From: Roberto Sassu @ 2019-05-24 15:42 UTC (permalink / raw)
  To: Mimi Zohar, Prakhar Srivastava, linux-integrity,
	linux-security-module, linux-kernel
  Cc: mjg59, vgoyal

On 5/24/2019 5:12 PM, Mimi Zohar wrote:
> On Mon, 2019-05-20 at 17:06 -0700, Prakhar Srivastava wrote:
>> A buffer(cmdline args) measured into ima cannot be appraised
>> without already being aware of the buffer contents.Since we
>> don't know what cmdline args will be passed (or need to validate
>> what was passed) it is not possible to appraise it.
>>
>> Since hashs are non reversible the raw buffer is needed to
>> recompute the hash.
>> To regenrate the hash of the buffer and appraise the same
>> the contents of the buffer need to be available.
>>
>> A new template field buf is added to the existing ima template
>> fields, which can be used to store/read the buffer itself.
>> Two new fields are added to the ima_event_data to carry the
>> buf and buf_len whenever necessary.
>>
>> Updated the process_buffer_measurement call to add the buf
>> to the ima_event_data.
>> process_buffer_measurement added in "Add a new ima hook
>> ima_kexec_cmdline to measure cmdline args"
>>
>> - Add a new template field 'buf' to be used to store/read
>> the buffer data.
>> - Added two new fields to ima_event_data to hold the buf and
>> buf_len [Suggested by Roberto]
>> -Updated process_buffer_meaurement to add the buffer to
>> ima_event_data
> 
> This patch description can be written more concisely.
> 
> Patch 1/3 in this series introduces measuring the kexec boot command
> line.  This patch defines a new template field for storing the kexec
> boot command line in the measurement list in order for a remote
> attestation server to verify.
> 
> As mentioned, the first patch description should include a shell
> command for verifying the digest in the kexec boot command line
> measurement list record against /proc/cmdline.  This patch description
> should include a shell command showing how to verify the digest based
> on the new field.  Should the new field in the ascii measurement list
> be displayed as a string, not hex?

We should define a new type. If the type is DATA_FMT_STRING, spaces are
replaced with '_'.

Roberto

-- 
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-05-24 15:42     ` Roberto Sassu
@ 2019-05-24 15:47       ` Roberto Sassu
  2019-05-24 18:09         ` Re: Mimi Zohar
  0 siblings, 1 reply; 1651+ messages in thread
From: Roberto Sassu @ 2019-05-24 15:47 UTC (permalink / raw)
  To: Mimi Zohar, Prakhar Srivastava, linux-integrity,
	linux-security-module, linux-kernel
  Cc: mjg59, vgoyal

On 5/24/2019 5:42 PM, Roberto Sassu wrote:
> On 5/24/2019 5:12 PM, Mimi Zohar wrote:
>> On Mon, 2019-05-20 at 17:06 -0700, Prakhar Srivastava wrote:
>>> A buffer(cmdline args) measured into ima cannot be appraised
>>> without already being aware of the buffer contents.Since we
>>> don't know what cmdline args will be passed (or need to validate
>>> what was passed) it is not possible to appraise it.
>>>
>>> Since hashs are non reversible the raw buffer is needed to
>>> recompute the hash.
>>> To regenrate the hash of the buffer and appraise the same
>>> the contents of the buffer need to be available.
>>>
>>> A new template field buf is added to the existing ima template
>>> fields, which can be used to store/read the buffer itself.
>>> Two new fields are added to the ima_event_data to carry the
>>> buf and buf_len whenever necessary.
>>>
>>> Updated the process_buffer_measurement call to add the buf
>>> to the ima_event_data.
>>> process_buffer_measurement added in "Add a new ima hook
>>> ima_kexec_cmdline to measure cmdline args"
>>>
>>> - Add a new template field 'buf' to be used to store/read
>>> the buffer data.
>>> - Added two new fields to ima_event_data to hold the buf and
>>> buf_len [Suggested by Roberto]
>>> -Updated process_buffer_meaurement to add the buffer to
>>> ima_event_data
>>
>> This patch description can be written more concisely.
>>
>> Patch 1/3 in this series introduces measuring the kexec boot command
>> line.  This patch defines a new template field for storing the kexec
>> boot command line in the measurement list in order for a remote
>> attestation server to verify.
>>
>> As mentioned, the first patch description should include a shell
>> command for verifying the digest in the kexec boot command line
>> measurement list record against /proc/cmdline.  This patch description
>> should include a shell command showing how to verify the digest based
>> on the new field.  Should the new field in the ascii measurement list
>> be displayed as a string, not hex?
> 
> We should define a new type. If the type is DATA_FMT_STRING, spaces are
> replaced with '_'.

Or better. Leave it as hex, otherwise there would be a parsing problem
if there are spaces in the data for a field.

Roberto

-- 
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2019-05-24 15:47       ` Re: Roberto Sassu
@ 2019-05-24 18:09         ` Mimi Zohar
  2019-05-24 19:00           ` Re: prakhar srivastava
  0 siblings, 1 reply; 1651+ messages in thread
From: Mimi Zohar @ 2019-05-24 18:09 UTC (permalink / raw)
  To: Roberto Sassu, Prakhar Srivastava, linux-integrity,
	linux-security-module, linux-kernel
  Cc: mjg59, vgoyal

> >> As mentioned, the first patch description should include a shell
> >> command for verifying the digest in the kexec boot command line
> >> measurement list record against /proc/cmdline.  This patch description
> >> should include a shell command showing how to verify the digest based
> >> on the new field.  Should the new field in the ascii measurement list
> >> be displayed as a string, not hex?
> > 
> > We should define a new type. If the type is DATA_FMT_STRING, spaces are
> > replaced with '_'.
> 
> Or better. Leave it as hex, otherwise there would be a parsing problem
> if there are spaces in the data for a field.

After making a few changes, the measurement list contains the
following kexec-cmdline data:

10 edc32d1e3a5ba7272280a395b6fb56a5ef7c78c3 ima-buf
sha256:4f43b7db850e
88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b45275 
kexec-cmdline
726f6f
743d2f6465762f7364613420726f2072642e6c756b732e757569643d6c756b73
2d6637
3633643737632d653236622d343431642d613734652d62363633636334643832
656120
696d615f706f6c6963793d7463627c61707072616973655f746362

There's probably a better shell command, but the following works to
verify the digest locally against the /proc/cmdline:

$ echo -n -e `cat /proc/cmdline | sed 's/^.*root=/root=/'` | sha256sum
4f43b7db850e88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b4527f65  -

If we leave the "buf" field as ascii-hex, what would the shell command
look like when verifying the digest based on the "buf" field?

Mimi


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2019-05-24 18:09         ` Re: Mimi Zohar
@ 2019-05-24 19:00           ` prakhar srivastava
  2019-05-24 19:15             ` Re: Mimi Zohar
  0 siblings, 1 reply; 1651+ messages in thread
From: prakhar srivastava @ 2019-05-24 19:00 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Roberto Sassu, linux-integrity, linux-security-module,
	linux-kernel, Matthew Garrett, vgoyal

On Fri, May 24, 2019 at 11:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
>
> > >> As mentioned, the first patch description should include a shell
> > >> command for verifying the digest in the kexec boot command line
> > >> measurement list record against /proc/cmdline.  This patch description
> > >> should include a shell command showing how to verify the digest based
> > >> on the new field.  Should the new field in the ascii measurement list
> > >> be displayed as a string, not hex?
> > >
> > > We should define a new type. If the type is DATA_FMT_STRING, spaces are
> > > replaced with '_'.
> >
> > Or better. Leave it as hex, otherwise there would be a parsing problem
> > if there are spaces in the data for a field.
>
> After making a few changes, the measurement list contains the
> following kexec-cmdline data:
>
> 10 edc32d1e3a5ba7272280a395b6fb56a5ef7c78c3 ima-buf
> sha256:4f43b7db850e
> 88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b45275
> kexec-cmdline
> 726f6f
> 743d2f6465762f7364613420726f2072642e6c756b732e757569643d6c756b73
> 2d6637
> 3633643737632d653236622d343431642d613734652d62363633636334643832
> 656120
> 696d615f706f6c6963793d7463627c61707072616973655f746362
>
> There's probably a better shell command, but the following works to
> verify the digest locally against the /proc/cmdline:
>
> $ echo -n -e `cat /proc/cmdline | sed 's/^.*root=/root=/'` | sha256sum
> 4f43b7db850e88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b4527f65  -
>
> If we leave the "buf" field as ascii-hex, what would the shell command
> look like when verifying the digest based on the "buf" field?
>
> Mimi
>
To quickly test the sha256 i used the my /proc/cmdline
ro quiet splash vt.handoff=1 ima_policy=tcb ima_appraise=fix
ima_template_fmt=n-ng|d-ng|sig|buf ima_hash=sha256

export $VAL=
726f2071756965742073706c6173682076742e68616e646f66663d3120
696d615f706f6c6963793d74636220696d615f61707072616973653d666
97820696d615f74656d706c6174655f666d743d6e2d6e677c642d6e677c
7369677c62756620696d615f686173683d736861323536

echo -n -e $VAL | xxd -r -p | sha256sum
0d0b891bb730120d9593799cba1a7b3febf68f2bb81fb1304b0c963f95f6bc58  -

I will run it through the code as well, but the shell command should work.

Thanks,
Prakhar Srivastava

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2019-05-24 19:00           ` Re: prakhar srivastava
@ 2019-05-24 19:15             ` Mimi Zohar
  0 siblings, 0 replies; 1651+ messages in thread
From: Mimi Zohar @ 2019-05-24 19:15 UTC (permalink / raw)
  To: prakhar srivastava
  Cc: Roberto Sassu, linux-integrity, linux-security-module,
	linux-kernel, Matthew Garrett, vgoyal

On Fri, 2019-05-24 at 12:00 -0700, prakhar srivastava wrote:
> On Fri, May 24, 2019 at 11:09 AM Mimi Zohar <zohar@linux.ibm.com> wrote:
> >
> > > >> As mentioned, the first patch description should include a shell
> > > >> command for verifying the digest in the kexec boot command line
> > > >> measurement list record against /proc/cmdline.  This patch description
> > > >> should include a shell command showing how to verify the digest based
> > > >> on the new field.  Should the new field in the ascii measurement list
> > > >> be displayed as a string, not hex?
> > > >
> > > > We should define a new type. If the type is DATA_FMT_STRING, spaces are
> > > > replaced with '_'.
> > >
> > > Or better. Leave it as hex, otherwise there would be a parsing problem
> > > if there are spaces in the data for a field.
> >
> > After making a few changes, the measurement list contains the
> > following kexec-cmdline data:
> >
> > 10 edc32d1e3a5ba7272280a395b6fb56a5ef7c78c3 ima-buf
> > sha256:4f43b7db850e
> > 88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b45275
> > kexec-cmdline
> > 726f6f
> > 743d2f6465762f7364613420726f2072642e6c756b732e757569643d6c756b73
> > 2d6637
> > 3633643737632d653236622d343431642d613734652d62363633636334643832
> > 656120
> > 696d615f706f6c6963793d7463627c61707072616973655f746362
> >
> > There's probably a better shell command, but the following works to
> > verify the digest locally against the /proc/cmdline:
> >
> > $ echo -n -e `cat /proc/cmdline | sed 's/^.*root=/root=/'` | sha256sum
> > 4f43b7db850e88c49dfeffd4b1eb4f021d78033dfb05b07e45eec8d0b4527f65  -
> >
> > If we leave the "buf" field as ascii-hex, what would the shell command
> > look like when verifying the digest based on the "buf" field?
> >
> > Mimi
> >
> To quickly test the sha256 i used the my /proc/cmdline
> ro quiet splash vt.handoff=1 ima_policy=tcb ima_appraise=fix
> ima_template_fmt=n-ng|d-ng|sig|buf ima_hash=sha256
> 
> export $VAL=
> 726f2071756965742073706c6173682076742e68616e646f66663d3120
> 696d615f706f6c6963793d74636220696d615f61707072616973653d666
> 97820696d615f74656d706c6174655f666d743d6e2d6e677c642d6e677c
> 7369677c62756620696d615f686173683d736861323536
> 
> echo -n -e $VAL | xxd -r -p | sha256sum
> 0d0b891bb730120d9593799cba1a7b3febf68f2bb81fb1304b0c963f95f6bc58  -
> 
> I will run it through the code as well, but the shell command should work.

Yes, that works.

sudo cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements
| grep  kexec-cmdline | cut -d' ' -f 6 | xxd -r -p | sha256sum

Mimi


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <E1hUrZM-0007qA-Q8@sslproxy01.your-server.de>]

* Re:
       [not found] <E1hUrZM-0007qA-Q8@sslproxy01.your-server.de>
@ 2019-05-29 19:54 ` Alex Williamson
  0 siblings, 0 replies; 1651+ messages in thread
From: Alex Williamson @ 2019-05-29 19:54 UTC (permalink / raw)
  To: Thomas Meyer; +Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org

On Sun, 26 May 2019 13:44:04 +0200
"Thomas Meyer" <thomas@m3y3r.de> wrote:

> From thomas@m3y3r.de Sun May 26 00:13:26 2019
> Subject: [PATCH] vfio-pci/nvlink2: Use vma_pages function instead of explicit
>  computation
> To: alex.williamson@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org
> Content-Type: text/plain; charset="UTF-8"
> Mime-Version: 1.0
> Content-Transfer-Encoding: 8bit
> X-Patch: Cocci
> X-Mailer: DiffSplit
> Message-ID: <1558822461341-1674464153-1-diffsplit-thomas@m3y3r.de>
> References: <1558822461331-726613767-0-diffsplit-thomas@m3y3r.de>
> In-Reply-To: <1558822461331-726613767-0-diffsplit-thomas@m3y3r.de>
> X-Serial-No: 1
> 
> Use vma_pages function on vma object instead of explicit computation.
> 
> Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
> ---
> 
> diff -u -p a/drivers/vfio/pci/vfio_pci_nvlink2.c b/drivers/vfio/pci/vfio_pci_nvlink2.c
> --- a/drivers/vfio/pci/vfio_pci_nvlink2.c
> +++ b/drivers/vfio/pci/vfio_pci_nvlink2.c
> @@ -161,7 +161,7 @@ static int vfio_pci_nvgpu_mmap(struct vf
>  
>  	atomic_inc(&data->mm->mm_count);
>  	ret = (int) mm_iommu_newdev(data->mm, data->useraddr,
> -			(vma->vm_end - vma->vm_start) >> PAGE_SHIFT,
> +			vma_pages(vma),
>  			data->gpu_hpa, &data->mem);
>  
>  	trace_vfio_pci_nvgpu_mmap(vdev->pdev, data->gpu_hpa, data->useraddr,

Besides the formatting of this patch, there's already a pending patch
with this same change:

https://lkml.org/lkml/2019/5/16/658

I think the original must have bounced from lkml due the encoding, but
I'll use that one since it came first, is slightly cleaner in wrapping
the line following the change, and already has Alexey's R-b.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2019-06-13  7:02 Erling Persson Foundation
  0 siblings, 0 replies; 1651+ messages in thread
From: Erling Persson Foundation @ 2019-06-13  7:02 UTC (permalink / raw)
  To: sparclinux

Message from Stefan Erling Persson, owner of Erling-Persson family
philanthropic foundation  and you have been selected as benefactor of 3.5
Million Euro from our personal donation in the year 2019. Reply for claim.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <DM5PR19MB165765D43BE979AB51A9897E9EEB0@DM5PR19MB1657.namprd19.prod.outlook.com>]

* Re:
       [not found] <DM5PR19MB165765D43BE979AB51A9897E9EEB0@DM5PR19MB1657.namprd19.prod.outlook.com>
@ 2019-06-18  9:41 ` Enrico Weigelt, metux IT consult
  0 siblings, 0 replies; 1651+ messages in thread
From: Enrico Weigelt, metux IT consult @ 2019-06-18  9:41 UTC (permalink / raw)
  To: Grim, Dennis, linux-iio@vger.kernel.org

On 17.06.19 16:58, Grim, Dennis wrote:
> Is Industrial IO considered to be stable in kernel-3.6.0?
> 

What exactly are you trying to achieve ?

3.6 is *very* old and completely unmaintained. And it's likely to miss
lots of things you'll probably want, sooner or later. And backporting
such far is anything but practical. (I recently had a client who asked
me to backport recent BT features onto some old 3.15 vendor kernel -
that would have taken years to get anythings stable).

Seriously, don't try to use such old code in production systems.
It's better to rebase your individual customizations onto recent
mainline releases.

--mtx

-- 
Enrico Weigelt, metux IT consult
Free software and Linux embedded engineering
info@metux.net -- +49-151-27565287

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20190703063132.GA27292@ls3530.dellerweb.de>]

* Re:
       [not found] <20190703063132.GA27292@ls3530.dellerweb.de>
@ 2019-07-03  6:38 ` Helge Deller
  0 siblings, 0 replies; 1651+ messages in thread
From: Helge Deller @ 2019-07-03  6:38 UTC (permalink / raw)
  To: Linus Torvalds, linux-parisc, James Bottomley, John David Anglin

Please ignore the last mail.
Somehow a newly-installed mutt misbehaved and sent out an empty email.
Helge

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2019-08-20 17:23 William Baker
  2019-08-20 17:27 ` Yagnatinsky, Mark
  0 siblings, 1 reply; 1651+ messages in thread
From: William Baker @ 2019-08-20 17:23 UTC (permalink / raw)
  To: git

subscribe git

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
  2019-08-20 17:23 William Baker
@ 2019-08-20 17:27 ` Yagnatinsky, Mark
  0 siblings, 0 replies; 1651+ messages in thread
From: Yagnatinsky, Mark @ 2019-08-20 17:27 UTC (permalink / raw)
  To: 'William Baker', git@vger.kernel.org

No, not like that.  See here:
https://git.wiki.kernel.org/index.php/GitCommunity
The email address you send the "subscribe" message to is NOT the mailing list itself.
What you just did is send the words "subscribe git" to everyone already on the mailing list :)

-----Original Message-----
From: git-owner@vger.kernel.org [mailto:git-owner@vger.kernel.org] On Behalf Of William Baker
Sent: Tuesday, August 20, 2019 1:23 PM
To: git@vger.kernel.org
Subject: 

subscribe git

----------------------------------------------------------------------
This message, and any attachments, is for the intended recipient(s) only, may contain information that is privileged, confidential and/or proprietary and subject to important terms and conditions available at http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended recipient, please delete this message.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] Revert "asm-generic: Remove unneeded __ARCH_WANT_SYS_LLSEEK macro"
@ 2019-08-30 19:54 Arnd Bergmann
       [not found] ` <20190830202959.3539-1-msuchanek@suse.de>
  0 siblings, 1 reply; 1651+ messages in thread
From: Arnd Bergmann @ 2019-08-30 19:54 UTC (permalink / raw)
  To: Michal Suchanek
  Cc: Rich Felker, Linux-sh list, Benjamin Herrenschmidt,
	Heiko Carstens, linux-mips, James E.J. Bottomley, Max Filippov,
	Guo Ren, H. Peter Anvin, sparclinux, Vincenzo Frascino,
	Will Deacon, linux-arch, linux-s390, Yoshinori Sato,
	Michael Ellerman, Helge Deller, the arch/x86 maintainers,
	Russell King, Christian Borntraeger, Ingo Molnar,
	Geert Uytterhoeven

On Fri, Aug 30, 2019 at 9:46 PM Michal Suchanek <msuchanek@suse.de> wrote:
>
> This reverts commit caf6f9c8a326cffd1d4b3ff3f1cfba75d159d70b.
>
> Maybe it was needed after all.
>
> When CONFIG_COMPAT is disabled on ppc64 the kernel does not build.
>
> There is resistance to both removing the llseek syscall from the 64bit
> syscall tables and building the llseek interface unconditionally.
>
> Link: https://lore.kernel.org/lkml/20190828151552.GA16855@infradead.org/
> Link: https://lore.kernel.org/lkml/20190829214319.498c7de2@naga/
>
> Signed-off-by: Michal Suchanek <msuchanek@suse.de>

This seems like the right idea in principle.

> index 5bbf587f5bc1..2f3c4bb138c4 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -331,7 +331,7 @@ COMPAT_SYSCALL_DEFINE3(lseek, unsigned int, fd, compat_off_t, offset, unsigned i
>  }
>  #endif
>
> -#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT)
> +#ifdef __ARCH_WANT_SYS_LLSEEK
>  SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high,
>                 unsigned long, offset_low, loff_t __user *, result,
>                 unsigned int, whence)

However, only reverting the patch will now break all newly added
32-bit architectures that don't define __ARCH_WANT_SYS_LLSEEK:
at least nds32 and riscv32 come to mind, not sure if there is another.

I think the easiest way however would be to combine the two checks
above and make it

#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) ||
defined(__ARCH_WANT_SYS_LLSEEK)

and then only set __ARCH_WANT_SYS_LLSEEK for powerpc.

     Arnd

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20190830202959.3539-1-msuchanek@suse.de>]

* Re:
       [not found] ` <20190830202959.3539-1-msuchanek@suse.de>
@ 2019-08-30 20:32   ` Arnd Bergmann
  0 siblings, 0 replies; 1651+ messages in thread
From: Arnd Bergmann @ 2019-08-30 20:32 UTC (permalink / raw)
  To: Michal Suchanek
  Cc: Heiko Carstens, Allison Randal, Linux Kernel Mailing List,
	Paul Mackerras, Alexander Viro, Greg Kroah-Hartman,
	Linux FS-devel Mailing List, Firoz Khan, Thomas Gleixner,
	linuxppc-dev, Christian Brauner

On Fri, Aug 30, 2019 at 10:30 PM Michal Suchanek <msuchanek@suse.de> wrote:
>
> Subject: [PATCH] powerpc: Add back __ARCH_WANT_SYS_LLSEEK macro
>
> This partially reverts commit caf6f9c8a326 ("asm-generic: Remove
> unneeded __ARCH_WANT_SYS_LLSEEK macro")
>
> When CONFIG_COMPAT is disabled on ppc64 the kernel does not build.
>
> There is resistance to both removing the llseek syscall from the 64bit
> syscall tables and building the llseek interface unconditionally.
>
> Link: https://lore.kernel.org/lkml/20190828151552.GA16855@infradead.org/
> Link: https://lore.kernel.org/lkml/20190829214319.498c7de2@naga/
>
> Signed-off-by: Michal Suchanek <msuchanek@suse.de>

Reviewed-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAGkTAxsV0zS_E64criQM-WtPKpSyW2PL=+fjACvnx2=m7piwXg@mail.gmail.com>]

* Re:
       [not found] <CAGkTAxsV0zS_E64criQM-WtPKpSyW2PL=+fjACvnx2=m7piwXg@mail.gmail.com>
@ 2019-09-27  6:37 ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 1651+ messages in thread
From: Michael Kerrisk (man-pages) @ 2019-09-27  6:37 UTC (permalink / raw)
  To: nilsocket; +Cc: linux-man

Hello

On Fri, 27 Sep 2019 at 08:26, nilsocket <nilsocket@gmail.com> wrote:
>
> In http://man7.org/linux/man-pages/man2/epoll_pwait.2.html#NOTES ,
> through out this section `epoll_wait()` is used. but only once
> `epoll_pwait()` is used, I think it's a typo mistake.
>
> Current:
> While one thread is blocked in a call to epoll_pwait()
>
> Expected Change:
> While one thread is blocked in a call to epoll_wait()

Thanks. Fixed!

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2019-10-27 21:47 Margaret Kwan Wing Han
  0 siblings, 0 replies; 1651+ messages in thread
From: Margaret Kwan Wing Han @ 2019-10-27 21:47 UTC (permalink / raw)
  To: linux-xfs


I need a partner for a legal deal worth $30,500,000 if interested reply me for
more details.

Regards,
Margaret Kwan Wing

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2019-11-14 11:37 SGV INVESTMENT
  0 siblings, 0 replies; 1651+ messages in thread
From: SGV INVESTMENT @ 2019-11-14 11:37 UTC (permalink / raw)
  To: linux-nvdimm

Did you receive our business proposal email ?
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2019-11-15 16:03 Martin Nicolay
  2019-11-15 16:29 ` Martin Ågren
  0 siblings, 1 reply; 1651+ messages in thread
From: Martin Nicolay @ 2019-11-15 16:03 UTC (permalink / raw)
  To: git; +Cc: pclouds, mhagger, gitster, yuelinho777

From 7f5bb47a3b5b7a5443009078d936ab62dfa1fd85 Mon Sep 17 00:00:00 2001
From: Martin Nicolay <m.nicolay@osm-ag.de>
Date: Fri, 15 Nov 2019 16:41:20 +0100
Subject: [PATCH] enable a timeout for hold_lock_file_for_update

The new funktion get_files_lock_timeout_ms reads the config
core.fileslocktimeout analog get_files_ref_lock_timeout_ms.

This value is used in hold_lock_file_for_update instead of the
fixed value 0.
---
While working with complex scripts invoking git multiple times my
editor detects the changes and calls "git status". This leads to
aborts in "git-stash". With this patch and an appropriate value
core.fileslocktimeout this problem goes away.

 lockfile.c | 15 +++++++++++++++
 lockfile.h |  4 +++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/lockfile.c b/lockfile.c
index 8e8ab4f29f..ac19de8937 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -145,6 +145,21 @@ static int lock_file_timeout(struct lock_file *lk, const char *path,
 	}
 }
 
+long get_files_lock_timeout_ms(void)
+{
+	static int configured = 0;
+
+	/* The default timeout is 100 ms: */
+	static int timeout_ms = 100;
+
+	if (!configured) {
+		git_config_get_int("core.fileslocktimeout", &timeout_ms);
+		configured = 1;
+	}
+
+	return timeout_ms;
+}
+
 void unable_to_lock_message(const char *path, int err, struct strbuf *buf)
 {
 	if (err == EEXIST) {
diff --git a/lockfile.h b/lockfile.h
index 9843053ce8..a0520e6a7b 100644
--- a/lockfile.h
+++ b/lockfile.h
@@ -163,6 +163,8 @@ int hold_lock_file_for_update_timeout(
 		struct lock_file *lk, const char *path,
 		int flags, long timeout_ms);
 
+long get_files_lock_timeout_ms(void);
+
 /*
  * Attempt to create a lockfile for the file at `path` and return a
  * file descriptor for writing to it, or -1 on error. The flags
@@ -172,7 +174,7 @@ static inline int hold_lock_file_for_update(
 		struct lock_file *lk, const char *path,
 		int flags)
 {
-	return hold_lock_file_for_update_timeout(lk, path, flags, 0);
+	return hold_lock_file_for_update_timeout(lk, path, flags, get_files_lock_timeout_ms() );
 }
 
 /*
-- 
2.13.7


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2019-11-15 16:03 Martin Nicolay
@ 2019-11-15 16:29 ` Martin Ågren
  2019-11-15 16:37   ` Re: Martin Ågren
  0 siblings, 1 reply; 1651+ messages in thread
From: Martin Ågren @ 2019-11-15 16:29 UTC (permalink / raw)
  To: Martin Nicolay
  Cc: Git Mailing List, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Junio C Hamano, yuelinho777

Hi Martin

On Fri, 15 Nov 2019 at 17:17, Martin Nicolay <martin@wsmn.osm-gmbh.de> wrote:

> While working with complex scripts invoking git multiple times my
> editor detects the changes and calls "git status". This leads to
> aborts in "git-stash". With this patch and an appropriate value
> core.fileslocktimeout this problem goes away.

Are you able to patch your editor to call
  git --no-optional-locks status
instead? See the bottom of git-status(1) ("BACKGROUND REFRESH") for more
on this.

> +long get_files_lock_timeout_ms(void)
> +{
> +       static int configured = 0;
> +
> +       /* The default timeout is 100 ms: */
> +       static int timeout_ms = 100;
> +
> +       if (!configured) {
> +               git_config_get_int("core.fileslocktimeout", &timeout_ms);
> +               configured = 1;
> +       }
> +
> +       return timeout_ms;
> +}
> +

> @@ -172,7 +174,7 @@ static inline int hold_lock_file_for_update(
>                 struct lock_file *lk, const char *path,
>                 int flags)
>  {
> -       return hold_lock_file_for_update_timeout(lk, path, flags, 0);
> +       return hold_lock_file_for_update_timeout(lk, path, flags, get_files_lock_timeout_ms() );
>  }

This looks like it changes the default from 0 ("try exactly once") to
100ms. Maybe we should stick with 0 for those who don't jump onto this
new config knob?

Martin

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-11-15 16:29 ` Martin Ågren
@ 2019-11-15 16:37   ` Martin Ågren
  0 siblings, 0 replies; 1651+ messages in thread
From: Martin Ågren @ 2019-11-15 16:37 UTC (permalink / raw)
  To: Martin Nicolay
  Cc: Git Mailing List, Nguyễn Thái Ngọc Duy,
	Michael Haggerty, Junio C Hamano, yuelinho777

[Trying with another e-mail address for Martin Nicolay. Maybe the one
from the in-body From header works better. wsmn.osm-gmbh.de couldn't be
found.]

On Fri, 15 Nov 2019 at 17:29, Martin Ågren <martin.agren@gmail.com> wrote:
>
> Hi Martin
>
> On Fri, 15 Nov 2019 at 17:17, Martin Nicolay <martin@wsmn.osm-gmbh.de> wrote:
>
> > While working with complex scripts invoking git multiple times my
> > editor detects the changes and calls "git status". This leads to
> > aborts in "git-stash". With this patch and an appropriate value
> > core.fileslocktimeout this problem goes away.
>
> Are you able to patch your editor to call
>   git --no-optional-locks status
> instead? See the bottom of git-status(1) ("BACKGROUND REFRESH") for more
> on this.
>
> > +long get_files_lock_timeout_ms(void)
> > +{
> > +       static int configured = 0;
> > +
> > +       /* The default timeout is 100 ms: */
> > +       static int timeout_ms = 100;
> > +
> > +       if (!configured) {
> > +               git_config_get_int("core.fileslocktimeout", &timeout_ms);
> > +               configured = 1;
> > +       }
> > +
> > +       return timeout_ms;
> > +}
> > +
>
> > @@ -172,7 +174,7 @@ static inline int hold_lock_file_for_update(
> >                 struct lock_file *lk, const char *path,
> >                 int flags)
> >  {
> > -       return hold_lock_file_for_update_timeout(lk, path, flags, 0);
> > +       return hold_lock_file_for_update_timeout(lk, path, flags, get_files_lock_timeout_ms() );
> >  }
>
> This looks like it changes the default from 0 ("try exactly once") to
> 100ms. Maybe we should stick with 0 for those who don't jump onto this
> new config knob?
>
> Martin

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20191205030032.GA26925@ray.huang@amd.com>]

* RE:
       [not found] <20191205030032.GA26925@ray.huang@amd.com>
@ 2019-12-09  1:26 ` Quan, Evan
  0 siblings, 0 replies; 1651+ messages in thread
From: Quan, Evan @ 2019-12-09  1:26 UTC (permalink / raw)
  To: Huang, Ray, Wang, Kevin(Yang)
  Cc: Deucher, Alexander, amd-gfx@lists.freedesktop.org

I actually do not see any problem with this change.
1. if smu_read_smc_arg() always return 0, I do not see any meaning to keep "return 0". Making it a "void" API is more reasonable.
2. Making " WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66, msg);" a separate API is ridiculous while " WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_90, 0);" and " WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82, param);" did not. Actually these three combined together makes a real "message sending".

Anyway it's fine with me if you guys can live with original poor code.

> -----Original Message-----
> From: Huang Rui <ray.huang@amd.com>
> Sent: Thursday, December 5, 2019 11:01 AM
> To: Wang, Kevin(Yang) <Kevin1.Wang@amd.com>
> Cc: Quan, Evan <Evan.Quan@amd.com>; amd-gfx@lists.freedesktop.org;
> Deucher, Alexander <Alexander.Deucher@amd.com>
> Subject:
> 
> Bcc:
> Subject: Re: [PATCH 1/2] drm/amd/powerplay: drop unnecessary API wrapper
> and  return value
> Reply-To:
> In-Reply-To:
> <MN2PR12MB32961EFFD79528A4EFF4BF5AA25D0@MN2PR12MB3296.nampr
> d12.prod.outlook.com>
> 
> On Wed, Dec 04, 2019 at 08:41:00PM +0800, Wang, Kevin(Yang) wrote:
> >    [AMD Official Use Only - Internal Distribution Only]
> >
> >    this change doesn't make sense, and if you really think the return
> >    value is useless.
> >    It is more reasonable to accept parameters with return value, not
> >    parameter.
> >    I think these two patches make the code look worse, unless there's a
> >    bug in it.
> >    add [1]@Huang, Ray double check.
> >    Best Regards,
> >    Kevin
> >
> >
> ________________________________________________________________
> __
> >
> >    From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> on behalf of Evan
> >    Quan <evan.quan@amd.com>
> >    Sent: Wednesday, December 4, 2019 5:53 PM
> >    To: amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
> >    Cc: Quan, Evan <Evan.Quan@amd.com>
> >    Subject: [PATCH 1/2] drm/amd/powerplay: drop unnecessary API wrapper
> >    and return value
> >
> >    Some minor cosmetic fixes.
> >    Change-Id: I3ec217289f4cb491720430f2d0b0b4efe5e2b9aa
> >    Signed-off-by: Evan Quan <evan.quan@amd.com>
> >    ---
> >     drivers/gpu/drm/amd/powerplay/amdgpu_smu.c    | 12 ++----
> >     .../gpu/drm/amd/powerplay/inc/amdgpu_smu.h    |  2 +-
> >     drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h |  2 +-
> >     drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h |  2 +-
> >     drivers/gpu/drm/amd/powerplay/smu_v11_0.c     | 39 +++++--------------
> >     drivers/gpu/drm/amd/powerplay/smu_v12_0.c     | 22 ++---------
> >     6 files changed, 19 insertions(+), 60 deletions(-)
> >    diff --git a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> >    b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> >    index 2dd960e85a24..00a0df9b41c9 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> >    +++ b/drivers/gpu/drm/amd/powerplay/amdgpu_smu.c
> >    @@ -198,9 +198,7 @@ int smu_get_smc_version(struct smu_context *smu,
> >    uint32_t *if_version, uint32_t
> >                     if (ret)
> >                             return ret;
> >
> >    -               ret = smu_read_smc_arg(smu, if_version);
> >    -               if (ret)
> >    -                       return ret;
> >    +               smu_read_smc_arg(smu, if_version);
> >             }
> >
> >             if (smu_version) {
> >    @@ -208,9 +206,7 @@ int smu_get_smc_version(struct smu_context *smu,
> >    uint32_t *if_version, uint32_t
> >                     if (ret)
> >                             return ret;
> >
> >    -               ret = smu_read_smc_arg(smu, smu_version);
> >    -               if (ret)
> >    -                       return ret;
> >    +               smu_read_smc_arg(smu, smu_version);
> >             }
> >
> >             return ret;
> >    @@ -339,9 +335,7 @@ int smu_get_dpm_freq_by_index(struct
> smu_context
> >    *smu, enum smu_clk_type clk_typ
> >             if (ret)
> >                     return ret;
> >
> >    -       ret = smu_read_smc_arg(smu, &param);
> >    -       if (ret)
> >    -               return ret;
> >    +       smu_read_smc_arg(smu, &param);
> >
> >             /* BIT31:  0 - Fine grained DPM, 1 - Dicrete DPM
> >              * now, we un-support it */
> >    diff --git a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
> >    b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
> >    index ca3fdc6777cf..e7b18b209bc7 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
> >    +++ b/drivers/gpu/drm/amd/powerplay/inc/amdgpu_smu.h
> >    @@ -502,7 +502,7 @@ struct pptable_funcs {
> >             int (*system_features_control)(struct smu_context *smu, bool
> >    en);
> >             int (*send_smc_msg_with_param)(struct smu_context *smu,
> >                                            enum smu_message_type msg,
> >    uint32_t param);
> >    -       int (*read_smc_arg)(struct smu_context *smu, uint32_t *arg);
> >    +       void (*read_smc_arg)(struct smu_context *smu, uint32_t *arg);
> >             int (*init_display_count)(struct smu_context *smu, uint32_t
> >    count);
> >             int (*set_allowed_mask)(struct smu_context *smu);
> >             int (*get_enabled_mask)(struct smu_context *smu, uint32_t
> >    *feature_mask, uint32_t num);
> >    diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
> >    b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
> >    index 610e301a5fce..4160147a03f3 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
> >    +++ b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0.h
> >    @@ -183,7 +183,7 @@ smu_v11_0_send_msg_with_param(struct
> smu_context
> >    *smu,
> >                                   enum smu_message_type msg,
> >                                   uint32_t param);
> >
> >    -int smu_v11_0_read_arg(struct smu_context *smu, uint32_t *arg);
> >    +void smu_v11_0_read_arg(struct smu_context *smu, uint32_t *arg);
> >
> >     int smu_v11_0_init_display_count(struct smu_context *smu, uint32_t
> >    count);
> >
> >    diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h
> >    b/drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h
> >    index 922973b7e29f..710af2860a8f 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h
> >    +++ b/drivers/gpu/drm/amd/powerplay/inc/smu_v12_0.h
> >    @@ -40,7 +40,7 @@ struct smu_12_0_cmn2aisc_mapping {
> >     int smu_v12_0_send_msg_without_waiting(struct smu_context *smu,
> >                                                   uint16_t msg);
> >
> >    -int smu_v12_0_read_arg(struct smu_context *smu, uint32_t *arg);
> >    +void smu_v12_0_read_arg(struct smu_context *smu, uint32_t *arg);
> >
> >     int smu_v12_0_wait_for_response(struct smu_context *smu);
> >
> >    diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
> >    b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
> >    index 8683e0678b56..325ec4864f90 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
> >    +++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
> >    @@ -53,20 +53,11 @@ MODULE_FIRMWARE("amdgpu/navi12_smc.bin");
> >
> >     #define SMU11_VOLTAGE_SCALE 4
> >
> >    -static int smu_v11_0_send_msg_without_waiting(struct smu_context *smu,
> >    -                                             uint16_t msg)
> >    -{
> >    -       struct amdgpu_device *adev = smu->adev;
> >    -       WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66, msg);
> >    -       return 0;
> >    -}
> >    -
> >    -int smu_v11_0_read_arg(struct smu_context *smu, uint32_t *arg)
> >    +void smu_v11_0_read_arg(struct smu_context *smu, uint32_t *arg)
> >     {
> >             struct amdgpu_device *adev = smu->adev;
> >
> >             *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82);
> >    -       return 0;
> >     }
> >
> >     static int smu_v11_0_wait_for_response(struct smu_context *smu)
> >    @@ -109,7 +100,7 @@ smu_v11_0_send_msg_with_param(struct
> smu_context
> >    *smu,
> >
> >             WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82, param);
> >
> >    -       smu_v11_0_send_msg_without_waiting(smu, (uint16_t)index);
> >    +       WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66,
> (uint16_t)index);
> >
> >             ret = smu_v11_0_wait_for_response(smu);
> >             if (ret)
> >    @@ -843,16 +834,12 @@ int smu_v11_0_get_enabled_mask(struct
> smu_context
> >    *smu,
> >             ret = smu_send_smc_msg(smu,
> >    SMU_MSG_GetEnabledSmuFeaturesHigh);
> >             if (ret)
> >                     return ret;
> >    -       ret = smu_read_smc_arg(smu, &feature_mask_high);
> >    -       if (ret)
> >    -               return ret;
> >    +       smu_read_smc_arg(smu, &feature_mask_high);
> >
> >             ret = smu_send_smc_msg(smu,
> SMU_MSG_GetEnabledSmuFeaturesLow);
> >             if (ret)
> >                     return ret;
> >    -       ret = smu_read_smc_arg(smu, &feature_mask_low);
> >    -       if (ret)
> >    -               return ret;
> >    +       smu_read_smc_arg(smu, &feature_mask_low);
> >
> >             feature_mask[0] = feature_mask_low;
> >             feature_mask[1] = feature_mask_high;
> >    @@ -924,9 +911,7 @@ smu_v11_0_get_max_sustainable_clock(struct
> >    smu_context *smu, uint32_t *clock,
> >                     return ret;
> >             }
> >
> >    -       ret = smu_read_smc_arg(smu, clock);
> >    -       if (ret)
> >    -               return ret;
> >    +       smu_read_smc_arg(smu, clock);
> >
> >             if (*clock != 0)
> >                     return 0;
> >    @@ -939,7 +924,7 @@ smu_v11_0_get_max_sustainable_clock(struct
> >    smu_context *smu, uint32_t *clock,
> >                     return ret;
> >             }
> >
> >    -       ret = smu_read_smc_arg(smu, clock);
> >    +       smu_read_smc_arg(smu, clock);
> >
> >             return ret;
> >     }
> >    @@ -1107,9 +1092,7 @@ int smu_v11_0_get_current_clk_freq(struct
> >    smu_context *smu,
> >                     if (ret)
> >                             return ret;
> >
> >    -               ret = smu_read_smc_arg(smu, &freq);
> >    -               if (ret)
> >    -                       return ret;
> >    +               smu_read_smc_arg(smu, &freq);
> >             }
> >
> >             freq *= 100;
> >    @@ -1749,18 +1732,14 @@ int smu_v11_0_get_dpm_ultimate_freq(struct
> >    smu_context *smu, enum smu_clk_type c
> >                     ret = smu_send_smc_msg_with_param(smu,
> >    SMU_MSG_GetMaxDpmFreq, param);
> >                     if (ret)
> >                             goto failed;
> >    -               ret = smu_read_smc_arg(smu, max);
> >    -               if (ret)
> >    -                       goto failed;
> >    +               smu_read_smc_arg(smu, max);
> >             }
> >
> >             if (min) {
> >                     ret = smu_send_smc_msg_with_param(smu,
> >    SMU_MSG_GetMinDpmFreq, param);
> >                     if (ret)
> >                             goto failed;
> >    -               ret = smu_read_smc_arg(smu, min);
> >    -               if (ret)
> >    -                       goto failed;
> >    +               smu_read_smc_arg(smu, min);
> >             }
> >
> >     failed:
> >    diff --git a/drivers/gpu/drm/amd/powerplay/smu_v12_0.c
> >    b/drivers/gpu/drm/amd/powerplay/smu_v12_0.c
> >    index 269a7d73b58d..7f5f7e12a41e 100644
> >    --- a/drivers/gpu/drm/amd/powerplay/smu_v12_0.c
> >    +++ b/drivers/gpu/drm/amd/powerplay/smu_v12_0.c
> >    @@ -41,21 +41,11 @@
> >     #define SMUIO_GFX_MISC_CNTL__PWR_GFXOFF_STATUS_MASK
> >    0x00000006L
> >     #define SMUIO_GFX_MISC_CNTL__PWR_GFXOFF_STATUS__SHIFT        0x1
> >
> >    -int smu_v12_0_send_msg_without_waiting(struct smu_context *smu,
> >    -                                             uint16_t msg)
> >    -{
> >    -       struct amdgpu_device *adev = smu->adev;
> >    -
> >    -       WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66, msg);
> >    -       return 0;
> >    -}
> >    -
> >    -int smu_v12_0_read_arg(struct smu_context *smu, uint32_t *arg)
> >    +void smu_v12_0_read_arg(struct smu_context *smu, uint32_t *arg)
> >     {
> >             struct amdgpu_device *adev = smu->adev;
> >
> >             *arg = RREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82);
> >    -       return 0;
> >     }
> >
> >     int smu_v12_0_wait_for_response(struct smu_context *smu)
> >    @@ -98,7 +88,7 @@ smu_v12_0_send_msg_with_param(struct
> smu_context
> >    *smu,
> >
> >             WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_82, param);
> >
> >    -       smu_v12_0_send_msg_without_waiting(smu, (uint16_t)index);
> >    +       WREG32_SOC15(MP1, 0, mmMP1_SMN_C2PMSG_66,
> (uint16_t)index);
> 
> smu_v12_0_send_msg_without_waiting() function is more readable than using
> raw register programming.
> 
> Thanks,
> Ray
> 
> >
> >             ret = smu_v12_0_wait_for_response(smu);
> >             if (ret)
> >    @@ -352,9 +342,7 @@ int smu_v12_0_get_dpm_ultimate_freq(struct
> >    smu_context *smu, enum smu_clk_type c
> >                                     pr_err("Attempt to get max GX
> >    frequency from SMC Failed !\n");
> >                                     goto failed;
> >                             }
> >    -                       ret = smu_read_smc_arg(smu, max);
> >    -                       if (ret)
> >    -                               goto failed;
> >    +                       smu_read_smc_arg(smu, max);
> >                             break;
> >                     case SMU_UCLK:
> >                     case SMU_FCLK:
> >    @@ -383,9 +371,7 @@ int smu_v12_0_get_dpm_ultimate_freq(struct
> >    smu_context *smu, enum smu_clk_type c
> >                                     pr_err("Attempt to get min GX
> >    frequency from SMC Failed !\n");
> >                                     goto failed;
> >                             }
> >    -                       ret = smu_read_smc_arg(smu, min);
> >    -                       if (ret)
> >    -                               goto failed;
> >    +                       smu_read_smc_arg(smu, min);
> >                             break;
> >                     case SMU_UCLK:
> >                     case SMU_FCLK:
> >    --
> >    2.24.0
> >    _______________________________________________
> >    amd-gfx mailing list
> >    amd-gfx@lists.freedesktop.org
> >    [2]https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fli
> >    sts.freedesktop.org%2Fmailman%2Flistinfo%2Famd-
> gfx&amp;data=02%7C01%7CK
> >
> evin1.Wang%40amd.com%7Cb2381beaed6e4f83074608d7789fe6ef%7C3dd896
> 1fe4884
> >
> e608e11a82d994e183d%7C0%7C0%7C637110500489978071&amp;sdata=U15c
> qXp2n00L
> >    RZDeu2482cwoZmEIrXWHCgF4NFap%2BkQ%3D&amp;reserved=0
> >
> > References
> >
> >    1. mailto:Ray.Huang@amd.com
> >    2.
> > https://nam11.safelinks.protection.outlook.com/?url=https://lists.free
> > desktop.org/mailman/listinfo/amd-
> gfx&amp;data=02|01|Kevin1.Wang@amd.co
> >
> m|b2381beaed6e4f83074608d7789fe6ef|3dd8961fe4884e608e11a82d994e183
> d|0|
> >
> 0|637110500489978071&amp;sdata=U15cqXp2n00LRZDeu2482cwoZmEIrXWH
> CgF4NFa
> > p+kQ=&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2019-12-19 12:31 liming wu
  2019-12-20  1:13 ` Andreas Dilger
  0 siblings, 1 reply; 1651+ messages in thread
From: liming wu @ 2019-12-19 12:31 UTC (permalink / raw)
  To: linux-ext4

Hi

Who can help analyze the following message . Or give me some advice, I
will appreciate it very much.

Dec 17 22:14:42 bdsitdb222 kernel: Buffer I/O error on device dm-7,
logical block 810449
Dec 17 22:14:42 bdsitdb222 kernel: lost page write due to I/O error on dm-7
Dec 17 22:14:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
logical block 283536
Dec 17 22:14:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
Dec 17 22:14:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
logical block 283537
Dec 17 22:14:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
Dec 17 22:14:48 bdsitdb222 kernel: JBD: Detected IO errors while
flushing file data on dm-7
Dec 17 22:15:42 bdsitdb222 kernel: Buffer I/O error on device dm-8,
logical block 127859
Dec 17 22:15:42 bdsitdb222 kernel: lost page write due to I/O error on dm-8
Dec 17 22:15:42 bdsitdb222 kernel: JBD: Detected IO errors while
flushing file data on dm-8
Dec 17 22:15:48 bdsitdb222 kernel: Aborting journal on device dm-7.
Dec 17 22:15:48 bdsitdb222 kernel: EXT3-fs (dm-7): error in
ext3_new_blocks: Journal has aborted
Dec 17 22:15:48 bdsitdb222 kernel: EXT3-fs (dm-7): error in
ext3_reserve_inode_write: Journal has aborted
Dec 17 22:16:42 bdsitdb222 kernel: Aborting journal on device dm-8.
Dec 17 22:16:42 bdsitdb222 kernel: EXT3-fs (dm-7): error:
ext3_journal_start_sb: Detected aborted journal
Dec 17 22:16:42 bdsitdb222 kernel: EXT3-fs (dm-7): error: remounting
filesystem read-only
Dec 17 22:16:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
logical block 23527938
Dec 17 22:16:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
Dec 17 22:16:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
logical block 0
Dec 17 22:16:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
Dec 17 22:16:48 bdsitdb222 kernel: JBD: I/O error detected when
updating journal superblock for dm-7.
Dec 17 22:17:05 bdsitdb222 kernel: EXT3-fs (dm-7): error in
ext3_orphan_add: Journal has aborted
Dec 17 22:17:05 bdsitdb222 kernel: __journal_remove_journal_head:
freeing b_committed_data

plus info:
it's KVM
# uname -a
Linux bdsitdb222 2.6.32-279.19.1.el6.62.x86_64 #6 SMP Mon Dec 3
22:54:25 CST 2018 x86_64 x86_64 x86_64 GNU/Linux1

# cat /proc/mounts
rootfs / rootfs rw 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
devtmpfs /dev devtmpfs
rw,nosuid,relatime,size=8157352k,nr_inodes=2039338,mode=755 0 0
devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,relatime 0 0
/dev/mapper/systemvg-rootlv / ext4 rw,relatime,barrier=1,data=ordered 0 0
/proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
/dev/vda1 /boot ext4 rw,relatime,barrier=1,data=ordered 0 0
/dev/mapper/systemvg-homelv /home ext4 rw,relatime,barrier=1,data=ordered 0 0
/dev/mapper/systemvg-optlv /opt ext3
rw,relatime,errors=continue,barrier=1,data=ordered 0 0
/dev/mapper/systemvg-tmplv /tmp ext3
rw,relatime,errors=continue,barrier=1,data=ordered 0 0
/dev/mapper/systemvg-usrlv /usr ext4 rw,relatime,barrier=1,data=ordered 0 0
/dev/mapper/systemvg-varlv /var ext4 rw,relatime,barrier=1,data=ordered 0 0
/dev/mapper/datavg-datalv /mysql/data ext3
rw,relatime,errors=continue,barrier=1,data=ordered 0 0
/dev/mapper/datavg-binloglv /mysql/binlog ext3
rw,relatime,errors=continue,barrier=1,data=ordered 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0

# ll /dev/mapper/
total 0
crw-rw---- 1 root root 10, 58 Dec 19 19:21 control
lrwxrwxrwx 1 root root      7 Dec 19 19:21 datavg-binloglv -> ../dm-3
lrwxrwxrwx 1 root root      7 Dec 19 19:21 datavg-datalv -> ../dm-2
lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-homelv -> ../dm-4
lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-optlv -> ../dm-7
lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-rootlv -> ../dm-1
lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-swaplv -> ../dm-0
lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-tmplv -> ../dm-6
lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-usrlv -> ../dm-8
lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-varlv -> ../dm-5

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2019-12-19 12:31 liming wu
@ 2019-12-20  1:13 ` Andreas Dilger
  0 siblings, 0 replies; 1651+ messages in thread
From: Andreas Dilger @ 2019-12-20  1:13 UTC (permalink / raw)
  To: liming wu; +Cc: Ext4 Developers List

[-- Attachment #1: Type: text/plain, Size: 4753 bytes --]

These messages indicate your storage is not working properly.
It doesn't have anything to do with ext3/ext4.



> On Dec 19, 2019, at 5:31 AM, liming wu <wu860403@gmail.com> wrote:
> 
> Hi
> 
> 
> Who can help analyze the following message . Or give me some advice, I
> will appreciate it very much.
> 
> Dec 17 22:14:42 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 810449
> Dec 17 22:14:42 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:14:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 283536
> Dec 17 22:14:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:14:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 283537
> Dec 17 22:14:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:14:48 bdsitdb222 kernel: JBD: Detected IO errors while
> flushing file data on dm-7
> Dec 17 22:15:42 bdsitdb222 kernel: Buffer I/O error on device dm-8,
> logical block 127859
> Dec 17 22:15:42 bdsitdb222 kernel: lost page write due to I/O error on dm-8
> Dec 17 22:15:42 bdsitdb222 kernel: JBD: Detected IO errors while
> flushing file data on dm-8
> Dec 17 22:15:48 bdsitdb222 kernel: Aborting journal on device dm-7.
> Dec 17 22:15:48 bdsitdb222 kernel: EXT3-fs (dm-7): error in
> ext3_new_blocks: Journal has aborted
> Dec 17 22:15:48 bdsitdb222 kernel: EXT3-fs (dm-7): error in
> ext3_reserve_inode_write: Journal has aborted
> Dec 17 22:16:42 bdsitdb222 kernel: Aborting journal on device dm-8.
> Dec 17 22:16:42 bdsitdb222 kernel: EXT3-fs (dm-7): error:
> ext3_journal_start_sb: Detected aborted journal
> Dec 17 22:16:42 bdsitdb222 kernel: EXT3-fs (dm-7): error: remounting
> filesystem read-only
> Dec 17 22:16:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 23527938
> Dec 17 22:16:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:16:48 bdsitdb222 kernel: Buffer I/O error on device dm-7,
> logical block 0
> Dec 17 22:16:48 bdsitdb222 kernel: lost page write due to I/O error on dm-7
> Dec 17 22:16:48 bdsitdb222 kernel: JBD: I/O error detected when
> updating journal superblock for dm-7.
> Dec 17 22:17:05 bdsitdb222 kernel: EXT3-fs (dm-7): error in
> ext3_orphan_add: Journal has aborted
> Dec 17 22:17:05 bdsitdb222 kernel: __journal_remove_journal_head:
> freeing b_committed_data
> 
> plus info:
> it's KVM
> # uname -a
> Linux bdsitdb222 2.6.32-279.19.1.el6.62.x86_64 #6 SMP Mon Dec 3
> 22:54:25 CST 2018 x86_64 x86_64 x86_64 GNU/Linux1
> 
> # cat /proc/mounts
> rootfs / rootfs rw 0 0
> proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
> sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
> devtmpfs /dev devtmpfs
> rw,nosuid,relatime,size=8157352k,nr_inodes=2039338,mode=755 0 0
> devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
> tmpfs /dev/shm tmpfs rw,nosuid,nodev,relatime 0 0
> /dev/mapper/systemvg-rootlv / ext4 rw,relatime,barrier=1,data=ordered 0 0
> /proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
> /dev/vda1 /boot ext4 rw,relatime,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-homelv /home ext4 rw,relatime,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-optlv /opt ext3
> rw,relatime,errors=continue,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-tmplv /tmp ext3
> rw,relatime,errors=continue,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-usrlv /usr ext4 rw,relatime,barrier=1,data=ordered 0 0
> /dev/mapper/systemvg-varlv /var ext4 rw,relatime,barrier=1,data=ordered 0 0
> /dev/mapper/datavg-datalv /mysql/data ext3
> rw,relatime,errors=continue,barrier=1,data=ordered 0 0
> /dev/mapper/datavg-binloglv /mysql/binlog ext3
> rw,relatime,errors=continue,barrier=1,data=ordered 0 0
> none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
> none /sys/kernel/debug debugfs rw,relatime 0 0
> 
> # ll /dev/mapper/
> total 0
> crw-rw---- 1 root root 10, 58 Dec 19 19:21 control
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 datavg-binloglv -> ../dm-3
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 datavg-datalv -> ../dm-2
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-homelv -> ../dm-4
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-optlv -> ../dm-7
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-rootlv -> ../dm-1
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-swaplv -> ../dm-0
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-tmplv -> ../dm-6
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-usrlv -> ../dm-8
> lrwxrwxrwx 1 root root      7 Dec 19 19:21 systemvg-varlv -> ../dm-5


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 873 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
@ 2019-12-20 17:06 Ben Dooks
  2019-12-22 17:08 ` Dmitry Osipenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Ben Dooks @ 2019-12-20 17:06 UTC (permalink / raw)
  To: Dmitry Osipenko, Jon Hunter, linux-tegra, alsa-devel,
	Jaroslav Kysela, Takashi Iwai, Liam Girdwood, Mark Brown,
	Thierry Reding
  Cc: linux-kernel, Edward Cragg

On 20/12/2019 16:40, Dmitry Osipenko wrote:
> 20.12.2019 18:25, Ben Dooks пишет:
>> On 20/12/2019 15:02, Dmitry Osipenko wrote:
>>> 20.12.2019 17:56, Ben Dooks пишет:
>>>> On 20/12/2019 14:43, Dmitry Osipenko wrote:
>>>>> 20.12.2019 16:57, Jon Hunter пишет:
>>>>>>
>>>>>> On 20/12/2019 11:38, Ben Dooks wrote:
>>>>>>> On 20/12/2019 11:30, Jon Hunter wrote:
>>>>>>>>
>>>>>>>> On 25/11/2019 17:28, Dmitry Osipenko wrote:
>>>>>>>>> 25.11.2019 20:22, Dmitry Osipenko пишет:
>>>>>>>>>> 25.11.2019 13:37, Ben Dooks пишет:
>>>>>>>>>>> On 23/11/2019 21:09, Dmitry Osipenko wrote:
>>>>>>>>>>>> 18.10.2019 18:48, Ben Dooks пишет:
>>>>>>>>>>>>> From: Edward Cragg <edward.cragg@codethink.co.uk>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The tegra3 audio can support 24 and 32 bit sample sizes so add
>>>>>>>>>>>>> the
>>>>>>>>>>>>> option to the tegra30_i2s_hw_params to configure the S24_LE or
>>>>>>>>>>>>> S32_LE
>>>>>>>>>>>>> formats when requested.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Edward Cragg <edward.cragg@codethink.co.uk>
>>>>>>>>>>>>> [ben.dooks@codethink.co.uk: fixup merge of 24 and 32bit]
>>>>>>>>>>>>> [ben.dooks@codethink.co.uk: add pm calls around ytdm config]
>>>>>>>>>>>>> [ben.dooks@codethink.co.uk: drop debug printing to dev_dbg]
>>>>>>>>>>>>> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>> squash 5aeca5a055fd ASoC: tegra: i2s: pm_runtime_get_sync() is
>>>>>>>>>>>>> needed
>>>>>>>>>>>>> in tdm code
>>>>>>>>>>>>>
>>>>>>>>>>>>> ASoC: tegra: i2s: pm_runtime_get_sync() is needed in tdm code
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>       sound/soc/tegra/tegra30_i2s.c | 25
>>>>>>>>>>>>> ++++++++++++++++++++-----
>>>>>>>>>>>>>       1 file changed, 20 insertions(+), 5 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/sound/soc/tegra/tegra30_i2s.c
>>>>>>>>>>>>> b/sound/soc/tegra/tegra30_i2s.c
>>>>>>>>>>>>> index 73f0dddeaef3..063f34c882af 100644
>>>>>>>>>>>>> --- a/sound/soc/tegra/tegra30_i2s.c
>>>>>>>>>>>>> +++ b/sound/soc/tegra/tegra30_i2s.c
>>>>>>>>>>>>> @@ -127,7 +127,7 @@ static int tegra30_i2s_hw_params(struct
>>>>>>>>>>>>> snd_pcm_substream *substream,
>>>>>>>>>>>>>           struct device *dev = dai->dev;
>>>>>>>>>>>>>           struct tegra30_i2s *i2s =
>>>>>>>>>>>>> snd_soc_dai_get_drvdata(dai);
>>>>>>>>>>>>>           unsigned int mask, val, reg;
>>>>>>>>>>>>> -    int ret, sample_size, srate, i2sclock, bitcnt;
>>>>>>>>>>>>> +    int ret, sample_size, srate, i2sclock, bitcnt, audio_bits;
>>>>>>>>>>>>>           struct tegra30_ahub_cif_conf cif_conf;
>>>>>>>>>>>>>             if (params_channels(params) != 2)
>>>>>>>>>>>>> @@ -137,8 +137,19 @@ static int tegra30_i2s_hw_params(struct
>>>>>>>>>>>>> snd_pcm_substream *substream,
>>>>>>>>>>>>>           switch (params_format(params)) {
>>>>>>>>>>>>>           case SNDRV_PCM_FORMAT_S16_LE:
>>>>>>>>>>>>>               val = TEGRA30_I2S_CTRL_BIT_SIZE_16;
>>>>>>>>>>>>> +        audio_bits = TEGRA30_AUDIOCIF_BITS_16;
>>>>>>>>>>>>>               sample_size = 16;
>>>>>>>>>>>>>               break;
>>>>>>>>>>>>> +    case SNDRV_PCM_FORMAT_S24_LE:
>>>>>>>>>>>>> +        val = TEGRA30_I2S_CTRL_BIT_SIZE_24;
>>>>>>>>>>>>> +        audio_bits = TEGRA30_AUDIOCIF_BITS_24;
>>>>>>>>>>>>> +        sample_size = 24;
>>>>>>>>>>>>> +        break;
>>>>>>>>>>>>> +    case SNDRV_PCM_FORMAT_S32_LE:
>>>>>>>>>>>>> +        val = TEGRA30_I2S_CTRL_BIT_SIZE_32;
>>>>>>>>>>>>> +        audio_bits = TEGRA30_AUDIOCIF_BITS_32;
>>>>>>>>>>>>> +        sample_size = 32;
>>>>>>>>>>>>> +        break;
>>>>>>>>>>>>>           default:
>>>>>>>>>>>>>               return -EINVAL;
>>>>>>>>>>>>>           }
>>>>>>>>>>>>> @@ -170,8 +181,8 @@ static int tegra30_i2s_hw_params(struct
>>>>>>>>>>>>> snd_pcm_substream *substream,
>>>>>>>>>>>>>           cif_conf.threshold = 0;
>>>>>>>>>>>>>           cif_conf.audio_channels = 2;
>>>>>>>>>>>>>           cif_conf.client_channels = 2;
>>>>>>>>>>>>> -    cif_conf.audio_bits = TEGRA30_AUDIOCIF_BITS_16;
>>>>>>>>>>>>> -    cif_conf.client_bits = TEGRA30_AUDIOCIF_BITS_16;
>>>>>>>>>>>>> +    cif_conf.audio_bits = audio_bits;
>>>>>>>>>>>>> +    cif_conf.client_bits = audio_bits;
>>>>>>>>>>>>>           cif_conf.expand = 0;
>>>>>>>>>>>>>           cif_conf.stereo_conv = 0;
>>>>>>>>>>>>>           cif_conf.replicate = 0;
>>>>>>>>>>>>> @@ -306,14 +317,18 @@ static const struct snd_soc_dai_driver
>>>>>>>>>>>>> tegra30_i2s_dai_template = {
>>>>>>>>>>>>>               .channels_min = 2,
>>>>>>>>>>>>>               .channels_max = 2,
>>>>>>>>>>>>>               .rates = SNDRV_PCM_RATE_8000_96000,
>>>>>>>>>>>>> -        .formats = SNDRV_PCM_FMTBIT_S16_LE,
>>>>>>>>>>>>> +        .formats = SNDRV_PCM_FMTBIT_S32_LE |
>>>>>>>>>>>>> +               SNDRV_PCM_FMTBIT_S24_LE |
>>>>>>>>>>>>> +               SNDRV_PCM_FMTBIT_S16_LE,
>>>>>>>>>>>>>           },
>>>>>>>>>>>>>           .capture = {
>>>>>>>>>>>>>               .stream_name = "Capture",
>>>>>>>>>>>>>               .channels_min = 2,
>>>>>>>>>>>>>               .channels_max = 2,
>>>>>>>>>>>>>               .rates = SNDRV_PCM_RATE_8000_96000,
>>>>>>>>>>>>> -        .formats = SNDRV_PCM_FMTBIT_S16_LE,
>>>>>>>>>>>>> +        .formats = SNDRV_PCM_FMTBIT_S32_LE |
>>>>>>>>>>>>> +               SNDRV_PCM_FMTBIT_S24_LE |
>>>>>>>>>>>>> +               SNDRV_PCM_FMTBIT_S16_LE,
>>>>>>>>>>>>>           },
>>>>>>>>>>>>>           .ops = &tegra30_i2s_dai_ops,
>>>>>>>>>>>>>           .symmetric_rates = 1,
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hello,
>>>>>>>>>>>>
>>>>>>>>>>>> This patch breaks audio on Tegra30. I don't see errors
>>>>>>>>>>>> anywhere, but
>>>>>>>>>>>> there is no audio and reverting this patch helps. Please fix it.
>>>>>>>>>>>
>>>>>>>>>>> What is the failure mode? I can try and take a look at this some
>>>>>>>>>>> time
>>>>>>>>>>> this week, but I am not sure if I have any boards with an actual
>>>>>>>>>>> useful
>>>>>>>>>>> audio output?
>>>>>>>>>>
>>>>>>>>>> The failure mode is that there no sound. I also noticed that video
>>>>>>>>>> playback stutters a lot if movie file has audio track, seems
>>>>>>>>>> something
>>>>>>>>>> times out during of the audio playback. For now I don't have any
>>>>>>>>>> more info.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Oh, I didn't say how to reproduce it.. for example simply playing
>>>>>>>>> big_buck_bunny_720p_h264.mov in MPV has the audio problem.
>>>>>>>>>
>>>>>>>>> https://download.blender.org/peach/bigbuckbunny_movies/big_buck_bunny_720p_h264.mov
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Given that the audio drivers uses regmap, it could be good to
>>>>>>>> dump the
>>>>>>>> I2S/AHUB registers while the clip if playing with and without this
>>>>>>>> patch
>>>>>>>> to see the differences. I am curious if the audio is now being
>>>>>>>> played as
>>>>>>>> 24 or 32-bit instead of 16-bit now these are available.
>>>>>>>>
>>>>>>>> You could also dump the hw_params to see the format while playing as
>>>>>>>> well ...
>>>>>>>>
>>>>>>>> $ /proc/asound/<scard-name>/pcm0p/sub0/hw_params
>>>>>>>
>>>>>>> I suppose it is also possible that the codec isn't properly doing >16
>>>>>>> bits and the fact we now offer 24 and 32 could be an issue. I've not
>>>>>>> got anything with an audio output on it that would be easy to test.
>>>>>>
>>>>>> I thought I had tested on a Jetson TK1 (Tegra124) but it was sometime
>>>>>> back. However, admittedly I may have only done simple 16-bit testing
>>>>>> with speaker-test.
>>>>>>
>>>>>> We do verify that all soundcards are detected and registered as
>>>>>> expected
>>>>>> during daily testing, but at the moment we don't have anything that
>>>>>> verifies actual playback.
>>>>>
>>>>> Please take a look at the attached logs.
>>>>
>>>> I wonder if we are falling into FIFO configuration issues with the
>>>> non-16 bit case.
>>>>
>>>> Can you try adding the following two patches?
>>>
>>> It is much better now! There is no horrible noise and no stuttering, but
>>> audio still has a "robotic" effect, like freq isn't correct.
>>
>> I wonder if there's an issue with FIFO stoking? I should also possibly
>> add the correctly stop FIFOs patch as well.
> 
> I'll be happy to try more patches.
> 
> Meanwhile sound on v5.5+ is broken and thus the incomplete patches
> should be reverted.

Have you checked if just removing the 24/32 bits from .formats in
the dai template and see if the problem continues? I will try and
see if I can find the code that does the fifo level checking and
see if the problem is in the FIFO fill or something else in the
audio hub setup.


-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2019-12-20 17:06 [alsa-devel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples Ben Dooks
@ 2019-12-22 17:08 ` Dmitry Osipenko
  2020-01-05  0:04   ` Ben Dooks
  0 siblings, 1 reply; 1651+ messages in thread
From: Dmitry Osipenko @ 2019-12-22 17:08 UTC (permalink / raw)
  To: Ben Dooks, Jon Hunter, linux-tegra, alsa-devel, Jaroslav Kysela,
	Takashi Iwai, Liam Girdwood, Mark Brown, Thierry Reding
  Cc: linux-kernel, Edward Cragg

20.12.2019 20:06, Ben Dooks пишет:
> On 20/12/2019 16:40, Dmitry Osipenko wrote:
>> 20.12.2019 18:25, Ben Dooks пишет:
>>> On 20/12/2019 15:02, Dmitry Osipenko wrote:
>>>> 20.12.2019 17:56, Ben Dooks пишет:
>>>>> On 20/12/2019 14:43, Dmitry Osipenko wrote:
>>>>>> 20.12.2019 16:57, Jon Hunter пишет:
>>>>>>>
>>>>>>> On 20/12/2019 11:38, Ben Dooks wrote:
>>>>>>>> On 20/12/2019 11:30, Jon Hunter wrote:
>>>>>>>>>
>>>>>>>>> On 25/11/2019 17:28, Dmitry Osipenko wrote:
>>>>>>>>>> 25.11.2019 20:22, Dmitry Osipenko пишет:
>>>>>>>>>>> 25.11.2019 13:37, Ben Dooks пишет:
>>>>>>>>>>>> On 23/11/2019 21:09, Dmitry Osipenko wrote:
>>>>>>>>>>>>> 18.10.2019 18:48, Ben Dooks пишет:
>>>>>>>>>>>>>> From: Edward Cragg <edward.cragg@codethink.co.uk>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The tegra3 audio can support 24 and 32 bit sample sizes so
>>>>>>>>>>>>>> add
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> option to the tegra30_i2s_hw_params to configure the
>>>>>>>>>>>>>> S24_LE or
>>>>>>>>>>>>>> S32_LE
>>>>>>>>>>>>>> formats when requested.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Edward Cragg <edward.cragg@codethink.co.uk>
>>>>>>>>>>>>>> [ben.dooks@codethink.co.uk: fixup merge of 24 and 32bit]
>>>>>>>>>>>>>> [ben.dooks@codethink.co.uk: add pm calls around ytdm config]
>>>>>>>>>>>>>> [ben.dooks@codethink.co.uk: drop debug printing to dev_dbg]
>>>>>>>>>>>>>> Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>> squash 5aeca5a055fd ASoC: tegra: i2s:
>>>>>>>>>>>>>> pm_runtime_get_sync() is
>>>>>>>>>>>>>> needed
>>>>>>>>>>>>>> in tdm code
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ASoC: tegra: i2s: pm_runtime_get_sync() is needed in tdm code
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>       sound/soc/tegra/tegra30_i2s.c | 25
>>>>>>>>>>>>>> ++++++++++++++++++++-----
>>>>>>>>>>>>>>       1 file changed, 20 insertions(+), 5 deletions(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/sound/soc/tegra/tegra30_i2s.c
>>>>>>>>>>>>>> b/sound/soc/tegra/tegra30_i2s.c
>>>>>>>>>>>>>> index 73f0dddeaef3..063f34c882af 100644
>>>>>>>>>>>>>> --- a/sound/soc/tegra/tegra30_i2s.c
>>>>>>>>>>>>>> +++ b/sound/soc/tegra/tegra30_i2s.c
>>>>>>>>>>>>>> @@ -127,7 +127,7 @@ static int tegra30_i2s_hw_params(struct
>>>>>>>>>>>>>> snd_pcm_substream *substream,
>>>>>>>>>>>>>>           struct device *dev = dai->dev;
>>>>>>>>>>>>>>           struct tegra30_i2s *i2s =
>>>>>>>>>>>>>> snd_soc_dai_get_drvdata(dai);
>>>>>>>>>>>>>>           unsigned int mask, val, reg;
>>>>>>>>>>>>>> -    int ret, sample_size, srate, i2sclock, bitcnt;
>>>>>>>>>>>>>> +    int ret, sample_size, srate, i2sclock, bitcnt,
>>>>>>>>>>>>>> audio_bits;
>>>>>>>>>>>>>>           struct tegra30_ahub_cif_conf cif_conf;
>>>>>>>>>>>>>>             if (params_channels(params) != 2)
>>>>>>>>>>>>>> @@ -137,8 +137,19 @@ static int tegra30_i2s_hw_params(struct
>>>>>>>>>>>>>> snd_pcm_substream *substream,
>>>>>>>>>>>>>>           switch (params_format(params)) {
>>>>>>>>>>>>>>           case SNDRV_PCM_FORMAT_S16_LE:
>>>>>>>>>>>>>>               val = TEGRA30_I2S_CTRL_BIT_SIZE_16;
>>>>>>>>>>>>>> +        audio_bits = TEGRA30_AUDIOCIF_BITS_16;
>>>>>>>>>>>>>>               sample_size = 16;
>>>>>>>>>>>>>>               break;
>>>>>>>>>>>>>> +    case SNDRV_PCM_FORMAT_S24_LE:
>>>>>>>>>>>>>> +        val = TEGRA30_I2S_CTRL_BIT_SIZE_24;
>>>>>>>>>>>>>> +        audio_bits = TEGRA30_AUDIOCIF_BITS_24;
>>>>>>>>>>>>>> +        sample_size = 24;
>>>>>>>>>>>>>> +        break;
>>>>>>>>>>>>>> +    case SNDRV_PCM_FORMAT_S32_LE:
>>>>>>>>>>>>>> +        val = TEGRA30_I2S_CTRL_BIT_SIZE_32;
>>>>>>>>>>>>>> +        audio_bits = TEGRA30_AUDIOCIF_BITS_32;
>>>>>>>>>>>>>> +        sample_size = 32;
>>>>>>>>>>>>>> +        break;
>>>>>>>>>>>>>>           default:
>>>>>>>>>>>>>>               return -EINVAL;
>>>>>>>>>>>>>>           }
>>>>>>>>>>>>>> @@ -170,8 +181,8 @@ static int tegra30_i2s_hw_params(struct
>>>>>>>>>>>>>> snd_pcm_substream *substream,
>>>>>>>>>>>>>>           cif_conf.threshold = 0;
>>>>>>>>>>>>>>           cif_conf.audio_channels = 2;
>>>>>>>>>>>>>>           cif_conf.client_channels = 2;
>>>>>>>>>>>>>> -    cif_conf.audio_bits = TEGRA30_AUDIOCIF_BITS_16;
>>>>>>>>>>>>>> -    cif_conf.client_bits = TEGRA30_AUDIOCIF_BITS_16;
>>>>>>>>>>>>>> +    cif_conf.audio_bits = audio_bits;
>>>>>>>>>>>>>> +    cif_conf.client_bits = audio_bits;
>>>>>>>>>>>>>>           cif_conf.expand = 0;
>>>>>>>>>>>>>>           cif_conf.stereo_conv = 0;
>>>>>>>>>>>>>>           cif_conf.replicate = 0;
>>>>>>>>>>>>>> @@ -306,14 +317,18 @@ static const struct snd_soc_dai_driver
>>>>>>>>>>>>>> tegra30_i2s_dai_template = {
>>>>>>>>>>>>>>               .channels_min = 2,
>>>>>>>>>>>>>>               .channels_max = 2,
>>>>>>>>>>>>>>               .rates = SNDRV_PCM_RATE_8000_96000,
>>>>>>>>>>>>>> -        .formats = SNDRV_PCM_FMTBIT_S16_LE,
>>>>>>>>>>>>>> +        .formats = SNDRV_PCM_FMTBIT_S32_LE |
>>>>>>>>>>>>>> +               SNDRV_PCM_FMTBIT_S24_LE |
>>>>>>>>>>>>>> +               SNDRV_PCM_FMTBIT_S16_LE,
>>>>>>>>>>>>>>           },
>>>>>>>>>>>>>>           .capture = {
>>>>>>>>>>>>>>               .stream_name = "Capture",
>>>>>>>>>>>>>>               .channels_min = 2,
>>>>>>>>>>>>>>               .channels_max = 2,
>>>>>>>>>>>>>>               .rates = SNDRV_PCM_RATE_8000_96000,
>>>>>>>>>>>>>> -        .formats = SNDRV_PCM_FMTBIT_S16_LE,
>>>>>>>>>>>>>> +        .formats = SNDRV_PCM_FMTBIT_S32_LE |
>>>>>>>>>>>>>> +               SNDRV_PCM_FMTBIT_S24_LE |
>>>>>>>>>>>>>> +               SNDRV_PCM_FMTBIT_S16_LE,
>>>>>>>>>>>>>>           },
>>>>>>>>>>>>>>           .ops = &tegra30_i2s_dai_ops,
>>>>>>>>>>>>>>           .symmetric_rates = 1,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> This patch breaks audio on Tegra30. I don't see errors
>>>>>>>>>>>>> anywhere, but
>>>>>>>>>>>>> there is no audio and reverting this patch helps. Please
>>>>>>>>>>>>> fix it.
>>>>>>>>>>>>
>>>>>>>>>>>> What is the failure mode? I can try and take a look at this
>>>>>>>>>>>> some
>>>>>>>>>>>> time
>>>>>>>>>>>> this week, but I am not sure if I have any boards with an
>>>>>>>>>>>> actual
>>>>>>>>>>>> useful
>>>>>>>>>>>> audio output?
>>>>>>>>>>>
>>>>>>>>>>> The failure mode is that there no sound. I also noticed that
>>>>>>>>>>> video
>>>>>>>>>>> playback stutters a lot if movie file has audio track, seems
>>>>>>>>>>> something
>>>>>>>>>>> times out during of the audio playback. For now I don't have any
>>>>>>>>>>> more info.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Oh, I didn't say how to reproduce it.. for example simply playing
>>>>>>>>>> big_buck_bunny_720p_h264.mov in MPV has the audio problem.
>>>>>>>>>>
>>>>>>>>>> https://download.blender.org/peach/bigbuckbunny_movies/big_buck_bunny_720p_h264.mov
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Given that the audio drivers uses regmap, it could be good to
>>>>>>>>> dump the
>>>>>>>>> I2S/AHUB registers while the clip if playing with and without this
>>>>>>>>> patch
>>>>>>>>> to see the differences. I am curious if the audio is now being
>>>>>>>>> played as
>>>>>>>>> 24 or 32-bit instead of 16-bit now these are available.
>>>>>>>>>
>>>>>>>>> You could also dump the hw_params to see the format while
>>>>>>>>> playing as
>>>>>>>>> well ...
>>>>>>>>>
>>>>>>>>> $ /proc/asound/<scard-name>/pcm0p/sub0/hw_params
>>>>>>>>
>>>>>>>> I suppose it is also possible that the codec isn't properly
>>>>>>>> doing >16
>>>>>>>> bits and the fact we now offer 24 and 32 could be an issue. I've
>>>>>>>> not
>>>>>>>> got anything with an audio output on it that would be easy to test.
>>>>>>>
>>>>>>> I thought I had tested on a Jetson TK1 (Tegra124) but it was
>>>>>>> sometime
>>>>>>> back. However, admittedly I may have only done simple 16-bit testing
>>>>>>> with speaker-test.
>>>>>>>
>>>>>>> We do verify that all soundcards are detected and registered as
>>>>>>> expected
>>>>>>> during daily testing, but at the moment we don't have anything that
>>>>>>> verifies actual playback.
>>>>>>
>>>>>> Please take a look at the attached logs.
>>>>>
>>>>> I wonder if we are falling into FIFO configuration issues with the
>>>>> non-16 bit case.
>>>>>
>>>>> Can you try adding the following two patches?
>>>>
>>>> It is much better now! There is no horrible noise and no stuttering,
>>>> but
>>>> audio still has a "robotic" effect, like freq isn't correct.
>>>
>>> I wonder if there's an issue with FIFO stoking? I should also possibly
>>> add the correctly stop FIFOs patch as well.
>>
>> I'll be happy to try more patches.
>>
>> Meanwhile sound on v5.5+ is broken and thus the incomplete patches
>> should be reverted.
> 
> Have you checked if just removing the 24/32 bits from .formats in
> the dai template and see if the problem continues? 

It works.

> I will try and
> see if I can find the code that does the fifo level checking and
> see if the problem is in the FIFO fill or something else in the
> audio hub setup.

Ok
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2019-12-22 17:08 ` Dmitry Osipenko
@ 2020-01-05  0:04   ` Ben Dooks
  2020-01-05  1:48     ` Dmitry Osipenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Ben Dooks @ 2020-01-05  0:04 UTC (permalink / raw)
  To: Dmitry Osipenko, Jon Hunter, linux-tegra, alsa-devel,
	Jaroslav Kysela, Takashi Iwai, Liam Girdwood, Mark Brown,
	Thierry Reding
  Cc: linux-kernel, Edward Cragg

[snip]

I've just gone through testing.

Some simple data tests show 16 and 32-bits work.

The 24 bit case seems to be weird, it looks like the 24-bit expects
24 bit samples in 32 bit words. I can't see any packing options to
do 24 bit in 24 bit, so we may have to remove 24 bit sample support
(which is a shame)

My preference is to remove the 24-bit support and keep the 32 bit in.

-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-05  0:04   ` Ben Dooks
@ 2020-01-05  1:48     ` Dmitry Osipenko
  2020-01-05 10:53       ` Ben Dooks
  0 siblings, 1 reply; 1651+ messages in thread
From: Dmitry Osipenko @ 2020-01-05  1:48 UTC (permalink / raw)
  To: Ben Dooks, Jon Hunter, linux-tegra, alsa-devel, Jaroslav Kysela,
	Takashi Iwai, Liam Girdwood, Mark Brown, Thierry Reding
  Cc: linux-kernel, Edward Cragg

05.01.2020 03:04, Ben Dooks пишет:
> [snip]
> 
> I've just gone through testing.
> 
> Some simple data tests show 16 and 32-bits work.
> 
> The 24 bit case seems to be weird, it looks like the 24-bit expects
> 24 bit samples in 32 bit words. I can't see any packing options to
> do 24 bit in 24 bit, so we may have to remove 24 bit sample support
> (which is a shame)
> 
> My preference is to remove the 24-bit support and keep the 32 bit in.
> 

Interesting.. Jon, could you please confirm that 24bit format isn't
usable on T30?
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-05  1:48     ` Dmitry Osipenko
@ 2020-01-05 10:53       ` Ben Dooks
  2020-01-06 19:00         ` [alsa-devel] [Linux-kernel] " Ben Dooks
  0 siblings, 1 reply; 1651+ messages in thread
From: Ben Dooks @ 2020-01-05 10:53 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: linux-kernel, alsa-devel, Takashi Iwai, Liam Girdwood, Mark Brown,
	Thierry Reding, Edward Cragg, linux-tegra, Jon Hunter



On 2020-01-05 01:48, Dmitry Osipenko wrote:
> 05.01.2020 03:04, Ben Dooks пишет:
>> [snip]
>> 
>> I've just gone through testing.
>> 
>> Some simple data tests show 16 and 32-bits work.
>> 
>> The 24 bit case seems to be weird, it looks like the 24-bit expects
>> 24 bit samples in 32 bit words. I can't see any packing options to
>> do 24 bit in 24 bit, so we may have to remove 24 bit sample support
>> (which is a shame)
>> 
>> My preference is to remove the 24-bit support and keep the 32 bit in.
>> 
> 
> Interesting.. Jon, could you please confirm that 24bit format isn't
> usable on T30?

If there is an option of 24 packed into 32, then I think that would 
work.

I can try testing that with raw data on Monday.

-- 
Ben

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [Linux-kernel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-05 10:53       ` Ben Dooks
@ 2020-01-06 19:00         ` Ben Dooks
  2020-01-07  1:39           ` Dmitry Osipenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Ben Dooks @ 2020-01-06 19:00 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: linux-kernel, alsa-devel, Liam Girdwood, Takashi Iwai, Mark Brown,
	Thierry Reding, Edward Cragg, linux-tegra, Jon Hunter

On 05/01/2020 10:53, Ben Dooks wrote:
> 
> 
> On 2020-01-05 01:48, Dmitry Osipenko wrote:
>> 05.01.2020 03:04, Ben Dooks пишет:
>>> [snip]
>>>
>>> I've just gone through testing.
>>>
>>> Some simple data tests show 16 and 32-bits work.
>>>
>>> The 24 bit case seems to be weird, it looks like the 24-bit expects
>>> 24 bit samples in 32 bit words. I can't see any packing options to
>>> do 24 bit in 24 bit, so we may have to remove 24 bit sample support
>>> (which is a shame)
>>>
>>> My preference is to remove the 24-bit support and keep the 32 bit in.
>>>
>>
>> Interesting.. Jon, could you please confirm that 24bit format isn't
>> usable on T30?
> 
> If there is an option of 24 packed into 32, then I think that would work.
> 
> I can try testing that with raw data on Monday.

I need to check some things, I assumed 24 was 24 packed bits, it looks
like the default is 24 in 32 bits so we may be ok. However I need to
re-write my test case which assumed it was 24bits in 3 bytes (S24_3LE).

I'll follow up later,



-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [Linux-kernel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-06 19:00         ` [alsa-devel] [Linux-kernel] " Ben Dooks
@ 2020-01-07  1:39           ` Dmitry Osipenko
  2020-01-23 19:38             ` Ben Dooks
  0 siblings, 1 reply; 1651+ messages in thread
From: Dmitry Osipenko @ 2020-01-07  1:39 UTC (permalink / raw)
  To: Ben Dooks
  Cc: linux-kernel, alsa-devel, Liam Girdwood, Takashi Iwai, Mark Brown,
	Thierry Reding, Edward Cragg, linux-tegra, Jon Hunter

06.01.2020 22:00, Ben Dooks пишет:
> On 05/01/2020 10:53, Ben Dooks wrote:
>>
>>
>> On 2020-01-05 01:48, Dmitry Osipenko wrote:
>>> 05.01.2020 03:04, Ben Dooks пишет:
>>>> [snip]
>>>>
>>>> I've just gone through testing.
>>>>
>>>> Some simple data tests show 16 and 32-bits work.
>>>>
>>>> The 24 bit case seems to be weird, it looks like the 24-bit expects
>>>> 24 bit samples in 32 bit words. I can't see any packing options to
>>>> do 24 bit in 24 bit, so we may have to remove 24 bit sample support
>>>> (which is a shame)
>>>>
>>>> My preference is to remove the 24-bit support and keep the 32 bit in.
>>>>
>>>
>>> Interesting.. Jon, could you please confirm that 24bit format isn't
>>> usable on T30?
>>
>> If there is an option of 24 packed into 32, then I think that would work.
>>
>> I can try testing that with raw data on Monday.
> 
> I need to check some things, I assumed 24 was 24 packed bits, it looks
> like the default is 24 in 32 bits so we may be ok. However I need to
> re-write my test case which assumed it was 24bits in 3 bytes (S24_3LE).
> 
> I'll follow up later,

Okay, the S24_3LE isn't supported by RT5640 codec in my case. I briefly
looked through the TRM doc and got impression that AHUB could re-pack
data stream into something that codec supports, but maybe it's a wrong
impression.
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [Linux-kernel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-07  1:39           ` Dmitry Osipenko
@ 2020-01-23 19:38             ` Ben Dooks
  2020-01-24 16:50               ` Jon Hunter
  0 siblings, 1 reply; 1651+ messages in thread
From: Ben Dooks @ 2020-01-23 19:38 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: linux-kernel, alsa-devel, Liam Girdwood, Takashi Iwai, Mark Brown,
	Thierry Reding, Edward Cragg, linux-tegra, Jon Hunter

On 07/01/2020 01:39, Dmitry Osipenko wrote:
> 06.01.2020 22:00, Ben Dooks пишет:
>> On 05/01/2020 10:53, Ben Dooks wrote:
>>>
>>>
>>> On 2020-01-05 01:48, Dmitry Osipenko wrote:
>>>> 05.01.2020 03:04, Ben Dooks пишет:
>>>>> [snip]
>>>>>
>>>>> I've just gone through testing.
>>>>>
>>>>> Some simple data tests show 16 and 32-bits work.
>>>>>
>>>>> The 24 bit case seems to be weird, it looks like the 24-bit expects
>>>>> 24 bit samples in 32 bit words. I can't see any packing options to
>>>>> do 24 bit in 24 bit, so we may have to remove 24 bit sample support
>>>>> (which is a shame)
>>>>>
>>>>> My preference is to remove the 24-bit support and keep the 32 bit in.
>>>>>
>>>>
>>>> Interesting.. Jon, could you please confirm that 24bit format isn't
>>>> usable on T30?
>>>
>>> If there is an option of 24 packed into 32, then I think that would work.
>>>
>>> I can try testing that with raw data on Monday.
>>
>> I need to check some things, I assumed 24 was 24 packed bits, it looks
>> like the default is 24 in 32 bits so we may be ok. However I need to
>> re-write my test case which assumed it was 24bits in 3 bytes (S24_3LE).
>>
>> I'll follow up later,
> 
> Okay, the S24_3LE isn't supported by RT5640 codec in my case. I briefly
> looked through the TRM doc and got impression that AHUB could re-pack
> data stream into something that codec supports, but maybe it's a wrong
> impression.
> _________________________________

I did a quick test with the following:

  sox -n -b 16 -c 2 -r 44100 /tmp/tmp.wav  synth sine 500 vol 0.5
  sox -n -b 24 -c 2 -r 44100 /tmp/tmp.wav  synth sine 500 vol 0.5
  sox -n -b 32 -c 2 -r 44100 /tmp/tmp.wav  synth sine 500 vol 0.5

The 16 and 32 work fine, the 24 is showing a playback output freq
of 440Hz instead of 500Hz... this suggests the clock is off, or there
is something else weird going on...

-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [Linux-kernel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-23 19:38             ` Ben Dooks
@ 2020-01-24 16:50               ` Jon Hunter
  2020-01-27 19:20                 ` Dmitry Osipenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Jon Hunter @ 2020-01-24 16:50 UTC (permalink / raw)
  To: Ben Dooks, Dmitry Osipenko
  Cc: linux-kernel, alsa-devel, Liam Girdwood, Takashi Iwai, Mark Brown,
	Thierry Reding, Edward Cragg, linux-tegra


On 23/01/2020 19:38, Ben Dooks wrote:
> On 07/01/2020 01:39, Dmitry Osipenko wrote:
>> 06.01.2020 22:00, Ben Dooks пишет:
>>> On 05/01/2020 10:53, Ben Dooks wrote:
>>>>
>>>>
>>>> On 2020-01-05 01:48, Dmitry Osipenko wrote:
>>>>> 05.01.2020 03:04, Ben Dooks пишет:
>>>>>> [snip]
>>>>>>
>>>>>> I've just gone through testing.
>>>>>>
>>>>>> Some simple data tests show 16 and 32-bits work.
>>>>>>
>>>>>> The 24 bit case seems to be weird, it looks like the 24-bit expects
>>>>>> 24 bit samples in 32 bit words. I can't see any packing options to
>>>>>> do 24 bit in 24 bit, so we may have to remove 24 bit sample support
>>>>>> (which is a shame)
>>>>>>
>>>>>> My preference is to remove the 24-bit support and keep the 32 bit in.
>>>>>>
>>>>>
>>>>> Interesting.. Jon, could you please confirm that 24bit format isn't
>>>>> usable on T30?
>>>>
>>>> If there is an option of 24 packed into 32, then I think that would
>>>> work.
>>>>
>>>> I can try testing that with raw data on Monday.
>>>
>>> I need to check some things, I assumed 24 was 24 packed bits, it looks
>>> like the default is 24 in 32 bits so we may be ok. However I need to
>>> re-write my test case which assumed it was 24bits in 3 bytes (S24_3LE).
>>>
>>> I'll follow up later,
>>
>> Okay, the S24_3LE isn't supported by RT5640 codec in my case. I briefly
>> looked through the TRM doc and got impression that AHUB could re-pack
>> data stream into something that codec supports, but maybe it's a wrong
>> impression.
>> _________________________________
> 
> I did a quick test with the following:
> 
>  sox -n -b 16 -c 2 -r 44100 /tmp/tmp.wav  synth sine 500 vol 0.5
>  sox -n -b 24 -c 2 -r 44100 /tmp/tmp.wav  synth sine 500 vol 0.5
>  sox -n -b 32 -c 2 -r 44100 /tmp/tmp.wav  synth sine 500 vol 0.5
> 
> The 16 and 32 work fine, the 24 is showing a playback output freq
> of 440Hz instead of 500Hz... this suggests the clock is off, or there
> is something else weird going on...

I was looking at using sox to create such as file, but the above command
generates a S24_3LE file and not S24_LE file. The codec on Jetson-TK1
supports S24_LE but does not support S24_3LE and so I cannot test this.
Anyway, we really need to test S24_LE and not S24_3LE because this is
the problem that Dmitry is having.

Ben is S24_3LE what you really need to support?

Dmitry, does the following fix your problem?

diff --git a/sound/soc/tegra/tegra30_i2s.c b/sound/soc/tegra/tegra30_i2s.c
index dbed3c5408e7..92845c4b63f4 100644
--- a/sound/soc/tegra/tegra30_i2s.c
+++ b/sound/soc/tegra/tegra30_i2s.c
@@ -140,7 +140,7 @@ static int tegra30_i2s_hw_params(struct
snd_pcm_substream *substream,
                audio_bits = TEGRA30_AUDIOCIF_BITS_16;
                sample_size = 16;
                break;
-       case SNDRV_PCM_FORMAT_S24_LE:
+       case SNDRV_PCM_FORMAT_S24_3LE:
                val = TEGRA30_I2S_CTRL_BIT_SIZE_24;
                audio_bits = TEGRA30_AUDIOCIF_BITS_24;
                sample_size = 24;
@@ -318,7 +318,7 @@ static const struct snd_soc_dai_driver
tegra30_i2s_dai_template = {
                .channels_max = 2,
                .rates = SNDRV_PCM_RATE_8000_96000,
                .formats = SNDRV_PCM_FMTBIT_S32_LE |
-                          SNDRV_PCM_FMTBIT_S24_LE |
+                          SNDRV_PCM_FMTBIT_S24_3LE |
                           SNDRV_PCM_FMTBIT_S16_LE,
        },
        .capture = {
@@ -327,7 +327,7 @@ static const struct snd_soc_dai_driver
tegra30_i2s_dai_template = {
                .channels_max = 2,
                .rates = SNDRV_PCM_RATE_8000_96000,
                .formats = SNDRV_PCM_FMTBIT_S32_LE |
-                          SNDRV_PCM_FMTBIT_S24_LE |
+                          SNDRV_PCM_FMTBIT_S24_3LE |
                           SNDRV_PCM_FMTBIT_S16_LE,
        },
        .ops = &tegra30_i2s_dai_ops,

Jon

-- 
nvpublic
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [Linux-kernel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-24 16:50               ` Jon Hunter
@ 2020-01-27 19:20                 ` Dmitry Osipenko
  2020-01-28 12:13                   ` Mark Brown
  0 siblings, 1 reply; 1651+ messages in thread
From: Dmitry Osipenko @ 2020-01-27 19:20 UTC (permalink / raw)
  To: Jon Hunter, Ben Dooks
  Cc: linux-kernel, alsa-devel, Liam Girdwood, Takashi Iwai, Mark Brown,
	Thierry Reding, Edward Cragg, linux-tegra

24.01.2020 19:50, Jon Hunter пишет:
> 
> On 23/01/2020 19:38, Ben Dooks wrote:
>> On 07/01/2020 01:39, Dmitry Osipenko wrote:
>>> 06.01.2020 22:00, Ben Dooks пишет:
>>>> On 05/01/2020 10:53, Ben Dooks wrote:
>>>>>
>>>>>
>>>>> On 2020-01-05 01:48, Dmitry Osipenko wrote:
>>>>>> 05.01.2020 03:04, Ben Dooks пишет:
>>>>>>> [snip]
>>>>>>>
>>>>>>> I've just gone through testing.
>>>>>>>
>>>>>>> Some simple data tests show 16 and 32-bits work.
>>>>>>>
>>>>>>> The 24 bit case seems to be weird, it looks like the 24-bit expects
>>>>>>> 24 bit samples in 32 bit words. I can't see any packing options to
>>>>>>> do 24 bit in 24 bit, so we may have to remove 24 bit sample support
>>>>>>> (which is a shame)
>>>>>>>
>>>>>>> My preference is to remove the 24-bit support and keep the 32 bit in.
>>>>>>>
>>>>>>
>>>>>> Interesting.. Jon, could you please confirm that 24bit format isn't
>>>>>> usable on T30?
>>>>>
>>>>> If there is an option of 24 packed into 32, then I think that would
>>>>> work.
>>>>>
>>>>> I can try testing that with raw data on Monday.
>>>>
>>>> I need to check some things, I assumed 24 was 24 packed bits, it looks
>>>> like the default is 24 in 32 bits so we may be ok. However I need to
>>>> re-write my test case which assumed it was 24bits in 3 bytes (S24_3LE).
>>>>
>>>> I'll follow up later,
>>>
>>> Okay, the S24_3LE isn't supported by RT5640 codec in my case. I briefly
>>> looked through the TRM doc and got impression that AHUB could re-pack
>>> data stream into something that codec supports, but maybe it's a wrong
>>> impression.
>>> _________________________________
>>
>> I did a quick test with the following:
>>
>>  sox -n -b 16 -c 2 -r 44100 /tmp/tmp.wav  synth sine 500 vol 0.5
>>  sox -n -b 24 -c 2 -r 44100 /tmp/tmp.wav  synth sine 500 vol 0.5
>>  sox -n -b 32 -c 2 -r 44100 /tmp/tmp.wav  synth sine 500 vol 0.5
>>
>> The 16 and 32 work fine, the 24 is showing a playback output freq
>> of 440Hz instead of 500Hz... this suggests the clock is off, or there
>> is something else weird going on...
> 
> I was looking at using sox to create such as file, but the above command
> generates a S24_3LE file and not S24_LE file. The codec on Jetson-TK1
> supports S24_LE but does not support S24_3LE and so I cannot test this.
> Anyway, we really need to test S24_LE and not S24_3LE because this is
> the problem that Dmitry is having.
> 
> Ben is S24_3LE what you really need to support?
> 
> Dmitry, does the following fix your problem?
> 
> diff --git a/sound/soc/tegra/tegra30_i2s.c b/sound/soc/tegra/tegra30_i2s.c
> index dbed3c5408e7..92845c4b63f4 100644
> --- a/sound/soc/tegra/tegra30_i2s.c
> +++ b/sound/soc/tegra/tegra30_i2s.c
> @@ -140,7 +140,7 @@ static int tegra30_i2s_hw_params(struct
> snd_pcm_substream *substream,
>                 audio_bits = TEGRA30_AUDIOCIF_BITS_16;
>                 sample_size = 16;
>                 break;
> -       case SNDRV_PCM_FORMAT_S24_LE:
> +       case SNDRV_PCM_FORMAT_S24_3LE:
>                 val = TEGRA30_I2S_CTRL_BIT_SIZE_24;
>                 audio_bits = TEGRA30_AUDIOCIF_BITS_24;
>                 sample_size = 24;
> @@ -318,7 +318,7 @@ static const struct snd_soc_dai_driver
> tegra30_i2s_dai_template = {
>                 .channels_max = 2,
>                 .rates = SNDRV_PCM_RATE_8000_96000,
>                 .formats = SNDRV_PCM_FMTBIT_S32_LE |
> -                          SNDRV_PCM_FMTBIT_S24_LE |
> +                          SNDRV_PCM_FMTBIT_S24_3LE |
>                            SNDRV_PCM_FMTBIT_S16_LE,
>         },
>         .capture = {
> @@ -327,7 +327,7 @@ static const struct snd_soc_dai_driver
> tegra30_i2s_dai_template = {
>                 .channels_max = 2,
>                 .rates = SNDRV_PCM_RATE_8000_96000,
>                 .formats = SNDRV_PCM_FMTBIT_S32_LE |
> -                          SNDRV_PCM_FMTBIT_S24_LE |
> +                          SNDRV_PCM_FMTBIT_S24_3LE |
>                            SNDRV_PCM_FMTBIT_S16_LE,
>         },
>         .ops = &tegra30_i2s_dai_ops,
> 
> Jon
> 

It should solve the problem in my particular case, but I'm not sure that
the solution is correct.

The v5.5 kernel is released now with the broken audio and apparently
getting 24bit to work won't be trivial (if possible at all). Ben, could
you please send a patch to fix v5.5 by removing the S24 support
advertisement from the driver?
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [Linux-kernel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-27 19:20                 ` Dmitry Osipenko
@ 2020-01-28 12:13                   ` Mark Brown
  2020-01-28 17:42                     ` Dmitry Osipenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Mark Brown @ 2020-01-28 12:13 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: linux-kernel, alsa-devel, Takashi Iwai, Liam Girdwood, Ben Dooks,
	Thierry Reding, Edward Cragg, linux-tegra, Jon Hunter


[-- Attachment #1.1: Type: text/plain, Size: 1209 bytes --]

On Mon, Jan 27, 2020 at 10:20:25PM +0300, Dmitry Osipenko wrote:
> 24.01.2020 19:50, Jon Hunter пишет:

> >                 .rates = SNDRV_PCM_RATE_8000_96000,
> >                 .formats = SNDRV_PCM_FMTBIT_S32_LE |
> > -                          SNDRV_PCM_FMTBIT_S24_LE |
> > +                          SNDRV_PCM_FMTBIT_S24_3LE |

> It should solve the problem in my particular case, but I'm not sure that
> the solution is correct.

If the format implemented by the driver is S24_3LE the driver should
advertise S24_3LE.

> The v5.5 kernel is released now with the broken audio and apparently
> getting 24bit to work won't be trivial (if possible at all). Ben, could
> you please send a patch to fix v5.5 by removing the S24 support
> advertisement from the driver?

Why is that the best fix rather than just advertising the format
implemented by the driver?

I really don't understand why this is all taking so long, this thread
just seems to be going round in interminable circles long after it
looked like the issue was understood.  I have to admit I've not read
every single message in the thread but it's difficult to see why it
doesn't seem to be making any progress.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [Linux-kernel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-28 12:13                   ` Mark Brown
@ 2020-01-28 17:42                     ` Dmitry Osipenko
  2020-01-28 18:19                       ` Jon Hunter
  0 siblings, 1 reply; 1651+ messages in thread
From: Dmitry Osipenko @ 2020-01-28 17:42 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-kernel, alsa-devel, Takashi Iwai, Liam Girdwood, Ben Dooks,
	Thierry Reding, Edward Cragg, linux-tegra, Jon Hunter

28.01.2020 15:13, Mark Brown пишет:
> On Mon, Jan 27, 2020 at 10:20:25PM +0300, Dmitry Osipenko wrote:
>> 24.01.2020 19:50, Jon Hunter пишет:
> 
>>>                 .rates = SNDRV_PCM_RATE_8000_96000,
>>>                 .formats = SNDRV_PCM_FMTBIT_S32_LE |
>>> -                          SNDRV_PCM_FMTBIT_S24_LE |
>>> +                          SNDRV_PCM_FMTBIT_S24_3LE |
> 
>> It should solve the problem in my particular case, but I'm not sure that
>> the solution is correct.
> 
> If the format implemented by the driver is S24_3LE the driver should
> advertise S24_3LE.

It should be S24_LE, but seems we still don't know for sure.

>> The v5.5 kernel is released now with the broken audio and apparently
>> getting 24bit to work won't be trivial (if possible at all). Ben, could
>> you please send a patch to fix v5.5 by removing the S24 support
>> advertisement from the driver?
> 
> Why is that the best fix rather than just advertising the format
> implemented by the driver?

The currently supported format that is known to work well is S16_LE.

I'm suggesting to drop the S24_LE and S32_LE that were added by the
applied patches simply because this series wasn't tested properly before
it was sent out and turned out that it doesn't work well.

> I really don't understand why this is all taking so long, this thread
> just seems to be going round in interminable circles long after it
> looked like the issue was understood.  I have to admit I've not read
> every single message in the thread but it's difficult to see why it
> doesn't seem to be making any progress.

Ben was trying to make a fix for the introduced problem, but it's not
easy as we see now.

Perhaps the best solution should be to revert all of the three applied
patches and try again later on, once all current problems will be resolved.
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [Linux-kernel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-28 17:42                     ` Dmitry Osipenko
@ 2020-01-28 18:19                       ` Jon Hunter
  2020-01-29  0:17                         ` Dmitry Osipenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Jon Hunter @ 2020-01-28 18:19 UTC (permalink / raw)
  To: Dmitry Osipenko, Mark Brown
  Cc: linux-kernel, alsa-devel, Liam Girdwood, Takashi Iwai, Ben Dooks,
	Thierry Reding, Edward Cragg, linux-tegra


On 28/01/2020 17:42, Dmitry Osipenko wrote:
> 28.01.2020 15:13, Mark Brown пишет:
>> On Mon, Jan 27, 2020 at 10:20:25PM +0300, Dmitry Osipenko wrote:
>>> 24.01.2020 19:50, Jon Hunter пишет:
>>
>>>>                 .rates = SNDRV_PCM_RATE_8000_96000,
>>>>                 .formats = SNDRV_PCM_FMTBIT_S32_LE |
>>>> -                          SNDRV_PCM_FMTBIT_S24_LE |
>>>> +                          SNDRV_PCM_FMTBIT_S24_3LE |
>>
>>> It should solve the problem in my particular case, but I'm not sure that
>>> the solution is correct.
>>
>> If the format implemented by the driver is S24_3LE the driver should
>> advertise S24_3LE.
> 
> It should be S24_LE, but seems we still don't know for sure.

Why?

>>> The v5.5 kernel is released now with the broken audio and apparently
>>> getting 24bit to work won't be trivial (if possible at all). Ben, could
>>> you please send a patch to fix v5.5 by removing the S24 support
>>> advertisement from the driver?
>>
>> Why is that the best fix rather than just advertising the format
>> implemented by the driver?
> 
> The currently supported format that is known to work well is S16_LE.
> 
> I'm suggesting to drop the S24_LE and S32_LE that were added by the
> applied patches simply because this series wasn't tested properly before
> it was sent out and turned out that it doesn't work well.

S32_LE should be fine, however, I do have some concerns about S24_LE.

Jon

-- 
nvpublic
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [alsa-devel] [Linux-kernel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples
  2020-01-28 18:19                       ` Jon Hunter
@ 2020-01-29  0:17                         ` Dmitry Osipenko
       [not found]                           ` <2ff97414-f0a5-7224-0e53-6cad2ed0ccd2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: Dmitry Osipenko @ 2020-01-29  0:17 UTC (permalink / raw)
  To: Jon Hunter, Mark Brown
  Cc: linux-kernel, alsa-devel, Liam Girdwood, Takashi Iwai, Ben Dooks,
	Thierry Reding, Edward Cragg, linux-tegra

28.01.2020 21:19, Jon Hunter пишет:
> 
> On 28/01/2020 17:42, Dmitry Osipenko wrote:
>> 28.01.2020 15:13, Mark Brown пишет:
>>> On Mon, Jan 27, 2020 at 10:20:25PM +0300, Dmitry Osipenko wrote:
>>>> 24.01.2020 19:50, Jon Hunter пишет:
>>>
>>>>>                 .rates = SNDRV_PCM_RATE_8000_96000,
>>>>>                 .formats = SNDRV_PCM_FMTBIT_S32_LE |
>>>>> -                          SNDRV_PCM_FMTBIT_S24_LE |
>>>>> +                          SNDRV_PCM_FMTBIT_S24_3LE |
>>>
>>>> It should solve the problem in my particular case, but I'm not sure that
>>>> the solution is correct.
>>>
>>> If the format implemented by the driver is S24_3LE the driver should
>>> advertise S24_3LE.
>>
>> It should be S24_LE, but seems we still don't know for sure.
> 
> Why?
/I think/ sound should be much more distorted if it was S24_3LE, but
maybe I'm wrong.
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
https://mailman.alsa-project.org/mailman/listinfo/alsa-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <2ff97414-f0a5-7224-0e53-6cad2ed0ccd2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]

* Re:
       [not found]                           ` <2ff97414-f0a5-7224-0e53-6cad2ed0ccd2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2020-01-30  8:05                             ` Ben Dooks
  0 siblings, 0 replies; 1651+ messages in thread
From: Ben Dooks @ 2020-01-30  8:05 UTC (permalink / raw)
  To: Dmitry Osipenko, Jon Hunter, Mark Brown
  Cc: linux-kernel-81qHHgoATdFT9dQujB1mzip2UmYkHbXO,
	alsa-devel-K7yf7f+aM1XWsZ/bQMPhNw, Liam Girdwood, Takashi Iwai,
	Thierry Reding, Edward Cragg, linux-tegra-u79uwXL29TY76Z2rM5mHXA

On 29/01/2020 00:17, Dmitry Osipenko wrote:
> 28.01.2020 21:19, Jon Hunter пишет:
>>
>> On 28/01/2020 17:42, Dmitry Osipenko wrote:
>>> 28.01.2020 15:13, Mark Brown пишет:
>>>> On Mon, Jan 27, 2020 at 10:20:25PM +0300, Dmitry Osipenko wrote:
>>>>> 24.01.2020 19:50, Jon Hunter пишет:
>>>>
>>>>>>                  .rates = SNDRV_PCM_RATE_8000_96000,
>>>>>>                  .formats = SNDRV_PCM_FMTBIT_S32_LE |
>>>>>> -                          SNDRV_PCM_FMTBIT_S24_LE |
>>>>>> +                          SNDRV_PCM_FMTBIT_S24_3LE |
>>>>
>>>>> It should solve the problem in my particular case, but I'm not sure that
>>>>> the solution is correct.
>>>>
>>>> If the format implemented by the driver is S24_3LE the driver should
>>>> advertise S24_3LE.
>>>
>>> It should be S24_LE, but seems we still don't know for sure.
>>
>> Why?
> /I think/ sound should be much more distorted if it was S24_3LE, but
> maybe I'm wrong.

S24_3LE is IICC the wrong thing and we can't support it on the tegra3


-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <mailman.6.1579205674.8101.b.a.t.m.a.n@lists.open-mesh.org>]

* Re:
       [not found] <mailman.6.1579205674.8101.b.a.t.m.a.n@lists.open-mesh.org>
@ 2020-01-17  7:44 ` Simon Wunderlich
  0 siblings, 0 replies; 1651+ messages in thread
From: Simon Wunderlich @ 2020-01-17  7:44 UTC (permalink / raw)
  To: b.a.t.m.a.n; +Cc: Martin, Jeremy J CIV USARMY CCDC C5ISR (USA)

[-- Attachment #1: Type: text/plain, Size: 1424 bytes --]

Hi Jeremy,

On Thursday, January 16, 2020 9:06:50 PM CET Martin, Jeremy J CIV USARMY CCDC 
C5ISR (USA) via B.A.T.M.A.N wrote:
> My/My Teams intent is to have 4 radios in total, 2 on one pc and two on
> another. Our plan is to have Batman take care of the switching between
> which radio to use in order to transmit data between these two PC's. One
> radio is high frequency radio (60 Ghz) and the other would be a lower
> frequency radio and the idea is to have batman switch between these radios
> once the higher frequency radio is dropping between a certain TQ.

BATMAN will switch by default when one link has a better TQ (towards the final 
destination) than the other link, so I believe this should happen by default.

> My
> primary questions regarding this scenario would be, 1) Are there specific
> standards the radio chipsets would need to support in order for them to
> work in this scenario?. 

Normally you would want IBSS mode or 802.11s mode work. BATMAN can also work 
in AP/Sta mode, although the packet loss counting may be biased since 
broadcast handling works a bit different than in IBSS/11s. But for point-to-
point links it might just work.

> 2) Would Batman-adv be adequate enough to be able
> to handle a 1Gb/s data transmission and be able to swap accordingly to the
> lower frequency radio?

If your radio and CPU are powerful enough, batman-adv is able to handle it, 
yes.

Cheers,
      Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-02-06  2:24 Viviane Jose Pereira
  0 siblings, 0 replies; 1651+ messages in thread
From: Viviane Jose Pereira @ 2020-02-06  2:24 UTC (permalink / raw)




-- 
Hallo, ich entschuldige mich dafür, dass ich deine Privatsphäre gestört habe. Ich kontaktiere Sie für eine äußerst dringende und vertrauliche Angelegenheit.

Ihnen wurde eine Spende von 15.000.000,00 EUR angeboten Kontakt: cristtom063@gmail.com für weitere Informationen.

Dies ist keine Spam-Nachricht, sondern eine wichtige Mitteilung an Sie. Antworten Sie auf die obige E-Mail, um immer mehr Informationen über die Spende und den Erhalt von Geldern zu erhalten.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-02-06  6:36 Viviane Jose Pereira
  0 siblings, 0 replies; 1651+ messages in thread
From: Viviane Jose Pereira @ 2020-02-06  6:36 UTC (permalink / raw)




-- 
Hallo, ich entschuldige mich dafür, dass ich deine Privatsphäre gestört habe. Ich kontaktiere Sie für eine äußerst dringende und vertrauliche Angelegenheit.

Ihnen wurde eine Spende von 15.000.000,00 EUR angeboten Kontakt: cristtom063@gmail.com für weitere Informationen.

Dies ist keine Spam-Nachricht, sondern eine wichtige Mitteilung an Sie. Antworten Sie auf die obige E-Mail, um immer mehr Informationen über die Spende und den Erhalt von Geldern zu erhalten.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown)
@ 2020-02-11 22:34 Rajat Jain
       [not found] ` <20200211223400.107604-1-rajatja-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: Rajat Jain @ 2020-02-11 22:34 UTC (permalink / raw)
  To: Daniel Mack, Haojian Zhuang, Robert Jarzmik, Mark Brown,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-spi-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Evan Green, rajatja-hpIqsD4AKlfQT0dZR+AlfA,
	rajatxjain-Re5JQEeQqe8AvxtiuMwx3w, evgreen-hpIqsD4AKlfQT0dZR+AlfA,
	shobhit.srivastava-ral2JQCrhuEAvxtiuMwx3w,
	porselvan.muthukrishnan-ral2JQCrhuEAvxtiuMwx3w

From: Evan Green <evgreen-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

Date: Wed, 29 Jan 2020 13:54:16 -0800
Subject: [PATCH] spi: pxa2xx: Add CS control clock quirk

In some circumstances on Intel LPSS controllers, toggling the LPSS
CS control register doesn't actually cause the CS line to toggle.
This seems to be failure of dynamic clock gating that occurs after
going through a suspend/resume transition, where the controller
is sent through a reset transition. This ruins SPI transactions
that either rely on delay_usecs, or toggle the CS line without
sending data.

Whenever CS is toggled, momentarily set the clock gating register
to "Force On" to poke the controller into acting on CS.

Signed-off-by: Evan Green <evgreen-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
Signed-off-by: Rajat Jain <rajatja-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
---
 drivers/spi/spi-pxa2xx.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c
index 4c7a71f0fb3e..2e318158fca9 100644
--- a/drivers/spi/spi-pxa2xx.c
+++ b/drivers/spi/spi-pxa2xx.c
@@ -70,6 +70,10 @@ MODULE_ALIAS("platform:pxa2xx-spi");
 #define LPSS_CAPS_CS_EN_SHIFT			9
 #define LPSS_CAPS_CS_EN_MASK			(0xf << LPSS_CAPS_CS_EN_SHIFT)
 
+#define LPSS_PRIV_CLOCK_GATE 0x38
+#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK 0x3
+#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON 0x3
+
 struct lpss_config {
 	/* LPSS offset from drv_data->ioaddr */
 	unsigned offset;
@@ -86,6 +90,8 @@ struct lpss_config {
 	unsigned cs_sel_shift;
 	unsigned cs_sel_mask;
 	unsigned cs_num;
+	/* Quirks */
+	unsigned cs_clk_stays_gated : 1;
 };
 
 /* Keep these sorted with enum pxa_ssp_type */
@@ -156,6 +162,7 @@ static const struct lpss_config lpss_platforms[] = {
 		.tx_threshold_hi = 56,
 		.cs_sel_shift = 8,
 		.cs_sel_mask = 3 << 8,
+		.cs_clk_stays_gated = true,
 	},
 };
 
@@ -383,6 +390,22 @@ static void lpss_ssp_cs_control(struct spi_device *spi, bool enable)
 	else
 		value |= LPSS_CS_CONTROL_CS_HIGH;
 	__lpss_ssp_write_priv(drv_data, config->reg_cs_ctrl, value);
+	if (config->cs_clk_stays_gated) {
+		u32 clkgate;
+
+		/*
+		 * Changing CS alone when dynamic clock gating is on won't
+		 * actually flip CS at that time. This ruins SPI transfers
+		 * that specify delays, or have no data. Toggle the clock mode
+		 * to force on briefly to poke the CS pin to move.
+		 */
+		clkgate = __lpss_ssp_read_priv(drv_data, LPSS_PRIV_CLOCK_GATE);
+		value = (clkgate & ~LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK) |
+			LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON;
+
+		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, value);
+		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, clkgate);
+	}
 }
 
 static void cs_assert(struct spi_device *spi)
-- 
2.25.0.225.g125e21ebc7-goog

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

[parent not found: <20200211223400.107604-1-rajatja-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>]

* Re:
  2020-02-11 22:34 (unknown) Rajat Jain
@ 2020-02-12  9:30     ` Jarkko Nikula
  0 siblings, 0 replies; 1651+ messages in thread
From: Jarkko Nikula @ 2020-02-12  9:30 UTC (permalink / raw)
  To: Rajat Jain, Daniel Mack, Haojian Zhuang, Robert Jarzmik,
	Mark Brown, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-spi-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
  Cc: Evan Green, rajatxjain-Re5JQEeQqe8AvxtiuMwx3w,
	evgreen-hpIqsD4AKlfQT0dZR+AlfA,
	shobhit.srivastava-ral2JQCrhuEAvxtiuMwx3w,
	porselvan.muthukrishnan-ral2JQCrhuEAvxtiuMwx3w, Andy Shevchenko

Hi

+ Andy

On 2/12/20 12:34 AM, Rajat Jain wrote:
> From: Evan Green <evgreen-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> 
> Date: Wed, 29 Jan 2020 13:54:16 -0800
> Subject: [PATCH] spi: pxa2xx: Add CS control clock quirk
> 
This patch subject is missing from mail subject.

> In some circumstances on Intel LPSS controllers, toggling the LPSS
> CS control register doesn't actually cause the CS line to toggle.
> This seems to be failure of dynamic clock gating that occurs after
> going through a suspend/resume transition, where the controller
> is sent through a reset transition. This ruins SPI transactions
> that either rely on delay_usecs, or toggle the CS line without
> sending data.
> 
> Whenever CS is toggled, momentarily set the clock gating register
> to "Force On" to poke the controller into acting on CS.
> 
Could you share the test case how to trigger this? What's the platform 
here? I'd like to check does this reproduce on other Intel LPSS 
platforms so is there need to add quirk for them too.

> Signed-off-by: Evan Green <evgreen-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> Signed-off-by: Rajat Jain <rajatja-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> ---
>   drivers/spi/spi-pxa2xx.c | 23 +++++++++++++++++++++++
>   1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c
> index 4c7a71f0fb3e..2e318158fca9 100644
> --- a/drivers/spi/spi-pxa2xx.c
> +++ b/drivers/spi/spi-pxa2xx.c
> @@ -70,6 +70,10 @@ MODULE_ALIAS("platform:pxa2xx-spi");
>   #define LPSS_CAPS_CS_EN_SHIFT			9
>   #define LPSS_CAPS_CS_EN_MASK			(0xf << LPSS_CAPS_CS_EN_SHIFT)
>   
> +#define LPSS_PRIV_CLOCK_GATE 0x38
> +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK 0x3
> +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON 0x3
> +
>   struct lpss_config {
>   	/* LPSS offset from drv_data->ioaddr */
>   	unsigned offset;
> @@ -86,6 +90,8 @@ struct lpss_config {
>   	unsigned cs_sel_shift;
>   	unsigned cs_sel_mask;
>   	unsigned cs_num;
> +	/* Quirks */
> +	unsigned cs_clk_stays_gated : 1;
>   };
>   
>   /* Keep these sorted with enum pxa_ssp_type */
> @@ -156,6 +162,7 @@ static const struct lpss_config lpss_platforms[] = {
>   		.tx_threshold_hi = 56,
>   		.cs_sel_shift = 8,
>   		.cs_sel_mask = 3 << 8,
> +		.cs_clk_stays_gated = true,
>   	},
>   };
>   
> @@ -383,6 +390,22 @@ static void lpss_ssp_cs_control(struct spi_device *spi, bool enable)
>   	else
>   		value |= LPSS_CS_CONTROL_CS_HIGH;
>   	__lpss_ssp_write_priv(drv_data, config->reg_cs_ctrl, value);
> +	if (config->cs_clk_stays_gated) {
> +		u32 clkgate;
> +
> +		/*
> +		 * Changing CS alone when dynamic clock gating is on won't
> +		 * actually flip CS at that time. This ruins SPI transfers
> +		 * that specify delays, or have no data. Toggle the clock mode
> +		 * to force on briefly to poke the CS pin to move.
> +		 */
> +		clkgate = __lpss_ssp_read_priv(drv_data, LPSS_PRIV_CLOCK_GATE);
> +		value = (clkgate & ~LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK) |
> +			LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON;
> +
> +		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, value);
> +		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, clkgate);
> +	}
>   }
>   
I wonder is it enough to have this quick toggling only or is time or 
actually number of clock cycles dependent? Now there is no delay between 
but I'm thinking if it needs certain number cycles does this still work 
when using low ssp_clk rates similar than in commit d0283eb2dbc1 ("spi: 
pxa2xx: Add output control for multiple Intel LPSS chip selects").

I'm thinking can this be done only once after resume and may other LPSS 
blocks need the same? I.e. should this be done in drivers/mfd/intel-lpss.c?

Jarkko

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-02-12  9:30     ` Jarkko Nikula
  0 siblings, 0 replies; 1651+ messages in thread
From: Jarkko Nikula @ 2020-02-12  9:30 UTC (permalink / raw)
  To: Rajat Jain, Daniel Mack, Haojian Zhuang, Robert Jarzmik,
	Mark Brown, linux-arm-kernel, linux-spi, linux-kernel
  Cc: rajatxjain, shobhit.srivastava, Evan Green, evgreen,
	porselvan.muthukrishnan, Andy Shevchenko

Hi

+ Andy

On 2/12/20 12:34 AM, Rajat Jain wrote:
> From: Evan Green <evgreen@chromium.org>
> 
> Date: Wed, 29 Jan 2020 13:54:16 -0800
> Subject: [PATCH] spi: pxa2xx: Add CS control clock quirk
> 
This patch subject is missing from mail subject.

> In some circumstances on Intel LPSS controllers, toggling the LPSS
> CS control register doesn't actually cause the CS line to toggle.
> This seems to be failure of dynamic clock gating that occurs after
> going through a suspend/resume transition, where the controller
> is sent through a reset transition. This ruins SPI transactions
> that either rely on delay_usecs, or toggle the CS line without
> sending data.
> 
> Whenever CS is toggled, momentarily set the clock gating register
> to "Force On" to poke the controller into acting on CS.
> 
Could you share the test case how to trigger this? What's the platform 
here? I'd like to check does this reproduce on other Intel LPSS 
platforms so is there need to add quirk for them too.

> Signed-off-by: Evan Green <evgreen@chromium.org>
> Signed-off-by: Rajat Jain <rajatja@google.com>
> ---
>   drivers/spi/spi-pxa2xx.c | 23 +++++++++++++++++++++++
>   1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c
> index 4c7a71f0fb3e..2e318158fca9 100644
> --- a/drivers/spi/spi-pxa2xx.c
> +++ b/drivers/spi/spi-pxa2xx.c
> @@ -70,6 +70,10 @@ MODULE_ALIAS("platform:pxa2xx-spi");
>   #define LPSS_CAPS_CS_EN_SHIFT			9
>   #define LPSS_CAPS_CS_EN_MASK			(0xf << LPSS_CAPS_CS_EN_SHIFT)
>   
> +#define LPSS_PRIV_CLOCK_GATE 0x38
> +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK 0x3
> +#define LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON 0x3
> +
>   struct lpss_config {
>   	/* LPSS offset from drv_data->ioaddr */
>   	unsigned offset;
> @@ -86,6 +90,8 @@ struct lpss_config {
>   	unsigned cs_sel_shift;
>   	unsigned cs_sel_mask;
>   	unsigned cs_num;
> +	/* Quirks */
> +	unsigned cs_clk_stays_gated : 1;
>   };
>   
>   /* Keep these sorted with enum pxa_ssp_type */
> @@ -156,6 +162,7 @@ static const struct lpss_config lpss_platforms[] = {
>   		.tx_threshold_hi = 56,
>   		.cs_sel_shift = 8,
>   		.cs_sel_mask = 3 << 8,
> +		.cs_clk_stays_gated = true,
>   	},
>   };
>   
> @@ -383,6 +390,22 @@ static void lpss_ssp_cs_control(struct spi_device *spi, bool enable)
>   	else
>   		value |= LPSS_CS_CONTROL_CS_HIGH;
>   	__lpss_ssp_write_priv(drv_data, config->reg_cs_ctrl, value);
> +	if (config->cs_clk_stays_gated) {
> +		u32 clkgate;
> +
> +		/*
> +		 * Changing CS alone when dynamic clock gating is on won't
> +		 * actually flip CS at that time. This ruins SPI transfers
> +		 * that specify delays, or have no data. Toggle the clock mode
> +		 * to force on briefly to poke the CS pin to move.
> +		 */
> +		clkgate = __lpss_ssp_read_priv(drv_data, LPSS_PRIV_CLOCK_GATE);
> +		value = (clkgate & ~LPSS_PRIV_CLOCK_GATE_CLK_CTL_MASK) |
> +			LPSS_PRIV_CLOCK_GATE_CLK_CTL_FORCE_ON;
> +
> +		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, value);
> +		__lpss_ssp_write_priv(drv_data, LPSS_PRIV_CLOCK_GATE, clkgate);
> +	}
>   }
>   
I wonder is it enough to have this quick toggling only or is time or 
actually number of clock cycles dependent? Now there is no delay between 
but I'm thinking if it needs certain number cycles does this still work 
when using low ssp_clk rates similar than in commit d0283eb2dbc1 ("spi: 
pxa2xx: Add output control for multiple Intel LPSS chip selects").

I'm thinking can this be done only once after resume and may other LPSS 
blocks need the same? I.e. should this be done in drivers/mfd/intel-lpss.c?

Jarkko

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <b3397374-0cb8-cf6c-0555-34541a1c108c-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>]

* Re:
  2020-02-12  9:30     ` Re: Jarkko Nikula
@ 2020-02-12 10:24         ` Andy Shevchenko
  -1 siblings, 0 replies; 1651+ messages in thread
From: Andy Shevchenko @ 2020-02-12 10:24 UTC (permalink / raw)
  To: Jarkko Nikula
  Cc: Rajat Jain, Daniel Mack, Haojian Zhuang, Robert Jarzmik,
	Mark Brown, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-spi-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Evan Green,
	rajatxjain-Re5JQEeQqe8AvxtiuMwx3w, evgreen-hpIqsD4AKlfQT0dZR+AlfA,
	shobhit.srivastava-ral2JQCrhuEAvxtiuMwx3w,
	porselvan.muthukrishnan-ral2JQCrhuEAvxtiuMwx3w

On Wed, Feb 12, 2020 at 11:30:51AM +0200, Jarkko Nikula wrote:
> On 2/12/20 12:34 AM, Rajat Jain wrote:

> This patch subject is missing from mail subject.

> I'm thinking can this be done only once after resume and may other LPSS
> blocks need the same? I.e. should this be done in drivers/mfd/intel-lpss.c?

On resume we restore the previously saved context, can we be sure that values
we saved during suspend are correct?

If above won't show any issue, it might be best place to have this quirk
applied in intel_lpss_suspend() / intel_lpss_resume() callbacks as Jarkko
suggested.

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-02-12 10:24         ` Andy Shevchenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Andy Shevchenko @ 2020-02-12 10:24 UTC (permalink / raw)
  To: Jarkko Nikula
  Cc: Evan Green, rajatxjain, shobhit.srivastava, linux-kernel,
	Haojian Zhuang, linux-spi, Mark Brown, evgreen, Daniel Mack,
	Rajat Jain, Robert Jarzmik, linux-arm-kernel,
	porselvan.muthukrishnan

On Wed, Feb 12, 2020 at 11:30:51AM +0200, Jarkko Nikula wrote:
> On 2/12/20 12:34 AM, Rajat Jain wrote:

> This patch subject is missing from mail subject.

> I'm thinking can this be done only once after resume and may other LPSS
> blocks need the same? I.e. should this be done in drivers/mfd/intel-lpss.c?

On resume we restore the previously saved context, can we be sure that values
we saved during suspend are correct?

If above won't show any issue, it might be best place to have this quirk
applied in intel_lpss_suspend() / intel_lpss_resume() callbacks as Jarkko
suggested.

-- 
With Best Regards,
Andy Shevchenko



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20200224173733.16323-1-axboe@kernel.dk>]

* Re:
       [not found] <20200224173733.16323-1-axboe@kernel.dk>
@ 2020-02-24 17:38 ` Jens Axboe
  0 siblings, 0 replies; 1651+ messages in thread
From: Jens Axboe @ 2020-02-24 17:38 UTC (permalink / raw)
  To: io-uring

On 2/24/20 10:37 AM, Jens Axboe wrote:
> Here's v3 of the poll async retry patchset. Changes since v2:
> 
> - Rebase on for-5.7/io_uring
> - Get rid of REQ_F_WORK bit
> - Improve the tracing additions
> - Fix linked_timeout case
> - Fully restore work from async task handler
> - Credentials now fixed
> - Fix task_works running from SQPOLL
> - Remove task cancellation stuff, we don't need it
> - fdinfo print improvements
> 
> I think this is getting pretty close to mergeable, I haven't found
> any issues with the test cases.

Gah, wrong directory, resending it. Ignore this thread.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2020-02-26 11:57 Ville Syrjälä
  2020-02-26 12:08   ` Re: Linus Walleij
  0 siblings, 1 reply; 1651+ messages in thread
From: Ville Syrjälä @ 2020-02-26 11:57 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, Eric Anholt, nouveau,
	Guido Günther, open list:DRM PANEL DRIVERS,
	Gustaf Lindström, Andrzej Hajda, Laurent Pinchart,
	Philipp Zabel, Sam Ravnborg, Marian-Cristian Rotariu, Jagan Teki,
	Thomas Hellstrom, Joonyoung Shim, Jonathan Marek,
	Stefan Mavrodiev, Adam Ford, Jerry Han

[-- Attachment #1: Type: text/plain, Size: 1850 bytes --]

Subject: Re: [PATCH 04/12] drm: Nuke mode->vrefresh
Message-ID: <20200226115708.GH13686@intel.com>
References: <20200219203544.31013-1-ville.syrjala@linux.intel.com>
 <CGME20200219203620eucas1p24b4990a91e758dbcf3e9b943669b0c8f@eucas1p2.samsung.com>
 <20200219203544.31013-5-ville.syrjala@linux.intel.com>
 <0f278771-79ce-fe23-e72c-3935dbe82d24@samsung.com>
 <20200225112114.GA13686@intel.com>
 <3ca785f2-9032-aaf9-0965-8657d31116ba@samsung.com>
 <20200225154506.GF13686@intel.com>
 <20200225192720.GG13686@intel.com>
 <CACRpkdZk9QEy+Kzkmy4BXiHB+aq9hprf=dmA_-R23yqH3NCt1g@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CACRpkdZk9QEy+Kzkmy4BXiHB+aq9hprf=dmA_-R23yqH3NCt1g@mail.gmail.com>
X-Patchwork-Hint: comment
User-Agent: Mutt/1.10.1 (2018-07-13)

On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> On Tue, Feb 25, 2020 at 8:27 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> 
> > OK, so I went ahead a wrote a bit of cocci [1] to find the bad apples.
> 
> That's impressive :D
> 
> > Unfortunately it found a lot of strange stuff:
> 
> I will answer for the weirdness I caused.
> 
> I have long suspected that a whole bunch of the "simple" displays
> are not simple but contains a display controller and memory.
> That means that the speed over the link to the display and
> actual refresh rate on the actual display is asymmetric because
> well we are just updating a RAM, the resolution just limits how
> much data we are sending, the clock limits the speed on the
> bus over to the RAM on the other side.

IMO even in command mode mode->clock should probably be the actual
dotclock used by the display. If there's another clock for the bus
speed/etc. it should be stored somewhere else.

-- 
Ville Syrjälä
Intel

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-02-26 11:57 (no subject) Ville Syrjälä
@ 2020-02-26 12:08   ` Linus Walleij
  0 siblings, 0 replies; 1651+ messages in thread
From: Linus Walleij @ 2020-02-26 12:08 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, Eric Anholt, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Philipp Zabel, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Maxime Ripard, Alexandre Courbot, Fabio Estevam,
	open list:ARM/Amlogic Meson..., Vincent Abriou, Andreas Pretzsch,
	Jernej Skrabec, Alex Gonzalez, Purism Kernel Team,
	Boris Brezillon, Seung-Woo Kim, Christoph Fritz, Kyungmin Park,
	Heiko Stuebner, Eugen Hristev, Giulio Benetti

On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:

> > I have long suspected that a whole bunch of the "simple" displays
> > are not simple but contains a display controller and memory.
> > That means that the speed over the link to the display and
> > actual refresh rate on the actual display is asymmetric because
> > well we are just updating a RAM, the resolution just limits how
> > much data we are sending, the clock limits the speed on the
> > bus over to the RAM on the other side.
>
> IMO even in command mode mode->clock should probably be the actual
> dotclock used by the display. If there's another clock for the bus
> speed/etc. it should be stored somewhere else.

Good point. For the DSI panels we have the field hs_rate
for the HS clock in struct mipi_dsi_device which is based
on exactly this reasoning. And that is what I actually use for
setting the HS clock.

The problem is however that we in many cases have so
substandard documentation of these panels that we have
absolutely no idea about the dotclock. Maybe we should
just set it to 0 in these cases?

Yours,
Linus Walleij

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-02-26 12:08   ` Linus Walleij
  0 siblings, 0 replies; 1651+ messages in thread
From: Linus Walleij @ 2020-02-26 12:08 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Alexandre Courbot, open list:ARM/Amlogic Meson...,
	Vincent Abriou, Andreas Pretzsch, Jernej Skrabec, Alex Gonzalez,
	Purism Kernel Team, Boris Brezillon, Seung-Woo Kim,
	Christoph Fritz, Kyungmin Park, Heiko Stuebner, Eugen Hristev,
	Giulio Benetti

On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:

> > I have long suspected that a whole bunch of the "simple" displays
> > are not simple but contains a display controller and memory.
> > That means that the speed over the link to the display and
> > actual refresh rate on the actual display is asymmetric because
> > well we are just updating a RAM, the resolution just limits how
> > much data we are sending, the clock limits the speed on the
> > bus over to the RAM on the other side.
>
> IMO even in command mode mode->clock should probably be the actual
> dotclock used by the display. If there's another clock for the bus
> speed/etc. it should be stored somewhere else.

Good point. For the DSI panels we have the field hs_rate
for the HS clock in struct mipi_dsi_device which is based
on exactly this reasoning. And that is what I actually use for
setting the HS clock.

The problem is however that we in many cases have so
substandard documentation of these panels that we have
absolutely no idea about the dotclock. Maybe we should
just set it to 0 in these cases?

Yours,
Linus Walleij
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-02-26 12:08   ` Re: Linus Walleij
@ 2020-02-26 14:34     ` Ville Syrjälä
  -1 siblings, 0 replies; 1651+ messages in thread
From: Ville Syrjälä @ 2020-02-26 14:34 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, Eric Anholt, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Philipp Zabel, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Maxime Ripard, Alexandre Courbot, Fabio Estevam,
	open list:ARM/Amlogic Meson..., Vincent Abriou, Andreas Pretzsch,
	Jernej Skrabec, Alex Gonzalez, Purism Kernel Team,
	Boris Brezillon, Seung-Woo Kim, Christoph Fritz, Kyungmin Park,
	Heiko Stuebner, Eugen Hristev, Giulio Benetti

On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> 
> > > I have long suspected that a whole bunch of the "simple" displays
> > > are not simple but contains a display controller and memory.
> > > That means that the speed over the link to the display and
> > > actual refresh rate on the actual display is asymmetric because
> > > well we are just updating a RAM, the resolution just limits how
> > > much data we are sending, the clock limits the speed on the
> > > bus over to the RAM on the other side.
> >
> > IMO even in command mode mode->clock should probably be the actual
> > dotclock used by the display. If there's another clock for the bus
> > speed/etc. it should be stored somewhere else.
> 
> Good point. For the DSI panels we have the field hs_rate
> for the HS clock in struct mipi_dsi_device which is based
> on exactly this reasoning. And that is what I actually use for
> setting the HS clock.
> 
> The problem is however that we in many cases have so
> substandard documentation of these panels that we have
> absolutely no idea about the dotclock. Maybe we should
> just set it to 0 in these cases?

Don't you always have a TE interrupt or something like that
available? Could just measure it from that if no better
information is available?

-- 
Ville Syrjälä
Intel

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-02-26 14:34     ` Ville Syrjälä
  0 siblings, 0 replies; 1651+ messages in thread
From: Ville Syrjälä @ 2020-02-26 14:34 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Alexandre Courbot, open list:ARM/Amlogic Meson...,
	Vincent Abriou, Andreas Pretzsch, Jernej Skrabec, Alex Gonzalez,
	Purism Kernel Team, Boris Brezillon, Seung-Woo Kim,
	Christoph Fritz, Kyungmin Park, Heiko Stuebner, Eugen Hristev,
	Giulio Benetti

On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> 
> > > I have long suspected that a whole bunch of the "simple" displays
> > > are not simple but contains a display controller and memory.
> > > That means that the speed over the link to the display and
> > > actual refresh rate on the actual display is asymmetric because
> > > well we are just updating a RAM, the resolution just limits how
> > > much data we are sending, the clock limits the speed on the
> > > bus over to the RAM on the other side.
> >
> > IMO even in command mode mode->clock should probably be the actual
> > dotclock used by the display. If there's another clock for the bus
> > speed/etc. it should be stored somewhere else.
> 
> Good point. For the DSI panels we have the field hs_rate
> for the HS clock in struct mipi_dsi_device which is based
> on exactly this reasoning. And that is what I actually use for
> setting the HS clock.
> 
> The problem is however that we in many cases have so
> substandard documentation of these panels that we have
> absolutely no idea about the dotclock. Maybe we should
> just set it to 0 in these cases?

Don't you always have a TE interrupt or something like that
available? Could just measure it from that if no better
information is available?

-- 
Ville Syrjälä
Intel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-02-26 14:34     ` Re: Ville Syrjälä
@ 2020-02-26 14:56       ` Linus Walleij
  -1 siblings, 0 replies; 1651+ messages in thread
From: Linus Walleij @ 2020-02-26 14:56 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, Eric Anholt, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Philipp Zabel, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Maxime Ripard, Alexandre Courbot, Fabio Estevam,
	open list:ARM/Amlogic Meson..., Vincent Abriou, Andreas Pretzsch,
	Jernej Skrabec, Alex Gonzalez, Purism Kernel Team,
	Boris Brezillon, Seung-Woo Kim, Christoph Fritz, Kyungmin Park,
	Heiko Stuebner, Eugen Hristev, Giulio Benetti

On Wed, Feb 26, 2020 at 3:34 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> > On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> > <ville.syrjala@linux.intel.com> wrote:
> > > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> >
> > > > I have long suspected that a whole bunch of the "simple" displays
> > > > are not simple but contains a display controller and memory.
> > > > That means that the speed over the link to the display and
> > > > actual refresh rate on the actual display is asymmetric because
> > > > well we are just updating a RAM, the resolution just limits how
> > > > much data we are sending, the clock limits the speed on the
> > > > bus over to the RAM on the other side.
> > >
> > > IMO even in command mode mode->clock should probably be the actual
> > > dotclock used by the display. If there's another clock for the bus
> > > speed/etc. it should be stored somewhere else.
> >
> > Good point. For the DSI panels we have the field hs_rate
> > for the HS clock in struct mipi_dsi_device which is based
> > on exactly this reasoning. And that is what I actually use for
> > setting the HS clock.
> >
> > The problem is however that we in many cases have so
> > substandard documentation of these panels that we have
> > absolutely no idea about the dotclock. Maybe we should
> > just set it to 0 in these cases?
>
> Don't you always have a TE interrupt or something like that
> available? Could just measure it from that if no better
> information is available?

Yes and I did exactly that, so that is why this comment is in
the driver:

static const struct drm_display_mode sony_acx424akp_cmd_mode = {
(...)
        /*
         * Some desired refresh rate, experiments at the maximum "pixel"
         * clock speed (HS clock 420 MHz) yields around 117Hz.
         */
        .vrefresh = 60,

I got a review comment at the time saying 117 Hz was weird.
We didn't reach a proper conclusion on this:
https://lore.kernel.org/dri-devel/CACRpkdYW3YNPSNMY3A44GQn8DqK-n9TLvr7uipF7LM_DHZ5=Lg@mail.gmail.com/

Thierry wasn't sure if 60Hz was good or not, so I just had to
go with something.

We could calculate the resulting pixel clock for ~117 Hz with
this resolution and put that in the clock field but ... don't know
what is the best?

Yours,
Linus Walleij

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-02-26 14:56       ` Linus Walleij
  0 siblings, 0 replies; 1651+ messages in thread
From: Linus Walleij @ 2020-02-26 14:56 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Alexandre Courbot, open list:ARM/Amlogic Meson...,
	Vincent Abriou, Andreas Pretzsch, Jernej Skrabec, Alex Gonzalez,
	Purism Kernel Team, Boris Brezillon, Seung-Woo Kim,
	Christoph Fritz, Kyungmin Park, Heiko Stuebner, Eugen Hristev,
	Giulio Benetti

On Wed, Feb 26, 2020 at 3:34 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
> On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> > On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> > <ville.syrjala@linux.intel.com> wrote:
> > > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> >
> > > > I have long suspected that a whole bunch of the "simple" displays
> > > > are not simple but contains a display controller and memory.
> > > > That means that the speed over the link to the display and
> > > > actual refresh rate on the actual display is asymmetric because
> > > > well we are just updating a RAM, the resolution just limits how
> > > > much data we are sending, the clock limits the speed on the
> > > > bus over to the RAM on the other side.
> > >
> > > IMO even in command mode mode->clock should probably be the actual
> > > dotclock used by the display. If there's another clock for the bus
> > > speed/etc. it should be stored somewhere else.
> >
> > Good point. For the DSI panels we have the field hs_rate
> > for the HS clock in struct mipi_dsi_device which is based
> > on exactly this reasoning. And that is what I actually use for
> > setting the HS clock.
> >
> > The problem is however that we in many cases have so
> > substandard documentation of these panels that we have
> > absolutely no idea about the dotclock. Maybe we should
> > just set it to 0 in these cases?
>
> Don't you always have a TE interrupt or something like that
> available? Could just measure it from that if no better
> information is available?

Yes and I did exactly that, so that is why this comment is in
the driver:

static const struct drm_display_mode sony_acx424akp_cmd_mode = {
(...)
        /*
         * Some desired refresh rate, experiments at the maximum "pixel"
         * clock speed (HS clock 420 MHz) yields around 117Hz.
         */
        .vrefresh = 60,

I got a review comment at the time saying 117 Hz was weird.
We didn't reach a proper conclusion on this:
https://lore.kernel.org/dri-devel/CACRpkdYW3YNPSNMY3A44GQn8DqK-n9TLvr7uipF7LM_DHZ5=Lg@mail.gmail.com/

Thierry wasn't sure if 60Hz was good or not, so I just had to
go with something.

We could calculate the resulting pixel clock for ~117 Hz with
this resolution and put that in the clock field but ... don't know
what is the best?

Yours,
Linus Walleij
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-02-26 14:56       ` Re: Linus Walleij
@ 2020-02-26 15:08         ` Ville Syrjälä
  -1 siblings, 0 replies; 1651+ messages in thread
From: Ville Syrjälä @ 2020-02-26 15:08 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, Eric Anholt, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Philipp Zabel, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Maxime Ripard, Alexandre Courbot, Fabio Estevam,
	open list:ARM/Amlogic Meson..., Vincent Abriou, Andreas Pretzsch,
	Jernej Skrabec, Alex Gonzalez, Purism Kernel Team,
	Boris Brezillon, Seung-Woo Kim, Christoph Fritz, Kyungmin Park,
	Heiko Stuebner, Eugen Hristev, Giulio Benetti

On Wed, Feb 26, 2020 at 03:56:36PM +0100, Linus Walleij wrote:
> On Wed, Feb 26, 2020 at 3:34 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> > > On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> > > <ville.syrjala@linux.intel.com> wrote:
> > > > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> > >
> > > > > I have long suspected that a whole bunch of the "simple" displays
> > > > > are not simple but contains a display controller and memory.
> > > > > That means that the speed over the link to the display and
> > > > > actual refresh rate on the actual display is asymmetric because
> > > > > well we are just updating a RAM, the resolution just limits how
> > > > > much data we are sending, the clock limits the speed on the
> > > > > bus over to the RAM on the other side.
> > > >
> > > > IMO even in command mode mode->clock should probably be the actual
> > > > dotclock used by the display. If there's another clock for the bus
> > > > speed/etc. it should be stored somewhere else.
> > >
> > > Good point. For the DSI panels we have the field hs_rate
> > > for the HS clock in struct mipi_dsi_device which is based
> > > on exactly this reasoning. And that is what I actually use for
> > > setting the HS clock.
> > >
> > > The problem is however that we in many cases have so
> > > substandard documentation of these panels that we have
> > > absolutely no idea about the dotclock. Maybe we should
> > > just set it to 0 in these cases?
> >
> > Don't you always have a TE interrupt or something like that
> > available? Could just measure it from that if no better
> > information is available?
> 
> Yes and I did exactly that, so that is why this comment is in
> the driver:
> 
> static const struct drm_display_mode sony_acx424akp_cmd_mode = {
> (...)
>         /*
>          * Some desired refresh rate, experiments at the maximum "pixel"
>          * clock speed (HS clock 420 MHz) yields around 117Hz.
>          */
>         .vrefresh = 60,
> 
> I got a review comment at the time saying 117 Hz was weird.
> We didn't reach a proper conclusion on this:
> https://lore.kernel.org/dri-devel/CACRpkdYW3YNPSNMY3A44GQn8DqK-n9TLvr7uipF7LM_DHZ5=Lg@mail.gmail.com/
> 
> Thierry wasn't sure if 60Hz was good or not, so I just had to
> go with something.
> 
> We could calculate the resulting pixel clock for ~117 Hz with
> this resolution and put that in the clock field but ... don't know
> what is the best?

I would vote for that approach.

-- 
Ville Syrjälä
Intel

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-02-26 15:08         ` Ville Syrjälä
  0 siblings, 0 replies; 1651+ messages in thread
From: Ville Syrjälä @ 2020-02-26 15:08 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Josh Wu, Bhuvanchandra DV, Neil Armstrong, nouveau,
	Guido Günther, Paul Kocialkowski,
	open list:DRM PANEL DRIVERS, Gustaf Lindström, Andrzej Hajda,
	Thierry Reding, Laurent Pinchart, Sam Ravnborg,
	Marian-Cristian Rotariu, Jagan Teki, Thomas Hellstrom,
	Joonyoung Shim, Jonathan Marek, Stefan Mavrodiev, Adam Ford,
	Jerry Han, VMware Graphics, Ben Skeggs, H. Nikolaus Schaller,
	Robert Chiras, Heiko Schocher, Icenowy Zheng, Jonas Karlman,
	intel-gfx, Alexandre Courbot, open list:ARM/Amlogic Meson...,
	Vincent Abriou, Andreas Pretzsch, Jernej Skrabec, Alex Gonzalez,
	Purism Kernel Team, Boris Brezillon, Seung-Woo Kim,
	Christoph Fritz, Kyungmin Park, Heiko Stuebner, Eugen Hristev,
	Giulio Benetti

On Wed, Feb 26, 2020 at 03:56:36PM +0100, Linus Walleij wrote:
> On Wed, Feb 26, 2020 at 3:34 PM Ville Syrjälä
> <ville.syrjala@linux.intel.com> wrote:
> > On Wed, Feb 26, 2020 at 01:08:06PM +0100, Linus Walleij wrote:
> > > On Wed, Feb 26, 2020 at 12:57 PM Ville Syrjälä
> > > <ville.syrjala@linux.intel.com> wrote:
> > > > On Tue, Feb 25, 2020 at 10:52:25PM +0100, Linus Walleij wrote:
> > >
> > > > > I have long suspected that a whole bunch of the "simple" displays
> > > > > are not simple but contains a display controller and memory.
> > > > > That means that the speed over the link to the display and
> > > > > actual refresh rate on the actual display is asymmetric because
> > > > > well we are just updating a RAM, the resolution just limits how
> > > > > much data we are sending, the clock limits the speed on the
> > > > > bus over to the RAM on the other side.
> > > >
> > > > IMO even in command mode mode->clock should probably be the actual
> > > > dotclock used by the display. If there's another clock for the bus
> > > > speed/etc. it should be stored somewhere else.
> > >
> > > Good point. For the DSI panels we have the field hs_rate
> > > for the HS clock in struct mipi_dsi_device which is based
> > > on exactly this reasoning. And that is what I actually use for
> > > setting the HS clock.
> > >
> > > The problem is however that we in many cases have so
> > > substandard documentation of these panels that we have
> > > absolutely no idea about the dotclock. Maybe we should
> > > just set it to 0 in these cases?
> >
> > Don't you always have a TE interrupt or something like that
> > available? Could just measure it from that if no better
> > information is available?
> 
> Yes and I did exactly that, so that is why this comment is in
> the driver:
> 
> static const struct drm_display_mode sony_acx424akp_cmd_mode = {
> (...)
>         /*
>          * Some desired refresh rate, experiments at the maximum "pixel"
>          * clock speed (HS clock 420 MHz) yields around 117Hz.
>          */
>         .vrefresh = 60,
> 
> I got a review comment at the time saying 117 Hz was weird.
> We didn't reach a proper conclusion on this:
> https://lore.kernel.org/dri-devel/CACRpkdYW3YNPSNMY3A44GQn8DqK-n9TLvr7uipF7LM_DHZ5=Lg@mail.gmail.com/
> 
> Thierry wasn't sure if 60Hz was good or not, so I just had to
> go with something.
> 
> We could calculate the resulting pixel clock for ~117 Hz with
> this resolution and put that in the clock field but ... don't know
> what is the best?

I would vote for that approach.

-- 
Ville Syrjälä
Intel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2020-03-03 15:27 Gene Chen
  2020-03-04 14:56   ` Re: Matthias Brugger
  0 siblings, 1 reply; 1651+ messages in thread
From: Gene Chen @ 2020-03-03 15:27 UTC (permalink / raw)
  To: lee.jones, matthias.bgg
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Wilma.Wu,
	linux-arm-kernel, shufan_lee


Add mfd driver for mt6360 pmic chip include
Battery Charger/USB_PD/Flash LED/RGB LED/LDO/Buck

Signed-off-by: Gene Chen <gene_chen@richtek.com
---
 drivers/mfd/Kconfig        |  12 ++
 drivers/mfd/Makefile       |   1 +
 drivers/mfd/mt6360-core.c  | 425 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/mfd/mt6360.h | 240 +++++++++++++++++++++++++
 4 files changed, 678 insertions(+)
 create mode 100644 drivers/mfd/mt6360-core.c
 create mode 100644 include/linux/mfd/mt6360.h

changelogs between v1 & v2
- include missing header file

changelogs between v2 & v3
- add changelogs

changelogs between v3 & v4
- fix Kconfig description
- replace mt6360_pmu_info with mt6360_pmu_data
- replace probe with probe_new
- remove unnecessary irq_chip variable
- remove annotation
- replace MT6360_MFD_CELL with OF_MFD_CELL

changelogs between v4 & v5
- remove unnecessary parse dt function
- use devm_i2c_new_dummy_device
- add base-commit message

changelogs between v5 & v6
- review return value
- remove i2c id_table
- use GPL license v2

changelogs between v6 & v7
- add author description
- replace MT6360_REGMAP_IRQ_REG by REGMAP_IRQ_REG_LINE
- remove mt6360-private.h

changelogs between v7 & v8
- fix kbuild auto reboot by include interrupt header

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 2b20329..0f8c341 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -857,6 +857,18 @@ config MFD_MAX8998
 	  additional drivers must be enabled in order to use the functionality
 	  of the device.
 
+config MFD_MT6360
+	tristate "Mediatek MT6360 SubPMIC"
+	select MFD_CORE
+	select REGMAP_I2C
+	select REGMAP_IRQ
+	depends on I2C
+	help
+	  Say Y here to enable MT6360 PMU/PMIC/LDO functional support.
+	  PMU part includes Charger, Flashlight, RGB LED
+	  PMIC part includes 2-channel BUCKs and 2-channel LDOs
+	  LDO part includes 4-channel LDOs
+
 config MFD_MT6397
 	tristate "MediaTek MT6397 PMIC Support"
 	select MFD_CORE
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index b83f172..8c35816 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -238,6 +238,7 @@ obj-$(CONFIG_INTEL_SOC_PMIC)	+= intel-soc-pmic.o
 obj-$(CONFIG_INTEL_SOC_PMIC_BXTWC)	+= intel_soc_pmic_bxtwc.o
 obj-$(CONFIG_INTEL_SOC_PMIC_CHTWC)	+= intel_soc_pmic_chtwc.o
 obj-$(CONFIG_INTEL_SOC_PMIC_CHTDC_TI)	+= intel_soc_pmic_chtdc_ti.o
+obj-$(CONFIG_MFD_MT6360)	+= mt6360-core.o
 mt6397-objs	:= mt6397-core.o mt6397-irq.o
 obj-$(CONFIG_MFD_MT6397)	+= mt6397.o
 obj-$(CONFIG_INTEL_SOC_PMIC_MRFLD)	+= intel_soc_pmic_mrfld.o
diff --git a/drivers/mfd/mt6360-core.c b/drivers/mfd/mt6360-core.c
new file mode 100644
index 0000000..d1168f8
--- /dev/null
+++ b/drivers/mfd/mt6360-core.c
@@ -0,0 +1,425 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2019 MediaTek Inc.
+ *
+ * Author: Gene Chen <gene_chen@richtek.com>
+ */
+
+#include <linux/i2c.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/mfd/core.h>
+#include <linux/module.h>
+#include <linux/of_irq.h>
+#include <linux/of_platform.h>
+#include <linux/version.h>
+
+#include <linux/mfd/mt6360.h>
+
+/* reg 0 -> 0 ~ 7 */
+#define MT6360_CHG_TREG_EVT		(4)
+#define MT6360_CHG_AICR_EVT		(5)
+#define MT6360_CHG_MIVR_EVT		(6)
+#define MT6360_PWR_RDY_EVT		(7)
+/* REG 1 -> 8 ~ 15 */
+#define MT6360_CHG_BATSYSUV_EVT		(9)
+#define MT6360_FLED_CHG_VINOVP_EVT	(11)
+#define MT6360_CHG_VSYSUV_EVT		(12)
+#define MT6360_CHG_VSYSOV_EVT		(13)
+#define MT6360_CHG_VBATOV_EVT		(14)
+#define MT6360_CHG_VBUSOV_EVT		(15)
+/* REG 2 -> 16 ~ 23 */
+/* REG 3 -> 24 ~ 31 */
+#define MT6360_WD_PMU_DET		(25)
+#define MT6360_WD_PMU_DONE		(26)
+#define MT6360_CHG_TMRI			(27)
+#define MT6360_CHG_ADPBADI		(29)
+#define MT6360_CHG_RVPI			(30)
+#define MT6360_OTPI			(31)
+/* REG 4 -> 32 ~ 39 */
+#define MT6360_CHG_AICCMEASL		(32)
+#define MT6360_CHGDET_DONEI		(34)
+#define MT6360_WDTMRI			(35)
+#define MT6360_SSFINISHI		(36)
+#define MT6360_CHG_RECHGI		(37)
+#define MT6360_CHG_TERMI		(38)
+#define MT6360_CHG_IEOCI		(39)
+/* REG 5 -> 40 ~ 47 */
+#define MT6360_PUMPX_DONEI		(40)
+#define MT6360_BAT_OVP_ADC_EVT		(41)
+#define MT6360_TYPEC_OTP_EVT		(42)
+#define MT6360_ADC_WAKEUP_EVT		(43)
+#define MT6360_ADC_DONEI		(44)
+#define MT6360_BST_BATUVI		(45)
+#define MT6360_BST_VBUSOVI		(46)
+#define MT6360_BST_OLPI			(47)
+/* REG 6 -> 48 ~ 55 */
+#define MT6360_ATTACH_I			(48)
+#define MT6360_DETACH_I			(49)
+#define MT6360_QC30_STPDONE		(51)
+#define MT6360_QC_VBUSDET_DONE		(52)
+#define MT6360_HVDCP_DET		(53)
+#define MT6360_CHGDETI			(54)
+#define MT6360_DCDTI			(55)
+/* REG 7 -> 56 ~ 63 */
+#define MT6360_FOD_DONE_EVT		(56)
+#define MT6360_FOD_OV_EVT		(57)
+#define MT6360_CHRDET_UVP_EVT		(58)
+#define MT6360_CHRDET_OVP_EVT		(59)
+#define MT6360_CHRDET_EXT_EVT		(60)
+#define MT6360_FOD_LR_EVT		(61)
+#define MT6360_FOD_HR_EVT		(62)
+#define MT6360_FOD_DISCHG_FAIL_EVT	(63)
+/* REG 8 -> 64 ~ 71 */
+#define MT6360_USBID_EVT		(64)
+#define MT6360_APWDTRST_EVT		(65)
+#define MT6360_EN_EVT			(66)
+#define MT6360_QONB_RST_EVT		(67)
+#define MT6360_MRSTB_EVT		(68)
+#define MT6360_OTP_EVT			(69)
+#define MT6360_VDDAOV_EVT		(70)
+#define MT6360_SYSUV_EVT		(71)
+/* REG 9 -> 72 ~ 79 */
+#define MT6360_FLED_STRBPIN_EVT		(72)
+#define MT6360_FLED_TORPIN_EVT		(73)
+#define MT6360_FLED_TX_EVT		(74)
+#define MT6360_FLED_LVF_EVT		(75)
+#define MT6360_FLED2_SHORT_EVT		(78)
+#define MT6360_FLED1_SHORT_EVT		(79)
+/* REG 10 -> 80 ~ 87 */
+#define MT6360_FLED2_STRB_EVT		(80)
+#define MT6360_FLED1_STRB_EVT		(81)
+#define MT6360_FLED2_STRB_TO_EVT	(82)
+#define MT6360_FLED1_STRB_TO_EVT	(83)
+#define MT6360_FLED2_TOR_EVT		(84)
+#define MT6360_FLED1_TOR_EVT		(85)
+/* REG 11 -> 88 ~ 95 */
+/* REG 12 -> 96 ~ 103 */
+#define MT6360_BUCK1_PGB_EVT		(96)
+#define MT6360_BUCK1_OC_EVT		(100)
+#define MT6360_BUCK1_OV_EVT		(101)
+#define MT6360_BUCK1_UV_EVT		(102)
+/* REG 13 -> 104 ~ 111 */
+#define MT6360_BUCK2_PGB_EVT		(104)
+#define MT6360_BUCK2_OC_EVT		(108)
+#define MT6360_BUCK2_OV_EVT		(109)
+#define MT6360_BUCK2_UV_EVT		(110)
+/* REG 14 -> 112 ~ 119 */
+#define MT6360_LDO1_OC_EVT		(113)
+#define MT6360_LDO2_OC_EVT		(114)
+#define MT6360_LDO3_OC_EVT		(115)
+#define MT6360_LDO5_OC_EVT		(117)
+#define MT6360_LDO6_OC_EVT		(118)
+#define MT6360_LDO7_OC_EVT		(119)
+/* REG 15 -> 120 ~ 127 */
+#define MT6360_LDO1_PGB_EVT		(121)
+#define MT6360_LDO2_PGB_EVT		(122)
+#define MT6360_LDO3_PGB_EVT		(123)
+#define MT6360_LDO5_PGB_EVT		(125)
+#define MT6360_LDO6_PGB_EVT		(126)
+#define MT6360_LDO7_PGB_EVT		(127)
+
+static const struct regmap_irq mt6360_pmu_irqs[] =  {
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICR_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_MIVR_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_PWR_RDY_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_BATSYSUV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED_CHG_VINOVP_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSUV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSOV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBATOV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBUSOV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DET, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DONE, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_TMRI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_ADPBADI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_RVPI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_OTPI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICCMEASL, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHGDET_DONEI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_WDTMRI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_SSFINISHI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_RECHGI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_TERMI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_IEOCI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_PUMPX_DONEI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BAT_OVP_ADC_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_TYPEC_OTP_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_ADC_WAKEUP_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_ADC_DONEI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BST_BATUVI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BST_VBUSOVI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BST_OLPI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_ATTACH_I, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_DETACH_I, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_QC30_STPDONE, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_QC_VBUSDET_DONE, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_HVDCP_DET, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHGDETI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_DCDTI, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FOD_DONE_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FOD_OV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_UVP_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_OVP_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_EXT_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FOD_LR_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FOD_HR_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FOD_DISCHG_FAIL_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_USBID_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_APWDTRST_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_EN_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_QONB_RST_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_MRSTB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_OTP_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_VDDAOV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_SYSUV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED_STRBPIN_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED_TORPIN_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED_TX_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED_LVF_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED2_SHORT_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED1_SHORT_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_TO_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_TO_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED2_TOR_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_FLED1_TOR_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_PGB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OC_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_UV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_PGB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OC_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_UV_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO1_OC_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO2_OC_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO3_OC_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO5_OC_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO6_OC_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO7_OC_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO1_PGB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO2_PGB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO3_PGB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO5_PGB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO6_PGB_EVT, 8),
+	REGMAP_IRQ_REG_LINE(MT6360_LDO7_PGB_EVT, 8),
+};
+
+static int mt6360_pmu_handle_post_irq(void *irq_drv_data)
+{
+	struct mt6360_pmu_data *mpd = irq_drv_data;
+
+	return regmap_update_bits(mpd->regmap,
+		MT6360_PMU_IRQ_SET, MT6360_IRQ_RETRIG, MT6360_IRQ_RETRIG);
+}
+
+static struct regmap_irq_chip mt6360_pmu_irq_chip = {
+	.irqs = mt6360_pmu_irqs,
+	.num_irqs = ARRAY_SIZE(mt6360_pmu_irqs),
+	.num_regs = MT6360_PMU_IRQ_REGNUM,
+	.mask_base = MT6360_PMU_CHG_MASK1,
+	.status_base = MT6360_PMU_CHG_IRQ1,
+	.ack_base = MT6360_PMU_CHG_IRQ1,
+	.init_ack_masked = true,
+	.use_ack = true,
+	.handle_post_irq = mt6360_pmu_handle_post_irq,
+};
+
+static const struct regmap_config mt6360_pmu_regmap_config = {
+	.reg_bits = 8,
+	.val_bits = 8,
+	.max_register = MT6360_PMU_MAXREG,
+};
+
+static const struct resource mt6360_adc_resources[] = {
+	DEFINE_RES_IRQ_NAMED(MT6360_ADC_DONEI, "adc_donei"),
+};
+
+static const struct resource mt6360_chg_resources[] = {
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TREG_EVT, "chg_treg_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_PWR_RDY_EVT, "pwr_rdy_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_BATSYSUV_EVT, "chg_batsysuv_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSUV_EVT, "chg_vsysuv_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSOV_EVT, "chg_vsysov_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBATOV_EVT, "chg_vbatov_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBUSOV_EVT, "chg_vbusov_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_AICCMEASL, "chg_aiccmeasl"),
+	DEFINE_RES_IRQ_NAMED(MT6360_WDTMRI, "wdtmri"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_RECHGI, "chg_rechgi"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TERMI, "chg_termi"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHG_IEOCI, "chg_ieoci"),
+	DEFINE_RES_IRQ_NAMED(MT6360_PUMPX_DONEI, "pumpx_donei"),
+	DEFINE_RES_IRQ_NAMED(MT6360_ATTACH_I, "attach_i"),
+	DEFINE_RES_IRQ_NAMED(MT6360_CHRDET_EXT_EVT, "chrdet_ext_evt"),
+};
+
+static const struct resource mt6360_led_resources[] = {
+	DEFINE_RES_IRQ_NAMED(MT6360_FLED_CHG_VINOVP_EVT, "fled_chg_vinovp_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_FLED_LVF_EVT, "fled_lvf_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_SHORT_EVT, "fled2_short_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_SHORT_EVT, "fled1_short_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_STRB_TO_EVT, "fled2_strb_to_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_STRB_TO_EVT, "fled1_strb_to_evt"),
+};
+
+static const struct resource mt6360_pmic_resources[] = {
+	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_PGB_EVT, "buck1_pgb_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OC_EVT, "buck1_oc_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OV_EVT, "buck1_ov_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_UV_EVT, "buck1_uv_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_PGB_EVT, "buck2_pgb_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OC_EVT, "buck2_oc_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OV_EVT, "buck2_ov_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_UV_EVT, "buck2_uv_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_OC_EVT, "ldo6_oc_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_OC_EVT, "ldo7_oc_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_PGB_EVT, "ldo6_pgb_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_PGB_EVT, "ldo7_pgb_evt"),
+};
+
+static const struct resource mt6360_ldo_resources[] = {
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_OC_EVT, "ldo1_oc_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_OC_EVT, "ldo2_oc_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_OC_EVT, "ldo3_oc_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_OC_EVT, "ldo5_oc_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_PGB_EVT, "ldo1_pgb_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_PGB_EVT, "ldo2_pgb_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_PGB_EVT, "ldo3_pgb_evt"),
+	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_PGB_EVT, "ldo5_pgb_evt"),
+};
+
+static const struct mfd_cell mt6360_devs[] = {
+	OF_MFD_CELL("mt6360_adc", mt6360_adc_resources,
+		    NULL, 0, 0, "mediatek,mt6360_adc"),
+	OF_MFD_CELL("mt6360_chg", mt6360_chg_resources,
+		    NULL, 0, 0, "mediatek,mt6360_chg"),
+	OF_MFD_CELL("mt6360_led", mt6360_led_resources,
+		    NULL, 0, 0, "mediatek,mt6360_led"),
+	OF_MFD_CELL("mt6360_pmic", mt6360_pmic_resources,
+		    NULL, 0, 0, "mediatek,mt6360_pmic"),
+	OF_MFD_CELL("mt6360_ldo", mt6360_ldo_resources,
+		    NULL, 0, 0, "mediatek,mt6360_ldo"),
+	OF_MFD_CELL("mt6360_tcpc", NULL,
+		    NULL, 0, 0, "mediatek,mt6360_tcpc"),
+};
+
+static const unsigned short mt6360_slave_addr[MT6360_SLAVE_MAX] = {
+	MT6360_PMU_SLAVEID,
+	MT6360_PMIC_SLAVEID,
+	MT6360_LDO_SLAVEID,
+	MT6360_TCPC_SLAVEID,
+};
+
+static int mt6360_pmu_probe(struct i2c_client *client)
+{
+	struct mt6360_pmu_data *mpd;
+	unsigned int reg_data;
+	int i, ret;
+
+	mpd = devm_kzalloc(&client->dev, sizeof(*mpd), GFP_KERNEL);
+	if (!mpd)
+		return -ENOMEM;
+
+	mpd->dev = &client->dev;
+	i2c_set_clientdata(client, mpd);
+
+	mpd->regmap = devm_regmap_init_i2c(client, &mt6360_pmu_regmap_config);
+	if (IS_ERR(mpd->regmap)) {
+		dev_err(&client->dev, "Failed to register regmap\n");
+		return PTR_ERR(mpd->regmap);
+	}
+
+	ret = regmap_read(mpd->regmap, MT6360_PMU_DEV_INFO, &reg_data);
+	if (ret) {
+		dev_err(&client->dev, "Device not found\n");
+		return ret;
+	}
+
+	mpd->chip_rev = reg_data & CHIP_REV_MASK;
+	if (mpd->chip_rev != CHIP_VEN_MT6360) {
+		dev_err(&client->dev, "Device not supported\n");
+		return -ENODEV;
+	}
+
+	mt6360_pmu_irq_chip.irq_drv_data = mpd;
+	ret = devm_regmap_add_irq_chip(&client->dev, mpd->regmap, client->irq,
+				       IRQF_TRIGGER_FALLING, 0,
+				       &mt6360_pmu_irq_chip, &mpd->irq_data);
+	if (ret) {
+		dev_err(&client->dev, "Failed to add Regmap IRQ Chip\n");
+		return ret;
+	}
+
+	mpd->i2c[0] = client;
+	for (i = 1; i < MT6360_SLAVE_MAX; i++) {
+		mpd->i2c[i] = devm_i2c_new_dummy_device(&client->dev,
+							client->adapter,
+							mt6360_slave_addr[i]);
+		if (IS_ERR(mpd->i2c[i])) {
+			dev_err(&client->dev,
+				"Failed to get new dummy I2C device for address 0x%x",
+				mt6360_slave_addr[i]);
+			return PTR_ERR(mpd->i2c[i]);
+		}
+		i2c_set_clientdata(mpd->i2c[i], mpd);
+	}
+
+	ret = devm_mfd_add_devices(&client->dev, PLATFORM_DEVID_AUTO,
+				   mt6360_devs, ARRAY_SIZE(mt6360_devs), NULL,
+				   0, regmap_irq_get_domain(mpd->irq_data));
+	if (ret) {
+		dev_err(&client->dev,
+			"Failed to register subordinate devices\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+static int __maybe_unused mt6360_pmu_suspend(struct device *dev)
+{
+	struct i2c_client *i2c = to_i2c_client(dev);
+
+	if (device_may_wakeup(dev))
+		enable_irq_wake(i2c->irq);
+
+	return 0;
+}
+
+static int __maybe_unused mt6360_pmu_resume(struct device *dev)
+{
+
+	struct i2c_client *i2c = to_i2c_client(dev);
+
+	if (device_may_wakeup(dev))
+		disable_irq_wake(i2c->irq);
+
+	return 0;
+}
+
+static SIMPLE_DEV_PM_OPS(mt6360_pmu_pm_ops,
+			 mt6360_pmu_suspend, mt6360_pmu_resume);
+
+static const struct of_device_id __maybe_unused mt6360_pmu_of_id[] = {
+	{ .compatible = "mediatek,mt6360_pmu", },
+	{},
+};
+MODULE_DEVICE_TABLE(of, mt6360_pmu_of_id);
+
+static struct i2c_driver mt6360_pmu_driver = {
+	.driver = {
+		.pm = &mt6360_pmu_pm_ops,
+		.of_match_table = of_match_ptr(mt6360_pmu_of_id),
+	},
+	.probe_new = mt6360_pmu_probe,
+};
+module_i2c_driver(mt6360_pmu_driver);
+
+MODULE_AUTHOR("Gene Chen <gene_chen@richtek.com>");
+MODULE_DESCRIPTION("MT6360 PMU I2C Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/linux/mfd/mt6360.h b/include/linux/mfd/mt6360.h
new file mode 100644
index 0000000..c03e6d1
--- /dev/null
+++ b/include/linux/mfd/mt6360.h
@@ -0,0 +1,240 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2019 MediaTek Inc.
+ */
+
+#ifndef __MT6360_H__
+#define __MT6360_H__
+
+#include <linux/regmap.h>
+
+enum {
+	MT6360_SLAVE_PMU = 0,
+	MT6360_SLAVE_PMIC,
+	MT6360_SLAVE_LDO,
+	MT6360_SLAVE_TCPC,
+	MT6360_SLAVE_MAX,
+};
+
+#define MT6360_PMU_SLAVEID	(0x34)
+#define MT6360_PMIC_SLAVEID	(0x1A)
+#define MT6360_LDO_SLAVEID	(0x64)
+#define MT6360_TCPC_SLAVEID	(0x4E)
+
+struct mt6360_pmu_data {
+	struct i2c_client *i2c[MT6360_SLAVE_MAX];
+	struct device *dev;
+	struct regmap *regmap;
+	struct regmap_irq_chip_data *irq_data;
+	unsigned int chip_rev;
+};
+
+/* PMU register defininition */
+#define MT6360_PMU_DEV_INFO			(0x00)
+#define MT6360_PMU_CORE_CTRL1			(0x01)
+#define MT6360_PMU_RST1				(0x02)
+#define MT6360_PMU_CRCEN			(0x03)
+#define MT6360_PMU_RST_PAS_CODE1		(0x04)
+#define MT6360_PMU_RST_PAS_CODE2		(0x05)
+#define MT6360_PMU_CORE_CTRL2			(0x06)
+#define MT6360_PMU_TM_PAS_CODE1			(0x07)
+#define MT6360_PMU_TM_PAS_CODE2			(0x08)
+#define MT6360_PMU_TM_PAS_CODE3			(0x09)
+#define MT6360_PMU_TM_PAS_CODE4			(0x0A)
+#define MT6360_PMU_IRQ_IND			(0x0B)
+#define MT6360_PMU_IRQ_MASK			(0x0C)
+#define MT6360_PMU_IRQ_SET			(0x0D)
+#define MT6360_PMU_SHDN_CTRL			(0x0E)
+#define MT6360_PMU_TM_INF			(0x0F)
+#define MT6360_PMU_I2C_CTRL			(0x10)
+#define MT6360_PMU_CHG_CTRL1			(0x11)
+#define MT6360_PMU_CHG_CTRL2			(0x12)
+#define MT6360_PMU_CHG_CTRL3			(0x13)
+#define MT6360_PMU_CHG_CTRL4			(0x14)
+#define MT6360_PMU_CHG_CTRL5			(0x15)
+#define MT6360_PMU_CHG_CTRL6			(0x16)
+#define MT6360_PMU_CHG_CTRL7			(0x17)
+#define MT6360_PMU_CHG_CTRL8			(0x18)
+#define MT6360_PMU_CHG_CTRL9			(0x19)
+#define MT6360_PMU_CHG_CTRL10			(0x1A)
+#define MT6360_PMU_CHG_CTRL11			(0x1B)
+#define MT6360_PMU_CHG_CTRL12			(0x1C)
+#define MT6360_PMU_CHG_CTRL13			(0x1D)
+#define MT6360_PMU_CHG_CTRL14			(0x1E)
+#define MT6360_PMU_CHG_CTRL15			(0x1F)
+#define MT6360_PMU_CHG_CTRL16			(0x20)
+#define MT6360_PMU_CHG_AICC_RESULT		(0x21)
+#define MT6360_PMU_DEVICE_TYPE			(0x22)
+#define MT6360_PMU_QC_CONTROL1			(0x23)
+#define MT6360_PMU_QC_CONTROL2			(0x24)
+#define MT6360_PMU_QC30_CONTROL1		(0x25)
+#define MT6360_PMU_QC30_CONTROL2		(0x26)
+#define MT6360_PMU_USB_STATUS1			(0x27)
+#define MT6360_PMU_QC_STATUS1			(0x28)
+#define MT6360_PMU_QC_STATUS2			(0x29)
+#define MT6360_PMU_CHG_PUMP			(0x2A)
+#define MT6360_PMU_CHG_CTRL17			(0x2B)
+#define MT6360_PMU_CHG_CTRL18			(0x2C)
+#define MT6360_PMU_CHRDET_CTRL1			(0x2D)
+#define MT6360_PMU_CHRDET_CTRL2			(0x2E)
+#define MT6360_PMU_DPDN_CTRL			(0x2F)
+#define MT6360_PMU_CHG_HIDDEN_CTRL1		(0x30)
+#define MT6360_PMU_CHG_HIDDEN_CTRL2		(0x31)
+#define MT6360_PMU_CHG_HIDDEN_CTRL3		(0x32)
+#define MT6360_PMU_CHG_HIDDEN_CTRL4		(0x33)
+#define MT6360_PMU_CHG_HIDDEN_CTRL5		(0x34)
+#define MT6360_PMU_CHG_HIDDEN_CTRL6		(0x35)
+#define MT6360_PMU_CHG_HIDDEN_CTRL7		(0x36)
+#define MT6360_PMU_CHG_HIDDEN_CTRL8		(0x37)
+#define MT6360_PMU_CHG_HIDDEN_CTRL9		(0x38)
+#define MT6360_PMU_CHG_HIDDEN_CTRL10		(0x39)
+#define MT6360_PMU_CHG_HIDDEN_CTRL11		(0x3A)
+#define MT6360_PMU_CHG_HIDDEN_CTRL12		(0x3B)
+#define MT6360_PMU_CHG_HIDDEN_CTRL13		(0x3C)
+#define MT6360_PMU_CHG_HIDDEN_CTRL14		(0x3D)
+#define MT6360_PMU_CHG_HIDDEN_CTRL15		(0x3E)
+#define MT6360_PMU_CHG_HIDDEN_CTRL16		(0x3F)
+#define MT6360_PMU_CHG_HIDDEN_CTRL17		(0x40)
+#define MT6360_PMU_CHG_HIDDEN_CTRL18		(0x41)
+#define MT6360_PMU_CHG_HIDDEN_CTRL19		(0x42)
+#define MT6360_PMU_CHG_HIDDEN_CTRL20		(0x43)
+#define MT6360_PMU_CHG_HIDDEN_CTRL21		(0x44)
+#define MT6360_PMU_CHG_HIDDEN_CTRL22		(0x45)
+#define MT6360_PMU_CHG_HIDDEN_CTRL23		(0x46)
+#define MT6360_PMU_CHG_HIDDEN_CTRL24		(0x47)
+#define MT6360_PMU_CHG_HIDDEN_CTRL25		(0x48)
+#define MT6360_PMU_BC12_CTRL			(0x49)
+#define MT6360_PMU_CHG_STAT			(0x4A)
+#define MT6360_PMU_RESV1			(0x4B)
+#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEH	(0x4E)
+#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEL	(0x4F)
+#define MT6360_PMU_TYPEC_OTP_HYST_TH		(0x50)
+#define MT6360_PMU_TYPEC_OTP_CTRL		(0x51)
+#define MT6360_PMU_ADC_BAT_DATA_H		(0x52)
+#define MT6360_PMU_ADC_BAT_DATA_L		(0x53)
+#define MT6360_PMU_IMID_BACKBST_ON		(0x54)
+#define MT6360_PMU_IMID_BACKBST_OFF		(0x55)
+#define MT6360_PMU_ADC_CONFIG			(0x56)
+#define MT6360_PMU_ADC_EN2			(0x57)
+#define MT6360_PMU_ADC_IDLE_T			(0x58)
+#define MT6360_PMU_ADC_RPT_1			(0x5A)
+#define MT6360_PMU_ADC_RPT_2			(0x5B)
+#define MT6360_PMU_ADC_RPT_3			(0x5C)
+#define MT6360_PMU_ADC_RPT_ORG1			(0x5D)
+#define MT6360_PMU_ADC_RPT_ORG2			(0x5E)
+#define MT6360_PMU_BAT_OVP_TH_SEL_CODEH		(0x5F)
+#define MT6360_PMU_BAT_OVP_TH_SEL_CODEL		(0x60)
+#define MT6360_PMU_CHG_CTRL19			(0x61)
+#define MT6360_PMU_VDDASUPPLY			(0x62)
+#define MT6360_PMU_BC12_MANUAL			(0x63)
+#define MT6360_PMU_CHGDET_FUNC			(0x64)
+#define MT6360_PMU_FOD_CTRL			(0x65)
+#define MT6360_PMU_CHG_CTRL20			(0x66)
+#define MT6360_PMU_CHG_HIDDEN_CTRL26		(0x67)
+#define MT6360_PMU_CHG_HIDDEN_CTRL27		(0x68)
+#define MT6360_PMU_RESV2			(0x69)
+#define MT6360_PMU_USBID_CTRL1			(0x6D)
+#define MT6360_PMU_USBID_CTRL2			(0x6E)
+#define MT6360_PMU_USBID_CTRL3			(0x6F)
+#define MT6360_PMU_FLED_CFG			(0x70)
+#define MT6360_PMU_RESV3			(0x71)
+#define MT6360_PMU_FLED1_CTRL			(0x72)
+#define MT6360_PMU_FLED_STRB_CTRL		(0x73)
+#define MT6360_PMU_FLED1_STRB_CTRL2		(0x74)
+#define MT6360_PMU_FLED1_TOR_CTRL		(0x75)
+#define MT6360_PMU_FLED2_CTRL			(0x76)
+#define MT6360_PMU_RESV4			(0x77)
+#define MT6360_PMU_FLED2_STRB_CTRL2		(0x78)
+#define MT6360_PMU_FLED2_TOR_CTRL		(0x79)
+#define MT6360_PMU_FLED_VMIDTRK_CTRL1		(0x7A)
+#define MT6360_PMU_FLED_VMID_RTM		(0x7B)
+#define MT6360_PMU_FLED_VMIDTRK_CTRL2		(0x7C)
+#define MT6360_PMU_FLED_PWSEL			(0x7D)
+#define MT6360_PMU_FLED_EN			(0x7E)
+#define MT6360_PMU_FLED_Hidden1			(0x7F)
+#define MT6360_PMU_RGB_EN			(0x80)
+#define MT6360_PMU_RGB1_ISNK			(0x81)
+#define MT6360_PMU_RGB2_ISNK			(0x82)
+#define MT6360_PMU_RGB3_ISNK			(0x83)
+#define MT6360_PMU_RGB_ML_ISNK			(0x84)
+#define MT6360_PMU_RGB1_DIM			(0x85)
+#define MT6360_PMU_RGB2_DIM			(0x86)
+#define MT6360_PMU_RGB3_DIM			(0x87)
+#define MT6360_PMU_RESV5			(0x88)
+#define MT6360_PMU_RGB12_Freq			(0x89)
+#define MT6360_PMU_RGB34_Freq			(0x8A)
+#define MT6360_PMU_RGB1_Tr			(0x8B)
+#define MT6360_PMU_RGB1_Tf			(0x8C)
+#define MT6360_PMU_RGB1_TON_TOFF		(0x8D)
+#define MT6360_PMU_RGB2_Tr			(0x8E)
+#define MT6360_PMU_RGB2_Tf			(0x8F)
+#define MT6360_PMU_RGB2_TON_TOFF		(0x90)
+#define MT6360_PMU_RGB3_Tr			(0x91)
+#define MT6360_PMU_RGB3_Tf			(0x92)
+#define MT6360_PMU_RGB3_TON_TOFF		(0x93)
+#define MT6360_PMU_RGB_Hidden_CTRL1		(0x94)
+#define MT6360_PMU_RGB_Hidden_CTRL2		(0x95)
+#define MT6360_PMU_RESV6			(0x97)
+#define MT6360_PMU_SPARE1			(0x9A)
+#define MT6360_PMU_SPARE2			(0xA0)
+#define MT6360_PMU_SPARE3			(0xB0)
+#define MT6360_PMU_SPARE4			(0xC0)
+#define MT6360_PMU_CHG_IRQ1			(0xD0)
+#define MT6360_PMU_CHG_IRQ2			(0xD1)
+#define MT6360_PMU_CHG_IRQ3			(0xD2)
+#define MT6360_PMU_CHG_IRQ4			(0xD3)
+#define MT6360_PMU_CHG_IRQ5			(0xD4)
+#define MT6360_PMU_CHG_IRQ6			(0xD5)
+#define MT6360_PMU_QC_IRQ			(0xD6)
+#define MT6360_PMU_FOD_IRQ			(0xD7)
+#define MT6360_PMU_BASE_IRQ			(0xD8)
+#define MT6360_PMU_FLED_IRQ1			(0xD9)
+#define MT6360_PMU_FLED_IRQ2			(0xDA)
+#define MT6360_PMU_RGB_IRQ			(0xDB)
+#define MT6360_PMU_BUCK1_IRQ			(0xDC)
+#define MT6360_PMU_BUCK2_IRQ			(0xDD)
+#define MT6360_PMU_LDO_IRQ1			(0xDE)
+#define MT6360_PMU_LDO_IRQ2			(0xDF)
+#define MT6360_PMU_CHG_STAT1			(0xE0)
+#define MT6360_PMU_CHG_STAT2			(0xE1)
+#define MT6360_PMU_CHG_STAT3			(0xE2)
+#define MT6360_PMU_CHG_STAT4			(0xE3)
+#define MT6360_PMU_CHG_STAT5			(0xE4)
+#define MT6360_PMU_CHG_STAT6			(0xE5)
+#define MT6360_PMU_QC_STAT			(0xE6)
+#define MT6360_PMU_FOD_STAT			(0xE7)
+#define MT6360_PMU_BASE_STAT			(0xE8)
+#define MT6360_PMU_FLED_STAT1			(0xE9)
+#define MT6360_PMU_FLED_STAT2			(0xEA)
+#define MT6360_PMU_RGB_STAT			(0xEB)
+#define MT6360_PMU_BUCK1_STAT			(0xEC)
+#define MT6360_PMU_BUCK2_STAT			(0xED)
+#define MT6360_PMU_LDO_STAT1			(0xEE)
+#define MT6360_PMU_LDO_STAT2			(0xEF)
+#define MT6360_PMU_CHG_MASK1			(0xF0)
+#define MT6360_PMU_CHG_MASK2			(0xF1)
+#define MT6360_PMU_CHG_MASK3			(0xF2)
+#define MT6360_PMU_CHG_MASK4			(0xF3)
+#define MT6360_PMU_CHG_MASK5			(0xF4)
+#define MT6360_PMU_CHG_MASK6			(0xF5)
+#define MT6360_PMU_QC_MASK			(0xF6)
+#define MT6360_PMU_FOD_MASK			(0xF7)
+#define MT6360_PMU_BASE_MASK			(0xF8)
+#define MT6360_PMU_FLED_MASK1			(0xF9)
+#define MT6360_PMU_FLED_MASK2			(0xFA)
+#define MT6360_PMU_FAULTB_MASK			(0xFB)
+#define MT6360_PMU_BUCK1_MASK			(0xFC)
+#define MT6360_PMU_BUCK2_MASK			(0xFD)
+#define MT6360_PMU_LDO_MASK1			(0xFE)
+#define MT6360_PMU_LDO_MASK2			(0xFF)
+#define MT6360_PMU_MAXREG			(MT6360_PMU_LDO_MASK2)
+
+/* MT6360_PMU_IRQ_SET */
+#define MT6360_PMU_IRQ_REGNUM	(MT6360_PMU_LDO_IRQ2 - MT6360_PMU_CHG_IRQ1 + 1)
+#define MT6360_IRQ_RETRIG	BIT(2)
+
+#define CHIP_VEN_MASK				(0xF0)
+#define CHIP_VEN_MT6360				(0x50)
+#define CHIP_REV_MASK				(0x0F)
+
+#endif /* __MT6360_H__ */
-- 
2.7.4


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2020-03-03 15:27 Gene Chen
@ 2020-03-04 14:56   ` Matthias Brugger
  0 siblings, 0 replies; 1651+ messages in thread
From: Matthias Brugger @ 2020-03-04 14:56 UTC (permalink / raw)
  To: Gene Chen, lee.jones
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Wilma.Wu,
	linux-arm-kernel, shufan_lee

Please resend with appropiate commit message.

On 03/03/2020 16:27, Gene Chen wrote:
> Add mfd driver for mt6360 pmic chip include
> Battery Charger/USB_PD/Flash LED/RGB LED/LDO/Buck
> 
> Signed-off-by: Gene Chen <gene_chen@richtek.com
> ---
>  drivers/mfd/Kconfig        |  12 ++
>  drivers/mfd/Makefile       |   1 +
>  drivers/mfd/mt6360-core.c  | 425 +++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/mfd/mt6360.h | 240 +++++++++++++++++++++++++
>  4 files changed, 678 insertions(+)
>  create mode 100644 drivers/mfd/mt6360-core.c
>  create mode 100644 include/linux/mfd/mt6360.h
> 
> changelogs between v1 & v2
> - include missing header file
> 
> changelogs between v2 & v3
> - add changelogs
> 
> changelogs between v3 & v4
> - fix Kconfig description
> - replace mt6360_pmu_info with mt6360_pmu_data
> - replace probe with probe_new
> - remove unnecessary irq_chip variable
> - remove annotation
> - replace MT6360_MFD_CELL with OF_MFD_CELL
> 
> changelogs between v4 & v5
> - remove unnecessary parse dt function
> - use devm_i2c_new_dummy_device
> - add base-commit message
> 
> changelogs between v5 & v6
> - review return value
> - remove i2c id_table
> - use GPL license v2
> 
> changelogs between v6 & v7
> - add author description
> - replace MT6360_REGMAP_IRQ_REG by REGMAP_IRQ_REG_LINE
> - remove mt6360-private.h
> 
> changelogs between v7 & v8
> - fix kbuild auto reboot by include interrupt header
> 
> diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> index 2b20329..0f8c341 100644
> --- a/drivers/mfd/Kconfig
> +++ b/drivers/mfd/Kconfig
> @@ -857,6 +857,18 @@ config MFD_MAX8998
>  	  additional drivers must be enabled in order to use the functionality
>  	  of the device.
>  
> +config MFD_MT6360
> +	tristate "Mediatek MT6360 SubPMIC"
> +	select MFD_CORE
> +	select REGMAP_I2C
> +	select REGMAP_IRQ
> +	depends on I2C
> +	help
> +	  Say Y here to enable MT6360 PMU/PMIC/LDO functional support.
> +	  PMU part includes Charger, Flashlight, RGB LED
> +	  PMIC part includes 2-channel BUCKs and 2-channel LDOs
> +	  LDO part includes 4-channel LDOs
> +
>  config MFD_MT6397
>  	tristate "MediaTek MT6397 PMIC Support"
>  	select MFD_CORE
> diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
> index b83f172..8c35816 100644
> --- a/drivers/mfd/Makefile
> +++ b/drivers/mfd/Makefile
> @@ -238,6 +238,7 @@ obj-$(CONFIG_INTEL_SOC_PMIC)	+= intel-soc-pmic.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_BXTWC)	+= intel_soc_pmic_bxtwc.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_CHTWC)	+= intel_soc_pmic_chtwc.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_CHTDC_TI)	+= intel_soc_pmic_chtdc_ti.o
> +obj-$(CONFIG_MFD_MT6360)	+= mt6360-core.o
>  mt6397-objs	:= mt6397-core.o mt6397-irq.o
>  obj-$(CONFIG_MFD_MT6397)	+= mt6397.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_MRFLD)	+= intel_soc_pmic_mrfld.o
> diff --git a/drivers/mfd/mt6360-core.c b/drivers/mfd/mt6360-core.c
> new file mode 100644
> index 0000000..d1168f8
> --- /dev/null
> +++ b/drivers/mfd/mt6360-core.c
> @@ -0,0 +1,425 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2019 MediaTek Inc.
> + *
> + * Author: Gene Chen <gene_chen@richtek.com>
> + */
> +
> +#include <linux/i2c.h>
> +#include <linux/init.h>
> +#include <linux/interrupt.h>
> +#include <linux/kernel.h>
> +#include <linux/mfd/core.h>
> +#include <linux/module.h>
> +#include <linux/of_irq.h>
> +#include <linux/of_platform.h>
> +#include <linux/version.h>
> +
> +#include <linux/mfd/mt6360.h>
> +
> +/* reg 0 -> 0 ~ 7 */
> +#define MT6360_CHG_TREG_EVT		(4)
> +#define MT6360_CHG_AICR_EVT		(5)
> +#define MT6360_CHG_MIVR_EVT		(6)
> +#define MT6360_PWR_RDY_EVT		(7)
> +/* REG 1 -> 8 ~ 15 */
> +#define MT6360_CHG_BATSYSUV_EVT		(9)
> +#define MT6360_FLED_CHG_VINOVP_EVT	(11)
> +#define MT6360_CHG_VSYSUV_EVT		(12)
> +#define MT6360_CHG_VSYSOV_EVT		(13)
> +#define MT6360_CHG_VBATOV_EVT		(14)
> +#define MT6360_CHG_VBUSOV_EVT		(15)
> +/* REG 2 -> 16 ~ 23 */
> +/* REG 3 -> 24 ~ 31 */
> +#define MT6360_WD_PMU_DET		(25)
> +#define MT6360_WD_PMU_DONE		(26)
> +#define MT6360_CHG_TMRI			(27)
> +#define MT6360_CHG_ADPBADI		(29)
> +#define MT6360_CHG_RVPI			(30)
> +#define MT6360_OTPI			(31)
> +/* REG 4 -> 32 ~ 39 */
> +#define MT6360_CHG_AICCMEASL		(32)
> +#define MT6360_CHGDET_DONEI		(34)
> +#define MT6360_WDTMRI			(35)
> +#define MT6360_SSFINISHI		(36)
> +#define MT6360_CHG_RECHGI		(37)
> +#define MT6360_CHG_TERMI		(38)
> +#define MT6360_CHG_IEOCI		(39)
> +/* REG 5 -> 40 ~ 47 */
> +#define MT6360_PUMPX_DONEI		(40)
> +#define MT6360_BAT_OVP_ADC_EVT		(41)
> +#define MT6360_TYPEC_OTP_EVT		(42)
> +#define MT6360_ADC_WAKEUP_EVT		(43)
> +#define MT6360_ADC_DONEI		(44)
> +#define MT6360_BST_BATUVI		(45)
> +#define MT6360_BST_VBUSOVI		(46)
> +#define MT6360_BST_OLPI			(47)
> +/* REG 6 -> 48 ~ 55 */
> +#define MT6360_ATTACH_I			(48)
> +#define MT6360_DETACH_I			(49)
> +#define MT6360_QC30_STPDONE		(51)
> +#define MT6360_QC_VBUSDET_DONE		(52)
> +#define MT6360_HVDCP_DET		(53)
> +#define MT6360_CHGDETI			(54)
> +#define MT6360_DCDTI			(55)
> +/* REG 7 -> 56 ~ 63 */
> +#define MT6360_FOD_DONE_EVT		(56)
> +#define MT6360_FOD_OV_EVT		(57)
> +#define MT6360_CHRDET_UVP_EVT		(58)
> +#define MT6360_CHRDET_OVP_EVT		(59)
> +#define MT6360_CHRDET_EXT_EVT		(60)
> +#define MT6360_FOD_LR_EVT		(61)
> +#define MT6360_FOD_HR_EVT		(62)
> +#define MT6360_FOD_DISCHG_FAIL_EVT	(63)
> +/* REG 8 -> 64 ~ 71 */
> +#define MT6360_USBID_EVT		(64)
> +#define MT6360_APWDTRST_EVT		(65)
> +#define MT6360_EN_EVT			(66)
> +#define MT6360_QONB_RST_EVT		(67)
> +#define MT6360_MRSTB_EVT		(68)
> +#define MT6360_OTP_EVT			(69)
> +#define MT6360_VDDAOV_EVT		(70)
> +#define MT6360_SYSUV_EVT		(71)
> +/* REG 9 -> 72 ~ 79 */
> +#define MT6360_FLED_STRBPIN_EVT		(72)
> +#define MT6360_FLED_TORPIN_EVT		(73)
> +#define MT6360_FLED_TX_EVT		(74)
> +#define MT6360_FLED_LVF_EVT		(75)
> +#define MT6360_FLED2_SHORT_EVT		(78)
> +#define MT6360_FLED1_SHORT_EVT		(79)
> +/* REG 10 -> 80 ~ 87 */
> +#define MT6360_FLED2_STRB_EVT		(80)
> +#define MT6360_FLED1_STRB_EVT		(81)
> +#define MT6360_FLED2_STRB_TO_EVT	(82)
> +#define MT6360_FLED1_STRB_TO_EVT	(83)
> +#define MT6360_FLED2_TOR_EVT		(84)
> +#define MT6360_FLED1_TOR_EVT		(85)
> +/* REG 11 -> 88 ~ 95 */
> +/* REG 12 -> 96 ~ 103 */
> +#define MT6360_BUCK1_PGB_EVT		(96)
> +#define MT6360_BUCK1_OC_EVT		(100)
> +#define MT6360_BUCK1_OV_EVT		(101)
> +#define MT6360_BUCK1_UV_EVT		(102)
> +/* REG 13 -> 104 ~ 111 */
> +#define MT6360_BUCK2_PGB_EVT		(104)
> +#define MT6360_BUCK2_OC_EVT		(108)
> +#define MT6360_BUCK2_OV_EVT		(109)
> +#define MT6360_BUCK2_UV_EVT		(110)
> +/* REG 14 -> 112 ~ 119 */
> +#define MT6360_LDO1_OC_EVT		(113)
> +#define MT6360_LDO2_OC_EVT		(114)
> +#define MT6360_LDO3_OC_EVT		(115)
> +#define MT6360_LDO5_OC_EVT		(117)
> +#define MT6360_LDO6_OC_EVT		(118)
> +#define MT6360_LDO7_OC_EVT		(119)
> +/* REG 15 -> 120 ~ 127 */
> +#define MT6360_LDO1_PGB_EVT		(121)
> +#define MT6360_LDO2_PGB_EVT		(122)
> +#define MT6360_LDO3_PGB_EVT		(123)
> +#define MT6360_LDO5_PGB_EVT		(125)
> +#define MT6360_LDO6_PGB_EVT		(126)
> +#define MT6360_LDO7_PGB_EVT		(127)
> +
> +static const struct regmap_irq mt6360_pmu_irqs[] =  {
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_MIVR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_PWR_RDY_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_BATSYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_CHG_VINOVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBATOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBUSOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DET, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TMRI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_ADPBADI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_RVPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_OTPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICCMEASL, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHGDET_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WDTMRI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_SSFINISHI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_RECHGI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TERMI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_IEOCI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_PUMPX_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BAT_OVP_ADC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_TYPEC_OTP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ADC_WAKEUP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ADC_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_BATUVI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_VBUSOVI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_OLPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ATTACH_I, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_DETACH_I, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QC30_STPDONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QC_VBUSDET_DONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_HVDCP_DET, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHGDETI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_DCDTI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_DONE_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_UVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_OVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_EXT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_LR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_HR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_DISCHG_FAIL_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_USBID_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_APWDTRST_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_EN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QONB_RST_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_MRSTB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_OTP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_VDDAOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_SYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_STRBPIN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_TORPIN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_TX_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_LVF_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_SHORT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_SHORT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_TO_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_TO_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_TOR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_TOR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_UV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_UV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO1_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO2_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO3_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO5_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO6_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO7_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO1_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO2_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO3_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO5_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO6_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO7_PGB_EVT, 8),
> +};
> +
> +static int mt6360_pmu_handle_post_irq(void *irq_drv_data)
> +{
> +	struct mt6360_pmu_data *mpd = irq_drv_data;
> +
> +	return regmap_update_bits(mpd->regmap,
> +		MT6360_PMU_IRQ_SET, MT6360_IRQ_RETRIG, MT6360_IRQ_RETRIG);
> +}
> +
> +static struct regmap_irq_chip mt6360_pmu_irq_chip = {
> +	.irqs = mt6360_pmu_irqs,
> +	.num_irqs = ARRAY_SIZE(mt6360_pmu_irqs),
> +	.num_regs = MT6360_PMU_IRQ_REGNUM,
> +	.mask_base = MT6360_PMU_CHG_MASK1,
> +	.status_base = MT6360_PMU_CHG_IRQ1,
> +	.ack_base = MT6360_PMU_CHG_IRQ1,
> +	.init_ack_masked = true,
> +	.use_ack = true,
> +	.handle_post_irq = mt6360_pmu_handle_post_irq,
> +};
> +
> +static const struct regmap_config mt6360_pmu_regmap_config = {
> +	.reg_bits = 8,
> +	.val_bits = 8,
> +	.max_register = MT6360_PMU_MAXREG,
> +};
> +
> +static const struct resource mt6360_adc_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_ADC_DONEI, "adc_donei"),
> +};
> +
> +static const struct resource mt6360_chg_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TREG_EVT, "chg_treg_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_PWR_RDY_EVT, "pwr_rdy_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_BATSYSUV_EVT, "chg_batsysuv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSUV_EVT, "chg_vsysuv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSOV_EVT, "chg_vsysov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBATOV_EVT, "chg_vbatov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBUSOV_EVT, "chg_vbusov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_AICCMEASL, "chg_aiccmeasl"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_WDTMRI, "wdtmri"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_RECHGI, "chg_rechgi"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TERMI, "chg_termi"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_IEOCI, "chg_ieoci"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_PUMPX_DONEI, "pumpx_donei"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_ATTACH_I, "attach_i"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHRDET_EXT_EVT, "chrdet_ext_evt"),
> +};
> +
> +static const struct resource mt6360_led_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED_CHG_VINOVP_EVT, "fled_chg_vinovp_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED_LVF_EVT, "fled_lvf_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_SHORT_EVT, "fled2_short_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_SHORT_EVT, "fled1_short_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_STRB_TO_EVT, "fled2_strb_to_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_STRB_TO_EVT, "fled1_strb_to_evt"),
> +};
> +
> +static const struct resource mt6360_pmic_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_PGB_EVT, "buck1_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OC_EVT, "buck1_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OV_EVT, "buck1_ov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_UV_EVT, "buck1_uv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_PGB_EVT, "buck2_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OC_EVT, "buck2_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OV_EVT, "buck2_ov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_UV_EVT, "buck2_uv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_OC_EVT, "ldo6_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_OC_EVT, "ldo7_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_PGB_EVT, "ldo6_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_PGB_EVT, "ldo7_pgb_evt"),
> +};
> +
> +static const struct resource mt6360_ldo_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_OC_EVT, "ldo1_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_OC_EVT, "ldo2_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_OC_EVT, "ldo3_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_OC_EVT, "ldo5_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_PGB_EVT, "ldo1_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_PGB_EVT, "ldo2_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_PGB_EVT, "ldo3_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_PGB_EVT, "ldo5_pgb_evt"),
> +};
> +
> +static const struct mfd_cell mt6360_devs[] = {
> +	OF_MFD_CELL("mt6360_adc", mt6360_adc_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_adc"),
> +	OF_MFD_CELL("mt6360_chg", mt6360_chg_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_chg"),
> +	OF_MFD_CELL("mt6360_led", mt6360_led_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_led"),
> +	OF_MFD_CELL("mt6360_pmic", mt6360_pmic_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_pmic"),
> +	OF_MFD_CELL("mt6360_ldo", mt6360_ldo_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_ldo"),
> +	OF_MFD_CELL("mt6360_tcpc", NULL,
> +		    NULL, 0, 0, "mediatek,mt6360_tcpc"),
> +};
> +
> +static const unsigned short mt6360_slave_addr[MT6360_SLAVE_MAX] = {
> +	MT6360_PMU_SLAVEID,
> +	MT6360_PMIC_SLAVEID,
> +	MT6360_LDO_SLAVEID,
> +	MT6360_TCPC_SLAVEID,
> +};
> +
> +static int mt6360_pmu_probe(struct i2c_client *client)
> +{
> +	struct mt6360_pmu_data *mpd;
> +	unsigned int reg_data;
> +	int i, ret;
> +
> +	mpd = devm_kzalloc(&client->dev, sizeof(*mpd), GFP_KERNEL);
> +	if (!mpd)
> +		return -ENOMEM;
> +
> +	mpd->dev = &client->dev;
> +	i2c_set_clientdata(client, mpd);
> +
> +	mpd->regmap = devm_regmap_init_i2c(client, &mt6360_pmu_regmap_config);
> +	if (IS_ERR(mpd->regmap)) {
> +		dev_err(&client->dev, "Failed to register regmap\n");
> +		return PTR_ERR(mpd->regmap);
> +	}
> +
> +	ret = regmap_read(mpd->regmap, MT6360_PMU_DEV_INFO, &reg_data);
> +	if (ret) {
> +		dev_err(&client->dev, "Device not found\n");
> +		return ret;
> +	}
> +
> +	mpd->chip_rev = reg_data & CHIP_REV_MASK;
> +	if (mpd->chip_rev != CHIP_VEN_MT6360) {
> +		dev_err(&client->dev, "Device not supported\n");
> +		return -ENODEV;
> +	}
> +
> +	mt6360_pmu_irq_chip.irq_drv_data = mpd;
> +	ret = devm_regmap_add_irq_chip(&client->dev, mpd->regmap, client->irq,
> +				       IRQF_TRIGGER_FALLING, 0,
> +				       &mt6360_pmu_irq_chip, &mpd->irq_data);
> +	if (ret) {
> +		dev_err(&client->dev, "Failed to add Regmap IRQ Chip\n");
> +		return ret;
> +	}
> +
> +	mpd->i2c[0] = client;
> +	for (i = 1; i < MT6360_SLAVE_MAX; i++) {
> +		mpd->i2c[i] = devm_i2c_new_dummy_device(&client->dev,
> +							client->adapter,
> +							mt6360_slave_addr[i]);
> +		if (IS_ERR(mpd->i2c[i])) {
> +			dev_err(&client->dev,
> +				"Failed to get new dummy I2C device for address 0x%x",
> +				mt6360_slave_addr[i]);
> +			return PTR_ERR(mpd->i2c[i]);
> +		}
> +		i2c_set_clientdata(mpd->i2c[i], mpd);
> +	}
> +
> +	ret = devm_mfd_add_devices(&client->dev, PLATFORM_DEVID_AUTO,
> +				   mt6360_devs, ARRAY_SIZE(mt6360_devs), NULL,
> +				   0, regmap_irq_get_domain(mpd->irq_data));
> +	if (ret) {
> +		dev_err(&client->dev,
> +			"Failed to register subordinate devices\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused mt6360_pmu_suspend(struct device *dev)
> +{
> +	struct i2c_client *i2c = to_i2c_client(dev);
> +
> +	if (device_may_wakeup(dev))
> +		enable_irq_wake(i2c->irq);
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused mt6360_pmu_resume(struct device *dev)
> +{
> +
> +	struct i2c_client *i2c = to_i2c_client(dev);
> +
> +	if (device_may_wakeup(dev))
> +		disable_irq_wake(i2c->irq);
> +
> +	return 0;
> +}
> +
> +static SIMPLE_DEV_PM_OPS(mt6360_pmu_pm_ops,
> +			 mt6360_pmu_suspend, mt6360_pmu_resume);
> +
> +static const struct of_device_id __maybe_unused mt6360_pmu_of_id[] = {
> +	{ .compatible = "mediatek,mt6360_pmu", },
> +	{},
> +};
> +MODULE_DEVICE_TABLE(of, mt6360_pmu_of_id);
> +
> +static struct i2c_driver mt6360_pmu_driver = {
> +	.driver = {
> +		.pm = &mt6360_pmu_pm_ops,
> +		.of_match_table = of_match_ptr(mt6360_pmu_of_id),
> +	},
> +	.probe_new = mt6360_pmu_probe,
> +};
> +module_i2c_driver(mt6360_pmu_driver);
> +
> +MODULE_AUTHOR("Gene Chen <gene_chen@richtek.com>");
> +MODULE_DESCRIPTION("MT6360 PMU I2C Driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/mfd/mt6360.h b/include/linux/mfd/mt6360.h
> new file mode 100644
> index 0000000..c03e6d1
> --- /dev/null
> +++ b/include/linux/mfd/mt6360.h
> @@ -0,0 +1,240 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2019 MediaTek Inc.
> + */
> +
> +#ifndef __MT6360_H__
> +#define __MT6360_H__
> +
> +#include <linux/regmap.h>
> +
> +enum {
> +	MT6360_SLAVE_PMU = 0,
> +	MT6360_SLAVE_PMIC,
> +	MT6360_SLAVE_LDO,
> +	MT6360_SLAVE_TCPC,
> +	MT6360_SLAVE_MAX,
> +};
> +
> +#define MT6360_PMU_SLAVEID	(0x34)
> +#define MT6360_PMIC_SLAVEID	(0x1A)
> +#define MT6360_LDO_SLAVEID	(0x64)
> +#define MT6360_TCPC_SLAVEID	(0x4E)
> +
> +struct mt6360_pmu_data {
> +	struct i2c_client *i2c[MT6360_SLAVE_MAX];
> +	struct device *dev;
> +	struct regmap *regmap;
> +	struct regmap_irq_chip_data *irq_data;
> +	unsigned int chip_rev;
> +};
> +
> +/* PMU register defininition */
> +#define MT6360_PMU_DEV_INFO			(0x00)
> +#define MT6360_PMU_CORE_CTRL1			(0x01)
> +#define MT6360_PMU_RST1				(0x02)
> +#define MT6360_PMU_CRCEN			(0x03)
> +#define MT6360_PMU_RST_PAS_CODE1		(0x04)
> +#define MT6360_PMU_RST_PAS_CODE2		(0x05)
> +#define MT6360_PMU_CORE_CTRL2			(0x06)
> +#define MT6360_PMU_TM_PAS_CODE1			(0x07)
> +#define MT6360_PMU_TM_PAS_CODE2			(0x08)
> +#define MT6360_PMU_TM_PAS_CODE3			(0x09)
> +#define MT6360_PMU_TM_PAS_CODE4			(0x0A)
> +#define MT6360_PMU_IRQ_IND			(0x0B)
> +#define MT6360_PMU_IRQ_MASK			(0x0C)
> +#define MT6360_PMU_IRQ_SET			(0x0D)
> +#define MT6360_PMU_SHDN_CTRL			(0x0E)
> +#define MT6360_PMU_TM_INF			(0x0F)
> +#define MT6360_PMU_I2C_CTRL			(0x10)
> +#define MT6360_PMU_CHG_CTRL1			(0x11)
> +#define MT6360_PMU_CHG_CTRL2			(0x12)
> +#define MT6360_PMU_CHG_CTRL3			(0x13)
> +#define MT6360_PMU_CHG_CTRL4			(0x14)
> +#define MT6360_PMU_CHG_CTRL5			(0x15)
> +#define MT6360_PMU_CHG_CTRL6			(0x16)
> +#define MT6360_PMU_CHG_CTRL7			(0x17)
> +#define MT6360_PMU_CHG_CTRL8			(0x18)
> +#define MT6360_PMU_CHG_CTRL9			(0x19)
> +#define MT6360_PMU_CHG_CTRL10			(0x1A)
> +#define MT6360_PMU_CHG_CTRL11			(0x1B)
> +#define MT6360_PMU_CHG_CTRL12			(0x1C)
> +#define MT6360_PMU_CHG_CTRL13			(0x1D)
> +#define MT6360_PMU_CHG_CTRL14			(0x1E)
> +#define MT6360_PMU_CHG_CTRL15			(0x1F)
> +#define MT6360_PMU_CHG_CTRL16			(0x20)
> +#define MT6360_PMU_CHG_AICC_RESULT		(0x21)
> +#define MT6360_PMU_DEVICE_TYPE			(0x22)
> +#define MT6360_PMU_QC_CONTROL1			(0x23)
> +#define MT6360_PMU_QC_CONTROL2			(0x24)
> +#define MT6360_PMU_QC30_CONTROL1		(0x25)
> +#define MT6360_PMU_QC30_CONTROL2		(0x26)
> +#define MT6360_PMU_USB_STATUS1			(0x27)
> +#define MT6360_PMU_QC_STATUS1			(0x28)
> +#define MT6360_PMU_QC_STATUS2			(0x29)
> +#define MT6360_PMU_CHG_PUMP			(0x2A)
> +#define MT6360_PMU_CHG_CTRL17			(0x2B)
> +#define MT6360_PMU_CHG_CTRL18			(0x2C)
> +#define MT6360_PMU_CHRDET_CTRL1			(0x2D)
> +#define MT6360_PMU_CHRDET_CTRL2			(0x2E)
> +#define MT6360_PMU_DPDN_CTRL			(0x2F)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL1		(0x30)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL2		(0x31)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL3		(0x32)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL4		(0x33)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL5		(0x34)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL6		(0x35)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL7		(0x36)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL8		(0x37)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL9		(0x38)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL10		(0x39)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL11		(0x3A)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL12		(0x3B)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL13		(0x3C)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL14		(0x3D)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL15		(0x3E)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL16		(0x3F)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL17		(0x40)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL18		(0x41)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL19		(0x42)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL20		(0x43)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL21		(0x44)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL22		(0x45)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL23		(0x46)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL24		(0x47)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL25		(0x48)
> +#define MT6360_PMU_BC12_CTRL			(0x49)
> +#define MT6360_PMU_CHG_STAT			(0x4A)
> +#define MT6360_PMU_RESV1			(0x4B)
> +#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEH	(0x4E)
> +#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEL	(0x4F)
> +#define MT6360_PMU_TYPEC_OTP_HYST_TH		(0x50)
> +#define MT6360_PMU_TYPEC_OTP_CTRL		(0x51)
> +#define MT6360_PMU_ADC_BAT_DATA_H		(0x52)
> +#define MT6360_PMU_ADC_BAT_DATA_L		(0x53)
> +#define MT6360_PMU_IMID_BACKBST_ON		(0x54)
> +#define MT6360_PMU_IMID_BACKBST_OFF		(0x55)
> +#define MT6360_PMU_ADC_CONFIG			(0x56)
> +#define MT6360_PMU_ADC_EN2			(0x57)
> +#define MT6360_PMU_ADC_IDLE_T			(0x58)
> +#define MT6360_PMU_ADC_RPT_1			(0x5A)
> +#define MT6360_PMU_ADC_RPT_2			(0x5B)
> +#define MT6360_PMU_ADC_RPT_3			(0x5C)
> +#define MT6360_PMU_ADC_RPT_ORG1			(0x5D)
> +#define MT6360_PMU_ADC_RPT_ORG2			(0x5E)
> +#define MT6360_PMU_BAT_OVP_TH_SEL_CODEH		(0x5F)
> +#define MT6360_PMU_BAT_OVP_TH_SEL_CODEL		(0x60)
> +#define MT6360_PMU_CHG_CTRL19			(0x61)
> +#define MT6360_PMU_VDDASUPPLY			(0x62)
> +#define MT6360_PMU_BC12_MANUAL			(0x63)
> +#define MT6360_PMU_CHGDET_FUNC			(0x64)
> +#define MT6360_PMU_FOD_CTRL			(0x65)
> +#define MT6360_PMU_CHG_CTRL20			(0x66)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL26		(0x67)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL27		(0x68)
> +#define MT6360_PMU_RESV2			(0x69)
> +#define MT6360_PMU_USBID_CTRL1			(0x6D)
> +#define MT6360_PMU_USBID_CTRL2			(0x6E)
> +#define MT6360_PMU_USBID_CTRL3			(0x6F)
> +#define MT6360_PMU_FLED_CFG			(0x70)
> +#define MT6360_PMU_RESV3			(0x71)
> +#define MT6360_PMU_FLED1_CTRL			(0x72)
> +#define MT6360_PMU_FLED_STRB_CTRL		(0x73)
> +#define MT6360_PMU_FLED1_STRB_CTRL2		(0x74)
> +#define MT6360_PMU_FLED1_TOR_CTRL		(0x75)
> +#define MT6360_PMU_FLED2_CTRL			(0x76)
> +#define MT6360_PMU_RESV4			(0x77)
> +#define MT6360_PMU_FLED2_STRB_CTRL2		(0x78)
> +#define MT6360_PMU_FLED2_TOR_CTRL		(0x79)
> +#define MT6360_PMU_FLED_VMIDTRK_CTRL1		(0x7A)
> +#define MT6360_PMU_FLED_VMID_RTM		(0x7B)
> +#define MT6360_PMU_FLED_VMIDTRK_CTRL2		(0x7C)
> +#define MT6360_PMU_FLED_PWSEL			(0x7D)
> +#define MT6360_PMU_FLED_EN			(0x7E)
> +#define MT6360_PMU_FLED_Hidden1			(0x7F)
> +#define MT6360_PMU_RGB_EN			(0x80)
> +#define MT6360_PMU_RGB1_ISNK			(0x81)
> +#define MT6360_PMU_RGB2_ISNK			(0x82)
> +#define MT6360_PMU_RGB3_ISNK			(0x83)
> +#define MT6360_PMU_RGB_ML_ISNK			(0x84)
> +#define MT6360_PMU_RGB1_DIM			(0x85)
> +#define MT6360_PMU_RGB2_DIM			(0x86)
> +#define MT6360_PMU_RGB3_DIM			(0x87)
> +#define MT6360_PMU_RESV5			(0x88)
> +#define MT6360_PMU_RGB12_Freq			(0x89)
> +#define MT6360_PMU_RGB34_Freq			(0x8A)
> +#define MT6360_PMU_RGB1_Tr			(0x8B)
> +#define MT6360_PMU_RGB1_Tf			(0x8C)
> +#define MT6360_PMU_RGB1_TON_TOFF		(0x8D)
> +#define MT6360_PMU_RGB2_Tr			(0x8E)
> +#define MT6360_PMU_RGB2_Tf			(0x8F)
> +#define MT6360_PMU_RGB2_TON_TOFF		(0x90)
> +#define MT6360_PMU_RGB3_Tr			(0x91)
> +#define MT6360_PMU_RGB3_Tf			(0x92)
> +#define MT6360_PMU_RGB3_TON_TOFF		(0x93)
> +#define MT6360_PMU_RGB_Hidden_CTRL1		(0x94)
> +#define MT6360_PMU_RGB_Hidden_CTRL2		(0x95)
> +#define MT6360_PMU_RESV6			(0x97)
> +#define MT6360_PMU_SPARE1			(0x9A)
> +#define MT6360_PMU_SPARE2			(0xA0)
> +#define MT6360_PMU_SPARE3			(0xB0)
> +#define MT6360_PMU_SPARE4			(0xC0)
> +#define MT6360_PMU_CHG_IRQ1			(0xD0)
> +#define MT6360_PMU_CHG_IRQ2			(0xD1)
> +#define MT6360_PMU_CHG_IRQ3			(0xD2)
> +#define MT6360_PMU_CHG_IRQ4			(0xD3)
> +#define MT6360_PMU_CHG_IRQ5			(0xD4)
> +#define MT6360_PMU_CHG_IRQ6			(0xD5)
> +#define MT6360_PMU_QC_IRQ			(0xD6)
> +#define MT6360_PMU_FOD_IRQ			(0xD7)
> +#define MT6360_PMU_BASE_IRQ			(0xD8)
> +#define MT6360_PMU_FLED_IRQ1			(0xD9)
> +#define MT6360_PMU_FLED_IRQ2			(0xDA)
> +#define MT6360_PMU_RGB_IRQ			(0xDB)
> +#define MT6360_PMU_BUCK1_IRQ			(0xDC)
> +#define MT6360_PMU_BUCK2_IRQ			(0xDD)
> +#define MT6360_PMU_LDO_IRQ1			(0xDE)
> +#define MT6360_PMU_LDO_IRQ2			(0xDF)
> +#define MT6360_PMU_CHG_STAT1			(0xE0)
> +#define MT6360_PMU_CHG_STAT2			(0xE1)
> +#define MT6360_PMU_CHG_STAT3			(0xE2)
> +#define MT6360_PMU_CHG_STAT4			(0xE3)
> +#define MT6360_PMU_CHG_STAT5			(0xE4)
> +#define MT6360_PMU_CHG_STAT6			(0xE5)
> +#define MT6360_PMU_QC_STAT			(0xE6)
> +#define MT6360_PMU_FOD_STAT			(0xE7)
> +#define MT6360_PMU_BASE_STAT			(0xE8)
> +#define MT6360_PMU_FLED_STAT1			(0xE9)
> +#define MT6360_PMU_FLED_STAT2			(0xEA)
> +#define MT6360_PMU_RGB_STAT			(0xEB)
> +#define MT6360_PMU_BUCK1_STAT			(0xEC)
> +#define MT6360_PMU_BUCK2_STAT			(0xED)
> +#define MT6360_PMU_LDO_STAT1			(0xEE)
> +#define MT6360_PMU_LDO_STAT2			(0xEF)
> +#define MT6360_PMU_CHG_MASK1			(0xF0)
> +#define MT6360_PMU_CHG_MASK2			(0xF1)
> +#define MT6360_PMU_CHG_MASK3			(0xF2)
> +#define MT6360_PMU_CHG_MASK4			(0xF3)
> +#define MT6360_PMU_CHG_MASK5			(0xF4)
> +#define MT6360_PMU_CHG_MASK6			(0xF5)
> +#define MT6360_PMU_QC_MASK			(0xF6)
> +#define MT6360_PMU_FOD_MASK			(0xF7)
> +#define MT6360_PMU_BASE_MASK			(0xF8)
> +#define MT6360_PMU_FLED_MASK1			(0xF9)
> +#define MT6360_PMU_FLED_MASK2			(0xFA)
> +#define MT6360_PMU_FAULTB_MASK			(0xFB)
> +#define MT6360_PMU_BUCK1_MASK			(0xFC)
> +#define MT6360_PMU_BUCK2_MASK			(0xFD)
> +#define MT6360_PMU_LDO_MASK1			(0xFE)
> +#define MT6360_PMU_LDO_MASK2			(0xFF)
> +#define MT6360_PMU_MAXREG			(MT6360_PMU_LDO_MASK2)
> +
> +/* MT6360_PMU_IRQ_SET */
> +#define MT6360_PMU_IRQ_REGNUM	(MT6360_PMU_LDO_IRQ2 - MT6360_PMU_CHG_IRQ1 + 1)
> +#define MT6360_IRQ_RETRIG	BIT(2)
> +
> +#define CHIP_VEN_MASK				(0xF0)
> +#define CHIP_VEN_MT6360				(0x50)
> +#define CHIP_REV_MASK				(0x0F)
> +
> +#endif /* __MT6360_H__ */
> 

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-03-04 14:56   ` Matthias Brugger
  0 siblings, 0 replies; 1651+ messages in thread
From: Matthias Brugger @ 2020-03-04 14:56 UTC (permalink / raw)
  To: Gene Chen, lee.jones
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Wilma.Wu,
	linux-arm-kernel, shufan_lee

Please resend with appropiate commit message.

On 03/03/2020 16:27, Gene Chen wrote:
> Add mfd driver for mt6360 pmic chip include
> Battery Charger/USB_PD/Flash LED/RGB LED/LDO/Buck
> 
> Signed-off-by: Gene Chen <gene_chen@richtek.com
> ---
>  drivers/mfd/Kconfig        |  12 ++
>  drivers/mfd/Makefile       |   1 +
>  drivers/mfd/mt6360-core.c  | 425 +++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/mfd/mt6360.h | 240 +++++++++++++++++++++++++
>  4 files changed, 678 insertions(+)
>  create mode 100644 drivers/mfd/mt6360-core.c
>  create mode 100644 include/linux/mfd/mt6360.h
> 
> changelogs between v1 & v2
> - include missing header file
> 
> changelogs between v2 & v3
> - add changelogs
> 
> changelogs between v3 & v4
> - fix Kconfig description
> - replace mt6360_pmu_info with mt6360_pmu_data
> - replace probe with probe_new
> - remove unnecessary irq_chip variable
> - remove annotation
> - replace MT6360_MFD_CELL with OF_MFD_CELL
> 
> changelogs between v4 & v5
> - remove unnecessary parse dt function
> - use devm_i2c_new_dummy_device
> - add base-commit message
> 
> changelogs between v5 & v6
> - review return value
> - remove i2c id_table
> - use GPL license v2
> 
> changelogs between v6 & v7
> - add author description
> - replace MT6360_REGMAP_IRQ_REG by REGMAP_IRQ_REG_LINE
> - remove mt6360-private.h
> 
> changelogs between v7 & v8
> - fix kbuild auto reboot by include interrupt header
> 
> diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
> index 2b20329..0f8c341 100644
> --- a/drivers/mfd/Kconfig
> +++ b/drivers/mfd/Kconfig
> @@ -857,6 +857,18 @@ config MFD_MAX8998
>  	  additional drivers must be enabled in order to use the functionality
>  	  of the device.
>  
> +config MFD_MT6360
> +	tristate "Mediatek MT6360 SubPMIC"
> +	select MFD_CORE
> +	select REGMAP_I2C
> +	select REGMAP_IRQ
> +	depends on I2C
> +	help
> +	  Say Y here to enable MT6360 PMU/PMIC/LDO functional support.
> +	  PMU part includes Charger, Flashlight, RGB LED
> +	  PMIC part includes 2-channel BUCKs and 2-channel LDOs
> +	  LDO part includes 4-channel LDOs
> +
>  config MFD_MT6397
>  	tristate "MediaTek MT6397 PMIC Support"
>  	select MFD_CORE
> diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
> index b83f172..8c35816 100644
> --- a/drivers/mfd/Makefile
> +++ b/drivers/mfd/Makefile
> @@ -238,6 +238,7 @@ obj-$(CONFIG_INTEL_SOC_PMIC)	+= intel-soc-pmic.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_BXTWC)	+= intel_soc_pmic_bxtwc.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_CHTWC)	+= intel_soc_pmic_chtwc.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_CHTDC_TI)	+= intel_soc_pmic_chtdc_ti.o
> +obj-$(CONFIG_MFD_MT6360)	+= mt6360-core.o
>  mt6397-objs	:= mt6397-core.o mt6397-irq.o
>  obj-$(CONFIG_MFD_MT6397)	+= mt6397.o
>  obj-$(CONFIG_INTEL_SOC_PMIC_MRFLD)	+= intel_soc_pmic_mrfld.o
> diff --git a/drivers/mfd/mt6360-core.c b/drivers/mfd/mt6360-core.c
> new file mode 100644
> index 0000000..d1168f8
> --- /dev/null
> +++ b/drivers/mfd/mt6360-core.c
> @@ -0,0 +1,425 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2019 MediaTek Inc.
> + *
> + * Author: Gene Chen <gene_chen@richtek.com>
> + */
> +
> +#include <linux/i2c.h>
> +#include <linux/init.h>
> +#include <linux/interrupt.h>
> +#include <linux/kernel.h>
> +#include <linux/mfd/core.h>
> +#include <linux/module.h>
> +#include <linux/of_irq.h>
> +#include <linux/of_platform.h>
> +#include <linux/version.h>
> +
> +#include <linux/mfd/mt6360.h>
> +
> +/* reg 0 -> 0 ~ 7 */
> +#define MT6360_CHG_TREG_EVT		(4)
> +#define MT6360_CHG_AICR_EVT		(5)
> +#define MT6360_CHG_MIVR_EVT		(6)
> +#define MT6360_PWR_RDY_EVT		(7)
> +/* REG 1 -> 8 ~ 15 */
> +#define MT6360_CHG_BATSYSUV_EVT		(9)
> +#define MT6360_FLED_CHG_VINOVP_EVT	(11)
> +#define MT6360_CHG_VSYSUV_EVT		(12)
> +#define MT6360_CHG_VSYSOV_EVT		(13)
> +#define MT6360_CHG_VBATOV_EVT		(14)
> +#define MT6360_CHG_VBUSOV_EVT		(15)
> +/* REG 2 -> 16 ~ 23 */
> +/* REG 3 -> 24 ~ 31 */
> +#define MT6360_WD_PMU_DET		(25)
> +#define MT6360_WD_PMU_DONE		(26)
> +#define MT6360_CHG_TMRI			(27)
> +#define MT6360_CHG_ADPBADI		(29)
> +#define MT6360_CHG_RVPI			(30)
> +#define MT6360_OTPI			(31)
> +/* REG 4 -> 32 ~ 39 */
> +#define MT6360_CHG_AICCMEASL		(32)
> +#define MT6360_CHGDET_DONEI		(34)
> +#define MT6360_WDTMRI			(35)
> +#define MT6360_SSFINISHI		(36)
> +#define MT6360_CHG_RECHGI		(37)
> +#define MT6360_CHG_TERMI		(38)
> +#define MT6360_CHG_IEOCI		(39)
> +/* REG 5 -> 40 ~ 47 */
> +#define MT6360_PUMPX_DONEI		(40)
> +#define MT6360_BAT_OVP_ADC_EVT		(41)
> +#define MT6360_TYPEC_OTP_EVT		(42)
> +#define MT6360_ADC_WAKEUP_EVT		(43)
> +#define MT6360_ADC_DONEI		(44)
> +#define MT6360_BST_BATUVI		(45)
> +#define MT6360_BST_VBUSOVI		(46)
> +#define MT6360_BST_OLPI			(47)
> +/* REG 6 -> 48 ~ 55 */
> +#define MT6360_ATTACH_I			(48)
> +#define MT6360_DETACH_I			(49)
> +#define MT6360_QC30_STPDONE		(51)
> +#define MT6360_QC_VBUSDET_DONE		(52)
> +#define MT6360_HVDCP_DET		(53)
> +#define MT6360_CHGDETI			(54)
> +#define MT6360_DCDTI			(55)
> +/* REG 7 -> 56 ~ 63 */
> +#define MT6360_FOD_DONE_EVT		(56)
> +#define MT6360_FOD_OV_EVT		(57)
> +#define MT6360_CHRDET_UVP_EVT		(58)
> +#define MT6360_CHRDET_OVP_EVT		(59)
> +#define MT6360_CHRDET_EXT_EVT		(60)
> +#define MT6360_FOD_LR_EVT		(61)
> +#define MT6360_FOD_HR_EVT		(62)
> +#define MT6360_FOD_DISCHG_FAIL_EVT	(63)
> +/* REG 8 -> 64 ~ 71 */
> +#define MT6360_USBID_EVT		(64)
> +#define MT6360_APWDTRST_EVT		(65)
> +#define MT6360_EN_EVT			(66)
> +#define MT6360_QONB_RST_EVT		(67)
> +#define MT6360_MRSTB_EVT		(68)
> +#define MT6360_OTP_EVT			(69)
> +#define MT6360_VDDAOV_EVT		(70)
> +#define MT6360_SYSUV_EVT		(71)
> +/* REG 9 -> 72 ~ 79 */
> +#define MT6360_FLED_STRBPIN_EVT		(72)
> +#define MT6360_FLED_TORPIN_EVT		(73)
> +#define MT6360_FLED_TX_EVT		(74)
> +#define MT6360_FLED_LVF_EVT		(75)
> +#define MT6360_FLED2_SHORT_EVT		(78)
> +#define MT6360_FLED1_SHORT_EVT		(79)
> +/* REG 10 -> 80 ~ 87 */
> +#define MT6360_FLED2_STRB_EVT		(80)
> +#define MT6360_FLED1_STRB_EVT		(81)
> +#define MT6360_FLED2_STRB_TO_EVT	(82)
> +#define MT6360_FLED1_STRB_TO_EVT	(83)
> +#define MT6360_FLED2_TOR_EVT		(84)
> +#define MT6360_FLED1_TOR_EVT		(85)
> +/* REG 11 -> 88 ~ 95 */
> +/* REG 12 -> 96 ~ 103 */
> +#define MT6360_BUCK1_PGB_EVT		(96)
> +#define MT6360_BUCK1_OC_EVT		(100)
> +#define MT6360_BUCK1_OV_EVT		(101)
> +#define MT6360_BUCK1_UV_EVT		(102)
> +/* REG 13 -> 104 ~ 111 */
> +#define MT6360_BUCK2_PGB_EVT		(104)
> +#define MT6360_BUCK2_OC_EVT		(108)
> +#define MT6360_BUCK2_OV_EVT		(109)
> +#define MT6360_BUCK2_UV_EVT		(110)
> +/* REG 14 -> 112 ~ 119 */
> +#define MT6360_LDO1_OC_EVT		(113)
> +#define MT6360_LDO2_OC_EVT		(114)
> +#define MT6360_LDO3_OC_EVT		(115)
> +#define MT6360_LDO5_OC_EVT		(117)
> +#define MT6360_LDO6_OC_EVT		(118)
> +#define MT6360_LDO7_OC_EVT		(119)
> +/* REG 15 -> 120 ~ 127 */
> +#define MT6360_LDO1_PGB_EVT		(121)
> +#define MT6360_LDO2_PGB_EVT		(122)
> +#define MT6360_LDO3_PGB_EVT		(123)
> +#define MT6360_LDO5_PGB_EVT		(125)
> +#define MT6360_LDO6_PGB_EVT		(126)
> +#define MT6360_LDO7_PGB_EVT		(127)
> +
> +static const struct regmap_irq mt6360_pmu_irqs[] =  {
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_MIVR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_PWR_RDY_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_BATSYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_CHG_VINOVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VSYSOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBATOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_VBUSOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DET, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WD_PMU_DONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TMRI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_ADPBADI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_RVPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_OTPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_AICCMEASL, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHGDET_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_WDTMRI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_SSFINISHI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_RECHGI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TERMI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_IEOCI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_PUMPX_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHG_TREG_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BAT_OVP_ADC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_TYPEC_OTP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ADC_WAKEUP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ADC_DONEI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_BATUVI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_VBUSOVI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BST_OLPI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_ATTACH_I, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_DETACH_I, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QC30_STPDONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QC_VBUSDET_DONE, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_HVDCP_DET, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHGDETI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_DCDTI, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_DONE_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_UVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_OVP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_CHRDET_EXT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_LR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_HR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FOD_DISCHG_FAIL_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_USBID_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_APWDTRST_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_EN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_QONB_RST_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_MRSTB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_OTP_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_VDDAOV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_SYSUV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_STRBPIN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_TORPIN_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_TX_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED_LVF_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_SHORT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_SHORT_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_STRB_TO_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_STRB_TO_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED2_TOR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_FLED1_TOR_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK1_UV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_OV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_BUCK2_UV_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO1_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO2_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO3_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO5_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO6_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO7_OC_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO1_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO2_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO3_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO5_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO6_PGB_EVT, 8),
> +	REGMAP_IRQ_REG_LINE(MT6360_LDO7_PGB_EVT, 8),
> +};
> +
> +static int mt6360_pmu_handle_post_irq(void *irq_drv_data)
> +{
> +	struct mt6360_pmu_data *mpd = irq_drv_data;
> +
> +	return regmap_update_bits(mpd->regmap,
> +		MT6360_PMU_IRQ_SET, MT6360_IRQ_RETRIG, MT6360_IRQ_RETRIG);
> +}
> +
> +static struct regmap_irq_chip mt6360_pmu_irq_chip = {
> +	.irqs = mt6360_pmu_irqs,
> +	.num_irqs = ARRAY_SIZE(mt6360_pmu_irqs),
> +	.num_regs = MT6360_PMU_IRQ_REGNUM,
> +	.mask_base = MT6360_PMU_CHG_MASK1,
> +	.status_base = MT6360_PMU_CHG_IRQ1,
> +	.ack_base = MT6360_PMU_CHG_IRQ1,
> +	.init_ack_masked = true,
> +	.use_ack = true,
> +	.handle_post_irq = mt6360_pmu_handle_post_irq,
> +};
> +
> +static const struct regmap_config mt6360_pmu_regmap_config = {
> +	.reg_bits = 8,
> +	.val_bits = 8,
> +	.max_register = MT6360_PMU_MAXREG,
> +};
> +
> +static const struct resource mt6360_adc_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_ADC_DONEI, "adc_donei"),
> +};
> +
> +static const struct resource mt6360_chg_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TREG_EVT, "chg_treg_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_PWR_RDY_EVT, "pwr_rdy_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_BATSYSUV_EVT, "chg_batsysuv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSUV_EVT, "chg_vsysuv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VSYSOV_EVT, "chg_vsysov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBATOV_EVT, "chg_vbatov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_VBUSOV_EVT, "chg_vbusov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_AICCMEASL, "chg_aiccmeasl"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_WDTMRI, "wdtmri"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_RECHGI, "chg_rechgi"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_TERMI, "chg_termi"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHG_IEOCI, "chg_ieoci"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_PUMPX_DONEI, "pumpx_donei"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_ATTACH_I, "attach_i"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_CHRDET_EXT_EVT, "chrdet_ext_evt"),
> +};
> +
> +static const struct resource mt6360_led_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED_CHG_VINOVP_EVT, "fled_chg_vinovp_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED_LVF_EVT, "fled_lvf_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_SHORT_EVT, "fled2_short_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_SHORT_EVT, "fled1_short_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED2_STRB_TO_EVT, "fled2_strb_to_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_FLED1_STRB_TO_EVT, "fled1_strb_to_evt"),
> +};
> +
> +static const struct resource mt6360_pmic_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_PGB_EVT, "buck1_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OC_EVT, "buck1_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_OV_EVT, "buck1_ov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK1_UV_EVT, "buck1_uv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_PGB_EVT, "buck2_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OC_EVT, "buck2_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_OV_EVT, "buck2_ov_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_BUCK2_UV_EVT, "buck2_uv_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_OC_EVT, "ldo6_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_OC_EVT, "ldo7_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO6_PGB_EVT, "ldo6_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO7_PGB_EVT, "ldo7_pgb_evt"),
> +};
> +
> +static const struct resource mt6360_ldo_resources[] = {
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_OC_EVT, "ldo1_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_OC_EVT, "ldo2_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_OC_EVT, "ldo3_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_OC_EVT, "ldo5_oc_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO1_PGB_EVT, "ldo1_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO2_PGB_EVT, "ldo2_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO3_PGB_EVT, "ldo3_pgb_evt"),
> +	DEFINE_RES_IRQ_NAMED(MT6360_LDO5_PGB_EVT, "ldo5_pgb_evt"),
> +};
> +
> +static const struct mfd_cell mt6360_devs[] = {
> +	OF_MFD_CELL("mt6360_adc", mt6360_adc_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_adc"),
> +	OF_MFD_CELL("mt6360_chg", mt6360_chg_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_chg"),
> +	OF_MFD_CELL("mt6360_led", mt6360_led_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_led"),
> +	OF_MFD_CELL("mt6360_pmic", mt6360_pmic_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_pmic"),
> +	OF_MFD_CELL("mt6360_ldo", mt6360_ldo_resources,
> +		    NULL, 0, 0, "mediatek,mt6360_ldo"),
> +	OF_MFD_CELL("mt6360_tcpc", NULL,
> +		    NULL, 0, 0, "mediatek,mt6360_tcpc"),
> +};
> +
> +static const unsigned short mt6360_slave_addr[MT6360_SLAVE_MAX] = {
> +	MT6360_PMU_SLAVEID,
> +	MT6360_PMIC_SLAVEID,
> +	MT6360_LDO_SLAVEID,
> +	MT6360_TCPC_SLAVEID,
> +};
> +
> +static int mt6360_pmu_probe(struct i2c_client *client)
> +{
> +	struct mt6360_pmu_data *mpd;
> +	unsigned int reg_data;
> +	int i, ret;
> +
> +	mpd = devm_kzalloc(&client->dev, sizeof(*mpd), GFP_KERNEL);
> +	if (!mpd)
> +		return -ENOMEM;
> +
> +	mpd->dev = &client->dev;
> +	i2c_set_clientdata(client, mpd);
> +
> +	mpd->regmap = devm_regmap_init_i2c(client, &mt6360_pmu_regmap_config);
> +	if (IS_ERR(mpd->regmap)) {
> +		dev_err(&client->dev, "Failed to register regmap\n");
> +		return PTR_ERR(mpd->regmap);
> +	}
> +
> +	ret = regmap_read(mpd->regmap, MT6360_PMU_DEV_INFO, &reg_data);
> +	if (ret) {
> +		dev_err(&client->dev, "Device not found\n");
> +		return ret;
> +	}
> +
> +	mpd->chip_rev = reg_data & CHIP_REV_MASK;
> +	if (mpd->chip_rev != CHIP_VEN_MT6360) {
> +		dev_err(&client->dev, "Device not supported\n");
> +		return -ENODEV;
> +	}
> +
> +	mt6360_pmu_irq_chip.irq_drv_data = mpd;
> +	ret = devm_regmap_add_irq_chip(&client->dev, mpd->regmap, client->irq,
> +				       IRQF_TRIGGER_FALLING, 0,
> +				       &mt6360_pmu_irq_chip, &mpd->irq_data);
> +	if (ret) {
> +		dev_err(&client->dev, "Failed to add Regmap IRQ Chip\n");
> +		return ret;
> +	}
> +
> +	mpd->i2c[0] = client;
> +	for (i = 1; i < MT6360_SLAVE_MAX; i++) {
> +		mpd->i2c[i] = devm_i2c_new_dummy_device(&client->dev,
> +							client->adapter,
> +							mt6360_slave_addr[i]);
> +		if (IS_ERR(mpd->i2c[i])) {
> +			dev_err(&client->dev,
> +				"Failed to get new dummy I2C device for address 0x%x",
> +				mt6360_slave_addr[i]);
> +			return PTR_ERR(mpd->i2c[i]);
> +		}
> +		i2c_set_clientdata(mpd->i2c[i], mpd);
> +	}
> +
> +	ret = devm_mfd_add_devices(&client->dev, PLATFORM_DEVID_AUTO,
> +				   mt6360_devs, ARRAY_SIZE(mt6360_devs), NULL,
> +				   0, regmap_irq_get_domain(mpd->irq_data));
> +	if (ret) {
> +		dev_err(&client->dev,
> +			"Failed to register subordinate devices\n");
> +		return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused mt6360_pmu_suspend(struct device *dev)
> +{
> +	struct i2c_client *i2c = to_i2c_client(dev);
> +
> +	if (device_may_wakeup(dev))
> +		enable_irq_wake(i2c->irq);
> +
> +	return 0;
> +}
> +
> +static int __maybe_unused mt6360_pmu_resume(struct device *dev)
> +{
> +
> +	struct i2c_client *i2c = to_i2c_client(dev);
> +
> +	if (device_may_wakeup(dev))
> +		disable_irq_wake(i2c->irq);
> +
> +	return 0;
> +}
> +
> +static SIMPLE_DEV_PM_OPS(mt6360_pmu_pm_ops,
> +			 mt6360_pmu_suspend, mt6360_pmu_resume);
> +
> +static const struct of_device_id __maybe_unused mt6360_pmu_of_id[] = {
> +	{ .compatible = "mediatek,mt6360_pmu", },
> +	{},
> +};
> +MODULE_DEVICE_TABLE(of, mt6360_pmu_of_id);
> +
> +static struct i2c_driver mt6360_pmu_driver = {
> +	.driver = {
> +		.pm = &mt6360_pmu_pm_ops,
> +		.of_match_table = of_match_ptr(mt6360_pmu_of_id),
> +	},
> +	.probe_new = mt6360_pmu_probe,
> +};
> +module_i2c_driver(mt6360_pmu_driver);
> +
> +MODULE_AUTHOR("Gene Chen <gene_chen@richtek.com>");
> +MODULE_DESCRIPTION("MT6360 PMU I2C Driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/mfd/mt6360.h b/include/linux/mfd/mt6360.h
> new file mode 100644
> index 0000000..c03e6d1
> --- /dev/null
> +++ b/include/linux/mfd/mt6360.h
> @@ -0,0 +1,240 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2019 MediaTek Inc.
> + */
> +
> +#ifndef __MT6360_H__
> +#define __MT6360_H__
> +
> +#include <linux/regmap.h>
> +
> +enum {
> +	MT6360_SLAVE_PMU = 0,
> +	MT6360_SLAVE_PMIC,
> +	MT6360_SLAVE_LDO,
> +	MT6360_SLAVE_TCPC,
> +	MT6360_SLAVE_MAX,
> +};
> +
> +#define MT6360_PMU_SLAVEID	(0x34)
> +#define MT6360_PMIC_SLAVEID	(0x1A)
> +#define MT6360_LDO_SLAVEID	(0x64)
> +#define MT6360_TCPC_SLAVEID	(0x4E)
> +
> +struct mt6360_pmu_data {
> +	struct i2c_client *i2c[MT6360_SLAVE_MAX];
> +	struct device *dev;
> +	struct regmap *regmap;
> +	struct regmap_irq_chip_data *irq_data;
> +	unsigned int chip_rev;
> +};
> +
> +/* PMU register defininition */
> +#define MT6360_PMU_DEV_INFO			(0x00)
> +#define MT6360_PMU_CORE_CTRL1			(0x01)
> +#define MT6360_PMU_RST1				(0x02)
> +#define MT6360_PMU_CRCEN			(0x03)
> +#define MT6360_PMU_RST_PAS_CODE1		(0x04)
> +#define MT6360_PMU_RST_PAS_CODE2		(0x05)
> +#define MT6360_PMU_CORE_CTRL2			(0x06)
> +#define MT6360_PMU_TM_PAS_CODE1			(0x07)
> +#define MT6360_PMU_TM_PAS_CODE2			(0x08)
> +#define MT6360_PMU_TM_PAS_CODE3			(0x09)
> +#define MT6360_PMU_TM_PAS_CODE4			(0x0A)
> +#define MT6360_PMU_IRQ_IND			(0x0B)
> +#define MT6360_PMU_IRQ_MASK			(0x0C)
> +#define MT6360_PMU_IRQ_SET			(0x0D)
> +#define MT6360_PMU_SHDN_CTRL			(0x0E)
> +#define MT6360_PMU_TM_INF			(0x0F)
> +#define MT6360_PMU_I2C_CTRL			(0x10)
> +#define MT6360_PMU_CHG_CTRL1			(0x11)
> +#define MT6360_PMU_CHG_CTRL2			(0x12)
> +#define MT6360_PMU_CHG_CTRL3			(0x13)
> +#define MT6360_PMU_CHG_CTRL4			(0x14)
> +#define MT6360_PMU_CHG_CTRL5			(0x15)
> +#define MT6360_PMU_CHG_CTRL6			(0x16)
> +#define MT6360_PMU_CHG_CTRL7			(0x17)
> +#define MT6360_PMU_CHG_CTRL8			(0x18)
> +#define MT6360_PMU_CHG_CTRL9			(0x19)
> +#define MT6360_PMU_CHG_CTRL10			(0x1A)
> +#define MT6360_PMU_CHG_CTRL11			(0x1B)
> +#define MT6360_PMU_CHG_CTRL12			(0x1C)
> +#define MT6360_PMU_CHG_CTRL13			(0x1D)
> +#define MT6360_PMU_CHG_CTRL14			(0x1E)
> +#define MT6360_PMU_CHG_CTRL15			(0x1F)
> +#define MT6360_PMU_CHG_CTRL16			(0x20)
> +#define MT6360_PMU_CHG_AICC_RESULT		(0x21)
> +#define MT6360_PMU_DEVICE_TYPE			(0x22)
> +#define MT6360_PMU_QC_CONTROL1			(0x23)
> +#define MT6360_PMU_QC_CONTROL2			(0x24)
> +#define MT6360_PMU_QC30_CONTROL1		(0x25)
> +#define MT6360_PMU_QC30_CONTROL2		(0x26)
> +#define MT6360_PMU_USB_STATUS1			(0x27)
> +#define MT6360_PMU_QC_STATUS1			(0x28)
> +#define MT6360_PMU_QC_STATUS2			(0x29)
> +#define MT6360_PMU_CHG_PUMP			(0x2A)
> +#define MT6360_PMU_CHG_CTRL17			(0x2B)
> +#define MT6360_PMU_CHG_CTRL18			(0x2C)
> +#define MT6360_PMU_CHRDET_CTRL1			(0x2D)
> +#define MT6360_PMU_CHRDET_CTRL2			(0x2E)
> +#define MT6360_PMU_DPDN_CTRL			(0x2F)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL1		(0x30)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL2		(0x31)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL3		(0x32)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL4		(0x33)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL5		(0x34)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL6		(0x35)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL7		(0x36)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL8		(0x37)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL9		(0x38)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL10		(0x39)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL11		(0x3A)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL12		(0x3B)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL13		(0x3C)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL14		(0x3D)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL15		(0x3E)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL16		(0x3F)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL17		(0x40)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL18		(0x41)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL19		(0x42)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL20		(0x43)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL21		(0x44)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL22		(0x45)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL23		(0x46)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL24		(0x47)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL25		(0x48)
> +#define MT6360_PMU_BC12_CTRL			(0x49)
> +#define MT6360_PMU_CHG_STAT			(0x4A)
> +#define MT6360_PMU_RESV1			(0x4B)
> +#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEH	(0x4E)
> +#define MT6360_PMU_TYPEC_OTP_TH_SEL_CODEL	(0x4F)
> +#define MT6360_PMU_TYPEC_OTP_HYST_TH		(0x50)
> +#define MT6360_PMU_TYPEC_OTP_CTRL		(0x51)
> +#define MT6360_PMU_ADC_BAT_DATA_H		(0x52)
> +#define MT6360_PMU_ADC_BAT_DATA_L		(0x53)
> +#define MT6360_PMU_IMID_BACKBST_ON		(0x54)
> +#define MT6360_PMU_IMID_BACKBST_OFF		(0x55)
> +#define MT6360_PMU_ADC_CONFIG			(0x56)
> +#define MT6360_PMU_ADC_EN2			(0x57)
> +#define MT6360_PMU_ADC_IDLE_T			(0x58)
> +#define MT6360_PMU_ADC_RPT_1			(0x5A)
> +#define MT6360_PMU_ADC_RPT_2			(0x5B)
> +#define MT6360_PMU_ADC_RPT_3			(0x5C)
> +#define MT6360_PMU_ADC_RPT_ORG1			(0x5D)
> +#define MT6360_PMU_ADC_RPT_ORG2			(0x5E)
> +#define MT6360_PMU_BAT_OVP_TH_SEL_CODEH		(0x5F)
> +#define MT6360_PMU_BAT_OVP_TH_SEL_CODEL		(0x60)
> +#define MT6360_PMU_CHG_CTRL19			(0x61)
> +#define MT6360_PMU_VDDASUPPLY			(0x62)
> +#define MT6360_PMU_BC12_MANUAL			(0x63)
> +#define MT6360_PMU_CHGDET_FUNC			(0x64)
> +#define MT6360_PMU_FOD_CTRL			(0x65)
> +#define MT6360_PMU_CHG_CTRL20			(0x66)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL26		(0x67)
> +#define MT6360_PMU_CHG_HIDDEN_CTRL27		(0x68)
> +#define MT6360_PMU_RESV2			(0x69)
> +#define MT6360_PMU_USBID_CTRL1			(0x6D)
> +#define MT6360_PMU_USBID_CTRL2			(0x6E)
> +#define MT6360_PMU_USBID_CTRL3			(0x6F)
> +#define MT6360_PMU_FLED_CFG			(0x70)
> +#define MT6360_PMU_RESV3			(0x71)
> +#define MT6360_PMU_FLED1_CTRL			(0x72)
> +#define MT6360_PMU_FLED_STRB_CTRL		(0x73)
> +#define MT6360_PMU_FLED1_STRB_CTRL2		(0x74)
> +#define MT6360_PMU_FLED1_TOR_CTRL		(0x75)
> +#define MT6360_PMU_FLED2_CTRL			(0x76)
> +#define MT6360_PMU_RESV4			(0x77)
> +#define MT6360_PMU_FLED2_STRB_CTRL2		(0x78)
> +#define MT6360_PMU_FLED2_TOR_CTRL		(0x79)
> +#define MT6360_PMU_FLED_VMIDTRK_CTRL1		(0x7A)
> +#define MT6360_PMU_FLED_VMID_RTM		(0x7B)
> +#define MT6360_PMU_FLED_VMIDTRK_CTRL2		(0x7C)
> +#define MT6360_PMU_FLED_PWSEL			(0x7D)
> +#define MT6360_PMU_FLED_EN			(0x7E)
> +#define MT6360_PMU_FLED_Hidden1			(0x7F)
> +#define MT6360_PMU_RGB_EN			(0x80)
> +#define MT6360_PMU_RGB1_ISNK			(0x81)
> +#define MT6360_PMU_RGB2_ISNK			(0x82)
> +#define MT6360_PMU_RGB3_ISNK			(0x83)
> +#define MT6360_PMU_RGB_ML_ISNK			(0x84)
> +#define MT6360_PMU_RGB1_DIM			(0x85)
> +#define MT6360_PMU_RGB2_DIM			(0x86)
> +#define MT6360_PMU_RGB3_DIM			(0x87)
> +#define MT6360_PMU_RESV5			(0x88)
> +#define MT6360_PMU_RGB12_Freq			(0x89)
> +#define MT6360_PMU_RGB34_Freq			(0x8A)
> +#define MT6360_PMU_RGB1_Tr			(0x8B)
> +#define MT6360_PMU_RGB1_Tf			(0x8C)
> +#define MT6360_PMU_RGB1_TON_TOFF		(0x8D)
> +#define MT6360_PMU_RGB2_Tr			(0x8E)
> +#define MT6360_PMU_RGB2_Tf			(0x8F)
> +#define MT6360_PMU_RGB2_TON_TOFF		(0x90)
> +#define MT6360_PMU_RGB3_Tr			(0x91)
> +#define MT6360_PMU_RGB3_Tf			(0x92)
> +#define MT6360_PMU_RGB3_TON_TOFF		(0x93)
> +#define MT6360_PMU_RGB_Hidden_CTRL1		(0x94)
> +#define MT6360_PMU_RGB_Hidden_CTRL2		(0x95)
> +#define MT6360_PMU_RESV6			(0x97)
> +#define MT6360_PMU_SPARE1			(0x9A)
> +#define MT6360_PMU_SPARE2			(0xA0)
> +#define MT6360_PMU_SPARE3			(0xB0)
> +#define MT6360_PMU_SPARE4			(0xC0)
> +#define MT6360_PMU_CHG_IRQ1			(0xD0)
> +#define MT6360_PMU_CHG_IRQ2			(0xD1)
> +#define MT6360_PMU_CHG_IRQ3			(0xD2)
> +#define MT6360_PMU_CHG_IRQ4			(0xD3)
> +#define MT6360_PMU_CHG_IRQ5			(0xD4)
> +#define MT6360_PMU_CHG_IRQ6			(0xD5)
> +#define MT6360_PMU_QC_IRQ			(0xD6)
> +#define MT6360_PMU_FOD_IRQ			(0xD7)
> +#define MT6360_PMU_BASE_IRQ			(0xD8)
> +#define MT6360_PMU_FLED_IRQ1			(0xD9)
> +#define MT6360_PMU_FLED_IRQ2			(0xDA)
> +#define MT6360_PMU_RGB_IRQ			(0xDB)
> +#define MT6360_PMU_BUCK1_IRQ			(0xDC)
> +#define MT6360_PMU_BUCK2_IRQ			(0xDD)
> +#define MT6360_PMU_LDO_IRQ1			(0xDE)
> +#define MT6360_PMU_LDO_IRQ2			(0xDF)
> +#define MT6360_PMU_CHG_STAT1			(0xE0)
> +#define MT6360_PMU_CHG_STAT2			(0xE1)
> +#define MT6360_PMU_CHG_STAT3			(0xE2)
> +#define MT6360_PMU_CHG_STAT4			(0xE3)
> +#define MT6360_PMU_CHG_STAT5			(0xE4)
> +#define MT6360_PMU_CHG_STAT6			(0xE5)
> +#define MT6360_PMU_QC_STAT			(0xE6)
> +#define MT6360_PMU_FOD_STAT			(0xE7)
> +#define MT6360_PMU_BASE_STAT			(0xE8)
> +#define MT6360_PMU_FLED_STAT1			(0xE9)
> +#define MT6360_PMU_FLED_STAT2			(0xEA)
> +#define MT6360_PMU_RGB_STAT			(0xEB)
> +#define MT6360_PMU_BUCK1_STAT			(0xEC)
> +#define MT6360_PMU_BUCK2_STAT			(0xED)
> +#define MT6360_PMU_LDO_STAT1			(0xEE)
> +#define MT6360_PMU_LDO_STAT2			(0xEF)
> +#define MT6360_PMU_CHG_MASK1			(0xF0)
> +#define MT6360_PMU_CHG_MASK2			(0xF1)
> +#define MT6360_PMU_CHG_MASK3			(0xF2)
> +#define MT6360_PMU_CHG_MASK4			(0xF3)
> +#define MT6360_PMU_CHG_MASK5			(0xF4)
> +#define MT6360_PMU_CHG_MASK6			(0xF5)
> +#define MT6360_PMU_QC_MASK			(0xF6)
> +#define MT6360_PMU_FOD_MASK			(0xF7)
> +#define MT6360_PMU_BASE_MASK			(0xF8)
> +#define MT6360_PMU_FLED_MASK1			(0xF9)
> +#define MT6360_PMU_FLED_MASK2			(0xFA)
> +#define MT6360_PMU_FAULTB_MASK			(0xFB)
> +#define MT6360_PMU_BUCK1_MASK			(0xFC)
> +#define MT6360_PMU_BUCK2_MASK			(0xFD)
> +#define MT6360_PMU_LDO_MASK1			(0xFE)
> +#define MT6360_PMU_LDO_MASK2			(0xFF)
> +#define MT6360_PMU_MAXREG			(MT6360_PMU_LDO_MASK2)
> +
> +/* MT6360_PMU_IRQ_SET */
> +#define MT6360_PMU_IRQ_REGNUM	(MT6360_PMU_LDO_IRQ2 - MT6360_PMU_CHG_IRQ1 + 1)
> +#define MT6360_IRQ_RETRIG	BIT(2)
> +
> +#define CHIP_VEN_MASK				(0xF0)
> +#define CHIP_VEN_MT6360				(0x50)
> +#define CHIP_REV_MASK				(0x0F)
> +
> +#endif /* __MT6360_H__ */
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-03-04 14:56   ` Re: Matthias Brugger
@ 2020-03-04 15:15     ` Lee Jones
  -1 siblings, 0 replies; 1651+ messages in thread
From: Lee Jones @ 2020-03-04 15:15 UTC (permalink / raw)
  To: Matthias Brugger
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Gene Chen,
	Wilma.Wu, linux-arm-kernel, shufan_lee

On Wed, 04 Mar 2020, Matthias Brugger wrote:

> Please resend with appropiate commit message.

Please refrain from top-posting and don't forget to snip.

> On 03/03/2020 16:27, Gene Chen wrote:
> > Add mfd driver for mt6360 pmic chip include

Looks like your formatting is off.

How was this patch sent?

Best practice is to use `git send-email`.

> > Battery Charger/USB_PD/Flash LED/RGB LED/LDO/Buck
> > 
> > Signed-off-by: Gene Chen <gene_chen@richtek.com
> > ---
> >  drivers/mfd/Kconfig        |  12 ++
> >  drivers/mfd/Makefile       |   1 +
> >  drivers/mfd/mt6360-core.c  | 425 +++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/mfd/mt6360.h | 240 +++++++++++++++++++++++++
> >  4 files changed, 678 insertions(+)
> >  create mode 100644 drivers/mfd/mt6360-core.c
> >  create mode 100644 include/linux/mfd/mt6360.h

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-03-04 15:15     ` Lee Jones
  0 siblings, 0 replies; 1651+ messages in thread
From: Lee Jones @ 2020-03-04 15:15 UTC (permalink / raw)
  To: Matthias Brugger
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Gene Chen,
	Wilma.Wu, linux-arm-kernel, shufan_lee

On Wed, 04 Mar 2020, Matthias Brugger wrote:

> Please resend with appropiate commit message.

Please refrain from top-posting and don't forget to snip.

> On 03/03/2020 16:27, Gene Chen wrote:
> > Add mfd driver for mt6360 pmic chip include

Looks like your formatting is off.

How was this patch sent?

Best practice is to use `git send-email`.

> > Battery Charger/USB_PD/Flash LED/RGB LED/LDO/Buck
> > 
> > Signed-off-by: Gene Chen <gene_chen@richtek.com
> > ---
> >  drivers/mfd/Kconfig        |  12 ++
> >  drivers/mfd/Makefile       |   1 +
> >  drivers/mfd/mt6360-core.c  | 425 +++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/mfd/mt6360.h | 240 +++++++++++++++++++++++++
> >  4 files changed, 678 insertions(+)
> >  create mode 100644 drivers/mfd/mt6360-core.c
> >  create mode 100644 include/linux/mfd/mt6360.h

-- 
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-03-04 15:15     ` Re: Lee Jones
@ 2020-03-04 18:00       ` Matthias Brugger
  -1 siblings, 0 replies; 1651+ messages in thread
From: Matthias Brugger @ 2020-03-04 18:00 UTC (permalink / raw)
  To: Lee Jones
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Gene Chen,
	Wilma.Wu, linux-arm-kernel, shufan_lee



On 04/03/2020 16:15, Lee Jones wrote:
> On Wed, 04 Mar 2020, Matthias Brugger wrote:
> 
>> Please resend with appropiate commit message.
> 
> Please refrain from top-posting and don't forget to snip.
> 

It's difficult to write something below a missing subject line without
top-posting. ;)

Sorry for forgetting to snip.

Regards,
Matthias

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-03-04 18:00       ` Matthias Brugger
  0 siblings, 0 replies; 1651+ messages in thread
From: Matthias Brugger @ 2020-03-04 18:00 UTC (permalink / raw)
  To: Lee Jones
  Cc: gene_chen, linux-kernel, cy_huang, linux-mediatek, Gene Chen,
	Wilma.Wu, linux-arm-kernel, shufan_lee



On 04/03/2020 16:15, Lee Jones wrote:
> On Wed, 04 Mar 2020, Matthias Brugger wrote:
> 
>> Please resend with appropiate commit message.
> 
> Please refrain from top-posting and don't forget to snip.
> 

It's difficult to write something below a missing subject line without
top-posting. ;)

Sorry for forgetting to snip.

Regards,
Matthias

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-03-08 17:19 Francois Pinault
  0 siblings, 0 replies; 1651+ messages in thread
From: Francois Pinault @ 2020-03-08 17:19 UTC (permalink / raw)
  To: keyrings

A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-03-08 17:33 ` Francois Pinault
  0 siblings, 0 replies; 1651+ messages in thread
From: Francois Pinault @ 2020-03-08 17:33 UTC (permalink / raw)
  To: linux-fbdev

A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-03-08 17:33 ` Francois Pinault
  0 siblings, 0 replies; 1651+ messages in thread
From: Francois Pinault @ 2020-03-08 17:33 UTC (permalink / raw)


A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-03-08 17:33 Francois Pinault
  0 siblings, 0 replies; 1651+ messages in thread
From: Francois Pinault @ 2020-03-08 17:33 UTC (permalink / raw)


A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-03-08 19:12 Francois Pinault
  0 siblings, 0 replies; 1651+ messages in thread
From: Francois Pinault @ 2020-03-08 19:12 UTC (permalink / raw)
  To: target-devel

A donation was made in your favour by Francois Pinault, reply for more details.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2020-03-16 23:07 Sankalp Bhardwaj
  2020-03-17  9:13 ` Valdis Klētnieks
  0 siblings, 1 reply; 1651+ messages in thread
From: Sankalp Bhardwaj @ 2020-03-16 23:07 UTC (permalink / raw)
  To: kernelnewbies


[-- Attachment #1.1: Type: text/plain, Size: 122 bytes --]

Where to get started?? I am interested in understanding how the
kernel works but have no prior knowledge... Please help!!

[-- Attachment #1.2: Type: text/html, Size: 146 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-03-16 23:07 Sankalp Bhardwaj
@ 2020-03-17  9:13 ` Valdis Klētnieks
  2020-03-17 10:10   ` Re: suvrojit
  0 siblings, 1 reply; 1651+ messages in thread
From: Valdis Klētnieks @ 2020-03-17  9:13 UTC (permalink / raw)
  To: Sankalp Bhardwaj; +Cc: kernelnewbies

[-- Attachment #1.1: Type: text/plain, Size: 1820 bytes --]

On Tue, 17 Mar 2020 04:37:58 +0530, Sankalp Bhardwaj said:

> Where to get started?? I am interested in understanding how the
> kernel works but have no prior knowledge... Please help!!

A good place to start is to realize that the answers often depend on what the
question is - and there's usually a difference between the question that is
asked, and the question that the person needs the answer for.  You probably
want to read this:

https://lists.kernelnewbies.org/pipermail/kernelnewbies/2017-April/017765.html

Something that you'll need is a good understanding of operating system
concepts. Almost all modern computer systems have some idea of basic concepts
such as processes, files, a directory structure, security and permissions,
scheduling, locking, and so on.  And for most of these, there is more than one
way to accomplish the goal.

So two books that are useful to read for a compare-and-contrast view are Bach's
book on the System V kernel, and McKusic's book on the BSD kernel - both go
into details of *why* some things are done they are.  It's really helpful to
see stuff like "We need to lock this inode while we do X, because otherwise
another thread could concurrently do Y, and then Bad Thing Z will happen".

Of course, a Linux filesystem that does things differently won't have the same
exact issues, but understanding the *sort* of things that break when you screw
up your locking is quite the useful info, especially if most of your coding has
been in userspace where single-threaded is common and libraries did their own
locking when needed.

I admit that I also learned a bunch from Tanenbaum's "Modern Operating
Systems", but that was a long long time ago in a galaxy far far away, and I
have no idea what the cool kids are reading instead these days...

[-- Attachment #1.2: Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-03-17  9:13 ` Valdis Klētnieks
@ 2020-03-17 10:10   ` suvrojit
  0 siblings, 0 replies; 1651+ messages in thread
From: suvrojit @ 2020-03-17 10:10 UTC (permalink / raw)
  To: Valdis Klētnieks; +Cc: kernelnewbies, Sankalp Bhardwaj


[-- Attachment #1.1: Type: text/plain, Size: 2247 bytes --]

ULK by Bovet Cessati is the book u should start reading Sankalp

On Tue, Mar 17, 2020, 2:44 PM Valdis Klētnieks <valdis.kletnieks@vt.edu>
wrote:

> On Tue, 17 Mar 2020 04:37:58 +0530, Sankalp Bhardwaj said:
>
> > Where to get started?? I am interested in understanding how the
> > kernel works but have no prior knowledge... Please help!!
>
> A good place to start is to realize that the answers often depend on what
> the
> question is - and there's usually a difference between the question that is
> asked, and the question that the person needs the answer for.  You probably
> want to read this:
>
>
> https://lists.kernelnewbies.org/pipermail/kernelnewbies/2017-April/017765.html
>
> Something that you'll need is a good understanding of operating system
> concepts. Almost all modern computer systems have some idea of basic
> concepts
> such as processes, files, a directory structure, security and permissions,
> scheduling, locking, and so on.  And for most of these, there is more than
> one
> way to accomplish the goal.
>
> So two books that are useful to read for a compare-and-contrast view are
> Bach's
> book on the System V kernel, and McKusic's book on the BSD kernel - both go
> into details of *why* some things are done they are.  It's really helpful
> to
> see stuff like "We need to lock this inode while we do X, because otherwise
> another thread could concurrently do Y, and then Bad Thing Z will happen".
>
> Of course, a Linux filesystem that does things differently won't have the
> same
> exact issues, but understanding the *sort* of things that break when you
> screw
> up your locking is quite the useful info, especially if most of your
> coding has
> been in userspace where single-threaded is common and libraries did their
> own
> locking when needed.
>
> I admit that I also learned a bunch from Tanenbaum's "Modern Operating
> Systems", but that was a long long time ago in a galaxy far far away, and I
> have no idea what the cool kids are reading instead these days...
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

[-- Attachment #1.2: Type: text/html, Size: 2959 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CALeDE9OeBx6v6nGVjeydgF1vpfX1Bus319h3M1=49PMETdaCtw@mail.gmail.com>]

* Re:
       [not found] <CALeDE9OeBx6v6nGVjeydgF1vpfX1Bus319h3M1=49PMETdaCtw@mail.gmail.com>
@ 2020-03-20 11:49 ` Josh Boyer
  0 siblings, 0 replies; 1651+ messages in thread
From: Josh Boyer @ 2020-03-20 11:49 UTC (permalink / raw)
  To: Peter Robinson; +Cc: Linux Firmware

On Fri, Mar 20, 2020 at 7:47 AM Peter Robinson <pbrobinson@gmail.com> wrote:
>
> subscribe

linux-firmware isn't a mailing list.  It's an alias for the
maintainers.  You can find a lore instance for it though!

https://lore.kernel.org/linux-firmware/

josh

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown)
@ 2020-03-27  8:36 chenanqing
  2020-03-27  8:59 ` Ilya Dryomov
  0 siblings, 1 reply; 1651+ messages in thread
From: chenanqing @ 2020-03-27  8:36 UTC (permalink / raw)
  To: chenanqing, linux-kernel, netdev, ceph-devel, kuba, sage, jlayton,
	idryomov

From: Chen Anqing <chenanqing@oppo.com>
To: Ilya Dryomov <idryomov@gmail.com>
Cc: Jeff Layton <jlayton@kernel.org>,
        Sage Weil <sage@redhat.com>,
        Jakub Kicinski <kuba@kernel.org>,
        ceph-devel@vger.kernel.org,
        netdev@vger.kernel.org,
        linux-kernel@vger.kernel.org,
        chenanqing@oppo.com
Subject: [PATCH] libceph: we should take compound page into account also
Date: Fri, 27 Mar 2020 04:36:30 -0400
Message-Id: <20200327083630.36296-1-chenanqing@oppo.com>
X-Mailer: git-send-email 2.18.2

the patch is occur at a real crash,which slab is
come from a compound page,so we need take the compound page
into account also.
fixed commit 7e241f647dc7 ("libceph: fall back to sendmsg for slab pages")'

Signed-off-by: Chen Anqing <chenanqing@oppo.com>
---
 net/ceph/messenger.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index f8ca5edc5f2c..e08c1c334cd9 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -582,7 +582,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
         * coalescing neighboring slab objects into a single frag which
         * triggers one of hardened usercopy checks.
         */
-       if (page_count(page) >= 1 && !PageSlab(page))
+       if (page_count(page) >= 1 && !PageSlab(compound_head(page)))
                sendpage = sock->ops->sendpage;
        else
                sendpage = sock_no_sendpage;
--
2.18.2

________________________________
OPPO

本电子邮件及其附件含有OPPO公司的保密信息，仅限于邮件指明的收件人使用（包含个人及群组）。禁止任何人在未经授权的情况下以任何形式使用。如果您错收了本邮件，请立即以电子邮件通知发件人并删除本邮件及其附件。

This e-mail and its attachments contain confidential information from OPPO, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2020-03-27  8:36 (unknown) chenanqing
@ 2020-03-27  8:59 ` Ilya Dryomov
  0 siblings, 0 replies; 1651+ messages in thread
From: Ilya Dryomov @ 2020-03-27  8:59 UTC (permalink / raw)
  To: chenanqing; +Cc: LKML, netdev, Ceph Development, kuba, Sage Weil, Jeff Layton

On Fri, Mar 27, 2020 at 9:36 AM <chenanqing@oppo.com> wrote:
>
> From: Chen Anqing <chenanqing@oppo.com>
> To: Ilya Dryomov <idryomov@gmail.com>
> Cc: Jeff Layton <jlayton@kernel.org>,
>         Sage Weil <sage@redhat.com>,
>         Jakub Kicinski <kuba@kernel.org>,
>         ceph-devel@vger.kernel.org,
>         netdev@vger.kernel.org,
>         linux-kernel@vger.kernel.org,
>         chenanqing@oppo.com
> Subject: [PATCH] libceph: we should take compound page into account also
> Date: Fri, 27 Mar 2020 04:36:30 -0400
> Message-Id: <20200327083630.36296-1-chenanqing@oppo.com>
> X-Mailer: git-send-email 2.18.2
>
> the patch is occur at a real crash,which slab is
> come from a compound page,so we need take the compound page
> into account also.
> fixed commit 7e241f647dc7 ("libceph: fall back to sendmsg for slab pages")'
>
> Signed-off-by: Chen Anqing <chenanqing@oppo.com>
> ---
>  net/ceph/messenger.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index f8ca5edc5f2c..e08c1c334cd9 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -582,7 +582,7 @@ static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
>          * coalescing neighboring slab objects into a single frag which
>          * triggers one of hardened usercopy checks.
>          */
> -       if (page_count(page) >= 1 && !PageSlab(page))
> +       if (page_count(page) >= 1 && !PageSlab(compound_head(page)))
>                 sendpage = sock->ops->sendpage;
>         else
>                 sendpage = sock_no_sendpage;

Hi Chen,

AFAICT compound pages should already be taken into account, because
PageSlab is defined as:

  __PAGEFLAG(Slab, slab, PF_NO_TAIL)

  #define __PAGEFLAG(uname, lname, policy)                       \
      TESTPAGEFLAG(uname, lname, policy)                         \
      __SETPAGEFLAG(uname, lname, policy)                        \
      __CLEARPAGEFLAG(uname, lname, policy)

  #define TESTPAGEFLAG(uname, lname, policy)                     \
  static __always_inline int Page##uname(struct page *page)      \
      { return test_bit(PG_##lname, &policy(page, 0)->flags); }

and PF_NO_TAIL policy is defined as:

  #define PF_NO_TAIL(page, enforce) ({                        \
      VM_BUG_ON_PGFLAGS(enforce && PageTail(page), page);     \
      PF_POISONED_CHECK(compound_head(page)); })

So compound_head() is called behind the scenes.

Could you please explain what crash did you observe in more detail?
Perhaps you backported this patch to an older kernel?

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown)
@ 2020-03-27  9:20 chenanqing
       [not found] ` <5e7dc543.vYG3wru8B/me1sOV%chenanqing-Oq79sGaMObY@public.gmane.org>
  0 siblings, 1 reply; 1651+ messages in thread
From: chenanqing @ 2020-03-27  9:20 UTC (permalink / raw)
  To: chenanqing, linux-kernel, linux-scsi, open-iscsi, ceph-devel,
	martin.petersen, jejb, cleech, lduncan

From: Chen Anqing <chenanqing@oppo.com>
To: Lee Duncan <lduncan@suse.com>
Cc: Chris Leech <cleech@redhat.com>,
        "James E . J . Bottomley" <jejb@linux.ibm.com>,
        "Martin K . Petersen" <martin.petersen@oracle.com>,
        ceph-devel@vger.kernel.org,
        open-iscsi@googlegroups.com,
        linux-scsi@vger.kernel.org,
        linux-kernel@vger.kernel.org,
        chenanqing@oppo.com
Subject: [PATCH] scsi: libiscsi: we should take compound page into account also
Date: Fri, 27 Mar 2020 05:20:01 -0400
Message-Id: <20200327092001.56879-1-chenanqing@oppo.com>
X-Mailer: git-send-email 2.18.2

the patch is occur at a real crash,which slab is
come from a compound page,so we need take the compound page
into account also.
fixed commit 08b11eaccfcf ("scsi: libiscsi: fall back to
sendmsg for slab pages").

Signed-off-by: Chen Anqing <chenanqing@oppo.com>
---
 drivers/scsi/libiscsi_tcp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c
index 6ef93c7af954..98304e5e1f6f 100644
--- a/drivers/scsi/libiscsi_tcp.c
+++ b/drivers/scsi/libiscsi_tcp.c
@@ -128,7 +128,8 @@ static void iscsi_tcp_segment_map(struct iscsi_segment *segment, int recv)
         * coalescing neighboring slab objects into a single frag which
         * triggers one of hardened usercopy checks.
         */
-       if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))
+       if (!recv && page_count(sg_page(sg)) >= 1 &&
+           !PageSlab(compound_head(sg_page(sg))))
                return;

        if (recv) {
--
2.18.2

________________________________
OPPO

本电子邮件及其附件含有OPPO公司的保密信息，仅限于邮件指明的收件人使用（包含个人及群组）。禁止任何人在未经授权的情况下以任何形式使用。如果您错收了本邮件，请立即以电子邮件通知发件人并删除本邮件及其附件。

This e-mail and its attachments contain confidential information from OPPO, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

[parent not found: <5e7dc543.vYG3wru8B/me1sOV%chenanqing-Oq79sGaMObY@public.gmane.org>]

* Re:
  2020-03-27  9:20 (unknown) chenanqing
@ 2020-03-27 15:53     ` Lee Duncan
  0 siblings, 0 replies; 1651+ messages in thread
From: Lee Duncan @ 2020-03-27 15:53 UTC (permalink / raw)
  To: chenanqing-Oq79sGaMObY, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	open-iscsi-/JYPxA39Uh5TLH3MbocFFw,
	ceph-devel-u79uwXL29TY76Z2rM5mHXA,
	martin.petersen-QHcLZuEGTsvQT0dZR+AlfA,
	jejb-tEXmvtCZX7AybS5Ee8rs3A, cleech-H+wXaHxf7aLQT0dZR+AlfA

On 3/27/20 2:20 AM, chenanqing-Oq79sGaMObY@public.gmane.org wrote:
> From: Chen Anqing <chenanqing-Oq79sGaMObY@public.gmane.org>
> To: Lee Duncan <lduncan-IBi9RG/b67k@public.gmane.org>
> Cc: Chris Leech <cleech-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
>         "James E . J . Bottomley" <jejb-tEXmvtCZX7AybS5Ee8rs3A@public.gmane.org>,
>         "Martin K . Petersen" <martin.petersen-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
>         ceph-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
>         open-iscsi-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org,
>         linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
>         linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
>         chenanqing-Oq79sGaMObY@public.gmane.org
> Subject: [PATCH] scsi: libiscsi: we should take compound page into account also
> Date: Fri, 27 Mar 2020 05:20:01 -0400
> Message-Id: <20200327092001.56879-1-chenanqing-Oq79sGaMObY@public.gmane.org>
> X-Mailer: git-send-email 2.18.2
> 
> the patch is occur at a real crash,which slab is
> come from a compound page,so we need take the compound page
> into account also.
> fixed commit 08b11eaccfcf ("scsi: libiscsi: fall back to
> sendmsg for slab pages").
> 
> Signed-off-by: Chen Anqing <chenanqing-Oq79sGaMObY@public.gmane.org>
> ---
>  drivers/scsi/libiscsi_tcp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c
> index 6ef93c7af954..98304e5e1f6f 100644
> --- a/drivers/scsi/libiscsi_tcp.c
> +++ b/drivers/scsi/libiscsi_tcp.c
> @@ -128,7 +128,8 @@ static void iscsi_tcp_segment_map(struct iscsi_segment *segment, int recv)
>          * coalescing neighboring slab objects into a single frag which
>          * triggers one of hardened usercopy checks.
>          */
> -       if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))
> +       if (!recv && page_count(sg_page(sg)) >= 1 &&
> +           !PageSlab(compound_head(sg_page(sg))))
>                 return;
> 
>         if (recv) {
> --
> 2.18.2
> 


This is missing a proper subject ...

-- 
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/open-iscsi/5462bc04-8409-a0c3-628f-640d1c92b8c6%40suse.com.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-03-27 15:53     ` Lee Duncan
  0 siblings, 0 replies; 1651+ messages in thread
From: Lee Duncan @ 2020-03-27 15:53 UTC (permalink / raw)
  To: chenanqing, linux-kernel, linux-scsi, open-iscsi, ceph-devel,
	martin.petersen, jejb, cleech

On 3/27/20 2:20 AM, chenanqing@oppo.com wrote:
> From: Chen Anqing <chenanqing@oppo.com>
> To: Lee Duncan <lduncan@suse.com>
> Cc: Chris Leech <cleech@redhat.com>,
>         "James E . J . Bottomley" <jejb@linux.ibm.com>,
>         "Martin K . Petersen" <martin.petersen@oracle.com>,
>         ceph-devel@vger.kernel.org,
>         open-iscsi@googlegroups.com,
>         linux-scsi@vger.kernel.org,
>         linux-kernel@vger.kernel.org,
>         chenanqing@oppo.com
> Subject: [PATCH] scsi: libiscsi: we should take compound page into account also
> Date: Fri, 27 Mar 2020 05:20:01 -0400
> Message-Id: <20200327092001.56879-1-chenanqing@oppo.com>
> X-Mailer: git-send-email 2.18.2
> 
> the patch is occur at a real crash,which slab is
> come from a compound page,so we need take the compound page
> into account also.
> fixed commit 08b11eaccfcf ("scsi: libiscsi: fall back to
> sendmsg for slab pages").
> 
> Signed-off-by: Chen Anqing <chenanqing@oppo.com>
> ---
>  drivers/scsi/libiscsi_tcp.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c
> index 6ef93c7af954..98304e5e1f6f 100644
> --- a/drivers/scsi/libiscsi_tcp.c
> +++ b/drivers/scsi/libiscsi_tcp.c
> @@ -128,7 +128,8 @@ static void iscsi_tcp_segment_map(struct iscsi_segment *segment, int recv)
>          * coalescing neighboring slab objects into a single frag which
>          * triggers one of hardened usercopy checks.
>          */
> -       if (!recv && page_count(sg_page(sg)) >= 1 && !PageSlab(sg_page(sg)))
> +       if (!recv && page_count(sg_page(sg)) >= 1 &&
> +           !PageSlab(compound_head(sg_page(sg))))
>                 return;
> 
>         if (recv) {
> --
> 2.18.2
> 


This is missing a proper subject ...


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAPXXXSDVGeEK_NCSkDMwTpuvVxYkWGdQk=L=bz+RN4XLiGZmcg@mail.gmail.com>]

[parent not found: <CAPXXXSBYcU1QamovmP-gVTXms67Xi_QpMCV=V3570q1nnuWqNw@mail.gmail.com>]

* Re:
       [not found] ` <CAPXXXSBYcU1QamovmP-gVTXms67Xi_QpMCV=V3570q1nnuWqNw@mail.gmail.com>
@ 2020-04-04 21:05   ` Ruslan Bilovol
  2020-04-05  1:27       ` Re: Alan Stern
  0 siblings, 1 reply; 1651+ messages in thread
From: Ruslan Bilovol @ 2020-04-04 21:05 UTC (permalink / raw)
  To: Colin Williams, Linux USB, alsa-devel

Hi,

Please also add to CC related mailing lists (alsa-devel, linux-usb) rather
then directly emailing - community may also help with the issue. Also it can be
googled so if somebody else have same issue it can find answers faster.

On Fri, Apr 3, 2020 at 10:56 AM Colin Williams
<colin.williams.seattle@gmail.com> wrote:
>
> https://ubuntuforums.org/showthread.php?t=2439897
>
> On Thu, Apr 2, 2020 at 4:50 PM Colin Williams <colin.williams.seattle@gmail.com> wrote:
>>
>> Hello,
>>
>> Is it possible that one of these commits or related broke support for the Blue Mic Yeti?
>>
>> https://github.com/torvalds/linux/blame/ac438771ccb4479528594c7e19f2c39cf1814a86/sound/usb/stream.c#L816

Tha'ts workaround to ignore last altsetting which is the same as previous.
During UAC3 implementation, I reimplemented that workaround carefully,
but I didn't have (and still do not own) any Blue Mic USB device.
I don't know whether it was tested after that by anyone.

>>
>> I am getting the following when I plug my mic in:

Which kernel version is that? Have you tried latest Linux Kernel?

>>
>> [ 1283.848740] usb 1-1.2: new full-speed USB device number 82 using ehci-pci
>> [ 1283.964802] usb 1-1.2: New USB device found, idVendor=b58e, idProduct=9e84, bcdDevice= 1.00
>> [ 1283.964808] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
>> [ 1283.964810] usb 1-1.2: Product: Yeti Stereo Microphone
>> [ 1283.964812] usb 1-1.2: Manufacturer: Blue Microphones
>> [ 1284.080671] usb 1-1.3: new low-speed USB device number 83 using ehci-pci
>> [ 1284.784678] usb 1-1.3: device descriptor read/64, error -32
>> [ 1285.180674] usb 1-1.3: device descriptor read/64, error -32
>> [ 1285.992682] usb 1-1.3: new low-speed USB device number 84 using ehci-pci
>> [ 1286.696672] usb 1-1.3: device descriptor read/64, error -32
>> [ 1287.092695] usb 1-1.3: device descriptor read/64, error -32
>> [ 1287.200804] usb 1-1-port3: attempt power cycle
>> [ 1287.804662] usb 1-1.3: new low-speed USB device number 85 using ehci-pci
>> [ 1288.220686] usb 1-1.3: device not accepting address 85, error -32
>> [ 1288.508685] usb 1-1.3: new low-speed USB device number 86 using ehci-pci
>> [ 1288.924690] usb 1-1.3: device not accepting address 86, error -32
>> [ 1288.924916] usb 1-1-port3: unable to enumerate USB device
>> [ 1288.925391] usb 1-1.2: USB disconnect, device number 82
>> [ 1289.308736] usb 1-1.3: new low-speed USB device number 87 using ehci-pci
>> [ 1289.596727] usb 1-1.3: device descriptor read/64, error -32
>> [ 1289.992635] usb 1-1.3: device descriptor read/64, error -32
>> [ 1290.596683] usb 1-1.3: new low-speed USB device number 88 using ehci-pci
>> [ 1290.888718] usb 1-1.3: device descriptor read/64, error -32
>> [ 1291.284673] usb 1-1.3: device descriptor read/64, error -32
>> [ 1291.392928] usb 1-1-port3: attempt power cycle

Looking at this log, it seems the issue happens during enumeration,
so mentioned workaround isn't executed yet at this moment.
So it seems this is related to USB core, not to ALSA driver.

Thanks,
Ruslan

>>
>> Furthermore, there is some evidence this is happening to other users:
>>
>> https://askubuntu.com/questions/1027188/external-yeti-michrophone-not-detected-on-ubuntu-18-04
>>
>> Best Regards,
>>
>> Colin Williams
>>
>>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-04-04 21:05   ` Re: Ruslan Bilovol
@ 2020-04-05  1:27       ` Alan Stern
  0 siblings, 0 replies; 1651+ messages in thread
From: Alan Stern @ 2020-04-05  1:27 UTC (permalink / raw)
  To: Ruslan Bilovol; +Cc: alsa-devel, Linux USB, Colin Williams

On Sun, 5 Apr 2020, Ruslan Bilovol wrote:

> Hi,
> 
> Please also add to CC related mailing lists (alsa-devel, linux-usb) rather
> then directly emailing - community may also help with the issue. Also it can be
> googled so if somebody else have same issue it can find answers faster.
> 
> On Fri, Apr 3, 2020 at 10:56 AM Colin Williams
> <colin.williams.seattle@gmail.com> wrote:
> >
> > https://ubuntuforums.org/showthread.php?t=2439897
> >
> > On Thu, Apr 2, 2020 at 4:50 PM Colin Williams <colin.williams.seattle@gmail.com> wrote:
> >>
> >> Hello,
> >>
> >> Is it possible that one of these commits or related broke support for the Blue Mic Yeti?
> >>
> >> https://github.com/torvalds/linux/blame/ac438771ccb4479528594c7e19f2c39cf1814a86/sound/usb/stream.c#L816
> 
> Tha'ts workaround to ignore last altsetting which is the same as previous.
> During UAC3 implementation, I reimplemented that workaround carefully,
> but I didn't have (and still do not own) any Blue Mic USB device.
> I don't know whether it was tested after that by anyone.
> 
> >>
> >> I am getting the following when I plug my mic in:
> 
> Which kernel version is that? Have you tried latest Linux Kernel?
> 
> >>
> >> [ 1283.848740] usb 1-1.2: new full-speed USB device number 82 using ehci-pci
> >> [ 1283.964802] usb 1-1.2: New USB device found, idVendor=b58e, idProduct=9e84, bcdDevice= 1.00
> >> [ 1283.964808] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> >> [ 1283.964810] usb 1-1.2: Product: Yeti Stereo Microphone
> >> [ 1283.964812] usb 1-1.2: Manufacturer: Blue Microphones
> >> [ 1284.080671] usb 1-1.3: new low-speed USB device number 83 using ehci-pci
> >> [ 1284.784678] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1285.180674] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1285.992682] usb 1-1.3: new low-speed USB device number 84 using ehci-pci
> >> [ 1286.696672] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1287.092695] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1287.200804] usb 1-1-port3: attempt power cycle
> >> [ 1287.804662] usb 1-1.3: new low-speed USB device number 85 using ehci-pci
> >> [ 1288.220686] usb 1-1.3: device not accepting address 85, error -32
> >> [ 1288.508685] usb 1-1.3: new low-speed USB device number 86 using ehci-pci
> >> [ 1288.924690] usb 1-1.3: device not accepting address 86, error -32
> >> [ 1288.924916] usb 1-1-port3: unable to enumerate USB device
> >> [ 1288.925391] usb 1-1.2: USB disconnect, device number 82
> >> [ 1289.308736] usb 1-1.3: new low-speed USB device number 87 using ehci-pci
> >> [ 1289.596727] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1289.992635] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1290.596683] usb 1-1.3: new low-speed USB device number 88 using ehci-pci
> >> [ 1290.888718] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1291.284673] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1291.392928] usb 1-1-port3: attempt power cycle
> 
> Looking at this log, it seems the issue happens during enumeration,
> so mentioned workaround isn't executed yet at this moment.
> So it seems this is related to USB core, not to ALSA driver.

All those errors were for the 1-1.3 device.  The microphone was 1-1.2.
It's not clear from the log above what the relationship between those 
two devices is, but it sure looks like the microphone was enumerated 
okay.

What shows up in /sys/kernel/debug/usb/devices?

Alan Stern


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-04-05  1:27       ` Alan Stern
  0 siblings, 0 replies; 1651+ messages in thread
From: Alan Stern @ 2020-04-05  1:27 UTC (permalink / raw)
  To: Ruslan Bilovol; +Cc: Colin Williams, Linux USB, alsa-devel

On Sun, 5 Apr 2020, Ruslan Bilovol wrote:

> Hi,
> 
> Please also add to CC related mailing lists (alsa-devel, linux-usb) rather
> then directly emailing - community may also help with the issue. Also it can be
> googled so if somebody else have same issue it can find answers faster.
> 
> On Fri, Apr 3, 2020 at 10:56 AM Colin Williams
> <colin.williams.seattle@gmail.com> wrote:
> >
> > https://ubuntuforums.org/showthread.php?t=2439897
> >
> > On Thu, Apr 2, 2020 at 4:50 PM Colin Williams <colin.williams.seattle@gmail.com> wrote:
> >>
> >> Hello,
> >>
> >> Is it possible that one of these commits or related broke support for the Blue Mic Yeti?
> >>
> >> https://github.com/torvalds/linux/blame/ac438771ccb4479528594c7e19f2c39cf1814a86/sound/usb/stream.c#L816
> 
> Tha'ts workaround to ignore last altsetting which is the same as previous.
> During UAC3 implementation, I reimplemented that workaround carefully,
> but I didn't have (and still do not own) any Blue Mic USB device.
> I don't know whether it was tested after that by anyone.
> 
> >>
> >> I am getting the following when I plug my mic in:
> 
> Which kernel version is that? Have you tried latest Linux Kernel?
> 
> >>
> >> [ 1283.848740] usb 1-1.2: new full-speed USB device number 82 using ehci-pci
> >> [ 1283.964802] usb 1-1.2: New USB device found, idVendor=b58e, idProduct=9e84, bcdDevice= 1.00
> >> [ 1283.964808] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
> >> [ 1283.964810] usb 1-1.2: Product: Yeti Stereo Microphone
> >> [ 1283.964812] usb 1-1.2: Manufacturer: Blue Microphones
> >> [ 1284.080671] usb 1-1.3: new low-speed USB device number 83 using ehci-pci
> >> [ 1284.784678] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1285.180674] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1285.992682] usb 1-1.3: new low-speed USB device number 84 using ehci-pci
> >> [ 1286.696672] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1287.092695] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1287.200804] usb 1-1-port3: attempt power cycle
> >> [ 1287.804662] usb 1-1.3: new low-speed USB device number 85 using ehci-pci
> >> [ 1288.220686] usb 1-1.3: device not accepting address 85, error -32
> >> [ 1288.508685] usb 1-1.3: new low-speed USB device number 86 using ehci-pci
> >> [ 1288.924690] usb 1-1.3: device not accepting address 86, error -32
> >> [ 1288.924916] usb 1-1-port3: unable to enumerate USB device
> >> [ 1288.925391] usb 1-1.2: USB disconnect, device number 82
> >> [ 1289.308736] usb 1-1.3: new low-speed USB device number 87 using ehci-pci
> >> [ 1289.596727] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1289.992635] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1290.596683] usb 1-1.3: new low-speed USB device number 88 using ehci-pci
> >> [ 1290.888718] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1291.284673] usb 1-1.3: device descriptor read/64, error -32
> >> [ 1291.392928] usb 1-1-port3: attempt power cycle
> 
> Looking at this log, it seems the issue happens during enumeration,
> so mentioned workaround isn't executed yet at this moment.
> So it seems this is related to USB core, not to ALSA driver.

All those errors were for the 1-1.3 device.  The microphone was 1-1.2.
It's not clear from the log above what the relationship between those 
two devices is, but it sure looks like the microphone was enumerated 
okay.

What shows up in /sys/kernel/debug/usb/devices?

Alan Stern


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAPXXXSBLHYdHNSS4aM2Ax07+GQSB1WbPziOrk0iVWf-LXLmQRg@mail.gmail.com>]

[parent not found: <CAPXXXSAajets4AqcBKt8aRd8V1AL4bjAmCyuBOKr8qBG-AHO1A@mail.gmail.com>]

* Re:
       [not found]         ` <CAPXXXSAajets4AqcBKt8aRd8V1AL4bjAmCyuBOKr8qBG-AHO1A@mail.gmail.com>
@ 2020-04-05  2:51           ` Colin Williams
  0 siblings, 0 replies; 1651+ messages in thread
From: Colin Williams @ 2020-04-05  2:51 UTC (permalink / raw)
  To: Alan Stern; +Cc: Ruslan Bilovol, Linux USB, alsa-devel

Hello all


This is embarrassing but I think my issues were due to a bad USB cable.


Thank You


On Sat, Apr 4, 2020 at 7:50 PM Colin Williams
<colin.williams.seattle@gmail.com> wrote:
>
> Hello all
>
>
> This is embarrassing but I think my issues were due to a bad USB cable.
>
>
> Thank You
>
>>
>> On Sat, Apr 4, 2020 at 6:27 PM Alan Stern <stern@rowland.harvard.edu> wrote:
>>>
>>> On Sun, 5 Apr 2020, Ruslan Bilovol wrote:
>>>
>>> > Hi,
>>> >
>>> > Please also add to CC related mailing lists (alsa-devel, linux-usb) rather
>>> > then directly emailing - community may also help with the issue. Also it can be
>>> > googled so if somebody else have same issue it can find answers faster.
>>> >
>>> > On Fri, Apr 3, 2020 at 10:56 AM Colin Williams
>>> > <colin.williams.seattle@gmail.com> wrote:
>>> > >
>>> > > https://ubuntuforums.org/showthread.php?t=2439897
>>> > >
>>> > > On Thu, Apr 2, 2020 at 4:50 PM Colin Williams <colin.williams.seattle@gmail.com> wrote:
>>> > >>
>>> > >> Hello,
>>> > >>
>>> > >> Is it possible that one of these commits or related broke support for the Blue Mic Yeti?
>>> > >>
>>> > >> https://github.com/torvalds/linux/blame/ac438771ccb4479528594c7e19f2c39cf1814a86/sound/usb/stream.c#L816
>>> >
>>> > Tha'ts workaround to ignore last altsetting which is the same as previous.
>>> > During UAC3 implementation, I reimplemented that workaround carefully,
>>> > but I didn't have (and still do not own) any Blue Mic USB device.
>>> > I don't know whether it was tested after that by anyone.
>>> >
>>> > >>
>>> > >> I am getting the following when I plug my mic in:
>>> >
>>> > Which kernel version is that? Have you tried latest Linux Kernel?
>>> >
>>> > >>
>>> > >> [ 1283.848740] usb 1-1.2: new full-speed USB device number 82 using ehci-pci
>>> > >> [ 1283.964802] usb 1-1.2: New USB device found, idVendor=b58e, idProduct=9e84, bcdDevice= 1.00
>>> > >> [ 1283.964808] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
>>> > >> [ 1283.964810] usb 1-1.2: Product: Yeti Stereo Microphone
>>> > >> [ 1283.964812] usb 1-1.2: Manufacturer: Blue Microphones
>>> > >> [ 1284.080671] usb 1-1.3: new low-speed USB device number 83 using ehci-pci
>>> > >> [ 1284.784678] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1285.180674] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1285.992682] usb 1-1.3: new low-speed USB device number 84 using ehci-pci
>>> > >> [ 1286.696672] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1287.092695] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1287.200804] usb 1-1-port3: attempt power cycle
>>> > >> [ 1287.804662] usb 1-1.3: new low-speed USB device number 85 using ehci-pci
>>> > >> [ 1288.220686] usb 1-1.3: device not accepting address 85, error -32
>>> > >> [ 1288.508685] usb 1-1.3: new low-speed USB device number 86 using ehci-pci
>>> > >> [ 1288.924690] usb 1-1.3: device not accepting address 86, error -32
>>> > >> [ 1288.924916] usb 1-1-port3: unable to enumerate USB device
>>> > >> [ 1288.925391] usb 1-1.2: USB disconnect, device number 82
>>> > >> [ 1289.308736] usb 1-1.3: new low-speed USB device number 87 using ehci-pci
>>> > >> [ 1289.596727] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1289.992635] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1290.596683] usb 1-1.3: new low-speed USB device number 88 using ehci-pci
>>> > >> [ 1290.888718] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1291.284673] usb 1-1.3: device descriptor read/64, error -32
>>> > >> [ 1291.392928] usb 1-1-port3: attempt power cycle
>>> >
>>> > Looking at this log, it seems the issue happens during enumeration,
>>> > so mentioned workaround isn't executed yet at this moment.
>>> > So it seems this is related to USB core, not to ALSA driver.
>>>
>>> All those errors were for the 1-1.3 device.  The microphone was 1-1.2.
>>> It's not clear from the log above what the relationship between those
>>> two devices is, but it sure looks like the microphone was enumerated
>>> okay.
>>>
>>> What shows up in /sys/kernel/debug/usb/devices?
>>>
>>> Alan Stern
>>>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-04-18 12:26 Levi Brown
  0 siblings, 0 replies; 1651+ messages in thread
From: Levi Brown @ 2020-04-18 12:26 UTC (permalink / raw)
  To: linux-sh

LS0gDQrjgYLjgarjgZ/jgajoqbHjgZfjgZ/jgYTjgafjgZnjgIIg56eB44Gu5Lul5YmN44Gu44Oh
44O844Or44KS5Y+X44GR5Y+W44KK44G+44GX44Gf44GL77yfDQo

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2020-05-06  5:52 Jiaxun Yang
  2020-05-06 17:17 ` Nick Desaulniers
  0 siblings, 1 reply; 1651+ messages in thread
From: Jiaxun Yang @ 2020-05-06  5:52 UTC (permalink / raw)
  To: linux-mips
  Cc: Jiaxun Yang, clang-built-linux, Maciej W . Rozycki, Fangrui Song,
	Kees Cook, Nathan Chancellor, Thomas Bogendoerfer, Paul Burton,
	Masahiro Yamada, Jouni Hogander, Kevin Darbyshire-Bryant,
	Borislav Petkov, Heiko Carstens, linux-kernel

Subject: [PATCH v6] MIPS: Truncate link address into 32bit for 32bit kernel
In-Reply-To: <20200413062651.3992652-1-jiaxun.yang@flygoat.com>

LLD failed to link vmlinux with 64bit load address for 32bit ELF
while bfd will strip 64bit address into 32bit silently.
To fix LLD build, we should truncate load address provided by platform
into 32bit for 32bit kernel.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/786
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=25784
Reviewed-by: Fangrui Song <maskray@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: Maciej W. Rozycki <macro@linux-mips.org>
---
V2: Take MaskRay's shell magic.

V3: After spent an hour on dealing with special character issue in
Makefile, I gave up to do shell hacks and write a util in C instead.
Thanks Maciej for pointing out Makefile variable problem.

v4: Finally we managed to find a Makefile method to do it properly
thanks to Kees. As it's too far from the initial version, I removed
Review & Test tag from Nick and Fangrui and Cc instead.

v5: Care vmlinuz as well.

v6: Rename to LIKER_LOAD_ADDRESS 
---
 arch/mips/Makefile                 | 13 ++++++++++++-
 arch/mips/boot/compressed/Makefile |  2 +-
 arch/mips/kernel/vmlinux.lds.S     |  2 +-
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/mips/Makefile b/arch/mips/Makefile
index e1c44aed8156..68c0f22fefc0 100644
--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -288,12 +288,23 @@ ifdef CONFIG_64BIT
   endif
 endif
 
+# When linking a 32-bit executable the LLVM linker cannot cope with a
+# 32-bit load address that has been sign-extended to 64 bits.  Simply
+# remove the upper 32 bits then, as it is safe to do so with other
+# linkers.
+ifdef CONFIG_64BIT
+	load-ld			= $(load-y)
+else
+	load-ld			= $(subst 0xffffffff,0x,$(load-y))
+endif
+
 KBUILD_AFLAGS	+= $(cflags-y)
 KBUILD_CFLAGS	+= $(cflags-y)
-KBUILD_CPPFLAGS += -DVMLINUX_LOAD_ADDRESS=$(load-y)
+KBUILD_CPPFLAGS += -DVMLINUX_LOAD_ADDRESS=$(load-y) -DLINKER_LOAD_ADDRESS=$(load-ld)
 KBUILD_CPPFLAGS += -DDATAOFFSET=$(if $(dataoffset-y),$(dataoffset-y),0)
 
 bootvars-y	= VMLINUX_LOAD_ADDRESS=$(load-y) \
+		  LINKER_LOAD_ADDRESS=$(load-ld) \
 		  VMLINUX_ENTRY_ADDRESS=$(entry-y) \
 		  PLATFORM="$(platform-y)" \
 		  ITS_INPUTS="$(its-y)"
diff --git a/arch/mips/boot/compressed/Makefile b/arch/mips/boot/compressed/Makefile
index 0df0ee8a298d..3d391256ab7e 100644
--- a/arch/mips/boot/compressed/Makefile
+++ b/arch/mips/boot/compressed/Makefile
@@ -90,7 +90,7 @@ ifneq ($(zload-y),)
 VMLINUZ_LOAD_ADDRESS := $(zload-y)
 else
 VMLINUZ_LOAD_ADDRESS = $(shell $(obj)/calc_vmlinuz_load_addr \
-		$(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
+		$(obj)/vmlinux.bin $(LINKER_LOAD_ADDRESS))
 endif
 UIMAGE_LOADADDR = $(VMLINUZ_LOAD_ADDRESS)
 
diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
index a5f00ec73ea6..5226cd8e4bee 100644
--- a/arch/mips/kernel/vmlinux.lds.S
+++ b/arch/mips/kernel/vmlinux.lds.S
@@ -55,7 +55,7 @@ SECTIONS
 	/* . = 0xa800000000300000; */
 	. = 0xffffffff80300000;
 #endif
-	. = VMLINUX_LOAD_ADDRESS;
+	. = LINKER_LOAD_ADDRESS;
 	/* read-only */
 	_text = .;	/* Text and read-only data */
 	.text : {

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2020-05-06  5:52 Jiaxun Yang
@ 2020-05-06 17:17 ` Nick Desaulniers
  0 siblings, 0 replies; 1651+ messages in thread
From: Nick Desaulniers @ 2020-05-06 17:17 UTC (permalink / raw)
  To: Jiaxun Yang
  Cc: linux-mips, clang-built-linux, Maciej W . Rozycki, Fangrui Song,
	Kees Cook, Nathan Chancellor, Thomas Bogendoerfer, Paul Burton,
	Masahiro Yamada, Jouni Hogander, Kevin Darbyshire-Bryant,
	Borislav Petkov, Heiko Carstens, LKML

On Tue, May 5, 2020 at 10:52 PM Jiaxun Yang <jiaxun.yang@flygoat.com> wrote:
>
> Subject: [PATCH v6] MIPS: Truncate link address into 32bit for 32bit kernel
> In-Reply-To: <20200413062651.3992652-1-jiaxun.yang@flygoat.com>
>
> LLD failed to link vmlinux with 64bit load address for 32bit ELF
> while bfd will strip 64bit address into 32bit silently.
> To fix LLD build, we should truncate load address provided by platform
> into 32bit for 32bit kernel.
>
> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
> Link: https://github.com/ClangBuiltLinux/linux/issues/786
> Link: https://sourceware.org/bugzilla/show_bug.cgi?id=25784
> Reviewed-by: Fangrui Song <maskray@google.com>
> Reviewed-by: Kees Cook <keescook@chromium.org>
> Tested-by: Nathan Chancellor <natechancellor@gmail.com>
> Cc: Maciej W. Rozycki <macro@linux-mips.org>

Cool, this revision looks a bit simpler. Thanks for chasing this.
Tested-by: Nick Desaulniers <ndesaulniers@google.com>

> ---
> V2: Take MaskRay's shell magic.
>
> V3: After spent an hour on dealing with special character issue in
> Makefile, I gave up to do shell hacks and write a util in C instead.
> Thanks Maciej for pointing out Makefile variable problem.
>
> v4: Finally we managed to find a Makefile method to do it properly
> thanks to Kees. As it's too far from the initial version, I removed
> Review & Test tag from Nick and Fangrui and Cc instead.
>
> v5: Care vmlinuz as well.
>
> v6: Rename to LIKER_LOAD_ADDRESS
> ---
>  arch/mips/Makefile                 | 13 ++++++++++++-
>  arch/mips/boot/compressed/Makefile |  2 +-
>  arch/mips/kernel/vmlinux.lds.S     |  2 +-
>  3 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/arch/mips/Makefile b/arch/mips/Makefile
> index e1c44aed8156..68c0f22fefc0 100644
> --- a/arch/mips/Makefile
> +++ b/arch/mips/Makefile
> @@ -288,12 +288,23 @@ ifdef CONFIG_64BIT
>    endif
>  endif
>
> +# When linking a 32-bit executable the LLVM linker cannot cope with a
> +# 32-bit load address that has been sign-extended to 64 bits.  Simply
> +# remove the upper 32 bits then, as it is safe to do so with other
> +# linkers.
> +ifdef CONFIG_64BIT
> +       load-ld                 = $(load-y)
> +else
> +       load-ld                 = $(subst 0xffffffff,0x,$(load-y))
> +endif
> +
>  KBUILD_AFLAGS  += $(cflags-y)
>  KBUILD_CFLAGS  += $(cflags-y)
> -KBUILD_CPPFLAGS += -DVMLINUX_LOAD_ADDRESS=$(load-y)
> +KBUILD_CPPFLAGS += -DVMLINUX_LOAD_ADDRESS=$(load-y) -DLINKER_LOAD_ADDRESS=$(load-ld)
>  KBUILD_CPPFLAGS += -DDATAOFFSET=$(if $(dataoffset-y),$(dataoffset-y),0)
>
>  bootvars-y     = VMLINUX_LOAD_ADDRESS=$(load-y) \
> +                 LINKER_LOAD_ADDRESS=$(load-ld) \
>                   VMLINUX_ENTRY_ADDRESS=$(entry-y) \
>                   PLATFORM="$(platform-y)" \
>                   ITS_INPUTS="$(its-y)"
> diff --git a/arch/mips/boot/compressed/Makefile b/arch/mips/boot/compressed/Makefile
> index 0df0ee8a298d..3d391256ab7e 100644
> --- a/arch/mips/boot/compressed/Makefile
> +++ b/arch/mips/boot/compressed/Makefile
> @@ -90,7 +90,7 @@ ifneq ($(zload-y),)
>  VMLINUZ_LOAD_ADDRESS := $(zload-y)
>  else
>  VMLINUZ_LOAD_ADDRESS = $(shell $(obj)/calc_vmlinuz_load_addr \
> -               $(obj)/vmlinux.bin $(VMLINUX_LOAD_ADDRESS))
> +               $(obj)/vmlinux.bin $(LINKER_LOAD_ADDRESS))
>  endif
>  UIMAGE_LOADADDR = $(VMLINUZ_LOAD_ADDRESS)
>
> diff --git a/arch/mips/kernel/vmlinux.lds.S b/arch/mips/kernel/vmlinux.lds.S
> index a5f00ec73ea6..5226cd8e4bee 100644
> --- a/arch/mips/kernel/vmlinux.lds.S
> +++ b/arch/mips/kernel/vmlinux.lds.S
> @@ -55,7 +55,7 @@ SECTIONS
>         /* . = 0xa800000000300000; */
>         . = 0xffffffff80300000;
>  #endif
> -       . = VMLINUX_LOAD_ADDRESS;
> +       . = LINKER_LOAD_ADDRESS;
>         /* read-only */
>         _text = .;      /* Text and read-only data */
>         .text : {
>
> --

-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2020-05-14  8:17 Maksim Iushchenko
  2020-05-14 10:29 ` fboehm
  0 siblings, 1 reply; 1651+ messages in thread
From: Maksim Iushchenko @ 2020-05-14  8:17 UTC (permalink / raw)
  To: b.a.t.m.a.n

Hello,
I am creating a Wi-Fi ad-hoc network based on batman-adv. I read that
batman-adv is able to work with any types of interfaces, but I still
have a question related to ad-hoc networking. Will Wi-Fi ad-hoc
network (based on batman-adv) work if Wi-Fi chip does not support
802.11s standard?
Unfortunately, there is no mention of ad-hoc mode support in
documentation of many Wi-Fi chips.

How to check if a Wi-Fi chip is suited to be used to create a Wi-Fi
ad-hoc network based on batman-adv?

For example, is ATWILC3000-MR110CA an appropriate chip to build a
Wi-Fi ad-hoc network based on batman-adv? Or maybe you could suggest
any another Wi-Fi chips?

Thanks in advance

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-05-14  8:17 Maksim Iushchenko
@ 2020-05-14 10:29 ` fboehm
  0 siblings, 0 replies; 1651+ messages in thread
From: fboehm @ 2020-05-14 10:29 UTC (permalink / raw)
  To: b.a.t.m.a.n

Maksim, for clarification:

ATWILC3000 is a Wifi-Module for low-end embedded systems. This module 
consists of a Wifi-Chip + a small processor. The processor does stuff 
like authentication/registration with the Wifi network, WPA-Encryption 
and this kind of things. A typical use-case would be to add a Wifi 
interface to some sort of IoT device or some sort of computer peripheral 
device (like a Wifi-enabled printer or a smart-speaker).

Looking at the driver code it might not be impossible but it's just very 
unlikely that you will be happy to use it in combination with Batman. 
You would first of all need to connect the module to a much more 
powerful processor that runs Linux and Batman. But assuming you anyway 
need such a powerful processor for your application then you have a good 
chance that you can use a real Wifi-Adapter (with USB or PICe interface) 
instead of such a Wifi-Module.

Regards,
Franz

Am 14.05.20 um 10:17 schrieb Maksim Iushchenko:
> Hello,
> I am creating a Wi-Fi ad-hoc network based on batman-adv. I read that
> batman-adv is able to work with any types of interfaces, but I still
> have a question related to ad-hoc networking. Will Wi-Fi ad-hoc
> network (based on batman-adv) work if Wi-Fi chip does not support
> 802.11s standard?
> Unfortunately, there is no mention of ad-hoc mode support in
> documentation of many Wi-Fi chips.
>
> How to check if a Wi-Fi chip is suited to be used to create a Wi-Fi
> ad-hoc network based on batman-adv?
>
> For example, is ATWILC3000-MR110CA an appropriate chip to build a
> Wi-Fi ad-hoc network based on batman-adv? Or maybe you could suggest
> any another Wi-Fi chips?
>
> Thanks in advance

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-05-21  0:22 STOREBRAND
  0 siblings, 0 replies; 1651+ messages in thread
From: STOREBRAND @ 2020-05-21  0:22 UTC (permalink / raw)
  To: linux-m68k

Hello,

     Am Harald Hauge an Investment Manager from Norway. I wish to solicit your interest in an investment project that is currently ongoing in my company (Storebrand);  It is a short term investment with good returns. Simply reply for me to confirm the validity of your email so i shall give you a comprehensive details about the project.

Best Regards,
Harald Hauge
Business Consultant

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re;
@ 2020-06-24 13:54 test02
  0 siblings, 0 replies; 1651+ messages in thread
From: test02 @ 2020-06-24 13:54 UTC (permalink / raw)
  To: Recipients

Congratulations!!!

As part of my humanitarian individual support during this hard times of fighting the Corona Virus (Convid-19); your email account was selected for a Donation of $1,000,000.00 USD for charity and community medical support in your area. 
Please contact us for more information on charles_jackson001@yahoo.com.com

Send Your Response To: charles_jackson001@yahoo.com

Best Regards,

Charles .W. Jackson Jr

-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re;
@ 2020-06-24 13:54 test02
  0 siblings, 0 replies; 1651+ messages in thread
From: test02 @ 2020-06-24 13:54 UTC (permalink / raw)
  To: Recipients

Congratulations!!!

As part of my humanitarian individual support during this hard times of fighting the Corona Virus (Convid-19); your email account was selected for a Donation of $1,000,000.00 USD for charity and community medical support in your area. 
Please contact us for more information on charles_jackson001@yahoo.com.com

Send Your Response To: charles_jackson001@yahoo.com

Best Regards,

Charles .W. Jackson Jr

-- 
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (unknown)
@ 2020-06-30 17:56 Vasiliy Kupriakov
  2020-07-10 20:36 ` Andy Shevchenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Vasiliy Kupriakov @ 2020-06-30 17:56 UTC (permalink / raw)
  To: Corentin Chary, Darren Hart, Andy Shevchenko
  Cc: Vasiliy Kupriakov,
	open list:ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS,
	open list:ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS,
	open list

Subject: [PATCH] platform/x86: asus-wmi: allow BAT1 battery name

The battery on my laptop ASUS TUF Gaming FX706II is named BAT1.
This patch allows battery extension to load.

Signed-off-by: Vasiliy Kupriakov <rublag-ns@yandex.ru>
---
 drivers/platform/x86/asus-wmi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
index 877aade19497..8f4acdc06b13 100644
--- a/drivers/platform/x86/asus-wmi.c
+++ b/drivers/platform/x86/asus-wmi.c
@@ -441,6 +441,7 @@ static int asus_wmi_battery_add(struct power_supply *battery)
 	 * battery is named BATT.
 	 */
 	if (strcmp(battery->desc->name, "BAT0") != 0 &&
+	    strcmp(battery->desc->name, "BAT1") != 0 &&
 	    strcmp(battery->desc->name, "BATT") != 0)
 		return -ENODEV;
 
-- 
2.27.0

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2020-06-30 17:56 (unknown) Vasiliy Kupriakov
@ 2020-07-10 20:36 ` Andy Shevchenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Andy Shevchenko @ 2020-07-10 20:36 UTC (permalink / raw)
  To: Vasiliy Kupriakov
  Cc: Corentin Chary, Darren Hart, Andy Shevchenko,
	open list:ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS,
	open list:ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS,
	open list

On Tue, Jun 30, 2020 at 8:57 PM Vasiliy Kupriakov <rublag-ns@yandex.ru> wrote:
>
> Subject: [PATCH] platform/x86: asus-wmi: allow BAT1 battery name
>
> The battery on my laptop ASUS TUF Gaming FX706II is named BAT1.
> This patch allows battery extension to load.
>

Pushed to my review and testing queue, thanks!

> Signed-off-by: Vasiliy Kupriakov <rublag-ns@yandex.ru>
> ---
>  drivers/platform/x86/asus-wmi.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
> index 877aade19497..8f4acdc06b13 100644
> --- a/drivers/platform/x86/asus-wmi.c
> +++ b/drivers/platform/x86/asus-wmi.c
> @@ -441,6 +441,7 @@ static int asus_wmi_battery_add(struct power_supply *battery)
>          * battery is named BATT.
>          */
>         if (strcmp(battery->desc->name, "BAT0") != 0 &&
> +           strcmp(battery->desc->name, "BAT1") != 0 &&
>             strcmp(battery->desc->name, "BATT") != 0)
>                 return -ENODEV;
>
> --
> 2.27.0
>


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2020-07-16 21:22 Mauro Rossi
  2020-07-20  9:00 ` Christian König
  0 siblings, 1 reply; 1651+ messages in thread
From: Mauro Rossi @ 2020-07-16 21:22 UTC (permalink / raw)
  To: amd-gfx; +Cc: alexander.deucher, Mauro Rossi, harry.wentland

The series adds SI support to AMD DC

Changelog:

[RFC]
Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c

[PATCH v2]
Rebase on amd-staging-drm-next dated 17-Oct-2018

[PATCH v3]
Add support for DCE6 specific headers,
ad hoc DCE6 macros, funtions and fixes,
rebase on current amd-staging-drm-next


Commits [01/27]..[08/27] SI support added in various DC components

[PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
[PATCH v3 02/27] drm/amd/display: add asics info for SI parts
[PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
[PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
[PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
[PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
[PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
[PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)

Commits [09/27]..[24/27] DCE6 specific code adaptions

[PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
[PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
[PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
[PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
[PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
[PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
[PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
[PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
[PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
[PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
[PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
[PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
[PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
[PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
[PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
[PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)


Commits [25/27]..[27/27] SI support final enablements

[PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
[PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
[PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)


Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-07-16 21:22 Mauro Rossi
@ 2020-07-20  9:00 ` Christian König
  2020-07-20  9:59   ` Re: Mauro Rossi
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2020-07-20  9:00 UTC (permalink / raw)
  To: Mauro Rossi, amd-gfx; +Cc: alexander.deucher, harry.wentland

Hi Mauro,

I'm not deep into the whole DC design, so just some general high level 
comments on the cover letter:

1. Please add a subject line to the cover letter, my spam filter thinks 
that this is suspicious otherwise.

2. Then you should probably note how well (badly?) is that tested. Since 
you noted proof of concept it might not even work.

3. How feature complete (HDMI audio?, Freesync?) is it?

Apart from that it looks like a rather impressive piece of work :)

Cheers,
Christian.

Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> The series adds SI support to AMD DC
>
> Changelog:
>
> [RFC]
> Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
>
> [PATCH v2]
> Rebase on amd-staging-drm-next dated 17-Oct-2018
>
> [PATCH v3]
> Add support for DCE6 specific headers,
> ad hoc DCE6 macros, funtions and fixes,
> rebase on current amd-staging-drm-next
>
>
> Commits [01/27]..[08/27] SI support added in various DC components
>
> [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
> [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
> [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
> [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
> [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
>
> Commits [09/27]..[24/27] DCE6 specific code adaptions
>
> [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
> [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
> [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
> [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
> [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
> [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
> [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
> [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
> [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
> [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
> [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
> [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
> [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
> [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
> [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
>
>
> Commits [25/27]..[27/27] SI support final enablements
>
> [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
> [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
>
>
> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-07-20  9:00 ` Christian König
@ 2020-07-20  9:59   ` Mauro Rossi
  2020-07-22  2:51     ` Re: Alex Deucher
  0 siblings, 1 reply; 1651+ messages in thread
From: Mauro Rossi @ 2020-07-20  9:59 UTC (permalink / raw)
  To: christian.koenig; +Cc: Deucher, Alexander, Harry Wentland, amd-gfx

Hi Christian,

On Mon, Jul 20, 2020 at 11:00 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Hi Mauro,
>
> I'm not deep into the whole DC design, so just some general high level
> comments on the cover letter:
>
> 1. Please add a subject line to the cover letter, my spam filter thinks
> that this is suspicious otherwise.

My mistake in the editing of covert letter with git send-email,
I may have forgot to keep the Subject at the top

>
> 2. Then you should probably note how well (badly?) is that tested. Since
> you noted proof of concept it might not even work.

The Changelog is to be read as:

[RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
just a rebase onto amd-staging-drm-next

this series [PATCH v3] has all the known changes required for DCE6 specificity
and based on a long offline thread with Alexander Deutcher and past
dri-devel chats with Harry Wentland.

It was tested for my possibilities of testing with HD7750 and HD7950,
with checks in dmesg output for not getting "missing registers/masks"
kernel WARNING
and with kernel build on Ubuntu 20.04 and with android-x86

The proposal I made to Alex is that AMD testing systems will be used
for further regression testing,
as part of review and validation for eligibility to amd-staging-drm-next

>
> 3. How feature complete (HDMI audio?, Freesync?) is it?

All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
DCE6 (dc/dce60 path) in the last two years from initial submission

>
> Apart from that it looks like a rather impressive piece of work :)
>
> Cheers,
> Christian.

Thanks,
please consider that most of the latest DCE6 specific parts were
possible due to recent Alex support in getting the correct DCE6
headers,
his suggestions and continuous feedback.

I would suggest that Alex comments on the proposed next steps to follow.

Mauro

>
> Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> > The series adds SI support to AMD DC
> >
> > Changelog:
> >
> > [RFC]
> > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
> >
> > [PATCH v2]
> > Rebase on amd-staging-drm-next dated 17-Oct-2018
> >
> > [PATCH v3]
> > Add support for DCE6 specific headers,
> > ad hoc DCE6 macros, funtions and fixes,
> > rebase on current amd-staging-drm-next
> >
> >
> > Commits [01/27]..[08/27] SI support added in various DC components
> >
> > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
> > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
> > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
> > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
> > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
> >
> > Commits [09/27]..[24/27] DCE6 specific code adaptions
> >
> > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
> > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
> > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
> > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
> > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
> > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
> > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
> > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
> > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
> > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
> > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
> > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
> > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
> > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
> > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
> >
> >
> > Commits [25/27]..[27/27] SI support final enablements
> >
> > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
> > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
> >
> >
> > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> >
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-07-20  9:59   ` Re: Mauro Rossi
@ 2020-07-22  2:51     ` Alex Deucher
  2020-07-22  7:56       ` Re: Mauro Rossi
  0 siblings, 1 reply; 1651+ messages in thread
From: Alex Deucher @ 2020-07-22  2:51 UTC (permalink / raw)
  To: Mauro Rossi
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list

On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>
> Hi Christian,
>
> On Mon, Jul 20, 2020 at 11:00 AM Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
> >
> > Hi Mauro,
> >
> > I'm not deep into the whole DC design, so just some general high level
> > comments on the cover letter:
> >
> > 1. Please add a subject line to the cover letter, my spam filter thinks
> > that this is suspicious otherwise.
>
> My mistake in the editing of covert letter with git send-email,
> I may have forgot to keep the Subject at the top
>
> >
> > 2. Then you should probably note how well (badly?) is that tested. Since
> > you noted proof of concept it might not even work.
>
> The Changelog is to be read as:
>
> [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
> just a rebase onto amd-staging-drm-next
>
> this series [PATCH v3] has all the known changes required for DCE6 specificity
> and based on a long offline thread with Alexander Deutcher and past
> dri-devel chats with Harry Wentland.
>
> It was tested for my possibilities of testing with HD7750 and HD7950,
> with checks in dmesg output for not getting "missing registers/masks"
> kernel WARNING
> and with kernel build on Ubuntu 20.04 and with android-x86
>
> The proposal I made to Alex is that AMD testing systems will be used
> for further regression testing,
> as part of review and validation for eligibility to amd-staging-drm-next
>

We will certainly test it once it lands, but presumably this is
working on the SI cards you have access to?

> >
> > 3. How feature complete (HDMI audio?, Freesync?) is it?
>
> All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
> DCE6 (dc/dce60 path) in the last two years from initial submission
>
> >
> > Apart from that it looks like a rather impressive piece of work :)
> >
> > Cheers,
> > Christian.
>
> Thanks,
> please consider that most of the latest DCE6 specific parts were
> possible due to recent Alex support in getting the correct DCE6
> headers,
> his suggestions and continuous feedback.
>
> I would suggest that Alex comments on the proposed next steps to follow.

The code looks pretty good to me.  I'd like to get some feedback from
the display team to see if they have any concerns, but beyond that I
think we can pull it into the tree and continue improving it there.
Do you have a link to a git tree I can pull directly that contains
these patches?  Is this the right branch?
https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next

Thanks!

Alex

>
> Mauro
>
> >
> > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> > > The series adds SI support to AMD DC
> > >
> > > Changelog:
> > >
> > > [RFC]
> > > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
> > >
> > > [PATCH v2]
> > > Rebase on amd-staging-drm-next dated 17-Oct-2018
> > >
> > > [PATCH v3]
> > > Add support for DCE6 specific headers,
> > > ad hoc DCE6 macros, funtions and fixes,
> > > rebase on current amd-staging-drm-next
> > >
> > >
> > > Commits [01/27]..[08/27] SI support added in various DC components
> > >
> > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
> > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
> > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
> > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
> > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
> > >
> > > Commits [09/27]..[24/27] DCE6 specific code adaptions
> > >
> > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
> > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
> > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
> > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
> > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
> > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
> > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
> > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
> > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
> > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
> > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
> > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
> > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
> > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
> > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
> > >
> > >
> > > Commits [25/27]..[27/27] SI support final enablements
> > >
> > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
> > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
> > >
> > >
> > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> > >
> > > _______________________________________________
> > > amd-gfx mailing list
> > > amd-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-07-22  2:51     ` Re: Alex Deucher
@ 2020-07-22  7:56       ` Mauro Rossi
  2020-07-24 18:31         ` Re: Alex Deucher
  0 siblings, 1 reply; 1651+ messages in thread
From: Mauro Rossi @ 2020-07-22  7:56 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 6915 bytes --]

Hello,
re-sending and copying full DL

On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com> wrote:

> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
> >
> > Hi Christian,
> >
> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
> > <ckoenig.leichtzumerken@gmail.com> wrote:
> > >
> > > Hi Mauro,
> > >
> > > I'm not deep into the whole DC design, so just some general high level
> > > comments on the cover letter:
> > >
> > > 1. Please add a subject line to the cover letter, my spam filter thinks
> > > that this is suspicious otherwise.
> >
> > My mistake in the editing of covert letter with git send-email,
> > I may have forgot to keep the Subject at the top
> >
> > >
> > > 2. Then you should probably note how well (badly?) is that tested.
> Since
> > > you noted proof of concept it might not even work.
> >
> > The Changelog is to be read as:
> >
> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
> > just a rebase onto amd-staging-drm-next
> >
> > this series [PATCH v3] has all the known changes required for DCE6
> specificity
> > and based on a long offline thread with Alexander Deutcher and past
> > dri-devel chats with Harry Wentland.
> >
> > It was tested for my possibilities of testing with HD7750 and HD7950,
> > with checks in dmesg output for not getting "missing registers/masks"
> > kernel WARNING
> > and with kernel build on Ubuntu 20.04 and with android-x86
> >
> > The proposal I made to Alex is that AMD testing systems will be used
> > for further regression testing,
> > as part of review and validation for eligibility to amd-staging-drm-next
> >
>
> We will certainly test it once it lands, but presumably this is
> working on the SI cards you have access to?
>

Yes, most of my testing was done with android-x86  Android CTS (EGL, GLES2,
GLES3, VK)

I am also in contact with a person with Firepro W5130M who is running a
piglit session

I had bought an HD7850 to test with Pitcairn, but it arrived as defective
so I could not test with Pitcair



> > >
> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
> >
> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
> > DCE6 (dc/dce60 path) in the last two years from initial submission
> >
> > >
> > > Apart from that it looks like a rather impressive piece of work :)
> > >
> > > Cheers,
> > > Christian.
> >
> > Thanks,
> > please consider that most of the latest DCE6 specific parts were
> > possible due to recent Alex support in getting the correct DCE6
> > headers,
> > his suggestions and continuous feedback.
> >
> > I would suggest that Alex comments on the proposed next steps to follow.
>
> The code looks pretty good to me.  I'd like to get some feedback from
> the display team to see if they have any concerns, but beyond that I
> think we can pull it into the tree and continue improving it there.
> Do you have a link to a git tree I can pull directly that contains
> these patches?  Is this the right branch?
> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
>
> Thanks!
>
> Alex
>

The following branch was pushed with the series on top of
amd-staging-drm-next

https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next


>
> >
> > Mauro
> >
> > >
> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> > > > The series adds SI support to AMD DC
> > > >
> > > > Changelog:
> > > >
> > > > [RFC]
> > > > Preliminar Proof Of Concept, with DCE8 headers still used in
> dce60_resources.c
> > > >
> > > > [PATCH v2]
> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
> > > >
> > > > [PATCH v3]
> > > > Add support for DCE6 specific headers,
> > > > ad hoc DCE6 macros, funtions and fixes,
> > > > rebase on current amd-staging-drm-next
> > > >
> > > >
> > > > Commits [01/27]..[08/27] SI support added in various DC components
> > > >
> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support
> (v9b)
> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
> > > >
> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
> > > >
> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI
> parts (v2)
> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific
> macros,functions
> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific
> macros,functions
> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific
> macros,functions
> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6
> specific macros,functions
> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific
> macros,functions
> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific
> macros,functions
> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific
> macros,functions
> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling
> Horizontal Filter Init
> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> macros,functions
> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> specific .cursor_lock
> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6
> specific functions
> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
> > > >
> > > >
> > > > Commits [25/27]..[27/27] SI support final enablements
> > > >
> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for
> Bonarie and later
> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig
> (v2)
> > > >
> > > >
> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> > > >
> > > > _______________________________________________
> > > > amd-gfx mailing list
> > > > amd-gfx@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> > >
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 9472 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-07-22  7:56       ` Re: Mauro Rossi
@ 2020-07-24 18:31         ` Alex Deucher
  2020-07-26 15:31           ` Re: Mauro Rossi
  0 siblings, 1 reply; 1651+ messages in thread
From: Alex Deucher @ 2020-07-24 18:31 UTC (permalink / raw)
  To: Mauro Rossi
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list

[-- Attachment #1: Type: text/plain, Size: 7470 bytes --]

On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>
> Hello,
> re-sending and copying full DL
>
> On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>>
>> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >
>> > Hi Christian,
>> >
>> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
>> > <ckoenig.leichtzumerken@gmail.com> wrote:
>> > >
>> > > Hi Mauro,
>> > >
>> > > I'm not deep into the whole DC design, so just some general high level
>> > > comments on the cover letter:
>> > >
>> > > 1. Please add a subject line to the cover letter, my spam filter thinks
>> > > that this is suspicious otherwise.
>> >
>> > My mistake in the editing of covert letter with git send-email,
>> > I may have forgot to keep the Subject at the top
>> >
>> > >
>> > > 2. Then you should probably note how well (badly?) is that tested. Since
>> > > you noted proof of concept it might not even work.
>> >
>> > The Changelog is to be read as:
>> >
>> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
>> > just a rebase onto amd-staging-drm-next
>> >
>> > this series [PATCH v3] has all the known changes required for DCE6 specificity
>> > and based on a long offline thread with Alexander Deutcher and past
>> > dri-devel chats with Harry Wentland.
>> >
>> > It was tested for my possibilities of testing with HD7750 and HD7950,
>> > with checks in dmesg output for not getting "missing registers/masks"
>> > kernel WARNING
>> > and with kernel build on Ubuntu 20.04 and with android-x86
>> >
>> > The proposal I made to Alex is that AMD testing systems will be used
>> > for further regression testing,
>> > as part of review and validation for eligibility to amd-staging-drm-next
>> >
>>
>> We will certainly test it once it lands, but presumably this is
>> working on the SI cards you have access to?
>
>
> Yes, most of my testing was done with android-x86  Android CTS (EGL, GLES2, GLES3, VK)
>
> I am also in contact with a person with Firepro W5130M who is running a piglit session
>
> I had bought an HD7850 to test with Pitcairn, but it arrived as defective so I could not test with Pitcair
>
>
>>
>> > >
>> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
>> >
>> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
>> > DCE6 (dc/dce60 path) in the last two years from initial submission
>> >
>> > >
>> > > Apart from that it looks like a rather impressive piece of work :)
>> > >
>> > > Cheers,
>> > > Christian.
>> >
>> > Thanks,
>> > please consider that most of the latest DCE6 specific parts were
>> > possible due to recent Alex support in getting the correct DCE6
>> > headers,
>> > his suggestions and continuous feedback.
>> >
>> > I would suggest that Alex comments on the proposed next steps to follow.
>>
>> The code looks pretty good to me.  I'd like to get some feedback from
>> the display team to see if they have any concerns, but beyond that I
>> think we can pull it into the tree and continue improving it there.
>> Do you have a link to a git tree I can pull directly that contains
>> these patches?  Is this the right branch?
>> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
>>
>> Thanks!
>>
>> Alex
>
>
> The following branch was pushed with the series on top of amd-staging-drm-next
>
> https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next

I gave this a quick test on all of the SI asics and the various
monitors I had available and it looks good.  A few minor patches I
noticed are attached.  If they look good to you, I'll squash them into
the series when I commit it.  I've pushed it to my fdo tree as well:
https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support

Thanks!

Alex

>
>>
>>
>> >
>> > Mauro
>> >
>> > >
>> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
>> > > > The series adds SI support to AMD DC
>> > > >
>> > > > Changelog:
>> > > >
>> > > > [RFC]
>> > > > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
>> > > >
>> > > > [PATCH v2]
>> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
>> > > >
>> > > > [PATCH v3]
>> > > > Add support for DCE6 specific headers,
>> > > > ad hoc DCE6 macros, funtions and fixes,
>> > > > rebase on current amd-staging-drm-next
>> > > >
>> > > >
>> > > > Commits [01/27]..[08/27] SI support added in various DC components
>> > > >
>> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
>> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
>> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
>> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
>> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
>> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
>> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
>> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
>> > > >
>> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
>> > > >
>> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
>> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
>> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
>> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
>> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
>> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
>> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
>> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
>> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
>> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
>> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
>> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
>> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
>> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
>> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
>> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
>> > > >
>> > > >
>> > > > Commits [25/27]..[27/27] SI support final enablements
>> > > >
>> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
>> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
>> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
>> > > >
>> > > >
>> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
>> > > >
>> > > > _______________________________________________
>> > > > amd-gfx mailing list
>> > > > amd-gfx@lists.freedesktop.org
>> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> > >
>> > _______________________________________________
>> > amd-gfx mailing list
>> > amd-gfx@lists.freedesktop.org
>> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[-- Attachment #2: 0002-drm-amdgpu-display-addming-return-type-for-dce60_pro.patch --]
[-- Type: text/x-patch, Size: 982 bytes --]

From 782fea4387d22686856c87b8ac0491a43a4d944c Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 23 Jul 2020 21:05:41 -0400
Subject: [PATCH 2/3] drm/amdgpu/display: addming return type for
 dce60_program_front_end_for_pipe

Probably a copy/paste typo.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c
index 66e5a1ba2a58..920c7ae29d53 100644
--- a/drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dce60/dce60_hw_sequencer.c
@@ -266,7 +266,7 @@ static void dce60_program_scaler(const struct dc *dc,
 		&pipe_ctx->plane_res.scl_data);
 }
 
-
+static void
 dce60_program_front_end_for_pipe(
 		struct dc *dc, struct pipe_ctx *pipe_ctx)
 {
-- 
2.25.4


[-- Attachment #3: 0003-drm-amdgpu-display-Fix-up-PLL-handling-for-DCE6.patch --]
[-- Type: text/x-patch, Size: 1855 bytes --]

From 2b18098918717d9ee4c69a47be3527d1cc812b7f Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Fri, 24 Jul 2020 11:41:31 -0400
Subject: [PATCH 3/3] drm/amdgpu/display: Fix up PLL handling for DCE6

DCE6.0 supports 2 PLLs.  DCE6.1 supports 3 PLLs.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c b/drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c
index 261333afc936..5a5a9cb77acb 100644
--- a/drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dce60/dce60_resource.c
@@ -379,7 +379,7 @@ static const struct resource_caps res_cap_61 = {
 		.num_timing_generator = 4,
 		.num_audio = 6,
 		.num_stream_encoder = 6,
-		.num_pll = 2,
+		.num_pll = 3,
 		.num_ddc = 6,
 };
 
@@ -983,9 +983,7 @@ static bool dce60_construct(
 				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL0, &clk_src_regs[0], false);
 		pool->base.clock_sources[1] =
 				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL1, &clk_src_regs[1], false);
-		pool->base.clock_sources[2] =
-				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL2, &clk_src_regs[2], false);
-		pool->base.clk_src_count = 3;
+		pool->base.clk_src_count = 2;
 
 	} else {
 		pool->base.dp_clock_source =
@@ -993,9 +991,7 @@ static bool dce60_construct(
 
 		pool->base.clock_sources[0] =
 				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL1, &clk_src_regs[1], false);
-		pool->base.clock_sources[1] =
-				dce60_clock_source_create(ctx, bp, CLOCK_SOURCE_ID_PLL2, &clk_src_regs[2], false);
-		pool->base.clk_src_count = 2;
+		pool->base.clk_src_count = 1;
 	}
 
 	if (pool->base.dp_clock_source == NULL) {
-- 
2.25.4


[-- Attachment #4: 0001-drm-amdgpu-display-remove-unused-variable-in-dce60_c.patch --]
[-- Type: text/x-patch, Size: 1084 bytes --]

From 2ced8e528937051e4d8536718c6dc776e0b46314 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 23 Jul 2020 21:02:14 -0400
Subject: [PATCH 1/3] drm/amdgpu/display: remove unused variable in
 dce60_configure_crc

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c b/drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c
index 4a5b7a0940c6..fc1af0ff0ca4 100644
--- a/drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c
+++ b/drivers/gpu/drm/amd/display/dc/dce60/dce60_timing_generator.c
@@ -192,8 +192,6 @@ static bool dce60_is_tg_enabled(struct timing_generator *tg)
 bool dce60_configure_crc(struct timing_generator *tg,
 			  const struct crc_params *params)
 {
-	struct dce110_timing_generator *tg110 = DCE110TG_FROM_TG(tg);
-
 	/* Cannot configure crc on a CRTC that is disabled */
 	if (!dce60_is_tg_enabled(tg))
 		return false;
-- 
2.25.4


[-- Attachment #5: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2020-07-24 18:31         ` Re: Alex Deucher
@ 2020-07-26 15:31           ` Mauro Rossi
  2020-07-27 18:31             ` Re: Alex Deucher
  0 siblings, 1 reply; 1651+ messages in thread
From: Mauro Rossi @ 2020-07-26 15:31 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 9396 bytes --]

Hello,

On Fri, Jul 24, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:

> On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
> >
> > Hello,
> > re-sending and copying full DL
> >
> > On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com>
> wrote:
> >>
> >> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com>
> wrote:
> >> >
> >> > Hi Christian,
> >> >
> >> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
> >> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> > >
> >> > > Hi Mauro,
> >> > >
> >> > > I'm not deep into the whole DC design, so just some general high
> level
> >> > > comments on the cover letter:
> >> > >
> >> > > 1. Please add a subject line to the cover letter, my spam filter
> thinks
> >> > > that this is suspicious otherwise.
> >> >
> >> > My mistake in the editing of covert letter with git send-email,
> >> > I may have forgot to keep the Subject at the top
> >> >
> >> > >
> >> > > 2. Then you should probably note how well (badly?) is that tested.
> Since
> >> > > you noted proof of concept it might not even work.
> >> >
> >> > The Changelog is to be read as:
> >> >
> >> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
> >> > just a rebase onto amd-staging-drm-next
> >> >
> >> > this series [PATCH v3] has all the known changes required for DCE6
> specificity
> >> > and based on a long offline thread with Alexander Deutcher and past
> >> > dri-devel chats with Harry Wentland.
> >> >
> >> > It was tested for my possibilities of testing with HD7750 and HD7950,
> >> > with checks in dmesg output for not getting "missing registers/masks"
> >> > kernel WARNING
> >> > and with kernel build on Ubuntu 20.04 and with android-x86
> >> >
> >> > The proposal I made to Alex is that AMD testing systems will be used
> >> > for further regression testing,
> >> > as part of review and validation for eligibility to
> amd-staging-drm-next
> >> >
> >>
> >> We will certainly test it once it lands, but presumably this is
> >> working on the SI cards you have access to?
> >
> >
> > Yes, most of my testing was done with android-x86  Android CTS (EGL,
> GLES2, GLES3, VK)
> >
> > I am also in contact with a person with Firepro W5130M who is running a
> piglit session
> >
> > I had bought an HD7850 to test with Pitcairn, but it arrived as
> defective so I could not test with Pitcair
> >
> >
> >>
> >> > >
> >> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
> >> >
> >> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
> >> > DCE6 (dc/dce60 path) in the last two years from initial submission
> >> >
> >> > >
> >> > > Apart from that it looks like a rather impressive piece of work :)
> >> > >
> >> > > Cheers,
> >> > > Christian.
> >> >
> >> > Thanks,
> >> > please consider that most of the latest DCE6 specific parts were
> >> > possible due to recent Alex support in getting the correct DCE6
> >> > headers,
> >> > his suggestions and continuous feedback.
> >> >
> >> > I would suggest that Alex comments on the proposed next steps to
> follow.
> >>
> >> The code looks pretty good to me.  I'd like to get some feedback from
> >> the display team to see if they have any concerns, but beyond that I
> >> think we can pull it into the tree and continue improving it there.
> >> Do you have a link to a git tree I can pull directly that contains
> >> these patches?  Is this the right branch?
> >> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
> >>
> >> Thanks!
> >>
> >> Alex
> >
> >
> > The following branch was pushed with the series on top of
> amd-staging-drm-next
> >
> > https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next
>
> I gave this a quick test on all of the SI asics and the various
> monitors I had available and it looks good.  A few minor patches I
> noticed are attached.  If they look good to you, I'll squash them into
> the series when I commit it.  I've pushed it to my fdo tree as well:
> https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support
>
> Thanks!
>
> Alex
>

The new patches are ok and with the following infomation about piglit
tests,
the series may be good to go.

I have performed piglit tests on Tahiti HD7950 on kernel 5.8.0-rc6 with AMD
DC support for SI
and comparison with vanilla kernel 5.8.0-rc6

Results are the following

[piglit gpu tests with kernel 5.8.0-rc6-amddcsi]

utente@utente-desktop:~/piglit$ ./piglit run gpu .
[26714/26714] skip: 1731, pass: 24669, warn: 15, fail: 288, crash: 11
Thank you for running Piglit!
Results have been written to /home/utente/piglit

[piglit gpu tests with vanilla 5.8.0-rc6]

utente@utente-desktop:~/piglit$ ./piglit run gpu .
[26714/26714] skip: 1731, pass: 24673, warn: 13, fail: 283, crash: 14
Thank you for running Piglit!
Results have been written to /home/utente/piglit

In the attachment the comparison of "5.8.0-rc6-amddcsi" vs "5.8.0-rc6"
vanilla
and viceversa, I see no significant regression and in the delta of failed
tests I don't recognize DC related test cases,
but you may also have a look.

dmesg for "5.8.0-rc6-amddcsi" is also provide the check the crashes

Regarding the other user testing the series with Firepro W5130M
he found an already existing issue in amdgpu si_support=1 which is
independent from my series and matches a problem alrady reported. [1]

Mauro

[1] https://bbs.archlinux.org/viewtopic.php?id=249097


>
> >
> >>
> >>
> >> >
> >> > Mauro
> >> >
> >> > >
> >> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> >> > > > The series adds SI support to AMD DC
> >> > > >
> >> > > > Changelog:
> >> > > >
> >> > > > [RFC]
> >> > > > Preliminar Proof Of Concept, with DCE8 headers still used in
> dce60_resources.c
> >> > > >
> >> > > > [PATCH v2]
> >> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
> >> > > >
> >> > > > [PATCH v3]
> >> > > > Add support for DCE6 specific headers,
> >> > > > ad hoc DCE6 macros, funtions and fixes,
> >> > > > rebase on current amd-staging-drm-next
> >> > > >
> >> > > >
> >> > > > Commits [01/27]..[08/27] SI support added in various DC components
> >> > > >
> >> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
> >> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> >> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6
> support (v9b)
> >> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support
> (v2)
> >> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> >> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6
> (v2)
> >> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6
> (v4)
> >> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
> >> > > >
> >> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
> >> > > >
> >> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI
> parts (v2)
> >> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size
> to 64
> >> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific
> macros,functions
> >> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific
> macros
> >> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific
> macros,functions
> >> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific
> macros,functions
> >> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6
> specific macros,functions
> >> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6
> specific macros,functions
> >> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific
> macros,functions
> >> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6
> specific macros,functions
> >> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
> >> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling
> Horizontal Filter Init
> >> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> macros,functions
> >> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> specific .cursor_lock
> >> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add
> DCE6 specific functions
> >> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
> >> > > >
> >> > > >
> >> > > > Commits [25/27]..[27/27] SI support final enablements
> >> > > >
> >> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property
> for Bonarie and later
> >> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
> >> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the
> Kconfig (v2)
> >> > > >
> >> > > >
> >> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> >> > > >
> >> > > > _______________________________________________
> >> > > > amd-gfx mailing list
> >> > > > amd-gfx@lists.freedesktop.org
> >> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >> > >
> >> > _______________________________________________
> >> > amd-gfx mailing list
> >> > amd-gfx@lists.freedesktop.org
> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 13512 bytes --]

[-- Attachment #2: dmesg_kernel-5.8.0-rc6_amddcsi.txt --]
[-- Type: text/plain, Size: 87504 bytes --]

[    0.000000] microcode: microcode updated early to revision 0x21, date = 2019-02-13
[    0.000000] Linux version 5.8.0-050800rc6-generic (kernel@kathleen) (gcc (Ubuntu 9.3.0-13ubuntu1) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34.90.20200716) #202007192331 SMP Sun Jul 19 23:33:45 UTC 2020
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.8.0-050800rc6-generic root=UUID=833ac3c7-4d08-47b5-807f-9a8ddeb3a8d2 ro quiet splash radeon.si_support=0 amdgpu.si_support=1 vt.handoff=7
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000]   AMD AuthenticAMD
[    0.000000]   Hygon HygonGenuine
[    0.000000]   Centaur CentaurHauls
[    0.000000]   zhaoxin   Shanghai  
[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000dd907fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000dd908000-0x00000000de08cfff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000de08d000-0x00000000de116fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000de117000-0x00000000de1b6fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000de1b7000-0x00000000de9a5fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000de9a6000-0x00000000de9a6fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000de9a7000-0x00000000de9e9fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000de9ea000-0x00000000df407fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000df408000-0x00000000df7f0fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000df7f1000-0x00000000df7fffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000021effffff] usable
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] SMBIOS 2.7 present.
[    0.000000] DMI: To Be Filled By O.E.M. To Be Filled By O.E.M./H77 Pro4/MVP, BIOS P1.70 08/07/2013
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] tsc: Detected 3392.425 MHz processor
[    0.000891] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000892] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000897] last_pfn = 0x21f000 max_arch_pfn = 0x400000000
[    0.000901] MTRR default type: uncachable
[    0.000901] MTRR fixed ranges enabled:
[    0.000902]   00000-9FFFF write-back
[    0.000903]   A0000-BFFFF uncachable
[    0.000903]   C0000-CFFFF write-protect
[    0.000904]   D0000-E7FFF uncachable
[    0.000904]   E8000-FFFFF write-protect
[    0.000905] MTRR variable ranges enabled:
[    0.000906]   0 base 000000000 mask E00000000 write-back
[    0.000907]   1 base 200000000 mask FF0000000 write-back
[    0.000907]   2 base 210000000 mask FF8000000 write-back
[    0.000908]   3 base 218000000 mask FFC000000 write-back
[    0.000908]   4 base 21C000000 mask FFE000000 write-back
[    0.000909]   5 base 21E000000 mask FFF000000 write-back
[    0.000910]   6 base 0E0000000 mask FE0000000 uncachable
[    0.000910]   7 disabled
[    0.000910]   8 disabled
[    0.000911]   9 disabled
[    0.001158] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT  
[    0.001265] total RAM covered: 8176M
[    0.001638] Found optimal setting for mtrr clean up
[    0.001639]  gran_size: 64K 	chunk_size: 32M 	num_reg: 6  	lose cover RAM: 0G
[    0.001863] e820: update [mem 0xe0000000-0xffffffff] usable ==> reserved
[    0.001866] last_pfn = 0xdf800 max_arch_pfn = 0x400000000
[    0.008892] found SMP MP-table at [mem 0x000fd8d0-0x000fd8df]
[    0.030371] check: Scanning 1 areas for low memory corruption
[    0.030771] RAMDISK: [mem 0x3172d000-0x34b8dfff]
[    0.030777] ACPI: Early table checksum verification disabled
[    0.030780] ACPI: RSDP 0x00000000000F0490 000024 (v02 ALASKA)
[    0.030782] ACPI: XSDT 0x00000000DE19B080 00007C (v01 ALASKA A M I    01072009 AMI  00010013)
[    0.030787] ACPI: FACP 0x00000000DE1A4DC0 00010C (v05 ALASKA A M I    01072009 AMI  00010013)
[    0.030791] ACPI: DSDT 0x00000000DE19B190 009C2D (v02 ALASKA A M I    00000022 INTL 20051117)
[    0.030794] ACPI: FACS 0x00000000DE1B5080 000040
[    0.030796] ACPI: APIC 0x00000000DE1A4ED0 000072 (v03 ALASKA A M I    01072009 AMI  00010013)
[    0.030798] ACPI: FPDT 0x00000000DE1A4F48 000044 (v01 ALASKA A M I    01072009 AMI  00010013)
[    0.030800] ACPI: MCFG 0x00000000DE1A4F90 00003C (v01 ALASKA A M I    01072009 MSFT 00000097)
[    0.030802] ACPI: SSDT 0x00000000DE1A4FD0 0007E1 (v01 Intel_ AoacTabl 00001000 INTL 20091112)
[    0.030804] ACPI: AAFT 0x00000000DE1A57B8 000112 (v01 ALASKA OEMAAFT  01072009 MSFT 00000097)
[    0.030806] ACPI: HPET 0x00000000DE1A58D0 000038 (v01 ALASKA A M I    01072009 AMI. 00000005)
[    0.030808] ACPI: SSDT 0x00000000DE1A5908 00036D (v01 SataRe SataTabl 00001000 INTL 20091112)
[    0.030811] ACPI: SSDT 0x00000000DE1A5C78 0009AA (v01 PmRef  Cpu0Ist  00003000 INTL 20051117)
[    0.030813] ACPI: SSDT 0x00000000DE1A6628 000A92 (v01 PmRef  CpuPm    00003000 INTL 20051117)
[    0.030815] ACPI: BGRT 0x00000000DE1A70C0 000038 (v00 ALASKA A M I    01072009 AMI  00010013)
[    0.030822] ACPI: Local APIC address 0xfee00000
[    0.030892] No NUMA configuration found
[    0.030893] Faking a node at [mem 0x0000000000000000-0x000000021effffff]
[    0.030901] NODE_DATA(0) allocated [mem 0x21efd1000-0x21effafff]
[    0.031211] Zone ranges:
[    0.031211]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.031212]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.031213]   Normal   [mem 0x0000000100000000-0x000000021effffff]
[    0.031214]   Device   empty
[    0.031214] Movable zone start for each node
[    0.031217] Early memory node ranges
[    0.031217]   node   0: [mem 0x0000000000001000-0x000000000009cfff]
[    0.031218]   node   0: [mem 0x0000000000100000-0x00000000dd907fff]
[    0.031219]   node   0: [mem 0x00000000de08d000-0x00000000de116fff]
[    0.031219]   node   0: [mem 0x00000000de9a6000-0x00000000de9a6fff]
[    0.031220]   node   0: [mem 0x00000000de9ea000-0x00000000df407fff]
[    0.031220]   node   0: [mem 0x00000000df7f1000-0x00000000df7fffff]
[    0.031221]   node   0: [mem 0x0000000100000000-0x000000021effffff]
[    0.031310] Zeroed struct page in unavailable ranges: 11428 pages
[    0.031311] Initmem setup node 0 [mem 0x0000000000001000-0x000000021effffff]
[    0.031312] On node 0 totalpages: 2085724
[    0.031313]   DMA zone: 64 pages used for memmap
[    0.031313]   DMA zone: 21 pages reserved
[    0.031314]   DMA zone: 3996 pages, LIFO batch:0
[    0.031341]   DMA32 zone: 14159 pages used for memmap
[    0.031341]   DMA32 zone: 906176 pages, LIFO batch:63
[    0.042718]   Normal zone: 18368 pages used for memmap
[    0.042720]   Normal zone: 1175552 pages, LIFO batch:63
[    0.058107] ACPI: PM-Timer IO Port: 0x408
[    0.058109] ACPI: Local APIC address 0xfee00000
[    0.058116] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
[    0.058126] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
[    0.058128] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.058129] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.058130] ACPI: IRQ0 used by override.
[    0.058131] ACPI: IRQ9 used by override.
[    0.058133] Using ACPI (MADT) for SMP configuration information
[    0.058134] ACPI: HPET id: 0x8086a701 base: 0xfed00000
[    0.058139] TSC deadline timer available
[    0.058140] smpboot: Allowing 4 CPUs, 0 hotplug CPUs
[    0.058155] PM: hibernation: Registered nosave memory: [mem 0x00000000-0x00000fff]
[    0.058157] PM: hibernation: Registered nosave memory: [mem 0x0009d000-0x0009dfff]
[    0.058157] PM: hibernation: Registered nosave memory: [mem 0x0009e000-0x0009ffff]
[    0.058157] PM: hibernation: Registered nosave memory: [mem 0x000a0000-0x000dffff]
[    0.058158] PM: hibernation: Registered nosave memory: [mem 0x000e0000-0x000fffff]
[    0.058159] PM: hibernation: Registered nosave memory: [mem 0xdd908000-0xde08cfff]
[    0.058160] PM: hibernation: Registered nosave memory: [mem 0xde117000-0xde1b6fff]
[    0.058161] PM: hibernation: Registered nosave memory: [mem 0xde1b7000-0xde9a5fff]
[    0.058162] PM: hibernation: Registered nosave memory: [mem 0xde9a7000-0xde9e9fff]
[    0.058163] PM: hibernation: Registered nosave memory: [mem 0xdf408000-0xdf7f0fff]
[    0.058164] PM: hibernation: Registered nosave memory: [mem 0xdf800000-0xf7ffffff]
[    0.058164] PM: hibernation: Registered nosave memory: [mem 0xf8000000-0xfbffffff]
[    0.058165] PM: hibernation: Registered nosave memory: [mem 0xfc000000-0xfebfffff]
[    0.058165] PM: hibernation: Registered nosave memory: [mem 0xfec00000-0xfec00fff]
[    0.058166] PM: hibernation: Registered nosave memory: [mem 0xfec01000-0xfecfffff]
[    0.058166] PM: hibernation: Registered nosave memory: [mem 0xfed00000-0xfed03fff]
[    0.058166] PM: hibernation: Registered nosave memory: [mem 0xfed04000-0xfed1bfff]
[    0.058167] PM: hibernation: Registered nosave memory: [mem 0xfed1c000-0xfed1ffff]
[    0.058167] PM: hibernation: Registered nosave memory: [mem 0xfed20000-0xfedfffff]
[    0.058168] PM: hibernation: Registered nosave memory: [mem 0xfee00000-0xfee00fff]
[    0.058168] PM: hibernation: Registered nosave memory: [mem 0xfee01000-0xfeffffff]
[    0.058168] PM: hibernation: Registered nosave memory: [mem 0xff000000-0xffffffff]
[    0.058170] [mem 0xdf800000-0xf7ffffff] available for PCI devices
[    0.058171] Booting paravirtualized kernel on bare hardware
[    0.058173] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
[    0.058179] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1
[    0.058466] percpu: Embedded 56 pages/cpu s192512 r8192 d28672 u524288
[    0.058471] pcpu-alloc: s192512 r8192 d28672 u524288 alloc=1*2097152
[    0.058471] pcpu-alloc: [0] 0 1 2 3 
[    0.058495] Built 1 zonelists, mobility grouping on.  Total pages: 2053112
[    0.058495] Policy zone: Normal
[    0.058497] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.8.0-050800rc6-generic root=UUID=833ac3c7-4d08-47b5-807f-9a8ddeb3a8d2 ro quiet splash radeon.si_support=0 amdgpu.si_support=1 vt.handoff=7
[    0.059394] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, linear)
[    0.059799] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, linear)
[    0.059841] mem auto-init: stack:off, heap alloc:on, heap free:off
[    0.098003] Memory: 8041840K/8342896K available (14339K kernel code, 2555K rwdata, 8736K rodata, 2632K init, 4912K bss, 301056K reserved, 0K cma-reserved)
[    0.098010] random: get_random_u64 called from kmem_cache_open+0x2d/0x410 with crng_init=0
[    0.098116] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.098128] Kernel/User page tables isolation: enabled
[    0.098144] ftrace: allocating 46071 entries in 180 pages
[    0.111578] ftrace: allocated 180 pages with 4 groups
[    0.111684] rcu: Hierarchical RCU implementation.
[    0.111685] rcu: 	RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=4.
[    0.111686] 	Trampoline variant of Tasks RCU enabled.
[    0.111686] 	Rude variant of Tasks RCU enabled.
[    0.111687] 	Tracing variant of Tasks RCU enabled.
[    0.111687] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
[    0.111688] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
[    0.114400] NR_IRQS: 524544, nr_irqs: 456, preallocated irqs: 16
[    0.114611] random: crng done (trusting CPU's manufacturer)
[    0.114630] Console: colour dummy device 80x25
[    0.114634] printk: console [tty0] enabled
[    0.114648] ACPI: Core revision 20200528
[    0.114745] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 133484882848 ns
[    0.114755] APIC: Switch to symmetric I/O mode setup
[    0.114825] x2apic: IRQ remapping doesn't support X2APIC mode
[    0.115236] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.134755] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x30e65a81c66, max_idle_ns: 440795263477 ns
[    0.134758] Calibrating delay loop (skipped), value calculated using timer frequency.. 6784.85 BogoMIPS (lpj=13569700)
[    0.134760] pid_max: default: 32768 minimum: 301
[    0.134781] LSM: Security Framework initializing
[    0.134788] Yama: becoming mindful.
[    0.134810] AppArmor: AppArmor initialized
[    0.134854] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.134874] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.135089] mce: CPU0: Thermal monitoring enabled (TM1)
[    0.135099] process: using mwait in idle threads
[    0.135101] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8
[    0.135101] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0
[    0.135103] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
[    0.135104] Spectre V2 : Mitigation: Full generic retpoline
[    0.135105] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.135105] Spectre V2 : Enabling Restricted Speculation for firmware calls
[    0.135106] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier
[    0.135107] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp
[    0.135109] SRBDS: Vulnerable: No microcode
[    0.135110] MDS: Mitigation: Clear CPU buffers
[    0.135279] Freeing SMP alternatives memory: 40K
[    0.138821] smpboot: CPU0: Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz (family: 0x6, model: 0x3a, stepping: 0x9)
[    0.138915] Performance Events: PEBS fmt1+, IvyBridge events, 16-deep LBR, full-width counters, Intel PMU driver.
[    0.138921] ... version:                3
[    0.138921] ... bit width:              48
[    0.138922] ... generic registers:      8
[    0.138922] ... value mask:             0000ffffffffffff
[    0.138922] ... max period:             00007fffffffffff
[    0.138923] ... fixed-purpose events:   3
[    0.138923] ... event mask:             00000007000000ff
[    0.138953] rcu: Hierarchical SRCU implementation.
[    0.139601] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[    0.139647] smp: Bringing up secondary CPUs ...
[    0.139724] x86: Booting SMP configuration:
[    0.139725] .... node  #0, CPUs:      #1 #2 #3
[    0.146807] smp: Brought up 1 node, 4 CPUs
[    0.146807] smpboot: Max logical packages: 1
[    0.146807] smpboot: Total of 4 processors activated (27139.40 BogoMIPS)
[    0.147882] devtmpfs: initialized
[    0.147882] x86/mm: Memory block size: 128MB
[    0.147882] PM: Registering ACPI NVS region [mem 0xde117000-0xde1b6fff] (655360 bytes)
[    0.147882] PM: Registering ACPI NVS region [mem 0xde9a7000-0xde9e9fff] (274432 bytes)
[    0.147882] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.147882] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[    0.147882] pinctrl core: initialized pinctrl subsystem
[    0.147882] PM: RTC time: 12:11:09, date: 2020-07-26
[    0.147882] thermal_sys: Registered thermal governor 'fair_share'
[    0.147882] thermal_sys: Registered thermal governor 'bang_bang'
[    0.147882] thermal_sys: Registered thermal governor 'step_wise'
[    0.147882] thermal_sys: Registered thermal governor 'user_space'
[    0.147882] thermal_sys: Registered thermal governor 'power_allocator'
[    0.147882] NET: Registered protocol family 16
[    0.147882] DMA: preallocated 1024 KiB GFP_KERNEL pool for atomic allocations
[    0.147882] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[    0.147882] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[    0.147882] audit: initializing netlink subsys (disabled)
[    0.147882] audit: type=2000 audit(1595765468.032:1): state=initialized audit_enabled=0 res=1
[    0.147882] EISA bus registered
[    0.147882] cpuidle: using governor ladder
[    0.147882] cpuidle: using governor menu
[    0.147882] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it
[    0.147882] ACPI: bus type PCI registered
[    0.147882] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.147882] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf8000000-0xfbffffff] (base 0xf8000000)
[    0.147882] PCI: MMCONFIG at [mem 0xf8000000-0xfbffffff] reserved in E820
[    0.147882] PCI: Using configuration type 1 for base access
[    0.147882] core: PMU erratum BJ122, BV98, HSD29 workaround disabled, HT off
[    0.147882] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[    0.150780] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.150835] ACPI: Added _OSI(Module Device)
[    0.150835] ACPI: Added _OSI(Processor Device)
[    0.150836] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.150837] ACPI: Added _OSI(Processor Aggregator Device)
[    0.150837] ACPI: Added _OSI(Linux-Dell-Video)
[    0.150838] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[    0.150839] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[    0.157639] ACPI: 5 ACPI AML tables successfully acquired and loaded
[    0.159283] ACPI: Dynamic OEM Table Load:
[    0.159288] ACPI: SSDT 0xFFFF9176154C7000 00083B (v01 PmRef  Cpu0Cst  00003001 INTL 20051117)
[    0.160055] ACPI: Dynamic OEM Table Load:
[    0.160059] ACPI: SSDT 0xFFFF9176154BE000 000303 (v01 PmRef  ApIst    00003000 INTL 20051117)
[    0.160633] ACPI: Dynamic OEM Table Load:
[    0.160636] ACPI: SSDT 0xFFFF917615082400 000119 (v01 PmRef  ApCst    00003000 INTL 20051117)
[    0.161917] ACPI: Interpreter enabled
[    0.161936] ACPI: (supports S0 S3 S4 S5)
[    0.161937] ACPI: Using IOAPIC for interrupt routing
[    0.162004] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.162243] ACPI: Enabled 16 GPEs in block 00 to 3F
[    0.167082] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-3e])
[    0.167087] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[    0.167379] acpi PNP0A08:00: _OSC: platform does not support [PCIeHotplug SHPCHotplug PME LTR]
[    0.167574] acpi PNP0A08:00: _OSC: OS now controls [AER PCIeCapability]
[    0.167574] acpi PNP0A08:00: FADT indicates ASPM is unsupported, using BIOS configuration
[    0.168078] PCI host bridge to bus 0000:00
[    0.168080] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.168081] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.168082] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.168083] pci_bus 0000:00: root bus resource [mem 0x000d0000-0x000d3fff window]
[    0.168083] pci_bus 0000:00: root bus resource [mem 0x000d4000-0x000d7fff window]
[    0.168084] pci_bus 0000:00: root bus resource [mem 0x000d8000-0x000dbfff window]
[    0.168085] pci_bus 0000:00: root bus resource [mem 0x000dc000-0x000dffff window]
[    0.168086] pci_bus 0000:00: root bus resource [mem 0x000e0000-0x000e3fff window]
[    0.168087] pci_bus 0000:00: root bus resource [mem 0x000e4000-0x000e7fff window]
[    0.168087] pci_bus 0000:00: root bus resource [mem 0xe0000000-0xfeafffff window]
[    0.168088] pci_bus 0000:00: root bus resource [bus 00-3e]
[    0.168096] pci 0000:00:00.0: [8086:0150] type 00 class 0x060000
[    0.168183] pci 0000:00:01.0: [8086:0151] type 01 class 0x060400
[    0.168215] pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
[    0.168321] pci 0000:00:14.0: [8086:1e31] type 00 class 0x0c0330
[    0.168343] pci 0000:00:14.0: reg 0x10: [mem 0xf7f00000-0xf7f0ffff 64bit]
[    0.168408] pci 0000:00:14.0: PME# supported from D3hot D3cold
[    0.168494] pci 0000:00:16.0: [8086:1e3a] type 00 class 0x078000
[    0.168517] pci 0000:00:16.0: reg 0x10: [mem 0xf7f1a000-0xf7f1a00f 64bit]
[    0.168585] pci 0000:00:16.0: PME# supported from D0 D3hot D3cold
[    0.168668] pci 0000:00:1a.0: [8086:1e2d] type 00 class 0x0c0320
[    0.168688] pci 0000:00:1a.0: reg 0x10: [mem 0xf7f18000-0xf7f183ff]
[    0.168767] pci 0000:00:1a.0: PME# supported from D0 D3hot D3cold
[    0.168853] pci 0000:00:1b.0: [8086:1e20] type 00 class 0x040300
[    0.168872] pci 0000:00:1b.0: reg 0x10: [mem 0xf7f10000-0xf7f13fff 64bit]
[    0.168944] pci 0000:00:1b.0: PME# supported from D0 D3hot D3cold
[    0.169037] pci 0000:00:1c.0: [8086:1e10] type 01 class 0x060400
[    0.169192] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
[    0.169312] pci 0000:00:1c.4: [8086:244e] type 01 class 0x060401
[    0.169401] pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
[    0.169496] pci 0000:00:1c.5: [8086:1e1a] type 01 class 0x060400
[    0.169586] pci 0000:00:1c.5: PME# supported from D0 D3hot D3cold
[    0.169683] pci 0000:00:1c.7: [8086:1e1e] type 01 class 0x060400
[    0.169773] pci 0000:00:1c.7: PME# supported from D0 D3hot D3cold
[    0.169870] pci 0000:00:1d.0: [8086:1e26] type 00 class 0x0c0320
[    0.169890] pci 0000:00:1d.0: reg 0x10: [mem 0xf7f17000-0xf7f173ff]
[    0.169969] pci 0000:00:1d.0: PME# supported from D0 D3hot D3cold
[    0.170057] pci 0000:00:1f.0: [8086:1e4a] type 00 class 0x060100
[    0.170234] pci 0000:00:1f.2: [8086:1e02] type 00 class 0x010601
[    0.170250] pci 0000:00:1f.2: reg 0x10: [io  0xf070-0xf077]
[    0.170256] pci 0000:00:1f.2: reg 0x14: [io  0xf060-0xf063]
[    0.170263] pci 0000:00:1f.2: reg 0x18: [io  0xf050-0xf057]
[    0.170269] pci 0000:00:1f.2: reg 0x1c: [io  0xf040-0xf043]
[    0.170275] pci 0000:00:1f.2: reg 0x20: [io  0xf020-0xf03f]
[    0.170281] pci 0000:00:1f.2: reg 0x24: [mem 0xf7f16000-0xf7f167ff]
[    0.170318] pci 0000:00:1f.2: PME# supported from D3hot
[    0.170397] pci 0000:00:1f.3: [8086:1e22] type 00 class 0x0c0500
[    0.170413] pci 0000:00:1f.3: reg 0x10: [mem 0xf7f15000-0xf7f150ff 64bit]
[    0.170431] pci 0000:00:1f.3: reg 0x20: [io  0xf000-0xf01f]
[    0.170547] pci 0000:01:00.0: [1002:679a] type 00 class 0x030000
[    0.170558] pci 0000:01:00.0: reg 0x10: [mem 0xe0000000-0xefffffff 64bit pref]
[    0.170563] pci 0000:01:00.0: reg 0x18: [mem 0xf7e00000-0xf7e3ffff 64bit]
[    0.170567] pci 0000:01:00.0: reg 0x20: [io  0xe000-0xe0ff]
[    0.170573] pci 0000:01:00.0: reg 0x30: [mem 0xf7e40000-0xf7e5ffff pref]
[    0.170576] pci 0000:01:00.0: enabling Extended Tags
[    0.170602] pci 0000:01:00.0: supports D1 D2
[    0.170603] pci 0000:01:00.0: PME# supported from D1 D2 D3hot
[    0.170651] pci 0000:01:00.1: [1002:aaa0] type 00 class 0x040300
[    0.170661] pci 0000:01:00.1: reg 0x10: [mem 0xf7e60000-0xf7e63fff 64bit]
[    0.170677] pci 0000:01:00.1: enabling Extended Tags
[    0.170699] pci 0000:01:00.1: supports D1 D2
[    0.170743] pci 0000:00:01.0: PCI bridge to [bus 01]
[    0.170744] pci 0000:00:01.0:   bridge window [io  0xe000-0xefff]
[    0.170746] pci 0000:00:01.0:   bridge window [mem 0xf7e00000-0xf7efffff]
[    0.170748] pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xefffffff 64bit pref]
[    0.174800] pci 0000:00:1c.0: PCI bridge to [bus 02]
[    0.174868] pci 0000:03:00.0: [1b21:1080] type 01 class 0x060401
[    0.175050] pci 0000:00:1c.4: PCI bridge to [bus 03-04] (subtractive decode)
[    0.175059] pci 0000:00:1c.4:   bridge window [io  0x0000-0x0cf7 window] (subtractive decode)
[    0.175060] pci 0000:00:1c.4:   bridge window [io  0x0d00-0xffff window] (subtractive decode)
[    0.175061] pci 0000:00:1c.4:   bridge window [mem 0x000a0000-0x000bffff window] (subtractive decode)
[    0.175062] pci 0000:00:1c.4:   bridge window [mem 0x000d0000-0x000d3fff window] (subtractive decode)
[    0.175063] pci 0000:00:1c.4:   bridge window [mem 0x000d4000-0x000d7fff window] (subtractive decode)
[    0.175063] pci 0000:00:1c.4:   bridge window [mem 0x000d8000-0x000dbfff window] (subtractive decode)
[    0.175065] pci 0000:00:1c.4:   bridge window [mem 0x000dc000-0x000dffff window] (subtractive decode)
[    0.175066] pci 0000:00:1c.4:   bridge window [mem 0x000e0000-0x000e3fff window] (subtractive decode)
[    0.175067] pci 0000:00:1c.4:   bridge window [mem 0x000e4000-0x000e7fff window] (subtractive decode)
[    0.175068] pci 0000:00:1c.4:   bridge window [mem 0xe0000000-0xfeafffff window] (subtractive decode)
[    0.175102] pci_bus 0000:04: extended config space not accessible
[    0.175181] pci 0000:03:00.0: PCI bridge to [bus 04] (subtractive decode)
[    0.175201] pci 0000:03:00.0:   bridge window [io  0x0000-0x0cf7 window] (subtractive decode)
[    0.175202] pci 0000:03:00.0:   bridge window [io  0x0d00-0xffff window] (subtractive decode)
[    0.175203] pci 0000:03:00.0:   bridge window [mem 0x000a0000-0x000bffff window] (subtractive decode)
[    0.175204] pci 0000:03:00.0:   bridge window [mem 0x000d0000-0x000d3fff window] (subtractive decode)
[    0.175204] pci 0000:03:00.0:   bridge window [mem 0x000d4000-0x000d7fff window] (subtractive decode)
[    0.175205] pci 0000:03:00.0:   bridge window [mem 0x000d8000-0x000dbfff window] (subtractive decode)
[    0.175206] pci 0000:03:00.0:   bridge window [mem 0x000dc000-0x000dffff window] (subtractive decode)
[    0.175207] pci 0000:03:00.0:   bridge window [mem 0x000e0000-0x000e3fff window] (subtractive decode)
[    0.175208] pci 0000:03:00.0:   bridge window [mem 0x000e4000-0x000e7fff window] (subtractive decode)
[    0.175208] pci 0000:03:00.0:   bridge window [mem 0xe0000000-0xfeafffff window] (subtractive decode)
[    0.175270] pci 0000:05:00.0: [10ec:8168] type 00 class 0x020000
[    0.175303] pci 0000:05:00.0: reg 0x10: [io  0xd000-0xd0ff]
[    0.175335] pci 0000:05:00.0: reg 0x18: [mem 0xf0004000-0xf0004fff 64bit pref]
[    0.175354] pci 0000:05:00.0: reg 0x20: [mem 0xf0000000-0xf0003fff 64bit pref]
[    0.175479] pci 0000:05:00.0: supports D1 D2
[    0.175480] pci 0000:05:00.0: PME# supported from D0 D1 D2 D3hot D3cold
[    0.175606] pci 0000:00:1c.5: PCI bridge to [bus 05]
[    0.175609] pci 0000:00:1c.5:   bridge window [io  0xd000-0xdfff]
[    0.175616] pci 0000:00:1c.5:   bridge window [mem 0xf0000000-0xf00fffff 64bit pref]
[    0.175666] pci 0000:06:00.0: [1b21:0612] type 00 class 0x010601
[    0.175694] pci 0000:06:00.0: reg 0x10: [io  0xc050-0xc057]
[    0.175707] pci 0000:06:00.0: reg 0x14: [io  0xc040-0xc043]
[    0.175719] pci 0000:06:00.0: reg 0x18: [io  0xc030-0xc037]
[    0.175731] pci 0000:06:00.0: reg 0x1c: [io  0xc020-0xc023]
[    0.175743] pci 0000:06:00.0: reg 0x20: [io  0xc000-0xc01f]
[    0.175756] pci 0000:06:00.0: reg 0x24: [mem 0xf7d00000-0xf7d001ff]
[    0.175931] pci 0000:00:1c.7: PCI bridge to [bus 06]
[    0.175933] pci 0000:00:1c.7:   bridge window [io  0xc000-0xcfff]
[    0.175936] pci 0000:00:1c.7:   bridge window [mem 0xf7d00000-0xf7dfffff]
[    0.176585] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 10 *11 12 14 15)
[    0.176646] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 *10 11 12 14 15)
[    0.176705] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 *10 11 12 14 15)
[    0.176763] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 10 11 12 14 15)
[    0.176822] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.
[    0.176880] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.
[    0.176938] ACPI: PCI Interrupt Link [LNKG] (IRQs *3 4 5 6 10 11 12 14 15)
[    0.176998] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 10 *11 12 14 15)
[    0.177169] iommu: Default domain type: Translated 
[    0.177169] pci 0000:01:00.0: vgaarb: setting as boot VGA device
[    0.177169] pci 0000:01:00.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[    0.177169] pci 0000:01:00.0: vgaarb: bridge control possible
[    0.177169] vgaarb: loaded
[    0.177169] SCSI subsystem initialized
[    0.177169] libata version 3.00 loaded.
[    0.177169] ACPI: bus type USB registered
[    0.177169] usbcore: registered new interface driver usbfs
[    0.177169] usbcore: registered new interface driver hub
[    0.177169] usbcore: registered new device driver usb
[    0.177169] pps_core: LinuxPPS API ver. 1 registered
[    0.177169] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.177169] PTP clock support registered
[    0.177169] EDAC MC: Ver: 3.0.0
[    0.177169] NetLabel: Initializing
[    0.177169] NetLabel:  domain hash size = 128
[    0.177169] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
[    0.177169] NetLabel:  unlabeled traffic allowed by default
[    0.177169] PCI: Using ACPI for IRQ routing
[    0.179063] PCI: pci_cache_line_size set to 64 bytes
[    0.179109] e820: reserve RAM buffer [mem 0x0009d800-0x0009ffff]
[    0.179109] e820: reserve RAM buffer [mem 0xdd908000-0xdfffffff]
[    0.179110] e820: reserve RAM buffer [mem 0xde117000-0xdfffffff]
[    0.179111] e820: reserve RAM buffer [mem 0xde9a7000-0xdfffffff]
[    0.179112] e820: reserve RAM buffer [mem 0xdf408000-0xdfffffff]
[    0.179112] e820: reserve RAM buffer [mem 0xdf800000-0xdfffffff]
[    0.179113] e820: reserve RAM buffer [mem 0x21f000000-0x21fffffff]
[    0.179338] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0, 0, 0, 0, 0
[    0.179340] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
[    0.181360] clocksource: Switched to clocksource tsc-early
[    0.190211] VFS: Disk quotas dquot_6.6.0
[    0.190224] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[    0.190309] AppArmor: AppArmor Filesystem Enabled
[    0.190331] pnp: PnP ACPI init
[    0.190448] system 00:00: [mem 0xfed40000-0xfed44fff] has been reserved
[    0.190452] system 00:00: Plug and Play ACPI device, IDs PNP0c01 (active)
[    0.190536] system 00:01: [io  0x0680-0x069f] has been reserved
[    0.190537] system 00:01: [io  0x1000-0x100f] has been reserved
[    0.190538] system 00:01: [io  0xffff] has been reserved
[    0.190539] system 00:01: [io  0xffff] has been reserved
[    0.190540] system 00:01: [io  0x0400-0x0453] has been reserved
[    0.190541] system 00:01: [io  0x0458-0x047f] has been reserved
[    0.190543] system 00:01: [io  0x0500-0x057f] has been reserved
[    0.190544] system 00:01: [io  0x164e-0x164f] has been reserved
[    0.190546] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.190567] pnp 00:02: Plug and Play ACPI device, IDs PNP0b00 (active)
[    0.190613] system 00:03: [io  0x0454-0x0457] has been reserved
[    0.190616] system 00:03: Plug and Play ACPI device, IDs INT3f0d PNP0c02 (active)
[    0.190699] system 00:04: [io  0x0290-0x029f] has been reserved
[    0.190701] system 00:04: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.190931] system 00:05: [io  0x04d0-0x04d1] has been reserved
[    0.190934] system 00:05: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.190972] pnp 00:06: Plug and Play ACPI device, IDs PNP0303 PNP030b (active)
[    0.191125] pnp 00:07: [dma 0 disabled]
[    0.191160] pnp 00:07: Plug and Play ACPI device, IDs PNP0501 (active)
[    0.191391] system 00:08: [mem 0xfed1c000-0xfed1ffff] has been reserved
[    0.191392] system 00:08: [mem 0xfed10000-0xfed17fff] has been reserved
[    0.191393] system 00:08: [mem 0xfed18000-0xfed18fff] has been reserved
[    0.191394] system 00:08: [mem 0xfed19000-0xfed19fff] has been reserved
[    0.191395] system 00:08: [mem 0xf8000000-0xfbffffff] has been reserved
[    0.191396] system 00:08: [mem 0xfed20000-0xfed3ffff] has been reserved
[    0.191397] system 00:08: [mem 0xfed90000-0xfed93fff] has been reserved
[    0.191398] system 00:08: [mem 0xfed45000-0xfed8ffff] has been reserved
[    0.191399] system 00:08: [mem 0xff000000-0xffffffff] has been reserved
[    0.191400] system 00:08: [mem 0xfee00000-0xfeefffff] could not be reserved
[    0.191401] system 00:08: [mem 0xf0100000-0xf0100fff] has been reserved
[    0.191403] system 00:08: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.191561] pnp: PnP ACPI: found 9 devices
[    0.197011] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[    0.197057] NET: Registered protocol family 2
[    0.197198] tcp_listen_portaddr_hash hash table entries: 4096 (order: 4, 65536 bytes, linear)
[    0.197256] TCP established hash table entries: 65536 (order: 7, 524288 bytes, linear)
[    0.197413] TCP bind hash table entries: 65536 (order: 8, 1048576 bytes, linear)
[    0.197478] TCP: Hash tables configured (established 65536 bind 65536)
[    0.197550] UDP hash table entries: 4096 (order: 5, 131072 bytes, linear)
[    0.197573] UDP-Lite hash table entries: 4096 (order: 5, 131072 bytes, linear)
[    0.197615] NET: Registered protocol family 1
[    0.197619] NET: Registered protocol family 44
[    0.197629] pci 0000:00:01.0: PCI bridge to [bus 01]
[    0.197631] pci 0000:00:01.0:   bridge window [io  0xe000-0xefff]
[    0.197633] pci 0000:00:01.0:   bridge window [mem 0xf7e00000-0xf7efffff]
[    0.197635] pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xefffffff 64bit pref]
[    0.197637] pci 0000:00:1c.0: PCI bridge to [bus 02]
[    0.197654] pci 0000:03:00.0: PCI bridge to [bus 04]
[    0.197672] pci 0000:00:1c.4: PCI bridge to [bus 03-04]
[    0.197682] pci 0000:00:1c.5: PCI bridge to [bus 05]
[    0.197683] pci 0000:00:1c.5:   bridge window [io  0xd000-0xdfff]
[    0.197689] pci 0000:00:1c.5:   bridge window [mem 0xf0000000-0xf00fffff 64bit pref]
[    0.197694] pci 0000:00:1c.7: PCI bridge to [bus 06]
[    0.197695] pci 0000:00:1c.7:   bridge window [io  0xc000-0xcfff]
[    0.197699] pci 0000:00:1c.7:   bridge window [mem 0xf7d00000-0xf7dfffff]
[    0.197706] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7 window]
[    0.197707] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff window]
[    0.197708] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.197709] pci_bus 0000:00: resource 7 [mem 0x000d0000-0x000d3fff window]
[    0.197710] pci_bus 0000:00: resource 8 [mem 0x000d4000-0x000d7fff window]
[    0.197711] pci_bus 0000:00: resource 9 [mem 0x000d8000-0x000dbfff window]
[    0.197712] pci_bus 0000:00: resource 10 [mem 0x000dc000-0x000dffff window]
[    0.197712] pci_bus 0000:00: resource 11 [mem 0x000e0000-0x000e3fff window]
[    0.197713] pci_bus 0000:00: resource 12 [mem 0x000e4000-0x000e7fff window]
[    0.197714] pci_bus 0000:00: resource 13 [mem 0xe0000000-0xfeafffff window]
[    0.197715] pci_bus 0000:01: resource 0 [io  0xe000-0xefff]
[    0.197716] pci_bus 0000:01: resource 1 [mem 0xf7e00000-0xf7efffff]
[    0.197716] pci_bus 0000:01: resource 2 [mem 0xe0000000-0xefffffff 64bit pref]
[    0.197718] pci_bus 0000:03: resource 4 [io  0x0000-0x0cf7 window]
[    0.197718] pci_bus 0000:03: resource 5 [io  0x0d00-0xffff window]
[    0.197719] pci_bus 0000:03: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.197720] pci_bus 0000:03: resource 7 [mem 0x000d0000-0x000d3fff window]
[    0.197721] pci_bus 0000:03: resource 8 [mem 0x000d4000-0x000d7fff window]
[    0.197722] pci_bus 0000:03: resource 9 [mem 0x000d8000-0x000dbfff window]
[    0.197722] pci_bus 0000:03: resource 10 [mem 0x000dc000-0x000dffff window]
[    0.197723] pci_bus 0000:03: resource 11 [mem 0x000e0000-0x000e3fff window]
[    0.197724] pci_bus 0000:03: resource 12 [mem 0x000e4000-0x000e7fff window]
[    0.197725] pci_bus 0000:03: resource 13 [mem 0xe0000000-0xfeafffff window]
[    0.197726] pci_bus 0000:04: resource 4 [io  0x0000-0x0cf7 window]
[    0.197726] pci_bus 0000:04: resource 5 [io  0x0d00-0xffff window]
[    0.197727] pci_bus 0000:04: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.197728] pci_bus 0000:04: resource 7 [mem 0x000d0000-0x000d3fff window]
[    0.197729] pci_bus 0000:04: resource 8 [mem 0x000d4000-0x000d7fff window]
[    0.197730] pci_bus 0000:04: resource 9 [mem 0x000d8000-0x000dbfff window]
[    0.197730] pci_bus 0000:04: resource 10 [mem 0x000dc000-0x000dffff window]
[    0.197731] pci_bus 0000:04: resource 11 [mem 0x000e0000-0x000e3fff window]
[    0.197732] pci_bus 0000:04: resource 12 [mem 0x000e4000-0x000e7fff window]
[    0.197733] pci_bus 0000:04: resource 13 [mem 0xe0000000-0xfeafffff window]
[    0.197734] pci_bus 0000:05: resource 0 [io  0xd000-0xdfff]
[    0.197735] pci_bus 0000:05: resource 2 [mem 0xf0000000-0xf00fffff 64bit pref]
[    0.197735] pci_bus 0000:06: resource 0 [io  0xc000-0xcfff]
[    0.197736] pci_bus 0000:06: resource 1 [mem 0xf7d00000-0xf7dfffff]
[    0.222890] pci 0000:00:1a.0: quirk_usb_early_handoff+0x0/0x662 took 24314 usecs
[    0.246886] pci 0000:00:1d.0: quirk_usb_early_handoff+0x0/0x662 took 23419 usecs
[    0.246898] pci 0000:01:00.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[    0.246903] pci 0000:01:00.1: D0 power state depends on 0000:01:00.0
[    0.246910] pci 0000:03:00.0: CLS mismatch (64 != 32), using 64 bytes
[    0.246965] Trying to unpack rootfs image as initramfs...
[    0.364777] Freeing initrd memory: 53636K
[    0.364812] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    0.364814] software IO TLB: mapped [mem 0xd6600000-0xda600000] (64MB)
[    0.365043] check: Scanning for low memory corruption every 60 seconds
[    0.365380] Initialise system trusted keyrings
[    0.365388] Key type blacklist registered
[    0.365410] workingset: timestamp_bits=36 max_order=21 bucket_order=0
[    0.366383] zbud: loaded
[    0.366576] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    0.366695] fuse: init (API version 7.31)
[    0.366801] integrity: Platform Keyring initialized
[    0.375417] Key type asymmetric registered
[    0.375418] Asymmetric key parser 'x509' registered
[    0.375424] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 244)
[    0.375457] io scheduler mq-deadline registered
[    0.376279] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[    0.376316] vesafb: mode is 1280x1024x32, linelength=5120, pages=0
[    0.376317] vesafb: scrolling: redraw
[    0.376318] vesafb: Truecolor: size=0:8:8:8, shift=0:16:8:0
[    0.376332] vesafb: framebuffer at 0xe0000000, mapped to 0x0000000076879528, using 5120k, total 5120k
[    0.376360] fbcon: Deferring console take-over
[    0.376361] fb0: VESA VGA frame buffer device
[    0.376369] intel_idle: MWAIT substates: 0x1120
[    0.376370] intel_idle: v0.5.1 model 0x3A
[    0.376490] intel_idle: Local APIC timer is reliable in all C-states
[    0.376584] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
[    0.376600] ACPI: Power Button [PWRB]
[    0.376625] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input1
[    0.376649] ACPI: Power Button [PWRF]
[    0.376964] Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
[    0.397473] 00:07: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[    0.399205] Linux agpgart interface v0.103
[    0.401124] loop: module loaded
[    0.401322] libphy: Fixed MDIO Bus: probed
[    0.401323] tun: Universal TUN/TAP device driver, 1.6
[    0.401342] PPP generic driver version 2.4.2
[    0.401375] VFIO - User Level meta-driver version: 0.3
[    0.401445] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    0.401447] ehci-pci: EHCI PCI platform driver
[    0.401544] ehci-pci 0000:00:1a.0: EHCI Host Controller
[    0.401548] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 1
[    0.401558] ehci-pci 0000:00:1a.0: debug port 2
[    0.405470] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
[    0.405481] ehci-pci 0000:00:1a.0: irq 16, io mem 0xf7f18000
[    0.418782] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00
[    0.418858] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.08
[    0.418859] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    0.418860] usb usb1: Product: EHCI Host Controller
[    0.418861] usb usb1: Manufacturer: Linux 5.8.0-050800rc6-generic ehci_hcd
[    0.418861] usb usb1: SerialNumber: 0000:00:1a.0
[    0.419022] hub 1-0:1.0: USB hub found
[    0.419029] hub 1-0:1.0: 2 ports detected
[    0.419235] ehci-pci 0000:00:1d.0: EHCI Host Controller
[    0.419238] ehci-pci 0000:00:1d.0: new USB bus registered, assigned bus number 2
[    0.419247] ehci-pci 0000:00:1d.0: debug port 2
[    0.423140] ehci-pci 0000:00:1d.0: cache line size of 64 is not supported
[    0.423147] ehci-pci 0000:00:1d.0: irq 23, io mem 0xf7f17000
[    0.438781] ehci-pci 0000:00:1d.0: USB 2.0 started, EHCI 1.00
[    0.438850] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.08
[    0.438851] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    0.438852] usb usb2: Product: EHCI Host Controller
[    0.438853] usb usb2: Manufacturer: Linux 5.8.0-050800rc6-generic ehci_hcd
[    0.438854] usb usb2: SerialNumber: 0000:00:1d.0
[    0.439011] hub 2-0:1.0: USB hub found
[    0.439017] hub 2-0:1.0: 2 ports detected
[    0.439137] ehci-platform: EHCI generic platform driver
[    0.439143] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[    0.439146] ohci-pci: OHCI PCI platform driver
[    0.439153] ohci-platform: OHCI generic platform driver
[    0.439157] uhci_hcd: USB Universal Host Controller Interface driver
[    0.439197] i8042: PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
[    0.439197] i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp
[    0.439680] serio: i8042 KBD port at 0x60,0x64 irq 1
[    0.439888] mousedev: PS/2 mouse device common for all mice
[    0.440215] rtc_cmos 00:02: RTC can wake from S4
[    0.440407] rtc_cmos 00:02: registered as rtc0
[    0.440466] rtc_cmos 00:02: setting system clock to 2020-07-26T12:11:09 UTC (1595765469)
[    0.440478] rtc_cmos 00:02: alarms up to one month, y3k, 242 bytes nvram, hpet irqs
[    0.440483] i2c /dev entries driver
[    0.440513] device-mapper: uevent: version 1.0.3
[    0.440556] device-mapper: ioctl: 4.42.0-ioctl (2020-02-27) initialised: dm-devel@redhat.com
[    0.440571] platform eisa.0: Probing EISA bus 0
[    0.440572] platform eisa.0: EISA: Cannot allocate resource for mainboard
[    0.440573] platform eisa.0: Cannot allocate resource for EISA slot 1
[    0.440574] platform eisa.0: Cannot allocate resource for EISA slot 2
[    0.440574] platform eisa.0: Cannot allocate resource for EISA slot 3
[    0.440575] platform eisa.0: Cannot allocate resource for EISA slot 4
[    0.440576] platform eisa.0: Cannot allocate resource for EISA slot 5
[    0.440577] platform eisa.0: Cannot allocate resource for EISA slot 6
[    0.440577] platform eisa.0: Cannot allocate resource for EISA slot 7
[    0.440578] platform eisa.0: Cannot allocate resource for EISA slot 8
[    0.440579] platform eisa.0: EISA: Detected 0 cards
[    0.440583] intel_pstate: Intel P-state driver initializing
[    0.440811] ledtrig-cpu: registered to indicate activity on CPUs
[    0.440850] drop_monitor: Initializing network drop monitor service
[    0.440988] NET: Registered protocol family 10
[    0.446148] Segment Routing with IPv6
[    0.446163] NET: Registered protocol family 17
[    0.446230] Key type dns_resolver registered
[    0.446481] microcode: sig=0x306a9, pf=0x2, revision=0x21
[    0.446529] microcode: Microcode Update Driver: v2.2.
[    0.446532] IPI shorthand broadcast: enabled
[    0.446537] sched_clock: Marking stable (446362121, 164421)->(452023091, -5496549)
[    0.446596] registered taskstats version 1
[    0.446605] Loading compiled-in X.509 certificates
[    0.447191] Loaded X.509 cert 'Build time autogenerated kernel key: f5ed095bb538b9d2a07de73aa8b3b326e45d53f0'
[    0.447219] zswap: loaded using pool lzo/zbud
[    0.447327] Key type ._fscrypt registered
[    0.447328] Key type .fscrypt registered
[    0.447328] Key type fscrypt-provisioning registered
[    0.449435] Key type encrypted registered
[    0.449437] AppArmor: AppArmor sha1 policy hashing enabled
[    0.449442] ima: No TPM chip found, activating TPM-bypass!
[    0.449445] ima: Allocated hash algorithm: sha1
[    0.449452] ima: No architecture policies found
[    0.449462] evm: Initialising EVM extended attributes:
[    0.449462] evm: security.selinux
[    0.449463] evm: security.SMACK64
[    0.449463] evm: security.SMACK64EXEC
[    0.449463] evm: security.SMACK64TRANSMUTE
[    0.449464] evm: security.SMACK64MMAP
[    0.449464] evm: security.apparmor
[    0.449464] evm: security.ima
[    0.449464] evm: security.capability
[    0.449465] evm: HMAC attrs: 0x1
[    0.449711] PM:   Magic number: 12:847:178
[    0.449746] acpi device:0e: hash matches
[    0.449762]  platform: hash matches
[    0.449851] RAS: Correctable Errors collector initialized.
[    0.450788] Freeing unused decrypted memory: 2040K
[    0.451226] Freeing unused kernel image (initmem) memory: 2632K
[    0.464247] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input2
[    0.470785] Write protecting the kernel read-only data: 26624k
[    0.471421] Freeing unused kernel image (text/rodata gap) memory: 2044K
[    0.471711] Freeing unused kernel image (rodata/data gap) memory: 1504K
[    0.511328] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[    0.511329] x86/mm: Checking user space page tables
[    0.550008] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[    0.550011] Run /init as init process
[    0.550012]   with arguments:
[    0.550012]     /init
[    0.550013]     splash
[    0.550013]   with environment:
[    0.550014]     HOME=/
[    0.550014]     TERM=linux
[    0.550014]     BOOT_IMAGE=/boot/vmlinuz-5.8.0-050800rc6-generic
[    0.616201] xhci_hcd 0000:00:14.0: xHCI Host Controller
[    0.616206] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 3
[    0.617408] xhci_hcd 0000:00:14.0: hcc params 0x20007181 hci version 0x100 quirks 0x000000000000b930
[    0.617412] xhci_hcd 0000:00:14.0: cache line size of 64 is not supported
[    0.617453] ACPI Warning: SystemIO range 0x0000000000000428-0x000000000000042F conflicts with OpRegion 0x0000000000000400-0x000000000000047F (\PMIO) (20200528/utaddress-204)
[    0.617458] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    0.617460] ACPI Warning: SystemIO range 0x0000000000000540-0x000000000000054F conflicts with OpRegion 0x0000000000000500-0x000000000000057F (\GPR2) (20200528/utaddress-204)
[    0.617463] ACPI Warning: SystemIO range 0x0000000000000540-0x000000000000054F conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20200528/utaddress-204)
[    0.617465] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    0.617465] ACPI Warning: SystemIO range 0x0000000000000530-0x000000000000053F conflicts with OpRegion 0x0000000000000500-0x000000000000057F (\GPR2) (20200528/utaddress-204)
[    0.617467] ACPI Warning: SystemIO range 0x0000000000000530-0x000000000000053F conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20200528/utaddress-204)
[    0.617469] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    0.617469] ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052F conflicts with OpRegion 0x0000000000000500-0x000000000000057F (\GPR2) (20200528/utaddress-204)
[    0.617471] ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052F conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20200528/utaddress-204)
[    0.617473] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[    0.617473] lpc_ich: Resource conflict(s) found affecting gpio_ich
[    0.617550] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.08
[    0.617551] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    0.617551] usb usb3: Product: xHCI Host Controller
[    0.617552] usb usb3: Manufacturer: Linux 5.8.0-050800rc6-generic xhci-hcd
[    0.617553] usb usb3: SerialNumber: 0000:00:14.0
[    0.619611] ahci 0000:00:1f.2: version 3.0
[    0.619698] r8169 0000:05:00.0: can't disable ASPM; OS doesn't have ASPM control
[    0.619813] hub 3-0:1.0: USB hub found
[    0.620778] hub 3-0:1.0: 4 ports detected
[    0.630937] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x3f impl SATA mode
[    0.630939] ahci 0000:00:1f.2: flags: 64bit ncq pm led clo pio slum part ems apst 
[    0.636087] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
[    0.636977] xhci_hcd 0000:00:14.0: xHCI Host Controller
[    0.636980] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 4
[    0.636982] xhci_hcd 0000:00:14.0: Host supports USB 3.0 SuperSpeed
[    0.637007] i2c i2c-0: 2/4 memory slots populated (from DMI)
[    0.637019] usb usb4: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 5.08
[    0.637020] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    0.637021] usb usb4: Product: xHCI Host Controller
[    0.637021] usb usb4: Manufacturer: Linux 5.8.0-050800rc6-generic xhci-hcd
[    0.637022] usb usb4: SerialNumber: 0000:00:14.0
[    0.637102] hub 4-0:1.0: USB hub found
[    0.637109] hub 4-0:1.0: 4 ports detected
[    0.637356] i2c i2c-0: Successfully instantiated SPD at 0x50
[    0.637656] i2c i2c-0: Successfully instantiated SPD at 0x51
[    0.650843] libphy: r8169: probed
[    0.659022] r8169 0000:05:00.0 eth0: RTL8168evl/8111evl, bc:5f:f4:99:82:b4, XID 2c9, IRQ 31
[    0.659023] r8169 0000:05:00.0 eth0: jumbo features [frames: 9194 bytes, tx checksumming: ko]
[    0.695313] scsi host0: ahci
[    0.695501] scsi host1: ahci
[    0.695605] scsi host2: ahci
[    0.695702] scsi host3: ahci
[    0.695832] scsi host4: ahci
[    0.695947] scsi host5: ahci
[    0.695978] ata1: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16100 irq 30
[    0.695979] ata2: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16180 irq 30
[    0.695981] ata3: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16200 irq 30
[    0.695982] ata4: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16280 irq 30
[    0.695983] ata5: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16300 irq 30
[    0.695984] ata6: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16380 irq 30
[    0.696142] ahci 0000:06:00.0: SSS flag set, parallel bus scan disabled
[    0.696180] ahci 0000:06:00.0: AHCI 0001.0200 32 slots 2 ports 6 Gbps 0x3 impl SATA mode
[    0.696181] ahci 0000:06:00.0: flags: 64bit ncq sntf stag led clo pmp pio slum part ccc sxs 
[    0.696361] scsi host6: ahci
[    0.696415] scsi host7: ahci
[    0.696446] ata7: SATA max UDMA/133 abar m512@0xf7d00000 port 0xf7d00100 irq 32
[    0.696448] ata8: SATA max UDMA/133 abar m512@0xf7d00000 port 0xf7d00180 irq 32
[    0.754782] usb 1-1: new high-speed USB device number 2 using ehci-pci
[    0.774790] usb 2-1: new high-speed USB device number 2 using ehci-pci
[    0.911507] usb 1-1: New USB device found, idVendor=8087, idProduct=0024, bcdDevice= 0.00
[    0.911508] usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    0.911849] hub 1-1:1.0: USB hub found
[    0.912053] hub 1-1:1.0: 6 ports detected
[    0.931162] usb 2-1: New USB device found, idVendor=8087, idProduct=0024, bcdDevice= 0.00
[    0.931165] usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    0.931557] hub 2-1:1.0: USB hub found
[    0.931651] hub 2-1:1.0: 8 ports detected
[    1.010804] ata7: SATA link down (SStatus 0 SControl 300)
[    1.010808] ata6: SATA link down (SStatus 0 SControl 300)
[    1.010836] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    1.010857] ata5: SATA link down (SStatus 0 SControl 300)
[    1.010883] ata2: SATA link down (SStatus 0 SControl 300)
[    1.010895] ata1: SATA link down (SStatus 0 SControl 300)
[    1.010908] ata4: SATA link down (SStatus 0 SControl 300)
[    1.012014] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.012018] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.012020] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.047436] ata3.00: ATA-7: ST3360320AS, 3.AAM, max UDMA/133
[    1.047437] ata3.00: 703282608 sectors, multi 16: LBA48 NCQ (depth 32)
[    1.073177] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[    1.073180] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    1.073183] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    1.105743] ata3.00: configured for UDMA/133
[    1.105861] scsi 2:0:0:0: Direct-Access     ATA      ST3360320AS      M    PQ: 0 ANSI: 5
[    1.106002] sd 2:0:0:0: Attached scsi generic sg0 type 0
[    1.106029] sd 2:0:0:0: [sda] 703282608 512-byte logical blocks: (360 GB/335 GiB)
[    1.106036] sd 2:0:0:0: [sda] Write Protect is off
[    1.106037] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    1.106050] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.173751]  sda: sda1 sda2
[    1.174077] sd 2:0:0:0: [sda] Attached SCSI disk
[    1.178771] usb 1-1.5: new low-speed USB device number 3 using ehci-pci
[    1.302266] usb 1-1.5: New USB device found, idVendor=045e, idProduct=0040, bcdDevice= 1.21
[    1.302269] usb 1-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[    1.302279] usb 1-1.5: Product: Microsoft Wheel Mouse Optical®
[    1.302280] usb 1-1.5: Manufacturer: Microsoft
[    1.306529] hid: raw HID events driver (C) Jiri Kosina
[    1.313170] usbcore: registered new interface driver usbhid
[    1.313170] usbhid: USB HID core driver
[    1.315148] input: Microsoft Microsoft Wheel Mouse Optical® as /devices/pci0000:00/0000:00:1a.0/usb1/1-1/1-1.5/1-1.5:1.0/0003:045E:0040.0001/input/input3
[    1.315224] hid-generic 0003:045E:0040.0001: input,hidraw0: USB HID v1.00 Mouse [Microsoft Microsoft Wheel Mouse Optical®] on usb-0000:00:1a.0-1.5/input0
[    1.366782] tsc: Refined TSC clocksource calibration: 3392.293 MHz
[    1.366789] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x30e5de2a436, max_idle_ns: 440795285127 ns
[    1.366901] clocksource: Switched to clocksource tsc
[    1.382775] usb 1-1.6: new high-speed USB device number 4 using ehci-pci
[    1.405193] ata8: SATA link down (SStatus 0 SControl 300)
[    1.493243] usb 1-1.6: New USB device found, idVendor=05e3, idProduct=0605, bcdDevice= 6.0b
[    1.493244] usb 1-1.6: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[    1.493245] usb 1-1.6: Product: USB2.0 Hub
[    1.493691] hub 1-1.6:1.0: USB hub found
[    1.494115] hub 1-1.6:1.0: 4 ports detected
[    2.119687] fbcon: Taking over console
[    2.119758] Console: switching to colour frame buffer device 160x64
[    2.192425] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
[    4.153346] systemd[1]: Inserted module 'autofs4'
[    4.317155] systemd[1]: systemd 245.4-4ubuntu3.2 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
[    4.334876] systemd[1]: Detected architecture x86-64.
[    4.360873] systemd[1]: Set hostname to <utente-desktop>.
[    7.546847] systemd[1]: /lib/systemd/system/dbus.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/dbus/system_bus_socket → /run/dbus/system_bus_socket; please update the unit file accordingly.
[    8.593193] systemd[1]: Created slice Virtual Machine and Container Slice.
[    8.593492] systemd[1]: Created slice system-modprobe.slice.
[    8.593642] systemd[1]: Created slice User and Session Slice.
[    8.593690] systemd[1]: Started Forward Password Requests to Wall Directory Watch.
[    8.593823] systemd[1]: Set up automount Arbitrary Executable File Formats File System Automount Point.
[    8.593856] systemd[1]: Reached target User and Group Name Lookups.
[    8.593866] systemd[1]: Reached target Remote File Systems.
[    8.593872] systemd[1]: Reached target Slices.
[    8.593888] systemd[1]: Reached target Libvirt guests shutdown.
[    8.593938] systemd[1]: Listening on Device-mapper event daemon FIFOs.
[    8.594003] systemd[1]: Listening on LVM2 poll daemon socket.
[    8.605075] systemd[1]: Listening on Syslog Socket.
[    8.605141] systemd[1]: Listening on fsck to fsckd communication Socket.
[    8.605180] systemd[1]: Listening on initctl Compatibility Named Pipe.
[    8.605314] systemd[1]: Listening on Journal Audit Socket.
[    8.605367] systemd[1]: Listening on Journal Socket (/dev/log).
[    8.605436] systemd[1]: Listening on Journal Socket.
[    8.605529] systemd[1]: Listening on Network Service Netlink Socket.
[    8.605591] systemd[1]: Listening on udev Control Socket.
[    8.605632] systemd[1]: Listening on udev Kernel Socket.
[    8.606314] systemd[1]: Mounting Huge Pages File System...
[    8.607032] systemd[1]: Mounting POSIX Message Queue File System...
[    8.607828] systemd[1]: Mounting Kernel Debug File System...
[    8.608560] systemd[1]: Mounting Kernel Trace File System...
[    8.609756] systemd[1]: Starting Journal Service...
[    8.610486] systemd[1]: Starting Availability of block devices...
[    8.611470] systemd[1]: Starting Set the console keyboard layout...
[    8.612340] systemd[1]: Starting Create list of static device nodes for the current kernel...
[    8.613086] systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
[    8.613818] systemd[1]: Starting Load Kernel Module drm...
[    8.755368] systemd[1]: Condition check resulted in Set Up Additional Binary Formats being skipped.
[    8.755416] systemd[1]: Condition check resulted in File System Check on Root Device being skipped.
[    8.834411] systemd[1]: Starting Load Kernel Modules...
[    8.835159] systemd[1]: Starting Remount Root and Kernel File Systems...
[    8.835857] systemd[1]: Starting udev Coldplug all Devices...
[    8.836525] systemd[1]: Starting Uncomplicated firewall...
[    8.837906] systemd[1]: Mounted Huge Pages File System.
[    8.838007] systemd[1]: Mounted POSIX Message Queue File System.
[    8.838088] systemd[1]: Mounted Kernel Debug File System.
[    8.838167] systemd[1]: Mounted Kernel Trace File System.
[    8.838502] systemd[1]: Finished Availability of block devices.
[    8.846510] systemd[1]: Finished Create list of static device nodes for the current kernel.
[    9.003539] systemd[1]: Started Journal Service.
[    9.039225] EXT4-fs (sda1): re-mounted. Opts: errors=remount-ro
[    9.207828] systemd-journald[295]: Received client request to flush runtime journal.
[    9.534583] lp: driver loaded but no devices found
[    9.675407] ppdev: user-space parallel port driver
[   13.179050] audit: type=1400 audit(1595765482.234:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=388 comm="apparmor_parser"
[   13.179061] audit: type=1400 audit(1595765482.234:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=388 comm="apparmor_parser"
[   13.179063] audit: type=1400 audit(1595765482.234:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=388 comm="apparmor_parser"
[   13.228910] audit: type=1400 audit(1595765482.282:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash" pid=390 comm="apparmor_parser"
[   13.321052] audit: type=1400 audit(1595765482.374:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="unity8-dash" pid=392 comm="apparmor_parser"
[   13.327188] audit: type=1400 audit(1595765482.382:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="content-hub-peer-picker" pid=391 comm="apparmor_parser"
[   13.391780] audit: type=1400 audit(1595765482.446:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/tcpdump" pid=387 comm="apparmor_parser"
[   13.470023] audit: type=1400 audit(1595765482.522:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/cups-browsed" pid=393 comm="apparmor_parser"
[   13.493912] audit: type=1400 audit(1595765482.546:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/cups/backend/cups-pdf" pid=389 comm="apparmor_parser"
[   13.493923] audit: type=1400 audit(1595765482.546:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/sbin/cupsd" pid=389 comm="apparmor_parser"
[   16.818546] at24 0-0050: supply vcc not found, using dummy regulator
[   16.819139] at24 0-0050: 256 byte spd EEPROM, read-only
[   16.819164] at24 0-0051: supply vcc not found, using dummy regulator
[   16.819730] at24 0-0051: 256 byte spd EEPROM, read-only
[   20.037329] RAPL PMU: API unit is 2^-32 Joules, 2 fixed counters, 163840 ms ovfl timer
[   20.037330] RAPL PMU: hw unit of domain pp0-core 2^-16 Joules
[   20.037331] RAPL PMU: hw unit of domain package 2^-16 Joules
[   21.044402] [drm] radeon kernel modesetting enabled.
[   21.044450] radeon 0000:01:00.0: SI support disabled by module param
[   21.048448] cryptd: max_cpu_qlen set to 1000
[   21.477046] AVX version of gcm_enc/dec engaged.
[   21.477048] AES CTR mode by8 optimization enabled
[   21.618260] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[   21.618262] AMD-Vi: AMD IOMMUv2 functionality not available on this system
[   22.281348] [drm] amdgpu kernel modesetting enabled.
[   22.281415] CRAT table not found
[   22.281418] Virtual CRAT table created for CPU
[   22.281432] amdgpu: Topology: Add CPU node
[   22.281502] checking generic (e0000000 500000) vs hw (e0000000 10000000)
[   22.281503] fb0: switching to amdgpudrmfb from VESA VGA
[   22.281577] Console: switching to colour dummy device 80x25
[   22.281606] amdgpu 0000:01:00.0: vgaarb: deactivate vga console
[   22.281726] [drm] initializing kernel modesetting (TAHITI 0x1002:0x679A 0x174B:0xE207 0x00).
[   22.281728] amdgpu 0000:01:00.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
[   22.281734] [drm] register mmio base: 0xF7E00000
[   22.281734] [drm] register mmio size: 262144
[   22.281735] [drm] PCIE atomic ops is not supported
[   22.281739] [drm] add ip block number 0 <si_common>
[   22.281739] [drm] add ip block number 1 <gmc_v6_0>
[   22.281740] [drm] add ip block number 2 <si_ih>
[   22.281740] [drm] add ip block number 3 <gfx_v6_0>
[   22.281741] [drm] add ip block number 4 <si_dma>
[   22.281741] [drm] add ip block number 5 <si_dpm>
[   22.281742] [drm] add ip block number 6 <dce_v6_0>
[   22.281743] kfd kfd: TAHITI  not supported in kfd
[   22.288950] [drm] BIOS signature incorrect 0 0
[   22.288955] resource sanity check: requesting [mem 0x000c0000-0x000dffff], which spans more than PCI Bus 0000:00 [mem 0x000d0000-0x000d3fff window]
[   22.288958] caller pci_map_rom+0x71/0x18c mapping multiple BARs
[   22.288975] amdgpu: ATOM BIOS: 113-1E207200SA-T47
[   22.289285] [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
[   22.380020] snd_hda_intel 0000:01:00.1: Force to non-snoop mode
[   22.490933] input: HDA ATI HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input4
[   22.490969] input: HDA ATI HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input5
[   22.490998] input: HDA ATI HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input6
[   22.491027] input: HDA ATI HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input7
[   22.491058] input: HDA ATI HDMI HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input8
[   22.491087] input: HDA ATI HDMI HDMI/DP,pcm=11 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input9
[   22.813179] amdgpu 0000:01:00.0: amdgpu: VRAM: 3072M 0x000000F400000000 - 0x000000F4BFFFFFFF (3072M used)
[   22.813181] amdgpu 0000:01:00.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
[   22.813190] [drm] Detected VRAM RAM=3072M, BAR=256M
[   22.813190] [drm] RAM width 384bits GDDR5
[   22.813279] [TTM] Zone  kernel: Available graphics memory: 4051868 KiB
[   22.813280] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
[   22.813280] [TTM] Initializing pool allocator
[   22.813283] [TTM] Initializing DMA pool allocator
[   22.813315] [drm] amdgpu: 3072M of VRAM memory ready
[   22.813317] [drm] amdgpu: 3072M of GTT memory ready.
[   22.813320] [drm] GART: num cpu pages 262144, num gpu pages 262144
[   22.813765] amdgpu 0000:01:00.0: amdgpu: PCIE GART of 1024M enabled (table at 0x000000F400500000).
[   22.813811] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[   22.828189] intel_rapl_common: Found RAPL domain package
[   22.828190] intel_rapl_common: Found RAPL domain core
[   23.047397] snd_hda_codec_realtek hdaudioC0D0: autoconfig for ALC892: line_outs=3 (0x14/0x15/0x16/0x0/0x0) type:line
[   23.047399] snd_hda_codec_realtek hdaudioC0D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
[   23.047400] snd_hda_codec_realtek hdaudioC0D0:    hp_outs=1 (0x1b/0x0/0x0/0x0/0x0)
[   23.047400] snd_hda_codec_realtek hdaudioC0D0:    mono: mono_out=0x0
[   23.047401] snd_hda_codec_realtek hdaudioC0D0:    dig-out=0x1e/0x0
[   23.047401] snd_hda_codec_realtek hdaudioC0D0:    inputs:
[   23.047403] snd_hda_codec_realtek hdaudioC0D0:      Front Mic=0x19
[   23.047404] snd_hda_codec_realtek hdaudioC0D0:      Rear Mic=0x18
[   23.047404] snd_hda_codec_realtek hdaudioC0D0:      Line=0x1a
[   23.060290] input: HDA Intel PCH Rear Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input10
[   23.060326] input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1b.0/sound/card0/input11
[   23.060356] input: HDA Intel PCH Line Out Front as /devices/pci0000:00/0000:00:1b.0/sound/card0/input12
[   23.060386] input: HDA Intel PCH Line Out Surround as /devices/pci0000:00/0000:00:1b.0/sound/card0/input13
[   23.060424] input: HDA Intel PCH Line Out CLFE as /devices/pci0000:00/0000:00:1b.0/sound/card0/input14
[   23.132188] [drm] Internal thermal controller with fan control
[   23.132195] [drm] amdgpu: dpm initialized
[   23.132231] [drm] AMDGPU Display Connectors
[   23.132231] [drm] Connector 0:
[   23.132232] [drm]   DP-1
[   23.132232] [drm]   HPD5
[   23.132233] [drm]   DDC: 0x194c 0x194c 0x194d 0x194d 0x194e 0x194e 0x194f 0x194f
[   23.132233] [drm]   Encoders:
[   23.132234] [drm]     DFP1: INTERNAL_UNIPHY2
[   23.132234] [drm] Connector 1:
[   23.132234] [drm]   DP-2
[   23.132235] [drm]   HPD4
[   23.132235] [drm]   DDC: 0x1950 0x1950 0x1951 0x1951 0x1952 0x1952 0x1953 0x1953
[   23.132236] [drm]   Encoders:
[   23.132236] [drm]     DFP2: INTERNAL_UNIPHY2
[   23.132236] [drm] Connector 2:
[   23.132236] [drm]   HDMI-A-1
[   23.132237] [drm]   HPD1
[   23.132237] [drm]   DDC: 0x1954 0x1954 0x1955 0x1955 0x1956 0x1956 0x1957 0x1957
[   23.132238] [drm]   Encoders:
[   23.132238] [drm]     DFP3: INTERNAL_UNIPHY1
[   23.132238] [drm] Connector 3:
[   23.132238] [drm]   DVI-I-1
[   23.132239] [drm]   HPD3
[   23.132239] [drm]   DDC: 0x1960 0x1960 0x1961 0x1961 0x1962 0x1962 0x1963 0x1963
[   23.132240] [drm]   Encoders:
[   23.132240] [drm]     DFP4: INTERNAL_UNIPHY
[   23.132240] [drm]     CRT1: INTERNAL_KLDSCP_DAC1
[   23.132527] [drm] PCIE gen 3 link speeds already enabled
[   23.274921] amdgpu 0000:01:00.0: amdgpu: SE 2, SH per SE 2, CU per SH 8, active_cu_number 28
[   23.364927] [drm] fb mappable at 0xE0703000
[   23.364928] [drm] vram apper at 0xE0000000
[   23.364929] [drm] size 5242880
[   23.364929] [drm] fb depth is 24
[   23.364929] [drm]    pitch is 5120
[   23.365091] fbcon: amdgpudrmfb (fb0) is primary device
[   23.463699] Console: switching to colour frame buffer device 160x64
[   23.465607] amdgpu 0000:01:00.0: fb0: amdgpudrmfb frame buffer device
[   23.736585] [drm] Initialized amdgpu 3.38.0 20150101 for 0000:01:00.0 on minor 0
...
[ 7723.674495] arb_gpu_shader5[114877]: segfault at 7fbb937fe9d0 ip 00007fbbbaad8aab sp 00007fff47d256a0 error 4 in libpthread-2.31.so[7fbbbaad5000+11000]
[ 7723.674502] Code: Bad RIP value.
[ 7758.485659] arb_enhanced_la[124954]: segfault at 290001 ip 00007f73e6c3ad5a sp 00007ffdbe5d4aa8 error 4 in libc-2.31.so[7f73e6bab000+178000]
[ 7758.485664] Code: Bad RIP value.
[ 7759.173405] arb_enhanced_la[125230]: segfault at 290001 ip 00007f5ad9fa7d5a sp 00007fffd9aaa1e8 error 4 in libc-2.31.so[7f5ad9f18000+178000]
[ 7759.173411] Code: Bad RIP value.
[ 7805.053360] amdgpu 0000:01:00.0: amdgpu: GPU fault detected: 146 0x0006880c
[ 7805.053364] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
[ 7805.053365] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0608800C
[ 7805.053367] amdgpu 0000:01:00.0: amdgpu: VM fault (0x0c, vmid 3) at page 0, read from '' (0x00000000) (136)
[ 7813.142358] [TTM] Failed to find memory space for buffer 0x00000000812205b0 eviction
[ 7813.142371] [TTM]  No space for 00000000812205b0 (524288 pages, 2097152K, 2048M)
[ 7813.142373] [TTM]    placement[0]=0x00060002 (1)
[ 7813.142374] [TTM]      has_type: 1
[ 7813.142374] [TTM]      use_type: 1
[ 7813.142375] [TTM]      flags: 0x0000000A
[ 7813.142376] [TTM]      gpu_offset: 0xFF00000000
[ 7813.142376] [TTM]      size: 786432
[ 7813.142377] [TTM]      available_caching: 0x00070000
[ 7813.142377] [TTM]      default_caching: 0x00010000
[ 7813.142379] [TTM]  0x0000000000000400-0x0000000000000402: 2: used
[ 7813.142380] [TTM]  0x0000000000000402-0x0000000000000412: 16: used
[ 7813.142381] [TTM]  0x0000000000000412-0x0000000000000414: 2: used
[ 7813.142382] [TTM]  0x0000000000000414-0x0000000000000416: 2: used
[ 7813.142383] [TTM]  0x0000000000000416-0x0000000000000418: 2: used
[ 7813.142384] [TTM]  0x0000000000000418-0x000000000000041a: 2: used
[ 7813.142384] [TTM]  0x000000000000041a-0x000000000000041c: 2: used
[ 7813.142385] [TTM]  0x000000000000041c-0x000000000000051c: 256: used
[ 7813.142386] [TTM]  0x000000000000051c-0x000000000000061c: 256: used
[ 7813.142387] [TTM]  0x000000000000061c-0x000000000000061e: 2: used
[ 7813.142388] [TTM]  0x000000000000061e-0x0000000000000620: 2: used
[ 7813.142388] [TTM]  0x0000000000000620-0x0000000000000622: 2: used
[ 7813.142389] [TTM]  0x0000000000000622-0x0000000000000624: 2: used
[ 7813.142390] [TTM]  0x0000000000000624-0x0000000000000626: 2: used
[ 7813.142391] [TTM]  0x0000000000000626-0x0000000000000628: 2: used
[ 7813.142391] [TTM]  0x0000000000000628-0x000000000000062a: 2: used
[ 7813.142392] [TTM]  0x000000000000062a-0x000000000000062c: 2: used
[ 7813.142393] [TTM]  0x000000000000062c-0x000000000000062e: 2: used
[ 7813.142393] [TTM]  0x000000000000062e-0x0000000000000630: 2: used
[ 7813.142394] [TTM]  0x0000000000000630-0x0000000000000632: 2: used
[ 7813.142395] [TTM]  0x0000000000000632-0x0000000000000634: 2: used
[ 7813.142395] [TTM]  0x0000000000000634-0x0000000000000636: 2: used
[ 7813.142396] [TTM]  0x0000000000000636-0x0000000000000638: 2: used
[ 7813.142397] [TTM]  0x0000000000000638-0x000000000000063a: 2: used
[ 7813.142398] [TTM]  0x000000000000063a-0x000000000000063c: 2: used
[ 7813.142399] [TTM]  0x000000000000063c-0x000000000000063e: 2: used
[ 7813.142400] [TTM]  0x000000000000063e-0x000000000000063f: 1: used
[ 7813.142400] [TTM]  0x000000000000063f-0x0000000000000641: 2: used
[ 7813.142401] [TTM]  0x0000000000000641-0x0000000000000643: 2: used
[ 7813.142402] [TTM]  0x0000000000000643-0x0000000000000645: 2: used
[ 7813.142402] [TTM]  0x0000000000000645-0x0000000000000647: 2: used
[ 7813.142403] [TTM]  0x0000000000000647-0x0000000000000649: 2: used
[ 7813.142404] [TTM]  0x0000000000000649-0x000000000000064b: 2: used
[ 7813.142405] [TTM]  0x000000000000064b-0x000000000000064d: 2: used
[ 7813.142406] [TTM]  0x000000000000064d-0x000000000000064f: 2: used
[ 7813.142406] [TTM]  0x000000000000064f-0x0000000000000651: 2: used
[ 7813.142407] [TTM]  0x0000000000000651-0x0000000000000653: 2: used
[ 7813.142408] [TTM]  0x0000000000000653-0x0000000000000655: 2: used
[ 7813.142409] [TTM]  0x0000000000000655-0x0000000000000657: 2: used
[ 7813.142409] [TTM]  0x0000000000000657-0x0000000000000659: 2: used
[ 7813.142410] [TTM]  0x0000000000000659-0x000000000000065b: 2: used
[ 7813.142411] [TTM]  0x000000000000065b-0x0000000000000692: 55: free
[ 7813.142411] [TTM]  0x0000000000000692-0x0000000000000694: 2: used
[ 7813.142412] [TTM]  0x0000000000000694-0x000000000000070f: 123: free
[ 7813.142413] [TTM]  0x000000000000070f-0x0000000000000711: 2: used
[ 7813.142413] [TTM]  0x0000000000000711-0x000000000000079c: 139: free
[ 7813.142414] [TTM]  0x000000000000079c-0x000000000000079e: 2: used
[ 7813.142415] [TTM]  0x000000000000079e-0x00000000000007ee: 80: free
[ 7813.142415] [TTM]  0x00000000000007ee-0x00000000000007f0: 2: used
[ 7813.142461] [TTM]  0x00000000000007f0-0x00000000000007f2: 2: used
[ 7813.142462] [TTM]  0x00000000000007f2-0x00000000000007fe: 12: free
[ 7813.142463] [TTM]  0x00000000000007fe-0x0000000000000800: 2: used
[ 7813.142463] [TTM]  0x0000000000000800-0x0000000000000806: 6: free
[ 7813.142464] [TTM]  0x0000000000000806-0x0000000000000808: 2: used
[ 7813.142464] [TTM]  0x0000000000000808-0x000000000000080e: 6: free
[ 7813.142465] [TTM]  0x000000000000080e-0x000000000000082e: 32: used
[ 7813.142465] [TTM]  0x000000000000082e-0x000000000000083a: 12: free
[ 7813.142466] [TTM]  0x000000000000083a-0x000000000000083c: 2: used
[ 7813.142467] [TTM]  0x000000000000083c-0x000000000000083e: 2: used
[ 7813.142467] [TTM]  0x000000000000083e-0x0000000000000840: 2: used
[ 7813.142469] [TTM]  0x0000000000000840-0x0000000000000842: 2: used
[ 7813.142469] [TTM]  0x0000000000000842-0x0000000000000844: 2: used
[ 7813.142470] [TTM]  0x0000000000000844-0x0000000000000846: 2: used
[ 7813.142471] [TTM]  0x0000000000000846-0x0000000000000848: 2: used
[ 7813.142472] [TTM]  0x0000000000000848-0x000000000000084a: 2: used
[ 7813.142473] [TTM]  0x000000000000084a-0x000000000000084c: 2: used
[ 7813.142473] [TTM]  0x000000000000084c-0x000000000000084e: 2: used
[ 7813.142474] [TTM]  0x000000000000084e-0x0000000000000850: 2: used
[ 7813.142475] [TTM]  0x0000000000000850-0x0000000000000852: 2: used
[ 7813.142475] [TTM]  0x0000000000000852-0x0000000000000854: 2: used
[ 7813.142476] [TTM]  0x0000000000000854-0x000000000000088a: 54: free
[ 7813.142476] [TTM]  0x000000000000088a-0x000000000000088c: 2: used
[ 7813.142477] [TTM]  0x000000000000088c-0x0000000000040000: 259956: free
[ 7813.142478] [TTM]  total: 261120, used 677 free 260443
[ 7813.142479] [TTM]  man size:786432 pages, gtt available:260443 pages, usage:2054MB
[ 7813.270091] [TTM] Failed to find memory space for buffer 0x00000000812205b0 eviction
[ 7813.270104] [TTM]  No space for 00000000812205b0 (524288 pages, 2097152K, 2048M)
[ 7813.270105] [TTM]    placement[0]=0x00060002 (1)
[ 7813.270105] [TTM]      has_type: 1
[ 7813.270106] [TTM]      use_type: 1
[ 7813.270106] [TTM]      flags: 0x0000000A
[ 7813.270107] [TTM]      gpu_offset: 0xFF00000000
[ 7813.270108] [TTM]      size: 786432
[ 7813.270108] [TTM]      available_caching: 0x00070000
[ 7813.270109] [TTM]      default_caching: 0x00010000
[ 7813.270110] [TTM]  0x0000000000000400-0x0000000000000402: 2: used
[ 7813.270111] [TTM]  0x0000000000000402-0x0000000000000412: 16: used
[ 7813.270112] [TTM]  0x0000000000000412-0x0000000000000414: 2: used
[ 7813.270113] [TTM]  0x0000000000000414-0x0000000000000416: 2: used
[ 7813.270113] [TTM]  0x0000000000000416-0x0000000000000418: 2: used
[ 7813.270114] [TTM]  0x0000000000000418-0x000000000000041a: 2: used
[ 7813.270115] [TTM]  0x000000000000041a-0x000000000000041c: 2: used
[ 7813.270116] [TTM]  0x000000000000041c-0x000000000000051c: 256: used
[ 7813.270116] [TTM]  0x000000000000051c-0x000000000000061c: 256: used
[ 7813.270117] [TTM]  0x000000000000061c-0x000000000000061e: 2: used
[ 7813.270118] [TTM]  0x000000000000061e-0x0000000000040000: 260578: free
[ 7813.270119] [TTM]  total: 261120, used 542 free 260578
[ 7813.270120] [TTM]  man size:786432 pages, gtt available:261602 pages, usage:2050MB
[ 7813.339330] [TTM] Failed to find memory space for buffer 0x00000000812205b0 eviction
[ 7813.339339] [TTM]  No space for 00000000812205b0 (524288 pages, 2097152K, 2048M)
[ 7813.339340] [TTM]    placement[0]=0x00060002 (1)
[ 7813.339341] [TTM]      has_type: 1
[ 7813.339341] [TTM]      use_type: 1
[ 7813.339342] [TTM]      flags: 0x0000000A
[ 7813.339343] [TTM]      gpu_offset: 0xFF00000000
[ 7813.339343] [TTM]      size: 786432
[ 7813.339344] [TTM]      available_caching: 0x00070000
[ 7813.339344] [TTM]      default_caching: 0x00010000
[ 7813.339347] [TTM]  0x0000000000000400-0x0000000000000402: 2: used
[ 7813.339348] [TTM]  0x0000000000000402-0x0000000000000412: 16: used
[ 7813.339348] [TTM]  0x0000000000000412-0x0000000000000414: 2: used
[ 7813.339349] [TTM]  0x0000000000000414-0x0000000000000416: 2: used
[ 7813.339350] [TTM]  0x0000000000000416-0x0000000000000418: 2: used
[ 7813.339350] [TTM]  0x0000000000000418-0x000000000000041a: 2: used
[ 7813.339351] [TTM]  0x000000000000041a-0x000000000000041c: 2: used
[ 7813.339352] [TTM]  0x000000000000041c-0x000000000000051c: 256: used
[ 7813.339353] [TTM]  0x000000000000051c-0x000000000000061c: 256: used
[ 7813.339353] [TTM]  0x000000000000061c-0x000000000000061e: 2: used
[ 7813.339354] [TTM]  0x000000000000061e-0x0000000000000620: 2: used
[ 7813.339355] [TTM]  0x0000000000000620-0x0000000000000622: 2: used
[ 7813.339356] [TTM]  0x0000000000000622-0x0000000000000624: 2: used
[ 7813.339357] [TTM]  0x0000000000000624-0x0000000000000626: 2: used
[ 7813.339357] [TTM]  0x0000000000000626-0x0000000000000628: 2: used
[ 7813.339358] [TTM]  0x0000000000000628-0x00000000000006fe: 214: free
[ 7813.339359] [TTM]  0x00000000000006fe-0x000000000000071e: 32: used
[ 7813.339360] [TTM]  0x000000000000071e-0x000000000000071f: 1: used
[ 7813.339360] [TTM]  0x000000000000071f-0x0000000000040000: 260321: free
[ 7813.339361] [TTM]  total: 261120, used 585 free 260535
[ 7813.339363] [TTM]  man size:786432 pages, gtt available:260791 pages, usage:2053MB
[ 7813.437505] [TTM] Failed to find memory space for buffer 0x00000000812205b0 eviction
[ 7813.437516] [TTM]  No space for 00000000812205b0 (524288 pages, 2097152K, 2048M)
[ 7813.437517] [TTM]    placement[0]=0x00060002 (1)
[ 7813.437518] [TTM]      has_type: 1
[ 7813.437519] [TTM]      use_type: 1
[ 7813.437519] [TTM]      flags: 0x0000000A
[ 7813.437520] [TTM]      gpu_offset: 0xFF00000000
[ 7813.437521] [TTM]      size: 786432
[ 7813.437521] [TTM]      available_caching: 0x00070000
[ 7813.437522] [TTM]      default_caching: 0x00010000
[ 7813.437523] [TTM]  0x0000000000000400-0x0000000000000402: 2: used
[ 7813.437524] [TTM]  0x0000000000000402-0x0000000000000412: 16: used
[ 7813.437525] [TTM]  0x0000000000000412-0x0000000000000414: 2: used
[ 7813.437526] [TTM]  0x0000000000000414-0x0000000000000416: 2: used
[ 7813.437527] [TTM]  0x0000000000000416-0x0000000000000418: 2: used
[ 7813.437527] [TTM]  0x0000000000000418-0x000000000000041a: 2: used
[ 7813.437528] [TTM]  0x000000000000041a-0x000000000000041c: 2: used
[ 7813.437529] [TTM]  0x000000000000041c-0x000000000000051c: 256: used
[ 7813.437529] [TTM]  0x000000000000051c-0x000000000000061c: 256: used
[ 7813.437530] [TTM]  0x000000000000061c-0x000000000000061e: 2: used
[ 7813.437531] [TTM]  0x000000000000061e-0x0000000000040000: 260578: free
[ 7813.437531] [TTM]  total: 261120, used 542 free 260578
[ 7813.437533] [TTM]  man size:786432 pages, gtt available:261602 pages, usage:2050MB
[ 7813.438518] arb_uniform_buf[143135]: segfault at 0 ip 00007f20b6f990d7 sp 00007ffdebfcc8c8 error 6 in libc-2.31.so[7f20b6eff000+178000]
[ 7813.438532] Code: Bad RIP value.
[ 7919.344885] arb_shader_stor[146734]: segfault at 0 ip 00007fe2ab5020d7 sp 00007fff6027eda8 error 6 in libc-2.31.so[7fe2ab468000+178000]
[ 7919.344894] Code: Bad RIP value.
[ 7919.897315] arb_shader_stor[146769]: segfault at 0 ip 00007f10d8fbd0d7 sp 00007ffcf8895608 error 6 in libc-2.31.so[7f10d8f23000+178000]
[ 7919.897332] Code: Bad RIP value.
[ 8009.208256] egl-copy-buffer[147619]: segfault at 18 ip 00007f968e8c9e9b sp 00007ffe7ca12200 error 4 in libEGL_mesa.so.0.0.0[7f968e8a9000+26000]
[ 8009.208263] Code: Bad RIP value.
[ 8032.266864] perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
[ 8070.875068] [TTM] Buffer eviction failed
[ 8080.462745] amdgpu 0000:01:00.0: amdgpu: GPU fault detected: 146 0x00ce8804
[ 8080.462756] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00134006
[ 8080.462758] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0E088004
[ 8080.462759] amdgpu 0000:01:00.0: amdgpu: VM fault (0x04, vmid 7) at page 1261574, read from '' (0x00000000) (136)
[ 8080.478266] amdgpu 0000:01:00.0: amdgpu: GPU fault detected: 146 0x00c28804
[ 8080.478271] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00134006
[ 8080.478272] amdgpu 0000:01:00.0: amdgpu:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02088004
[ 8080.478274] amdgpu 0000:01:00.0: amdgpu: VM fault (0x04, vmid 1) at page 1261574, read from '' (0x00000000) (136)
[ 8204.864339] shader_runner[168816]: segfault at 7f64df7fe9d0 ip 00007f6506a47aab sp 00007fff3961d340 error 4
[ 8204.864343] shader_runner[168803]: segfault at 7faa7d7fa9d0 ip 00007faa941ccaab sp 00007ffca97f4490 error 4
[ 8204.864345]  in libpthread-2.31.so[7f6506a44000+11000]
[ 8204.864348] Code: Bad RIP value.
[ 8204.864349]  in libpthread-2.31.so[7faa941c9000+11000]
[ 8204.864351] Code: Bad RIP value.
[ 8204.864376] shader_runner[168801]: segfault at 7f12bf7fe9d0 ip 00007f12d3155aab sp 00007fff81846f80 error 4 in libpthread-2.31.so[7f12d3152000+11000]
[ 8204.864381] Code: Bad RIP value.
[ 8204.864501] shader_runner[168802]: segfault at 7f7225ffb9d0 ip 00007f723cfa4aab sp 00007ffda6a8a890 error 4 in libpthread-2.31.so[7f723cfa1000+11000]
[ 8204.864507] Code: Bad RIP value.
[ 8207.293001] shader_runner[168847]: segfault at 7f4781ffb9d0 ip 00007f4799379aab sp 00007ffd72820630 error 4 in libpthread-2.31.so[7f4799376000+11000]
[ 8207.293009] Code: Bad RIP value.
[ 8207.303214] shader_runner[168849]: segfault at 7f01a27fc9d0 ip 00007f01c1c58aab sp 00007ffef3fc31d0 error 4 in libpthread-2.31.so[7f01c1c55000+11000]
[ 8207.303220] Code: Bad RIP value.
[ 8207.333651] shader_runner[168872]: segfault at 7f84fffff9d0 ip 00007f852f5f4aab sp 00007ffc03821a30 error 4 in libpthread-2.31.so[7f852f5f1000+11000]
[ 8207.333656] Code: Bad RIP value.
[ 8207.339399] shader_runner[168875]: segfault at 7f5dedffb9d0 ip 00007f5e04e37aab sp 00007ffd41558ad0 error 4 in libpthread-2.31.so[7f5e04e34000+11000]
[ 8207.339405] Code: Bad RIP value.
[ 8207.515900] shader_runner[168890]: segfault at 7f3e677fe9d0 ip 00007f3e76a5baab sp 00007ffe1bdbfa30 error 4 in libpthread-2.31.so[7f3e76a58000+11000]
[ 8207.515907] Code: Bad RIP value.
[ 8207.551837] shader_runner[168915]: segfault at 7f14667fc9d0 ip 00007f147dbdbaab sp 00007ffef737bb30 error 4 in libpthread-2.31.so[7f147dbd8000+11000]
[ 8207.551842] Code: Bad RIP value.
[ 8209.900683] show_signal_msg: 38 callbacks suppressed
[ 8209.900686] shader_runner[169450]: segfault at 7fe88d1119d0 ip 00007fe897dc7aab sp 00007fff9a7994e0 error 4 in libpthread-2.31.so[7fe897dc4000+11000]
[ 8209.900695] Code: Bad RIP value.
[ 8209.958317] shader_runner[169463]: segfault at 7f05d8ff99d0 ip 00007f05e82a9aab sp 00007ffd29495db0 error 4 in libpthread-2.31.so[7f05e82a6000+11000]
[ 8209.958323] Code: Bad RIP value.
[ 8210.016780] shader_runner[169477]: segfault at 7fd1657fa9d0 ip 00007fd174c58aab sp 00007ffd46a738b0 error 4 in libpthread-2.31.so[7fd174c55000+11000]
[ 8210.016787] Code: Bad RIP value.
[ 8210.095393] shader_runner[169492]: segfault at 7f8d79d7c9d0 ip 00007f8d84a32aab sp 00007ffe83c7c320 error 4 in libpthread-2.31.so[7f8d84a2f000+11000]
[ 8210.095398] Code: Bad RIP value.
[ 8210.175068] shader_runner[169506]: segfault at 7f27877fe9d0 ip 00007f27a68b4aab sp 00007ffd39ff79a0 error 4 in libpthread-2.31.so[7f27a68b1000+11000]
[ 8210.175075] Code: Bad RIP value.
[ 8210.202147] shader_runner[169519]: segfault at 7f315a7fc9d0 ip 00007f316970daab sp 00007ffee6c3a210 error 4 in libpthread-2.31.so[7f316970a000+11000]
[ 8210.202156] Code: Bad RIP value.
[ 8210.288298] shader_runner[169534]: segfault at 7f7a3cff99d0 ip 00007f7a4c23baab sp 00007ffc087caeb0 error 4 in libpthread-2.31.so[7f7a4c238000+11000]
[ 8210.288303] Code: Bad RIP value.
[ 8210.329530] shader_runner[169547]: segfault at 7f63f57fa9d0 ip 00007f6404af5aab sp 00007ffdf3e7f790 error 4 in libpthread-2.31.so[7f6404af2000+11000]
[ 8210.329536] Code: Bad RIP value.
[ 8210.412320] shader_runner[169562]: segfault at 7f622471f9d0 ip 00007f622f3d5aab sp 00007fff6f38f6b0 error 4 in libpthread-2.31.so[7f622f3d2000+11000]
[ 8210.412325] Code: Bad RIP value.
[ 8210.455261] shader_runner[169575]: segfault at 7f0d177fe9d0 ip 00007f0d2e351aab sp 00007fff77b01400 error 4 in libpthread-2.31.so[7f0d2e34e000+11000]
[ 8210.455269] Code: Bad RIP value.
[ 8218.886289] show_signal_msg: 27 callbacks suppressed
[ 8218.886292] shader_runner[172286]: segfault at 56393e81e408 ip 00007f4feb9a3ed9 sp 00007ffe74015800 error 4 in radeonsi_dri.so[7f4feb6ad000+d49000]
[ 8218.886297] Code: Bad RIP value.
[ 8218.899687] shader_runner[172285]: segfault at 563750011378 ip 00007ff7236e4ed9 sp 00007ffe2e978e10 error 4 in radeonsi_dri.so[7ff7233ee000+d49000]
[ 8218.899692] Code: Bad RIP value.
[ 8219.001985] shader_runner[172334]: segfault at 5623ce8c4848 ip 00007fa239f2bed9 sp 00007ffcaf7c4170 error 4 in radeonsi_dri.so[7fa239c35000+d49000]
[ 8219.001991] Code: Bad RIP value.
[ 8219.490115] shader_runner[172514]: segfault at 55f2d3009314 ip 00007fad22647500 sp 00007ffe441c0120 error 4 in radeonsi_dri.so[7fad2234f000+d49000]
[ 8219.490123] Code: Bad RIP value.
[ 8219.491095] shader_runner[172516]: segfault at 563bd86d20a4 ip 00007fb9e40f9500 sp 00007ffcd77518b0 error 4 in radeonsi_dri.so[7fb9e3e01000+d49000]
[ 8219.491101] Code: Bad RIP value.
[ 8219.711083] shader_runner[172588]: segfault at 55ca9ae686a4 ip 00007fe140555500 sp 00007ffe9cae1400 error 4 in radeonsi_dri.so[7fe14025d000+d49000]
[ 8219.711090] Code: Bad RIP value.
[ 8430.203633] perf: interrupt took too long (3138 > 3133), lowering kernel.perf_event_max_sample_rate to 63500
[ 9055.012725] audit: type=1400 audit(1595774523.846:84): apparmor="ALLOWED" operation="open" profile="libreoffice-soffice" name="/usr/share/libdrm/amdgpu.ids" pid=383072 comm="soffice.bin" requested_mask="r" denied_mask="r" fsuid=1000 ouid=0

[-- Attachment #3: piglit_tests_amddcsi.ods --]
[-- Type: application/vnd.oasis.opendocument.spreadsheet, Size: 34347 bytes --]

[-- Attachment #4: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-07-26 15:31           ` Re: Mauro Rossi
@ 2020-07-27 18:31             ` Alex Deucher
  2020-07-27 19:46               ` Re: Mauro Rossi
  0 siblings, 1 reply; 1651+ messages in thread
From: Alex Deucher @ 2020-07-27 18:31 UTC (permalink / raw)
  To: Mauro Rossi
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list

On Sun, Jul 26, 2020 at 11:31 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>
> Hello,
>
> On Fri, Jul 24, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>
>> On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >
>> > Hello,
>> > re-sending and copying full DL
>> >
>> > On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>> >>
>> >> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >> >
>> >> > Hi Christian,
>> >> >
>> >> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
>> >> > <ckoenig.leichtzumerken@gmail.com> wrote:
>> >> > >
>> >> > > Hi Mauro,
>> >> > >
>> >> > > I'm not deep into the whole DC design, so just some general high level
>> >> > > comments on the cover letter:
>> >> > >
>> >> > > 1. Please add a subject line to the cover letter, my spam filter thinks
>> >> > > that this is suspicious otherwise.
>> >> >
>> >> > My mistake in the editing of covert letter with git send-email,
>> >> > I may have forgot to keep the Subject at the top
>> >> >
>> >> > >
>> >> > > 2. Then you should probably note how well (badly?) is that tested. Since
>> >> > > you noted proof of concept it might not even work.
>> >> >
>> >> > The Changelog is to be read as:
>> >> >
>> >> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
>> >> > just a rebase onto amd-staging-drm-next
>> >> >
>> >> > this series [PATCH v3] has all the known changes required for DCE6 specificity
>> >> > and based on a long offline thread with Alexander Deutcher and past
>> >> > dri-devel chats with Harry Wentland.
>> >> >
>> >> > It was tested for my possibilities of testing with HD7750 and HD7950,
>> >> > with checks in dmesg output for not getting "missing registers/masks"
>> >> > kernel WARNING
>> >> > and with kernel build on Ubuntu 20.04 and with android-x86
>> >> >
>> >> > The proposal I made to Alex is that AMD testing systems will be used
>> >> > for further regression testing,
>> >> > as part of review and validation for eligibility to amd-staging-drm-next
>> >> >
>> >>
>> >> We will certainly test it once it lands, but presumably this is
>> >> working on the SI cards you have access to?
>> >
>> >
>> > Yes, most of my testing was done with android-x86  Android CTS (EGL, GLES2, GLES3, VK)
>> >
>> > I am also in contact with a person with Firepro W5130M who is running a piglit session
>> >
>> > I had bought an HD7850 to test with Pitcairn, but it arrived as defective so I could not test with Pitcair
>> >
>> >
>> >>
>> >> > >
>> >> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
>> >> >
>> >> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
>> >> > DCE6 (dc/dce60 path) in the last two years from initial submission
>> >> >
>> >> > >
>> >> > > Apart from that it looks like a rather impressive piece of work :)
>> >> > >
>> >> > > Cheers,
>> >> > > Christian.
>> >> >
>> >> > Thanks,
>> >> > please consider that most of the latest DCE6 specific parts were
>> >> > possible due to recent Alex support in getting the correct DCE6
>> >> > headers,
>> >> > his suggestions and continuous feedback.
>> >> >
>> >> > I would suggest that Alex comments on the proposed next steps to follow.
>> >>
>> >> The code looks pretty good to me.  I'd like to get some feedback from
>> >> the display team to see if they have any concerns, but beyond that I
>> >> think we can pull it into the tree and continue improving it there.
>> >> Do you have a link to a git tree I can pull directly that contains
>> >> these patches?  Is this the right branch?
>> >> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
>> >>
>> >> Thanks!
>> >>
>> >> Alex
>> >
>> >
>> > The following branch was pushed with the series on top of amd-staging-drm-next
>> >
>> > https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next
>>
>> I gave this a quick test on all of the SI asics and the various
>> monitors I had available and it looks good.  A few minor patches I
>> noticed are attached.  If they look good to you, I'll squash them into
>> the series when I commit it.  I've pushed it to my fdo tree as well:
>> https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support
>>
>> Thanks!
>>
>> Alex
>
>
> The new patches are ok and with the following infomation about piglit tests,
> the series may be good to go.
>
> I have performed piglit tests on Tahiti HD7950 on kernel 5.8.0-rc6 with AMD DC support for SI
> and comparison with vanilla kernel 5.8.0-rc6
>
> Results are the following
>
> [piglit gpu tests with kernel 5.8.0-rc6-amddcsi]
>
> utente@utente-desktop:~/piglit$ ./piglit run gpu .
> [26714/26714] skip: 1731, pass: 24669, warn: 15, fail: 288, crash: 11
> Thank you for running Piglit!
> Results have been written to /home/utente/piglit
>
> [piglit gpu tests with vanilla 5.8.0-rc6]
>
> utente@utente-desktop:~/piglit$ ./piglit run gpu .
> [26714/26714] skip: 1731, pass: 24673, warn: 13, fail: 283, crash: 14
> Thank you for running Piglit!
> Results have been written to /home/utente/piglit
>
> In the attachment the comparison of "5.8.0-rc6-amddcsi" vs "5.8.0-rc6" vanilla
> and viceversa, I see no significant regression and in the delta of failed tests I don't recognize DC related test cases,
> but you may also have a look.

Looks good to me.  The series is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

>
> dmesg for "5.8.0-rc6-amddcsi" is also provide the check the crashes
>
> Regarding the other user testing the series with Firepro W5130M
> he found an already existing issue in amdgpu si_support=1 which is independent from my series and matches a problem alrady reported. [1]
>

amdgpu does not currently implement GPU reset support for SI.

Alex

> Mauro
>
> [1] https://bbs.archlinux.org/viewtopic.php?id=249097
>
>>
>>
>> >
>> >>
>> >>
>> >> >
>> >> > Mauro
>> >> >
>> >> > >
>> >> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
>> >> > > > The series adds SI support to AMD DC
>> >> > > >
>> >> > > > Changelog:
>> >> > > >
>> >> > > > [RFC]
>> >> > > > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
>> >> > > >
>> >> > > > [PATCH v2]
>> >> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
>> >> > > >
>> >> > > > [PATCH v3]
>> >> > > > Add support for DCE6 specific headers,
>> >> > > > ad hoc DCE6 macros, funtions and fixes,
>> >> > > > rebase on current amd-staging-drm-next
>> >> > > >
>> >> > > >
>> >> > > > Commits [01/27]..[08/27] SI support added in various DC components
>> >> > > >
>> >> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
>> >> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
>> >> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
>> >> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
>> >> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
>> >> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
>> >> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
>> >> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
>> >> > > >
>> >> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
>> >> > > >
>> >> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
>> >> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
>> >> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
>> >> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
>> >> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
>> >> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
>> >> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
>> >> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
>> >> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
>> >> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
>> >> > > >
>> >> > > >
>> >> > > > Commits [25/27]..[27/27] SI support final enablements
>> >> > > >
>> >> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
>> >> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
>> >> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
>> >> > > >
>> >> > > >
>> >> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
>> >> > > >
>> >> > > > _______________________________________________
>> >> > > > amd-gfx mailing list
>> >> > > > amd-gfx@lists.freedesktop.org
>> >> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> >> > >
>> >> > _______________________________________________
>> >> > amd-gfx mailing list
>> >> > amd-gfx@lists.freedesktop.org
>> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-07-27 18:31             ` Re: Alex Deucher
@ 2020-07-27 19:46               ` Mauro Rossi
  2020-07-27 19:54                 ` Re: Alex Deucher
  0 siblings, 1 reply; 1651+ messages in thread
From: Mauro Rossi @ 2020-07-27 19:46 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list


[-- Attachment #1.1: Type: text/plain, Size: 10849 bytes --]

On Mon, Jul 27, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:

> On Sun, Jul 26, 2020 at 11:31 AM Mauro Rossi <issor.oruam@gmail.com>
> wrote:
> >
> > Hello,
> >
> > On Fri, Jul 24, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com>
> wrote:
> >>
> >> On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com>
> wrote:
> >> >
> >> > Hello,
> >> > re-sending and copying full DL
> >> >
> >> > On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com>
> wrote:
> >> >>
> >> >> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com>
> wrote:
> >> >> >
> >> >> > Hi Christian,
> >> >> >
> >> >> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
> >> >> > <ckoenig.leichtzumerken@gmail.com> wrote:
> >> >> > >
> >> >> > > Hi Mauro,
> >> >> > >
> >> >> > > I'm not deep into the whole DC design, so just some general high
> level
> >> >> > > comments on the cover letter:
> >> >> > >
> >> >> > > 1. Please add a subject line to the cover letter, my spam filter
> thinks
> >> >> > > that this is suspicious otherwise.
> >> >> >
> >> >> > My mistake in the editing of covert letter with git send-email,
> >> >> > I may have forgot to keep the Subject at the top
> >> >> >
> >> >> > >
> >> >> > > 2. Then you should probably note how well (badly?) is that
> tested. Since
> >> >> > > you noted proof of concept it might not even work.
> >> >> >
> >> >> > The Changelog is to be read as:
> >> >> >
> >> >> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2]
> was
> >> >> > just a rebase onto amd-staging-drm-next
> >> >> >
> >> >> > this series [PATCH v3] has all the known changes required for DCE6
> specificity
> >> >> > and based on a long offline thread with Alexander Deutcher and past
> >> >> > dri-devel chats with Harry Wentland.
> >> >> >
> >> >> > It was tested for my possibilities of testing with HD7750 and
> HD7950,
> >> >> > with checks in dmesg output for not getting "missing
> registers/masks"
> >> >> > kernel WARNING
> >> >> > and with kernel build on Ubuntu 20.04 and with android-x86
> >> >> >
> >> >> > The proposal I made to Alex is that AMD testing systems will be
> used
> >> >> > for further regression testing,
> >> >> > as part of review and validation for eligibility to
> amd-staging-drm-next
> >> >> >
> >> >>
> >> >> We will certainly test it once it lands, but presumably this is
> >> >> working on the SI cards you have access to?
> >> >
> >> >
> >> > Yes, most of my testing was done with android-x86  Android CTS (EGL,
> GLES2, GLES3, VK)
> >> >
> >> > I am also in contact with a person with Firepro W5130M who is running
> a piglit session
> >> >
> >> > I had bought an HD7850 to test with Pitcairn, but it arrived as
> defective so I could not test with Pitcair
> >> >
> >> >
> >> >>
> >> >> > >
> >> >> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
> >> >> >
> >> >> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
> >> >> > DCE6 (dc/dce60 path) in the last two years from initial submission
> >> >> >
> >> >> > >
> >> >> > > Apart from that it looks like a rather impressive piece of work
> :)
> >> >> > >
> >> >> > > Cheers,
> >> >> > > Christian.
> >> >> >
> >> >> > Thanks,
> >> >> > please consider that most of the latest DCE6 specific parts were
> >> >> > possible due to recent Alex support in getting the correct DCE6
> >> >> > headers,
> >> >> > his suggestions and continuous feedback.
> >> >> >
> >> >> > I would suggest that Alex comments on the proposed next steps to
> follow.
> >> >>
> >> >> The code looks pretty good to me.  I'd like to get some feedback from
> >> >> the display team to see if they have any concerns, but beyond that I
> >> >> think we can pull it into the tree and continue improving it there.
> >> >> Do you have a link to a git tree I can pull directly that contains
> >> >> these patches?  Is this the right branch?
> >> >> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
> >> >>
> >> >> Thanks!
> >> >>
> >> >> Alex
> >> >
> >> >
> >> > The following branch was pushed with the series on top of
> amd-staging-drm-next
> >> >
> >> > https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next
> >>
> >> I gave this a quick test on all of the SI asics and the various
> >> monitors I had available and it looks good.  A few minor patches I
> >> noticed are attached.  If they look good to you, I'll squash them into
> >> the series when I commit it.  I've pushed it to my fdo tree as well:
> >> https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support
> >>
> >> Thanks!
> >>
> >> Alex
> >
> >
> > The new patches are ok and with the following infomation about piglit
> tests,
> > the series may be good to go.
> >
> > I have performed piglit tests on Tahiti HD7950 on kernel 5.8.0-rc6 with
> AMD DC support for SI
> > and comparison with vanilla kernel 5.8.0-rc6
> >
> > Results are the following
> >
> > [piglit gpu tests with kernel 5.8.0-rc6-amddcsi]
> >
> > utente@utente-desktop:~/piglit$ ./piglit run gpu .
> > [26714/26714] skip: 1731, pass: 24669, warn: 15, fail: 288, crash: 11
> > Thank you for running Piglit!
> > Results have been written to /home/utente/piglit
> >
> > [piglit gpu tests with vanilla 5.8.0-rc6]
> >
> > utente@utente-desktop:~/piglit$ ./piglit run gpu .
> > [26714/26714] skip: 1731, pass: 24673, warn: 13, fail: 283, crash: 14
> > Thank you for running Piglit!
> > Results have been written to /home/utente/piglit
> >
> > In the attachment the comparison of "5.8.0-rc6-amddcsi" vs "5.8.0-rc6"
> vanilla
> > and viceversa, I see no significant regression and in the delta of
> failed tests I don't recognize DC related test cases,
> > but you may also have a look.
>
> Looks good to me.  The series is:
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>

Thank you Alex for review and the help in finalizing the series
and to Harry who initially encouraged me and provided the feedbacks to
previous v2 series



>
> >
> > dmesg for "5.8.0-rc6-amddcsi" is also provide the check the crashes
> >
> > Regarding the other user testing the series with Firepro W5130M
> > he found an already existing issue in amdgpu si_support=1 which is
> independent from my series and matches a problem alrady reported. [1]
> >
>
> amdgpu does not currently implement GPU reset support for SI.
>
> Alex
>

If you have in the plans to add support and prevent those crashes,
the user would be glad to be available for glxgears and piglit testing
on Firepro W5130M

Please let me know

Mauro


>
> > Mauro
> >
> > [1] https://bbs.archlinux.org/viewtopic.php?id=249097
> >
> >>
> >>
> >> >
> >> >>
> >> >>
> >> >> >
> >> >> > Mauro
> >> >> >
> >> >> > >
> >> >> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
> >> >> > > > The series adds SI support to AMD DC
> >> >> > > >
> >> >> > > > Changelog:
> >> >> > > >
> >> >> > > > [RFC]
> >> >> > > > Preliminar Proof Of Concept, with DCE8 headers still used in
> dce60_resources.c
> >> >> > > >
> >> >> > > > [PATCH v2]
> >> >> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
> >> >> > > >
> >> >> > > > [PATCH v3]
> >> >> > > > Add support for DCE6 specific headers,
> >> >> > > > ad hoc DCE6 macros, funtions and fixes,
> >> >> > > > rebase on current amd-staging-drm-next
> >> >> > > >
> >> >> > > >
> >> >> > > > Commits [01/27]..[08/27] SI support added in various DC
> components
> >> >> > > >
> >> >> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers
> (v6)
> >> >> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
> >> >> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6
> support (v9b)
> >> >> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support
> (v2)
> >> >> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
> >> >> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for
> DCE6 (v2)
> >> >> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6
> (v4)
> >> >> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support
> (v4)
> >> >> > > >
> >> >> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
> >> >> > > >
> >> >> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for
> SI parts (v2)
> >> >> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set
> max_cursor_size to 64
> >> >> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific
> macros,functions
> >> >> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific
> macros
> >> >> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific
> macros,functions
> >> >> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific
> macros,functions
> >> >> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6
> specific macros,functions
> >> >> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6
> specific macros,functions
> >> >> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific
> macros,functions
> >> >> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6
> specific macros,functions
> >> >> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers
> (v7)
> >> >> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling
> Horizontal Filter Init
> >> >> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> macros,functions
> >> >> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6
> specific .cursor_lock
> >> >> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add
> DCE6 specific functions
> >> >> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers
> (v6)
> >> >> > > >
> >> >> > > >
> >> >> > > > Commits [25/27]..[27/27] SI support final enablements
> >> >> > > >
> >> >> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation
> property for Bonarie and later
> >> >> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts
> (v2)
> >> >> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the
> Kconfig (v2)
> >> >> > > >
> >> >> > > >
> >> >> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
> >> >> > > >
> >> >> > > > _______________________________________________
> >> >> > > > amd-gfx mailing list
> >> >> > > > amd-gfx@lists.freedesktop.org
> >> >> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >> >> > >
> >> >> > _______________________________________________
> >> >> > amd-gfx mailing list
> >> >> > amd-gfx@lists.freedesktop.org
> >> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>

[-- Attachment #1.2: Type: text/html, Size: 16193 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-07-27 19:46               ` Re: Mauro Rossi
@ 2020-07-27 19:54                 ` Alex Deucher
  0 siblings, 0 replies; 1651+ messages in thread
From: Alex Deucher @ 2020-07-27 19:54 UTC (permalink / raw)
  To: Mauro Rossi
  Cc: Deucher, Alexander, Harry Wentland, Christian Koenig,
	amd-gfx list

On Mon, Jul 27, 2020 at 3:46 PM Mauro Rossi <issor.oruam@gmail.com> wrote:
>
>
>
> On Mon, Jul 27, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>>
>> On Sun, Jul 26, 2020 at 11:31 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >
>> > Hello,
>> >
>> > On Fri, Jul 24, 2020 at 8:31 PM Alex Deucher <alexdeucher@gmail.com> wrote:
>> >>
>> >> On Wed, Jul 22, 2020 at 3:57 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >> >
>> >> > Hello,
>> >> > re-sending and copying full DL
>> >> >
>> >> > On Wed, Jul 22, 2020 at 4:51 AM Alex Deucher <alexdeucher@gmail.com> wrote:
>> >> >>
>> >> >> On Mon, Jul 20, 2020 at 6:00 AM Mauro Rossi <issor.oruam@gmail.com> wrote:
>> >> >> >
>> >> >> > Hi Christian,
>> >> >> >
>> >> >> > On Mon, Jul 20, 2020 at 11:00 AM Christian König
>> >> >> > <ckoenig.leichtzumerken@gmail.com> wrote:
>> >> >> > >
>> >> >> > > Hi Mauro,
>> >> >> > >
>> >> >> > > I'm not deep into the whole DC design, so just some general high level
>> >> >> > > comments on the cover letter:
>> >> >> > >
>> >> >> > > 1. Please add a subject line to the cover letter, my spam filter thinks
>> >> >> > > that this is suspicious otherwise.
>> >> >> >
>> >> >> > My mistake in the editing of covert letter with git send-email,
>> >> >> > I may have forgot to keep the Subject at the top
>> >> >> >
>> >> >> > >
>> >> >> > > 2. Then you should probably note how well (badly?) is that tested. Since
>> >> >> > > you noted proof of concept it might not even work.
>> >> >> >
>> >> >> > The Changelog is to be read as:
>> >> >> >
>> >> >> > [RFC] was the initial Proof of concept was the RFC and [PATCH v2] was
>> >> >> > just a rebase onto amd-staging-drm-next
>> >> >> >
>> >> >> > this series [PATCH v3] has all the known changes required for DCE6 specificity
>> >> >> > and based on a long offline thread with Alexander Deutcher and past
>> >> >> > dri-devel chats with Harry Wentland.
>> >> >> >
>> >> >> > It was tested for my possibilities of testing with HD7750 and HD7950,
>> >> >> > with checks in dmesg output for not getting "missing registers/masks"
>> >> >> > kernel WARNING
>> >> >> > and with kernel build on Ubuntu 20.04 and with android-x86
>> >> >> >
>> >> >> > The proposal I made to Alex is that AMD testing systems will be used
>> >> >> > for further regression testing,
>> >> >> > as part of review and validation for eligibility to amd-staging-drm-next
>> >> >> >
>> >> >>
>> >> >> We will certainly test it once it lands, but presumably this is
>> >> >> working on the SI cards you have access to?
>> >> >
>> >> >
>> >> > Yes, most of my testing was done with android-x86  Android CTS (EGL, GLES2, GLES3, VK)
>> >> >
>> >> > I am also in contact with a person with Firepro W5130M who is running a piglit session
>> >> >
>> >> > I had bought an HD7850 to test with Pitcairn, but it arrived as defective so I could not test with Pitcair
>> >> >
>> >> >
>> >> >>
>> >> >> > >
>> >> >> > > 3. How feature complete (HDMI audio?, Freesync?) is it?
>> >> >> >
>> >> >> > All the changes in DC impacting DCE8 (dc/dce80 path) were ported to
>> >> >> > DCE6 (dc/dce60 path) in the last two years from initial submission
>> >> >> >
>> >> >> > >
>> >> >> > > Apart from that it looks like a rather impressive piece of work :)
>> >> >> > >
>> >> >> > > Cheers,
>> >> >> > > Christian.
>> >> >> >
>> >> >> > Thanks,
>> >> >> > please consider that most of the latest DCE6 specific parts were
>> >> >> > possible due to recent Alex support in getting the correct DCE6
>> >> >> > headers,
>> >> >> > his suggestions and continuous feedback.
>> >> >> >
>> >> >> > I would suggest that Alex comments on the proposed next steps to follow.
>> >> >>
>> >> >> The code looks pretty good to me.  I'd like to get some feedback from
>> >> >> the display team to see if they have any concerns, but beyond that I
>> >> >> think we can pull it into the tree and continue improving it there.
>> >> >> Do you have a link to a git tree I can pull directly that contains
>> >> >> these patches?  Is this the right branch?
>> >> >> https://github.com/maurossi/linux/commits/kernel-5.8rc4_si_next
>> >> >>
>> >> >> Thanks!
>> >> >>
>> >> >> Alex
>> >> >
>> >> >
>> >> > The following branch was pushed with the series on top of amd-staging-drm-next
>> >> >
>> >> > https://github.com/maurossi/linux/commits/kernel-5.6_si_drm-next
>> >>
>> >> I gave this a quick test on all of the SI asics and the various
>> >> monitors I had available and it looks good.  A few minor patches I
>> >> noticed are attached.  If they look good to you, I'll squash them into
>> >> the series when I commit it.  I've pushed it to my fdo tree as well:
>> >> https://cgit.freedesktop.org/~agd5f/linux/log/?h=si_dc_support
>> >>
>> >> Thanks!
>> >>
>> >> Alex
>> >
>> >
>> > The new patches are ok and with the following infomation about piglit tests,
>> > the series may be good to go.
>> >
>> > I have performed piglit tests on Tahiti HD7950 on kernel 5.8.0-rc6 with AMD DC support for SI
>> > and comparison with vanilla kernel 5.8.0-rc6
>> >
>> > Results are the following
>> >
>> > [piglit gpu tests with kernel 5.8.0-rc6-amddcsi]
>> >
>> > utente@utente-desktop:~/piglit$ ./piglit run gpu .
>> > [26714/26714] skip: 1731, pass: 24669, warn: 15, fail: 288, crash: 11
>> > Thank you for running Piglit!
>> > Results have been written to /home/utente/piglit
>> >
>> > [piglit gpu tests with vanilla 5.8.0-rc6]
>> >
>> > utente@utente-desktop:~/piglit$ ./piglit run gpu .
>> > [26714/26714] skip: 1731, pass: 24673, warn: 13, fail: 283, crash: 14
>> > Thank you for running Piglit!
>> > Results have been written to /home/utente/piglit
>> >
>> > In the attachment the comparison of "5.8.0-rc6-amddcsi" vs "5.8.0-rc6" vanilla
>> > and viceversa, I see no significant regression and in the delta of failed tests I don't recognize DC related test cases,
>> > but you may also have a look.
>>
>> Looks good to me.  The series is:
>> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>
>
> Thank you Alex for review and the help in finalizing the series
> and to Harry who initially encouraged me and provided the feedbacks to previous v2 series
>

Thanks for sticking with this!

>
>>
>>
>> >
>> > dmesg for "5.8.0-rc6-amddcsi" is also provide the check the crashes
>> >
>> > Regarding the other user testing the series with Firepro W5130M
>> > he found an already existing issue in amdgpu si_support=1 which is independent from my series and matches a problem alrady reported. [1]
>> >
>>
>> amdgpu does not currently implement GPU reset support for SI.
>>
>> Alex
>
>
> If you have in the plans to add support and prevent those crashes,
> the user would be glad to be available for glxgears and piglit testing on Firepro W5130M

Initial patch here:
https://patchwork.freedesktop.org/patch/380648/

Alex

>
> Please let me know
>
> Mauro
>
>>
>>
>> > Mauro
>> >
>> > [1] https://bbs.archlinux.org/viewtopic.php?id=249097
>> >
>> >>
>> >>
>> >> >
>> >> >>
>> >> >>
>> >> >> >
>> >> >> > Mauro
>> >> >> >
>> >> >> > >
>> >> >> > > Am 16.07.20 um 23:22 schrieb Mauro Rossi:
>> >> >> > > > The series adds SI support to AMD DC
>> >> >> > > >
>> >> >> > > > Changelog:
>> >> >> > > >
>> >> >> > > > [RFC]
>> >> >> > > > Preliminar Proof Of Concept, with DCE8 headers still used in dce60_resources.c
>> >> >> > > >
>> >> >> > > > [PATCH v2]
>> >> >> > > > Rebase on amd-staging-drm-next dated 17-Oct-2018
>> >> >> > > >
>> >> >> > > > [PATCH v3]
>> >> >> > > > Add support for DCE6 specific headers,
>> >> >> > > > ad hoc DCE6 macros, funtions and fixes,
>> >> >> > > > rebase on current amd-staging-drm-next
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > Commits [01/27]..[08/27] SI support added in various DC components
>> >> >> > > >
>> >> >> > > > [PATCH v3 01/27] drm/amdgpu: add some required DCE6 registers (v6)
>> >> >> > > > [PATCH v3 02/27] drm/amd/display: add asics info for SI parts
>> >> >> > > > [PATCH v3 03/27] drm/amd/display: dc/dce: add initial DCE6 support (v9b)
>> >> >> > > > [PATCH v3 04/27] drm/amd/display: dc/core: add SI/DCE6 support (v2)
>> >> >> > > > [PATCH v3 05/27] drm/amd/display: dc/bios: add support for DCE6
>> >> >> > > > [PATCH v3 06/27] drm/amd/display: dc/gpio: add support for DCE6 (v2)
>> >> >> > > > [PATCH v3 07/27] drm/amd/display: dc/irq: add support for DCE6 (v4)
>> >> >> > > > [PATCH v3 08/27] drm/amd/display: amdgpu_dm: add SI support (v4)
>> >> >> > > >
>> >> >> > > > Commits [09/27]..[24/27] DCE6 specific code adaptions
>> >> >> > > >
>> >> >> > > > [PATCH v3 09/27] drm/amd/display: dc/clk_mgr: add support for SI parts (v2)
>> >> >> > > > [PATCH v3 10/27] drm/amd/display: dc/dce60: set max_cursor_size to 64
>> >> >> > > > [PATCH v3 11/27] drm/amd/display: dce_audio: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 12/27] drm/amd/display: dce_dmcu: add DCE6 specific macros
>> >> >> > > > [PATCH v3 13/27] drm/amd/display: dce_hwseq: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 14/27] drm/amd/display: dce_ipp: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 15/27] drm/amd/display: dce_link_encoder: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 16/27] drm/amd/display: dce_mem_input: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 17/27] drm/amd/display: dce_opp: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 18/27] drm/amd/display: dce_transform: add DCE6 specific macros,functions
>> >> >> > > > [PATCH v3 19/27] drm/amdgpu: add some required DCE6 registers (v7)
>> >> >> > > > [PATCH v3 20/27] drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init
>> >> >> > > > [PATCH v3 21/27] drm/amd/display: dce60_hw_sequencer: add DCE6 macros,functions
>> >> >> > > > [PATCH v3 22/27] drm/amd/display: dce60_hw_sequencer: add DCE6 specific .cursor_lock
>> >> >> > > > [PATCH v3 23/27] drm/amd/display: dce60_timing_generator: add DCE6 specific functions
>> >> >> > > > [PATCH v3 24/27] drm/amd/display: dc/dce60: use DCE6 headers (v6)
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > Commits [25/27]..[27/27] SI support final enablements
>> >> >> > > >
>> >> >> > > > [PATCH v3 25/27] drm/amd/display: create plane rotation property for Bonarie and later
>> >> >> > > > [PATCH v3 26/27] drm/amdgpu: enable DC support for SI parts (v2)
>> >> >> > > > [PATCH v3 27/27] drm/amd/display: enable SI support in the Kconfig (v2)
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
>> >> >> > > >
>> >> >> > > > _______________________________________________
>> >> >> > > > amd-gfx mailing list
>> >> >> > > > amd-gfx@lists.freedesktop.org
>> >> >> > > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>> >> >> > >
>> >> >> > _______________________________________________
>> >> >> > amd-gfx mailing list
>> >> >> > amd-gfx@lists.freedesktop.org
>> >> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium)
@ 2020-08-05 11:02 Amit Pundir
  2020-08-06 22:31 ` Konrad Dybcio
  0 siblings, 1 reply; 1651+ messages in thread
From: Amit Pundir @ 2020-08-05 11:02 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson, Rob Herring, John Stultz,
	Sumit Semwal
  Cc: linux-arm-msm, dt, lkml

On Wed, 5 Aug 2020 at 16:21, Amit Pundir <amit.pundir@linaro.org> wrote:
>
> Add initial dts support for Xiaomi Poco F1 (Beryllium).
>
> This initial support is based on upstream Dragonboard 845c
> (sdm845) device. With this dts, Beryllium boots AOSP up to
> ADB shell over USB-C.
>
> Supported functionality includes UFS, USB-C (peripheral),
> microSD card and Vol+/Vol-/power keys. Bluetooth should work
> too but couldn't be verified from adb command line, it is
> verified when enabled from UI with few WIP display patches.
>
> Just like initial db845c support, initializing the SMMU is
> clearing the mapping used for the splash screen framebuffer,
> which causes the device to hang during boot and recovery
> needs a hard power reset. This can be worked around using:
>
>     fastboot oem select-display-panel none
>
> To switch ON the display back run:
>
>     fastboot oem select-display-panel
>
> But this only works on Beryllium devices running bootloader
> version BOOT.XF.2.0-00369-SDM845LZB-1 that shipped with
> Android-9 based release. Newer bootloader version do not
> support switching OFF the display panel at all. So we need
> a few additional smmu patches (under review) from here to
> boot to shell:
> https://github.com/pundiramit/linux/commits/beryllium-mainline
>
> Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
> ---
> v4: Added more downstream reserved memory regions. It probably
>     need more work, but for now I see adsp/cdsp/wlan remoteprocs
>     powering up properly. Also removed the regulator nodes not
>     required for the device, as suggested by Bjorn.

Forgot to mention that I added couple of clocks to protected clocks in v4,
which need for display to work.

> v3: Added a reserved-memory region from downstream kernel to fix
>     a boot regression with recent dma-pool changes in v5.8-rc6.
> v2: Updated machine compatible string for seemingly inevitable
>     future quirks.
>
>  arch/arm64/boot/dts/qcom/Makefile             |   1 +
>  arch/arm64/boot/dts/qcom/sdm845-beryllium.dts | 383 ++++++++++++++++++++++++++
>  2 files changed, 384 insertions(+)
>  create mode 100644 arch/arm64/boot/dts/qcom/sdm845-beryllium.dts
>
> diff --git a/arch/arm64/boot/dts/qcom/Makefile b/arch/arm64/boot/dts/qcom/Makefile
> index 0f2c33d611df..3ef1b48bc0cb 100644
> --- a/arch/arm64/boot/dts/qcom/Makefile
> +++ b/arch/arm64/boot/dts/qcom/Makefile
> @@ -21,6 +21,7 @@ dtb-$(CONFIG_ARCH_QCOM)       += sdm845-cheza-r1.dtb
>  dtb-$(CONFIG_ARCH_QCOM)        += sdm845-cheza-r2.dtb
>  dtb-$(CONFIG_ARCH_QCOM)        += sdm845-cheza-r3.dtb
>  dtb-$(CONFIG_ARCH_QCOM)        += sdm845-db845c.dtb
> +dtb-$(CONFIG_ARCH_QCOM)        += sdm845-beryllium.dtb
>  dtb-$(CONFIG_ARCH_QCOM)        += sdm845-mtp.dtb
>  dtb-$(CONFIG_ARCH_QCOM)        += sdm850-lenovo-yoga-c630.dtb
>  dtb-$(CONFIG_ARCH_QCOM)        += sm8150-mtp.dtb
> diff --git a/arch/arm64/boot/dts/qcom/sdm845-beryllium.dts b/arch/arm64/boot/dts/qcom/sdm845-beryllium.dts
> new file mode 100644
> index 000000000000..0f9f61bf9fa4
> --- /dev/null
> +++ b/arch/arm64/boot/dts/qcom/sdm845-beryllium.dts
> @@ -0,0 +1,383 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/dts-v1/;
> +
> +#include <dt-bindings/gpio/gpio.h>
> +#include <dt-bindings/pinctrl/qcom,pmic-gpio.h>
> +#include <dt-bindings/regulator/qcom,rpmh-regulator.h>
> +#include "sdm845.dtsi"
> +#include "pm8998.dtsi"
> +#include "pmi8998.dtsi"
> +
> +/ {
> +       model = "Xiaomi Technologies Inc. Beryllium";
> +       compatible = "xiaomi,beryllium", "qcom,sdm845";
> +
> +       /* required for bootloader to select correct board */
> +       qcom,board-id = <69 0>;
> +       qcom,msm-id = <321 0x20001>;
> +
> +       aliases {
> +               hsuart0 = &uart6;
> +       };
> +
> +       gpio-keys {
> +               compatible = "gpio-keys";
> +               autorepeat;
> +
> +               pinctrl-names = "default";
> +               pinctrl-0 = <&vol_up_pin_a>;
> +
> +               vol-up {
> +                       label = "Volume Up";
> +                       linux,code = <KEY_VOLUMEUP>;
> +                       gpios = <&pm8998_gpio 6 GPIO_ACTIVE_LOW>;
> +               };
> +       };
> +
> +       vreg_s4a_1p8: vreg-s4a-1p8 {
> +               compatible = "regulator-fixed";
> +               regulator-name = "vreg_s4a_1p8";
> +
> +               regulator-min-microvolt = <1800000>;
> +               regulator-max-microvolt = <1800000>;
> +               regulator-always-on;
> +       };
> +};
> +
> +&adsp_pas {
> +       status = "okay";
> +       firmware-name = "qcom/sdm845/adsp.mdt";
> +};
> +
> +&apps_rsc {
> +       pm8998-rpmh-regulators {
> +               compatible = "qcom,pm8998-rpmh-regulators";
> +               qcom,pmic-id = "a";
> +
> +               vreg_l1a_0p875: ldo1 {
> +                       regulator-min-microvolt = <880000>;
> +                       regulator-max-microvolt = <880000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l5a_0p8: ldo5 {
> +                       regulator-min-microvolt = <800000>;
> +                       regulator-max-microvolt = <800000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l7a_1p8: ldo7 {
> +                       regulator-min-microvolt = <1800000>;
> +                       regulator-max-microvolt = <1800000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l12a_1p8: ldo12 {
> +                       regulator-min-microvolt = <1800000>;
> +                       regulator-max-microvolt = <1800000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l13a_2p95: ldo13 {
> +                       regulator-min-microvolt = <1800000>;
> +                       regulator-max-microvolt = <2960000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l17a_1p3: ldo17 {
> +                       regulator-min-microvolt = <1304000>;
> +                       regulator-max-microvolt = <1304000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l20a_2p95: ldo20 {
> +                       regulator-min-microvolt = <2960000>;
> +                       regulator-max-microvolt = <2968000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l21a_2p95: ldo21 {
> +                       regulator-min-microvolt = <2960000>;
> +                       regulator-max-microvolt = <2968000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l24a_3p075: ldo24 {
> +                       regulator-min-microvolt = <3088000>;
> +                       regulator-max-microvolt = <3088000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l25a_3p3: ldo25 {
> +                       regulator-min-microvolt = <3300000>;
> +                       regulator-max-microvolt = <3312000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +
> +               vreg_l26a_1p2: ldo26 {
> +                       regulator-min-microvolt = <1200000>;
> +                       regulator-max-microvolt = <1200000>;
> +                       regulator-initial-mode = <RPMH_REGULATOR_MODE_HPM>;
> +               };
> +       };
> +};
> +
> +&cdsp_pas {
> +       status = "okay";
> +       firmware-name = "qcom/sdm845/cdsp.mdt";
> +};
> +
> +&gcc {
> +       protected-clocks = <GCC_QSPI_CORE_CLK>,
> +                          <GCC_QSPI_CORE_CLK_SRC>,
> +                          <GCC_QSPI_CNOC_PERIPH_AHB_CLK>,
> +                          <GCC_LPASS_Q6_AXI_CLK>,
> +                          <GCC_LPASS_SWAY_CLK>;
> +};
> +
> +&gpu {
> +       zap-shader {
> +               memory-region = <&gpu_mem>;
> +               firmware-name = "qcom/sdm845/a630_zap.mbn";
> +       };
> +};
> +
> +/* Reserved memory changes from downstream */
> +/delete-node/ &adsp_mem;
> +/delete-node/ &wlan_msa_mem;
> +/delete-node/ &mpss_region;
> +/delete-node/ &venus_mem;
> +/delete-node/ &cdsp_mem;
> +/delete-node/ &mba_region;
> +/delete-node/ &slpi_mem;
> +/delete-node/ &spss_mem;
> +/delete-node/ &rmtfs_mem;
> +/ {
> +       reserved-memory {
> +               // This removed_region is needed to boot the device
> +               // TODO: Find out the user of this reserved memory
> +               removed_region: memory@88f00000 {
> +                       reg = <0 0x88f00000 0 0x1a00000>;
> +                       no-map;
> +               };
> +
> +               adsp_mem: memory@8c500000 {
> +                       reg = <0 0x8c500000 0 0x1e00000>;
> +                       no-map;
> +               };
> +
> +               wlan_msa_mem: memory@8e300000 {
> +                       reg = <0 0x8e300000 0 0x100000>;
> +                       no-map;
> +               };
> +
> +               mpss_region: memory@8e400000 {
> +                       reg = <0 0x8e400000 0 0x7800000>;
> +                       no-map;
> +               };
> +
> +               venus_mem: memory@95c00000 {
> +                       reg = <0 0x95c00000 0 0x500000>;
> +                       no-map;
> +               };
> +
> +               cdsp_mem: memory@96100000 {
> +                       reg = <0 0x96100000 0 0x800000>;
> +                       no-map;
> +               };
> +
> +               mba_region: memory@96900000 {
> +                       reg = <0 0x96900000 0 0x200000>;
> +                       no-map;
> +               };
> +
> +               slpi_mem: memory@96b00000 {
> +                       reg = <0 0x96b00000 0 0x1400000>;
> +                       no-map;
> +               };
> +
> +               spss_mem: memory@97f00000 {
> +                       reg = <0 0x97f00000 0 0x100000>;
> +                       no-map;
> +               };
> +
> +               rmtfs_mem: memory@f6301000 {
> +                       compatible = "qcom,rmtfs-mem";
> +                       reg = <0 0xf6301000 0 0x200000>;
> +                       no-map;
> +
> +                       qcom,client-id = <1>;
> +                       qcom,vmid = <15>;
> +               };
> +       };
> +};
> +
> +&mss_pil {
> +       status = "okay";
> +       firmware-name = "qcom/sdm845/mba.mbn", "qcom/sdm845/modem.mdt";
> +};
> +
> +&pm8998_gpio {
> +       vol_up_pin_a: vol-up-active {
> +               pins = "gpio6";
> +               function = "normal";
> +               input-enable;
> +               bias-pull-up;
> +               qcom,drive-strength = <PMIC_GPIO_STRENGTH_NO>;
> +       };
> +};
> +
> +&pm8998_pon {
> +       resin {
> +               compatible = "qcom,pm8941-resin";
> +               interrupts = <0x0 0x8 1 IRQ_TYPE_EDGE_BOTH>;
> +               debounce = <15625>;
> +               bias-pull-up;
> +               linux,code = <KEY_VOLUMEDOWN>;
> +       };
> +};
> +
> +&qupv3_id_0 {
> +       status = "okay";
> +};
> +
> +&sdhc_2 {
> +       status = "okay";
> +
> +       pinctrl-names = "default";
> +       pinctrl-0 = <&sdc2_default_state &sdc2_card_det_n>;
> +
> +       vmmc-supply = <&vreg_l21a_2p95>;
> +       vqmmc-supply = <&vreg_l13a_2p95>;
> +
> +       bus-width = <4>;
> +       cd-gpios = <&tlmm 126 GPIO_ACTIVE_HIGH>;
> +};
> +
> +&tlmm {
> +       gpio-reserved-ranges = <0 4>, <81 4>;
> +
> +       sdc2_default_state: sdc2-default {
> +               clk {
> +                       pins = "sdc2_clk";
> +                       bias-disable;
> +
> +                       /*
> +                        * It seems that mmc_test reports errors if drive
> +                        * strength is not 16 on clk, cmd, and data pins.
> +                        */
> +                       drive-strength = <16>;
> +               };
> +
> +               cmd {
> +                       pins = "sdc2_cmd";
> +                       bias-pull-up;
> +                       drive-strength = <10>;
> +               };
> +
> +               data {
> +                       pins = "sdc2_data";
> +                       bias-pull-up;
> +                       drive-strength = <10>;
> +               };
> +       };
> +
> +       sdc2_card_det_n: sd-card-det-n {
> +               pins = "gpio126";
> +               function = "gpio";
> +               bias-pull-up;
> +       };
> +};
> +
> +&uart6 {
> +       status = "okay";
> +
> +       bluetooth {
> +               compatible = "qcom,wcn3990-bt";
> +
> +               vddio-supply = <&vreg_s4a_1p8>;
> +               vddxo-supply = <&vreg_l7a_1p8>;
> +               vddrf-supply = <&vreg_l17a_1p3>;
> +               vddch0-supply = <&vreg_l25a_3p3>;
> +               max-speed = <3200000>;
> +       };
> +};
> +
> +&usb_1 {
> +       status = "okay";
> +};
> +
> +&usb_1_dwc3 {
> +       dr_mode = "peripheral";
> +};
> +
> +&usb_1_hsphy {
> +       status = "okay";
> +
> +       vdd-supply = <&vreg_l1a_0p875>;
> +       vdda-pll-supply = <&vreg_l12a_1p8>;
> +       vdda-phy-dpdm-supply = <&vreg_l24a_3p075>;
> +
> +       qcom,imp-res-offset-value = <8>;
> +       qcom,hstx-trim-value = <QUSB2_V2_HSTX_TRIM_21_6_MA>;
> +       qcom,preemphasis-level = <QUSB2_V2_PREEMPHASIS_5_PERCENT>;
> +       qcom,preemphasis-width = <QUSB2_V2_PREEMPHASIS_WIDTH_HALF_BIT>;
> +};
> +
> +&usb_1_qmpphy {
> +       status = "okay";
> +
> +       vdda-phy-supply = <&vreg_l26a_1p2>;
> +       vdda-pll-supply = <&vreg_l1a_0p875>;
> +};
> +
> +&ufs_mem_hc {
> +       status = "okay";
> +
> +       reset-gpios = <&tlmm 150 GPIO_ACTIVE_LOW>;
> +
> +       vcc-supply = <&vreg_l20a_2p95>;
> +       vcc-max-microamp = <800000>;
> +};
> +
> +&ufs_mem_phy {
> +       status = "okay";
> +
> +       vdda-phy-supply = <&vreg_l1a_0p875>;
> +       vdda-pll-supply = <&vreg_l26a_1p2>;
> +};
> +
> +&wifi {
> +       status = "okay";
> +
> +       vdd-0.8-cx-mx-supply = <&vreg_l5a_0p8>;
> +       vdd-1.8-xo-supply = <&vreg_l7a_1p8>;
> +       vdd-1.3-rfa-supply = <&vreg_l17a_1p3>;
> +       vdd-3.3-ch0-supply = <&vreg_l25a_3p3>;
> +};
> +
> +/* PINCTRL - additions to nodes defined in sdm845.dtsi */
> +
> +&qup_uart6_default {
> +       pinmux {
> +               pins = "gpio45", "gpio46", "gpio47", "gpio48";
> +               function = "qup6";
> +       };
> +
> +       cts {
> +               pins = "gpio45";
> +               bias-disable;
> +       };
> +
> +       rts-tx {
> +               pins = "gpio46", "gpio47";
> +               drive-strength = <2>;
> +               bias-disable;
> +       };
> +
> +       rx {
> +               pins = "gpio48";
> +               bias-pull-up;
> +       };
> +};
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2020-08-05 11:02 [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium) Amit Pundir
@ 2020-08-06 22:31 ` Konrad Dybcio
  2020-08-12 13:37   ` Amit Pundir
  0 siblings, 1 reply; 1651+ messages in thread
From: Konrad Dybcio @ 2020-08-06 22:31 UTC (permalink / raw)
  To: amit.pundir
  Cc: agross, bjorn.andersson, devicetree, john.stultz, linux-arm-msm,
	linux-kernel, robh+dt, sumit.semwal, Konrad Dybcio

Subject: Re: [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium)

>// This removed_region is needed to boot the device
>               // TODO: Find out the user of this reserved memory
>               removed_region: memory@88f00000 {

This region seems to belong to the Trust Zone. When Linux tries to access it, TZ bites and shuts the device down.

Konrad

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-08-06 22:31 ` Konrad Dybcio
@ 2020-08-12 13:37   ` Amit Pundir
  0 siblings, 0 replies; 1651+ messages in thread
From: Amit Pundir @ 2020-08-12 13:37 UTC (permalink / raw)
  To: Konrad Dybcio
  Cc: Andy Gross, Bjorn Andersson, dt, John Stultz, linux-arm-msm, lkml,
	Rob Herring, Sumit Semwal

On Fri, 7 Aug 2020 at 04:02, Konrad Dybcio <konradybcio@gmail.com> wrote:
>
> Subject: Re: [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium)
>
> >// This removed_region is needed to boot the device
> >               // TODO: Find out the user of this reserved memory
> >               removed_region: memory@88f00000 {
>
> This region seems to belong to the Trust Zone. When Linux tries to access it, TZ bites and shuts the device down.

That is totally possible. Plus it falls right in between TZ and QSEE
reserved-memory regions. However, I do not find any credible source
of information which can confirm this. So I'm hesitant to update the
TODO item in the above comment.

>
> Konrad

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-08-12 10:54 Alex Anadi
  0 siblings, 0 replies; 1651+ messages in thread
From: Alex Anadi @ 2020-08-12 10:54 UTC (permalink / raw)


Attention: Sir/Madam,

Compliments of the season.

I am Mr Alex Anadi a senior staff of Computer Telex Dept of central
bank of Nigeria.

I decided to contact you because of the prevailing security report
reaching my office and the intense nature of polity in Nigeria.

This is to inform you about the recent plan of federal government of
Nigeria to send your fund to you via diplomatic immunity CASH DELIVERY
SYSTEM valued at $10.6 Million United states dollars only, contact me
for further details.

Regards,
Mr Alex Anadi.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2020-09-15  2:40 Dave Airlie
  2020-09-15  7:53 ` Christian König
  0 siblings, 1 reply; 1651+ messages in thread
From: Dave Airlie @ 2020-09-15  2:40 UTC (permalink / raw)
  To: dri-devel; +Cc: christian.koenig, bskeggs

The goal here is to make the ttm_tt object just represent a
memory backing store, and now whether the store is bound to a
global translation table. It moves binding up to the bo level.

There's a lot more work on removing the global TT from the core
of TTM, but this seems like a good start.

Dave.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-09-15  2:40 Dave Airlie
@ 2020-09-15  7:53 ` Christian König
  0 siblings, 0 replies; 1651+ messages in thread
From: Christian König @ 2020-09-15  7:53 UTC (permalink / raw)
  To: dri-devel; +Cc: bskeggs

Reviewed-by: Christian König <christian.koenig@amd.com> for patches #1, 
#3 and #5-#7.

Minor comments on the other two.

Christian.

Am 15.09.20 um 04:40 schrieb Dave Airlie:
> The goal here is to make the ttm_tt object just represent a
> memory backing store, and now whether the store is bound to a
> global translation table. It moves binding up to the bo level.
>
> There's a lot more work on removing the global TT from the core
> of TTM, but this seems like a good start.
>
> Dave.
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20201008181606.460499-1-sandy.8925@gmail.com>]

* Re:
       [not found] <20201008181606.460499-1-sandy.8925@gmail.com>
@ 2020-10-09  6:47 ` Thomas Zimmermann
  2020-10-09  7:14   ` Re: Thomas Zimmermann
  0 siblings, 1 reply; 1651+ messages in thread
From: Thomas Zimmermann @ 2020-10-09  6:47 UTC (permalink / raw)
  To: sandy.8925, alexander.deucher; +Cc: dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 559 bytes --]

NACK for the entire lack of any form of commit description.

Am 08.10.20 um 20:16 schrieb sandy.8925@gmail.com:
> Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
  2020-10-09  6:47 ` Re: Thomas Zimmermann
@ 2020-10-09  7:14   ` Thomas Zimmermann
  2020-10-09  7:38     ` Re: Sandeep Raghuraman
  0 siblings, 1 reply; 1651+ messages in thread
From: Thomas Zimmermann @ 2020-10-09  7:14 UTC (permalink / raw)
  To: sandy.8925, alexander.deucher; +Cc: dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1038 bytes --]

Hi

Am 09.10.20 um 08:47 schrieb Thomas Zimmermann:
> NACK for the entire lack of any form of commit description.

Please see the documentation at [1] on how to describe the changes and
getting your patches merged.

Best regards
Thomas

[1]
https://dri.freedesktop.org/docs/drm/process/submitting-patches.html#describe-your-changes

> 
> Am 08.10.20 um 20:16 schrieb sandy.8925@gmail.com:
>> Signed-off-by: Sandeep Raghuraman <sandy.8925@gmail.com>
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
> 
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-10-09  7:14   ` Re: Thomas Zimmermann
@ 2020-10-09  7:38     ` Sandeep Raghuraman
  2020-10-09  7:51       ` Re: Thomas Zimmermann
  0 siblings, 1 reply; 1651+ messages in thread
From: Sandeep Raghuraman @ 2020-10-09  7:38 UTC (permalink / raw)
  To: Thomas Zimmermann, alexander.deucher; +Cc: dri-devel



On 10/9/20 12:44 PM, Thomas Zimmermann wrote:
> Hi
> 
> Am 09.10.20 um 08:47 schrieb Thomas Zimmermann:
>> NACK for the entire lack of any form of commit description.
> 
> Please see the documentation at [1] on how to describe the changes and
> getting your patches merged.

Yes, I tried to use git send-email to send patches this time, and it resulted in this disaster. I'll stick to sending them through Thunderbird.

> 
> Best regards
> Thomas
> 
> [1]
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-10-09  7:38     ` Re: Sandeep Raghuraman
@ 2020-10-09  7:51       ` Thomas Zimmermann
  2020-10-09 15:48         ` Re: Alex Deucher
  0 siblings, 1 reply; 1651+ messages in thread
From: Thomas Zimmermann @ 2020-10-09  7:51 UTC (permalink / raw)
  To: Sandeep Raghuraman, alexander.deucher; +Cc: dri-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 1328 bytes --]

Hi

Am 09.10.20 um 09:38 schrieb Sandeep Raghuraman:
> 
> 
> On 10/9/20 12:44 PM, Thomas Zimmermann wrote:
>> Hi
>>
>> Am 09.10.20 um 08:47 schrieb Thomas Zimmermann:
>>> NACK for the entire lack of any form of commit description.
>>
>> Please see the documentation at [1] on how to describe the changes and
>> getting your patches merged.
> 
> Yes, I tried to use git send-email to send patches this time, and it resulted in this disaster. I'll stick to sending them through Thunderbird.

What's the problem with send-email?

A typical call for your patchset would look like

  git send-mail <upstream-branch>...HEAD --cover-letter --annotate
--to=... --cc=...

That allows you to write the cover letter and have it sent out. IIRC you
need ot set $EDITOR to your favorite text editor; and configure the SMTP
server in ~/.gitconfig, under [sendemail].

Best regards
Thomas

> 
>>
>> Best regards
>> Thomas
>>
>> [1]
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 516 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-10-09  7:51       ` Re: Thomas Zimmermann
@ 2020-10-09 15:48         ` Alex Deucher
  0 siblings, 0 replies; 1651+ messages in thread
From: Alex Deucher @ 2020-10-09 15:48 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Deucher, Alexander, Sandeep Raghuraman,
	Maling list - DRI developers

On Fri, Oct 9, 2020 at 3:51 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
> Hi
>
> Am 09.10.20 um 09:38 schrieb Sandeep Raghuraman:
> >
> >
> > On 10/9/20 12:44 PM, Thomas Zimmermann wrote:
> >> Hi
> >>
> >> Am 09.10.20 um 08:47 schrieb Thomas Zimmermann:
> >>> NACK for the entire lack of any form of commit description.
> >>
> >> Please see the documentation at [1] on how to describe the changes and
> >> getting your patches merged.
> >
> > Yes, I tried to use git send-email to send patches this time, and it resulted in this disaster. I'll stick to sending them through Thunderbird.
>
> What's the problem with send-email?
>
> A typical call for your patchset would look like
>
>   git send-mail <upstream-branch>...HEAD --cover-letter --annotate
> --to=... --cc=...
>
> That allows you to write the cover letter and have it sent out. IIRC you
> need ot set $EDITOR to your favorite text editor; and configure the SMTP
> server in ~/.gitconfig, under [sendemail].
>

You can also do `git format-patch -3 --cover-letter` and manually edit
the coverletter as needed then send them with git send-email.

Alex

> Best regards
> Thomas
>
> >
> >>
> >> Best regards
> >> Thomas
> >>
> >> [1]
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2020-11-06 10:44 Luis Gerhorst
  2020-11-06 14:34 ` Pavel Begunkov
  0 siblings, 1 reply; 1651+ messages in thread
From: Luis Gerhorst @ 2020-11-06 10:44 UTC (permalink / raw)
  To: asml.silence; +Cc: axboe, io-uring, metze, carter.li

Hello Pavel,

I'm from a university and am searching for a project to work on in the
upcoming year. I am looking into allowing userspace to run multiple
system calls interleaved with application-specific logic using a single
context switch.

I noticed that you, Jens Axboe, and Carter Li discussed the possibility
of integrating eBPF into io_uring earlier this year [1, 2, 3]. Is there
any WIP on this topic?

If not I am considering to implement this. Besides the fact that AOT
eBPF is only supported for priviledged processes, are there any issues
you are aware of or reasons why this was not implemented yet?

Best,
Luis

[1] https://lore.kernel.org/io-uring/67b28e66-f2f8-99a1-dfd1-14f753d11f7a@gmail.com/
[2] https://lore.kernel.org/io-uring/8b3f182c-7c4b-da41-7ec8-bb4f22429ed1@kernel.dk/
[3] https://github.com/axboe/liburing/issues/58

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-11-06 10:44 Luis Gerhorst
@ 2020-11-06 14:34 ` Pavel Begunkov
  0 siblings, 0 replies; 1651+ messages in thread
From: Pavel Begunkov @ 2020-11-06 14:34 UTC (permalink / raw)
  To: Luis Gerhorst; +Cc: axboe, io-uring, metze, carter.li

On 06/11/2020 10:44, Luis Gerhorst wrote:
> Hello Pavel,
> 
> I'm from a university and am searching for a project to work on in the
> upcoming year. I am looking into allowing userspace to run multiple
> system calls interleaved with application-specific logic using a single
> context switch.
> 
> I noticed that you, Jens Axboe, and Carter Li discussed the possibility
> of integrating eBPF into io_uring earlier this year [1, 2, 3]. Is there
> any WIP on this topic?

To be honest, I've finally returned to it a week ago, just because got
more free time. I was implicitly patching/refactoring some bits keeping
this in mind but rather very lazily.

> If not I am considering to implement this. Besides the fact that AOT
> eBPF is only supported for priviledged processes, are there any issues
> you are aware of or reasons why this was not implemented yet?

All others I was anticipating are gone by now. I'd be really great to
think something out for non-privileged processes, but as you know that
doesn't hold us off.

> [1] https://lore.kernel.org/io-uring/67b28e66-f2f8-99a1-dfd1-14f753d11f7a@gmail.com/
> [2] https://lore.kernel.org/io-uring/8b3f182c-7c4b-da41-7ec8-bb4f22429ed1@kernel.dk/
> [3] https://github.com/axboe/liburing/issues/58
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2020-11-30 10:31 Oleksandr Tyshchenko
  2020-11-30 16:21 ` Alex Bennée
  0 siblings, 1 reply; 1651+ messages in thread
From: Oleksandr Tyshchenko @ 2020-11-30 10:31 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, Paul Durrant, Jan Beulich, Andrew Cooper,
	Roger Pau Monné, Wei Liu, Julien Grall, George Dunlap,
	Ian Jackson, Julien Grall, Stefano Stabellini, Tim Deegan,
	Daniel De Graaf, Volodymyr Babchuk, Jun Nakajima, Kevin Tian,
	Anthony PERARD, Bertrand Marquis, Wei Chen, Kaly Xin,
	Artem Mygaiev, Alex Bennée

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>


Date: Sat, 28 Nov 2020 22:33:51 +0200
Subject: [PATCH V3 00/23] IOREQ feature (+ virtio-mmio) on Arm
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Hello all.

The purpose of this patch series is to add IOREQ/DM support to Xen on Arm.
You can find an initial discussion at [1] and RFC/V1/V2 series at [2]/[3]/[4].
Xen on Arm requires some implementation to forward guest MMIO access to a device
model in order to implement virtio-mmio backend or even mediator outside of hypervisor.
As Xen on x86 already contains required support this series tries to make it common
and introduce Arm specific bits plus some new functionality. Patch series is based on
Julien's PoC "xen/arm: Add support for Guest IO forwarding to a device emulator".
Besides splitting existing IOREQ/DM support and introducing Arm side, the series
also includes virtio-mmio related changes (last 2 patches for toolstack)
for the reviewers to be able to see how the whole picture could look like.

According to the initial discussion there are a few open questions/concerns
regarding security, performance in VirtIO solution:
1. virtio-mmio vs virtio-pci, SPI vs MSI, different use-cases require different
   transport...
2. virtio backend is able to access all guest memory, some kind of protection
   is needed: 'virtio-iommu in Xen' vs 'pre-shared-memory & memcpys in guest'
3. interface between toolstack and 'out-of-qemu' virtio backend, avoid using
   Xenstore in virtio backend if possible.
4. a lot of 'foreing mapping' could lead to the memory exhaustion, Julien
   has some idea regarding that.

Looks like all of them are valid and worth considering, but the first thing
which we need on Arm is a mechanism to forward guest IO to a device emulator,
so let's focus on it in the first place.

***

There are a lot of changes since RFC series, almost all TODOs were resolved on Arm,
Arm code was improved and hardened, common IOREQ/DM code became really arch-agnostic
(without HVM-ism), the "legacy" mechanism of mapping magic pages for the IOREQ servers
was left x86 specific, etc. But one TODO still remains which is "PIO handling" on Arm.
The "PIO handling" TODO is expected to left unaddressed for the current series.
It is not an big issue for now while Xen doesn't have support for vPCI on Arm.
On Arm64 they are only used for PCI IO Bar and we would probably want to expose
them to emulator as PIO access to make a DM completely arch-agnostic. So "PIO handling"
should be implemented when we add support for vPCI.

I left interface untouched in the following patch
"xen/dm: Introduce xendevicemodel_set_irq_level DM op"
since there is still an open discussion what interface to use/what
information to pass to the hypervisor.

There is a patch on review this series depends on:
https://patchwork.kernel.org/patch/11816689

Please note, that IOREQ feature is disabled by default on Arm within current series.

***

Patch series [5] was rebased on recent "staging branch"
(181f2c2 evtchn: double per-channel locking can't hit identical channels) and tested on
Renesas Salvator-X board + H3 ES3.0 SoC (Arm64) with virtio-mmio disk backend [6]
running in driver domain and unmodified Linux Guest running on existing
virtio-blk driver (frontend). No issues were observed. Guest domain 'reboot/destroy'
use-cases work properly. Patch series was only build-tested on x86.

Please note, build-test passed for the following modes:
1. x86: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y (default)
2. x86: #CONFIG_HVM is not set / #CONFIG_IOREQ_SERVER is not set
3. Arm64: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
4. Arm64: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
5. Arm32: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
6. Arm32: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)

***

Any feedback/help would be highly appreciated.

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00825.html
[2] https://lists.xenproject.org/archives/html/xen-devel/2020-08/msg00071.html
[3] https://lists.xenproject.org/archives/html/xen-devel/2020-09/msg00732.html
[4] https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01077.html
[5] https://github.com/otyshchenko1/xen/commits/ioreq_4.14_ml4
[6] https://github.com/xen-troops/virtio-disk/commits/ioreq_ml1

Julien Grall (5):
  xen/dm: Make x86's DM feature common
  xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
  arm/ioreq: Introduce arch specific bits for IOREQ/DM features
  xen/dm: Introduce xendevicemodel_set_irq_level DM op
  libxl: Introduce basic virtio-mmio support on Arm

Oleksandr Tyshchenko (18):
  x86/ioreq: Prepare IOREQ feature for making it common
  x86/ioreq: Add IOREQ_STATUS_* #define-s and update code for moving
  x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
  xen/ioreq: Make x86's IOREQ feature common
  xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
  xen/ioreq: Make x86's hvm_mmio_first(last)_byte() common
  xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
  xen/ioreq: Move x86's ioreq_server to struct domain
  xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
  xen/ioreq: Remove "hvm" prefixes from involved function names
  xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
  xen/arm: Stick around in leave_hypervisor_to_guest until I/O has
    completed
  xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
  xen/ioreq: Introduce domain_has_ioreq_server()
  xen/arm: io: Abstract sign-extension
  xen/ioreq: Make x86's send_invalidate_req() common
  xen/arm: Add mapcache invalidation handling
  [RFC] libxl: Add support for virtio-disk configuration

 MAINTAINERS                                  |    8 +-
 tools/include/xendevicemodel.h               |    4 +
 tools/libs/devicemodel/core.c                |   18 +
 tools/libs/devicemodel/libxendevicemodel.map |    1 +
 tools/libs/light/Makefile                    |    1 +
 tools/libs/light/libxl_arm.c                 |   94 +-
 tools/libs/light/libxl_create.c              |    1 +
 tools/libs/light/libxl_internal.h            |    1 +
 tools/libs/light/libxl_types.idl             |   16 +
 tools/libs/light/libxl_types_internal.idl    |    1 +
 tools/libs/light/libxl_virtio_disk.c         |  109 +++
 tools/xl/Makefile                            |    2 +-
 tools/xl/xl.h                                |    3 +
 tools/xl/xl_cmdtable.c                       |   15 +
 tools/xl/xl_parse.c                          |  116 +++
 tools/xl/xl_virtio_disk.c                    |   46 +
 xen/arch/arm/Makefile                        |    2 +
 xen/arch/arm/dm.c                            |   89 ++
 xen/arch/arm/domain.c                        |    9 +
 xen/arch/arm/hvm.c                           |    4 +
 xen/arch/arm/io.c                            |   29 +-
 xen/arch/arm/ioreq.c                         |  126 +++
 xen/arch/arm/p2m.c                           |   48 +-
 xen/arch/arm/traps.c                         |   58 +-
 xen/arch/x86/Kconfig                         |    1 +
 xen/arch/x86/hvm/dm.c                        |  295 +-----
 xen/arch/x86/hvm/emulate.c                   |   80 +-
 xen/arch/x86/hvm/hvm.c                       |   12 +-
 xen/arch/x86/hvm/hypercall.c                 |    9 +-
 xen/arch/x86/hvm/intercept.c                 |    5 +-
 xen/arch/x86/hvm/io.c                        |   26 +-
 xen/arch/x86/hvm/ioreq.c                     | 1357 ++------------------------
 xen/arch/x86/hvm/stdvga.c                    |   10 +-
 xen/arch/x86/hvm/svm/nestedsvm.c             |    2 +-
 xen/arch/x86/hvm/vmx/realmode.c              |    6 +-
 xen/arch/x86/hvm/vmx/vvmx.c                  |    2 +-
 xen/arch/x86/mm.c                            |   46 +-
 xen/arch/x86/mm/p2m.c                        |   13 +-
 xen/arch/x86/mm/shadow/common.c              |    2 +-
 xen/common/Kconfig                           |    3 +
 xen/common/Makefile                          |    2 +
 xen/common/dm.c                              |  292 ++++++
 xen/common/ioreq.c                           | 1307 +++++++++++++++++++++++++
 xen/common/memory.c                          |   73 +-
 xen/include/asm-arm/domain.h                 |    3 +
 xen/include/asm-arm/hvm/ioreq.h              |  139 +++
 xen/include/asm-arm/mm.h                     |    8 -
 xen/include/asm-arm/mmio.h                   |    1 +
 xen/include/asm-arm/p2m.h                    |   19 +-
 xen/include/asm-arm/traps.h                  |   24 +
 xen/include/asm-x86/hvm/domain.h             |   43 -
 xen/include/asm-x86/hvm/emulate.h            |    2 +-
 xen/include/asm-x86/hvm/io.h                 |   17 -
 xen/include/asm-x86/hvm/ioreq.h              |   58 +-
 xen/include/asm-x86/hvm/vcpu.h               |   18 -
 xen/include/asm-x86/mm.h                     |    4 -
 xen/include/asm-x86/p2m.h                    |   24 +-
 xen/include/public/arch-arm.h                |    5 +
 xen/include/public/hvm/dm_op.h               |   16 +
 xen/include/xen/dm.h                         |   44 +
 xen/include/xen/ioreq.h                      |  146 +++
 xen/include/xen/p2m-common.h                 |    4 +
 xen/include/xen/sched.h                      |   32 +
 xen/include/xsm/dummy.h                      |    4 +-
 xen/include/xsm/xsm.h                        |    6 +-
 xen/xsm/dummy.c                              |    2 +-
 xen/xsm/flask/hooks.c                        |    5 +-
 67 files changed, 3084 insertions(+), 1884 deletions(-)
 create mode 100644 tools/libs/light/libxl_virtio_disk.c
 create mode 100644 tools/xl/xl_virtio_disk.c
 create mode 100644 xen/arch/arm/dm.c
 create mode 100644 xen/arch/arm/ioreq.c
 create mode 100644 xen/common/dm.c
 create mode 100644 xen/common/ioreq.c
 create mode 100644 xen/include/asm-arm/hvm/ioreq.h
 create mode 100644 xen/include/xen/dm.h
 create mode 100644 xen/include/xen/ioreq.h

-- 
2.7.4



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-11-30 10:31 Oleksandr Tyshchenko
@ 2020-11-30 16:21 ` Alex Bennée
  2020-12-29 15:32   ` Re: Roger Pau Monné
  0 siblings, 1 reply; 1651+ messages in thread
From: Alex Bennée @ 2020-11-30 16:21 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Oleksandr Tyshchenko, Paul Durrant, Jan Beulich,
	Andrew Cooper, Roger Pau Monné, Wei Liu, Julien Grall,
	George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Tim Deegan, Daniel De Graaf, Volodymyr Babchuk, Jun Nakajima,
	Kevin Tian, Anthony PERARD, Bertrand Marquis, Wei Chen, Kaly Xin,
	Artem Mygaiev

Oleksandr Tyshchenko <olekstysh@gmail.com> writes:

> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>
>
> Date: Sat, 28 Nov 2020 22:33:51 +0200
> Subject: [PATCH V3 00/23] IOREQ feature (+ virtio-mmio) on Arm
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Hello all.
>
> The purpose of this patch series is to add IOREQ/DM support to Xen on Arm.
> You can find an initial discussion at [1] and RFC/V1/V2 series at [2]/[3]/[4].
> Xen on Arm requires some implementation to forward guest MMIO access to a device
> model in order to implement virtio-mmio backend or even mediator outside of hypervisor.
> As Xen on x86 already contains required support this series tries to make it common
> and introduce Arm specific bits plus some new functionality. Patch series is based on
> Julien's PoC "xen/arm: Add support for Guest IO forwarding to a device emulator".
> Besides splitting existing IOREQ/DM support and introducing Arm side, the series
> also includes virtio-mmio related changes (last 2 patches for toolstack)
> for the reviewers to be able to see how the whole picture could look
> like.

Thanks for posting the latest version.

>
> According to the initial discussion there are a few open questions/concerns
> regarding security, performance in VirtIO solution:
> 1. virtio-mmio vs virtio-pci, SPI vs MSI, different use-cases require different
>    transport...

I think I'm repeating things here I've said in various ephemeral video
chats over the last few weeks but I should probably put things down on
the record.

I think the original intention of the virtio framers is advanced
features would build on virtio-pci because you get a bunch of things
"for free" - notably enumeration and MSI support. There is assumption
that by the time you add these features to virtio-mmio you end up
re-creating your own less well tested version of virtio-pci. I've not
been terribly convinced by the argument that the guest implementation of
PCI presents a sufficiently large blob of code to make the simpler MMIO
desirable. My attempts to build two virtio kernels (PCI/MMIO) with
otherwise the same devices wasn't terribly conclusive either way.

That said virtio-mmio still has life in it because the cloudy slimmed
down guests moved to using it because the enumeration of PCI is a road
block to their fast boot up requirements. I'm sure they would also
appreciate a MSI implementation to reduce the overhead that handling
notifications currently has on trap-and-emulate.

AIUI for Xen the other downside to PCI is you would have to emulate it
in the hypervisor which would be additional code at the most privileged
level.

> 2. virtio backend is able to access all guest memory, some kind of protection
>    is needed: 'virtio-iommu in Xen' vs 'pre-shared-memory & memcpys in
>    guest'

This is also an area of interest for Project Stratos and something we
would like to be solved generally for all hypervisors. There is a good
write up of some approaches that Jean Phillipe did on the stratos
mailing list:

  From: Jean-Philippe Brucker <jean-philippe@linaro.org>
  Subject: Limited memory sharing investigation
  Message-ID: <20201002134336.GA2196245@myrica>

I suspect there is a good argument for the simplicity of a combined
virt queue but it is unlikely to be very performance orientated.

> 3. interface between toolstack and 'out-of-qemu' virtio backend, avoid using
>    Xenstore in virtio backend if possible.

I wonder how much work it would be for a rust expert to make:

  https://github.com/slp/vhost-user-blk

handle an IOREQ signalling pathway instead of the vhost-user/eventfd
pathway? That would give a good indication on how "hypervisor blind"
these daemons could be made.

<snip>
>
> Please note, build-test passed for the following modes:
> 1. x86: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y (default)
> 2. x86: #CONFIG_HVM is not set / #CONFIG_IOREQ_SERVER is not set
> 3. Arm64: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y

Forgive my relative newness to Xen, how do I convince the hypervisor to
build with this on? I've tried variants of:

  make -j9 CROSS_COMPILE=aarch64-linux-gnu- XEN_TARGET_ARCH=arm64 menuconfig XEN_EXPERT=y [CONFIG_|XEN_|_]IOREQ_SERVER=y

with no joy...

> 4. Arm64: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
> 5. Arm32: CONFIG_HVM=y / CONFIG_IOREQ_SERVER=y
> 6. Arm32: CONFIG_HVM=y / #CONFIG_IOREQ_SERVER is not set  (default)
>
> ***
>
> Any feedback/help would be highly appreciated.
<snip>

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-11-30 16:21 ` Alex Bennée
@ 2020-12-29 15:32   ` Roger Pau Monné
  0 siblings, 0 replies; 1651+ messages in thread
From: Roger Pau Monné @ 2020-12-29 15:32 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Oleksandr Tyshchenko, xen-devel, Oleksandr Tyshchenko,
	Paul Durrant, Jan Beulich, Andrew Cooper, Wei Liu, Julien Grall,
	George Dunlap, Ian Jackson, Julien Grall, Stefano Stabellini,
	Tim Deegan, Daniel De Graaf, Volodymyr Babchuk, Jun Nakajima,
	Kevin Tian, Anthony PERARD, Bertrand Marquis, Wei Chen, Kaly Xin,
	Artem Mygaiev

On Mon, Nov 30, 2020 at 04:21:59PM +0000, Alex Bennée wrote:
> 
> Oleksandr Tyshchenko <olekstysh@gmail.com> writes:
> 
> > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> >
> >
> > Date: Sat, 28 Nov 2020 22:33:51 +0200
> > Subject: [PATCH V3 00/23] IOREQ feature (+ virtio-mmio) on Arm
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=UTF-8
> > Content-Transfer-Encoding: 8bit
> >
> > Hello all.
> >
> > The purpose of this patch series is to add IOREQ/DM support to Xen on Arm.
> > You can find an initial discussion at [1] and RFC/V1/V2 series at [2]/[3]/[4].
> > Xen on Arm requires some implementation to forward guest MMIO access to a device
> > model in order to implement virtio-mmio backend or even mediator outside of hypervisor.
> > As Xen on x86 already contains required support this series tries to make it common
> > and introduce Arm specific bits plus some new functionality. Patch series is based on
> > Julien's PoC "xen/arm: Add support for Guest IO forwarding to a device emulator".
> > Besides splitting existing IOREQ/DM support and introducing Arm side, the series
> > also includes virtio-mmio related changes (last 2 patches for toolstack)
> > for the reviewers to be able to see how the whole picture could look
> > like.
> 
> Thanks for posting the latest version.
> 
> >
> > According to the initial discussion there are a few open questions/concerns
> > regarding security, performance in VirtIO solution:
> > 1. virtio-mmio vs virtio-pci, SPI vs MSI, different use-cases require different
> >    transport...
> 
> I think I'm repeating things here I've said in various ephemeral video
> chats over the last few weeks but I should probably put things down on
> the record.
> 
> I think the original intention of the virtio framers is advanced
> features would build on virtio-pci because you get a bunch of things
> "for free" - notably enumeration and MSI support. There is assumption
> that by the time you add these features to virtio-mmio you end up
> re-creating your own less well tested version of virtio-pci. I've not
> been terribly convinced by the argument that the guest implementation of
> PCI presents a sufficiently large blob of code to make the simpler MMIO
> desirable. My attempts to build two virtio kernels (PCI/MMIO) with
> otherwise the same devices wasn't terribly conclusive either way.
> 
> That said virtio-mmio still has life in it because the cloudy slimmed
> down guests moved to using it because the enumeration of PCI is a road
> block to their fast boot up requirements. I'm sure they would also
> appreciate a MSI implementation to reduce the overhead that handling
> notifications currently has on trap-and-emulate.
> 
> AIUI for Xen the other downside to PCI is you would have to emulate it
> in the hypervisor which would be additional code at the most privileged
> level.

Xen already emulates (or maybe it would be better to say decodes) PCI
accesses on the hypervisor and forwards them to the appropriate device
model using the IOREQ interface, so that's not something new. It's
not really emulating the PCI config space, but just detecting accesses
and forwarding them to the device model that should handle them.

You can register different emulators in user space that handle
accesses to different PCI devices from a guest.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH] lib/find_bit: Add find_prev_*_bit functions.
@ 2020-12-02  1:10 Yun Levi
  2020-12-02  9:47 ` Andy Shevchenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Yun Levi @ 2020-12-02  1:10 UTC (permalink / raw)
  To: dushistov, arnd, akpm, gustavo, vilhelm.gray, richard.weiyang,
	andriy.shevchenko, joseph.qi, skalluru, yury.norov, jpoimboe
  Cc: linux-kernel, linux-arch

Inspired find_next_*bit function series, add find_prev_*_bit series.
I'm not sure whether it'll be used right now But, I add these functions
for future usage.

Signed-off-by: Levi Yun <ppbuk5246@gmail.com>
---
 fs/ufs/util.h                     |  24 +++---
 include/asm-generic/bitops/find.h |  69 ++++++++++++++++
 include/asm-generic/bitops/le.h   |  33 ++++++++
 include/linux/bitops.h            |  34 +++++---
 lib/find_bit.c                    | 126 +++++++++++++++++++++++++++++-
 5 files changed, 260 insertions(+), 26 deletions(-)

diff --git a/fs/ufs/util.h b/fs/ufs/util.h
index 4931bec1a01c..7c87c77d10ca 100644
--- a/fs/ufs/util.h
+++ b/fs/ufs/util.h
@@ -2,7 +2,7 @@
 /*
  *  linux/fs/ufs/util.h
  *
- * Copyright (C) 1998
+ * Copyright (C) 1998
  * Daniel Pirkl <daniel.pirkl@email.cz>
  * Charles University, Faculty of Mathematics and Physics
  */
@@ -263,7 +263,7 @@ extern int ufs_prepare_chunk(struct page *page,
loff_t pos, unsigned len);
 /*
  * These functions manipulate ufs buffers
  */
-#define ubh_bread(sb,fragment,size) _ubh_bread_(uspi,sb,fragment,size)
+#define ubh_bread(sb,fragment,size) _ubh_bread_(uspi,sb,fragment,size)
 extern struct ufs_buffer_head * _ubh_bread_(struct
ufs_sb_private_info *, struct super_block *, u64 , u64);
 extern struct ufs_buffer_head * ubh_bread_uspi(struct
ufs_sb_private_info *, struct super_block *, u64, u64);
 extern void ubh_brelse (struct ufs_buffer_head *);
@@ -296,7 +296,7 @@ static inline void *get_usb_offset(struct
ufs_sb_private_info *uspi,
                                   unsigned int offset)
 {
        unsigned int index;
-
+
        index = offset >> uspi->s_fshift;
        offset &= ~uspi->s_fmask;
        return uspi->s_ubh.bh[index]->b_data + offset;
@@ -411,9 +411,9 @@ static inline unsigned _ubh_find_next_zero_bit_(
                offset = 0;
        }
        return (base << uspi->s_bpfshift) + pos - begin;
-}
+}

-static inline unsigned find_last_zero_bit (unsigned char * bitmap,
+static inline unsigned __ubh_find_last_zero_bit(unsigned char * bitmap,
        unsigned size, unsigned offset)
 {
        unsigned bit, i;
@@ -453,7 +453,7 @@ static inline unsigned _ubh_find_last_zero_bit_(
                            size + (uspi->s_bpf - start), uspi->s_bpf)
                        - (uspi->s_bpf - start);
                size -= count;
-               pos = find_last_zero_bit (ubh->bh[base]->b_data,
+               pos = __ubh_find_last_zero_bit(ubh->bh[base]->b_data,
                        start, start - count);
                if (pos > start - count || !size)
                        break;
@@ -461,7 +461,7 @@ static inline unsigned _ubh_find_last_zero_bit_(
                start = uspi->s_bpf;
        }
        return (base << uspi->s_bpfshift) + pos - begin;
-}
+}

 #define ubh_isblockclear(ubh,begin,block)
(!_ubh_isblockset_(uspi,ubh,begin,block))

@@ -483,7 +483,7 @@ static inline int _ubh_isblockset_(struct
ufs_sb_private_info * uspi,
                mask = 0x01 << (block & 0x07);
                return (*ubh_get_addr (ubh, begin + (block >> 3)) &
mask) == mask;
        }
-       return 0;
+       return 0;
 }

 #define ubh_clrblock(ubh,begin,block) _ubh_clrblock_(uspi,ubh,begin,block)
@@ -492,8 +492,8 @@ static inline void _ubh_clrblock_(struct
ufs_sb_private_info * uspi,
 {
        switch (uspi->s_fpb) {
        case 8:
-               *ubh_get_addr (ubh, begin + block) = 0x00;
-               return;
+               *ubh_get_addr (ubh, begin + block) = 0x00;
+               return;
        case 4:
                *ubh_get_addr (ubh, begin + (block >> 1)) &= ~(0x0f <<
((block & 0x01) << 2));
                return;
@@ -531,9 +531,9 @@ static inline void ufs_fragacct (struct
super_block * sb, unsigned blockmap,
 {
        struct ufs_sb_private_info * uspi;
        unsigned fragsize, pos;
-
+
        uspi = UFS_SB(sb)->s_uspi;
-
+
        fragsize = 0;
        for (pos = 0; pos < uspi->s_fpb; pos++) {
                if (blockmap & (1 << pos)) {
diff --git a/include/asm-generic/bitops/find.h
b/include/asm-generic/bitops/find.h
index 9fdf21302fdf..ca18b2ec954c 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -16,6 +16,20 @@ extern unsigned long find_next_bit(const unsigned
long *addr, unsigned long
                size, unsigned long offset);
 #endif

+#ifndef find_prev_bit
+/**
+ * find_prev_bit - find the prev set bit in a memory region
+ * @addr: The address to base the search on
+ * @offset: The bitnumber to start searching at
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number for the prev set bit
+ * If no bits are set, returns @size.
+ */
+extern unsigned long find_prev_bit(const unsigned long *addr, unsigned long
+               size, unsigned long offset);
+#endif
+
 #ifndef find_next_and_bit
 /**
  * find_next_and_bit - find the next set bit in both memory regions
@@ -32,6 +46,22 @@ extern unsigned long find_next_and_bit(const
unsigned long *addr1,
                unsigned long offset);
 #endif

+#ifndef find_prev_and_bit
+/**
+ * find_prev_and_bit - find the prev set bit in both memory regions
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @offset: The bitnumber to start searching at
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number for the prev set bit
+ * If no bits are set, returns @size.
+ */
+extern unsigned long find_prev_and_bit(const unsigned long *addr1,
+               const unsigned long *addr2, unsigned long size,
+               unsigned long offset);
+#endif
+
 #ifndef find_next_zero_bit
 /**
  * find_next_zero_bit - find the next cleared bit in a memory region
@@ -46,6 +76,20 @@ extern unsigned long find_next_zero_bit(const
unsigned long *addr, unsigned
                long size, unsigned long offset);
 #endif

+#ifndef find_prev_zero_bit
+/**
+ * find_prev_zero_bit - find the prev cleared bit in a memory region
+ * @addr: The address to base the search on
+ * @offset: The bitnumber to start searching at
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number of the prev zero bit
+ * If no bits are zero, returns @size.
+ */
+extern unsigned long find_prev_zero_bit(const unsigned long *addr, unsigned
+               long size, unsigned long offset);
+#endif
+
 #ifdef CONFIG_GENERIC_FIND_FIRST_BIT

 /**
@@ -80,6 +124,31 @@ extern unsigned long find_first_zero_bit(const
unsigned long *addr,

 #endif /* CONFIG_GENERIC_FIND_FIRST_BIT */

+#ifndef find_last_bit
+/**
+ * find_last_bit - find the last set bit in a memory region
+ * @addr: The address to start the search at
+ * @size: The number of bits to search
+ *
+ * Returns the bit number of the last set bit, or size.
+ */
+extern unsigned long find_last_bit(const unsigned long *addr,
+                                  unsigned long size);
+#endif
+
+#ifndef find_last_zero_bit
+/**
+ * find_last_zero_bit - find the last cleared bit in a memory region
+ * @addr: The address to start the search at
+ * @size: The maximum number of bits to search
+ *
+ * Returns the bit number of the first cleared bit.
+ * If no bits are zero, returns @size.
+ */
+extern unsigned long find_last_zero_bit(const unsigned long *addr,
+                                        unsigned long size);
+#endif
+
 /**
  * find_next_clump8 - find next 8-bit clump with set bits in a memory region
  * @clump: location to store copy of found clump
diff --git a/include/asm-generic/bitops/le.h b/include/asm-generic/bitops/le.h
index 188d3eba3ace..d0bd15bc34d9 100644
--- a/include/asm-generic/bitops/le.h
+++ b/include/asm-generic/bitops/le.h
@@ -27,6 +27,24 @@ static inline unsigned long
find_first_zero_bit_le(const void *addr,
        return find_first_zero_bit(addr, size);
 }

+static inline unsigned long find_prev_zero_bit_le(const void *addr,
+               unsigned long size, unsigned long offset)
+{
+       return find_prev_zero_bit(addr, size, offset);
+}
+
+static inline unsigned long find_prev_bit_le(const void *addr,
+               unsigned long size, unsigned long offset)
+{
+       return find_prev_bit(addr, size, offset);
+}
+
+static inline unsigned long find_last_zero_bit_le(const void *addr,
+               unsigned long size)
+{
+       return find_last_zero_bit(addr, size);
+}
+
 #elif defined(__BIG_ENDIAN)

 #define BITOP_LE_SWIZZLE       ((BITS_PER_LONG-1) & ~0x7)
@@ -41,11 +59,26 @@ extern unsigned long find_next_bit_le(const void *addr,
                unsigned long size, unsigned long offset);
 #endif

+#ifndef find_prev_zero_bit_le
+extern unsigned long find_prev_zero_bit_le(const void *addr,
+               unsigned long size, unsigned long offset);
+#endif
+
+#ifndef find_prev_bit_le
+extern unsigned long find_prev_bit_le(const void *addr,
+               unsigned long size, unsigned long offset);
+#endif
+
 #ifndef find_first_zero_bit_le
 #define find_first_zero_bit_le(addr, size) \
        find_next_zero_bit_le((addr), (size), 0)
 #endif

+#ifndef find_last_zero_bit_le
+#define find_last_zero_bit_le(addr, size) \
+       find_prev_zero_bit_le((addr), (size), (size - 1))
+#endif
+
 #else
 #error "Please fix <asm/byteorder.h>"
 #endif
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 5b74bdf159d6..124c68929861 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -50,6 +50,28 @@ extern unsigned long __sw_hweight64(__u64 w);
             (bit) < (size);                                    \
             (bit) = find_next_zero_bit((addr), (size), (bit) + 1))

+#define for_each_set_bit_reverse(bit, addr, size) \
+       for ((bit) = find_last_bit((addr), (size));             \
+            (bit) < (size);                                    \
+            (bit) = find_prev_bit((addr), (size), (bit)))
+
+/* same as for_each_set_bit_reverse() but use bit as value to start with */
+#define for_each_set_bit_from_reverse(bit, addr, size) \
+       for ((bit) = find_prev_bit((addr), (size), (bit));      \
+            (bit) < (size);                                    \
+            (bit) = find_prev_bit((addr), (size), (bit - 1)))
+
+#define for_each_clear_bit_reverse(bit, addr, size) \
+       for ((bit) = find_last_zero_bit((addr), (size));        \
+            (bit) < (size);                                    \
+            (bit) = find_prev_zero_bit((addr), (size), (bit)))
+
+/* same as for_each_clear_bit_reverse() but use bit as value to start with */
+#define for_each_clear_bit_from_reverse(bit, addr, size) \
+       for ((bit) = find_prev_zero_bit((addr), (size), (bit)); \
+            (bit) < (size);                                    \
+            (bit) = find_next_zero_bit((addr), (size), (bit - 1)))
+
 /**
  * for_each_set_clump8 - iterate over bitmap for each 8-bit clump with set bits
  * @start: bit offset to start search and to store the current iteration offset
@@ -283,17 +305,5 @@ static __always_inline void __assign_bit(long nr,
volatile unsigned long *addr,
 })
 #endif

-#ifndef find_last_bit
-/**
- * find_last_bit - find the last set bit in a memory region
- * @addr: The address to start the search at
- * @size: The number of bits to search
- *
- * Returns the bit number of the last set bit, or size.
- */
-extern unsigned long find_last_bit(const unsigned long *addr,
-                                  unsigned long size);
-#endif
-
 #endif /* __KERNEL__ */
 #endif
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 4a8751010d59..cbe06abd3d21 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -69,6 +69,58 @@ static unsigned long _find_next_bit(const unsigned
long *addr1,
 }
 #endif

+#if !defined(find_prev_bit) || !defined(find_prev_zero_bit) ||
         \
+       !defined(find_prev_bit_le) || !defined(find_prev_zero_bit_le)
||        \
+       !defined(find_prev_and_bit)
+/*
+ * This is a common helper function for find_prev_bit, find_prev_zero_bit, and
+ * find_prev_and_bit. The differences are:
+ *  - The "invert" argument, which is XORed with each fetched word before
+ *    searching it for one bits.
+ *  - The optional "addr2", which is anded with "addr1" if present.
+ */
+static unsigned long _find_prev_bit(const unsigned long *addr1,
+               const unsigned long *addr2, unsigned long nbits,
+               unsigned long start, unsigned long invert, unsigned long le)
+{
+       unsigned long tmp, mask;
+
+       if (unlikely(start >= nbits))
+               return nbits;
+
+       tmp = addr1[start / BITS_PER_LONG];
+       if (addr2)
+               tmp &= addr2[start / BITS_PER_LONG];
+       tmp ^= invert;
+
+       /* Handle 1st word. */
+       mask = BITMAP_LAST_WORD_MASK(start + 1);
+       if (le)
+               mask = swab(mask);
+
+       tmp &= mask;
+
+       start = round_down(start, BITS_PER_LONG);
+
+       while (!tmp) {
+               start -= BITS_PER_LONG;
+               if (start >= nbits)
+                       return nbits;
+
+               tmp = addr1[start / BITS_PER_LONG];
+               if (addr2)
+                       tmp &= addr2[start / BITS_PER_LONG];
+               tmp ^= invert;
+       }
+
+       if (le)
+               tmp = swab(tmp);
+
+       return start + __fls(tmp);
+}
+#endif
+
+
 #ifndef find_next_bit
 /*
  * Find the next set bit in a memory region.
@@ -81,6 +133,18 @@ unsigned long find_next_bit(const unsigned long
*addr, unsigned long size,
 EXPORT_SYMBOL(find_next_bit);
 #endif

+#ifndef find_prev_bit
+/*
+ * Find the prev set bit in a memory region.
+ */
+unsigned long find_prev_bit(const unsigned long *addr, unsigned long size,
+                           unsigned long offset)
+{
+       return _find_prev_bit(addr, NULL, size, offset, 0UL, 0);
+}
+EXPORT_SYMBOL(find_prev_bit);
+#endif
+
 #ifndef find_next_zero_bit
 unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
                                 unsigned long offset)
@@ -90,7 +154,16 @@ unsigned long find_next_zero_bit(const unsigned
long *addr, unsigned long size,
 EXPORT_SYMBOL(find_next_zero_bit);
 #endif

-#if !defined(find_next_and_bit)
+#ifndef find_prev_zero_bit
+unsigned long find_prev_zero_bit(const unsigned long *addr, unsigned long size,
+                                unsigned long offset)
+{
+       return _find_prev_bit(addr, NULL, size, offset, ~0UL, 0);
+}
+EXPORT_SYMBOL(find_prev_zero_bit);
+#endif
+
+#ifndef find_next_and_bit
 unsigned long find_next_and_bit(const unsigned long *addr1,
                const unsigned long *addr2, unsigned long size,
                unsigned long offset)
@@ -100,6 +173,16 @@ unsigned long find_next_and_bit(const unsigned long *addr1,
 EXPORT_SYMBOL(find_next_and_bit);
 #endif

+#ifndef find_prev_and_bit
+unsigned long find_prev_and_bit(const unsigned long *addr1,
+               const unsigned long *addr2, unsigned long size,
+               unsigned long offset)
+{
+       return _find_prev_bit(addr1, addr2, size, offset, 0UL, 0);
+}
+EXPORT_SYMBOL(find_prev_and_bit);
+#endif
+
 #ifndef find_first_bit
 /*
  * Find the first set bit in a memory region.
@@ -141,7 +224,7 @@ unsigned long find_last_bit(const unsigned long
*addr, unsigned long size)
 {
        if (size) {
                unsigned long val = BITMAP_LAST_WORD_MASK(size);
-               unsigned long idx = (size-1) / BITS_PER_LONG;
+               unsigned long idx = (size - 1) / BITS_PER_LONG;

                do {
                        val &= addr[idx];
@@ -156,6 +239,27 @@ unsigned long find_last_bit(const unsigned long
*addr, unsigned long size)
 EXPORT_SYMBOL(find_last_bit);
 #endif

+#ifndef find_last_zero_bit
+unsigned long find_last_zero_bit(const unsigned long *addr, unsigned long size)
+{
+       if (size) {
+               unsigned long val = BITMAP_LAST_WORD_MASK(size);
+               unsigned long idx = (size - 1) / BITS_PER_LONG;
+
+               do {
+                       val &= ~addr[idx];
+                       if (val)
+                               return idx * BITS_PER_LONG + __fls(val);
+
+                       val = ~0ul;
+               } while (idx--);
+       }
+
+       return size;
+}
+EXPORT_SYMBOL(find_last_zero_bit);
+#endif
+
 #ifdef __BIG_ENDIAN

 #ifndef find_next_zero_bit_le
@@ -167,6 +271,15 @@ unsigned long find_next_zero_bit_le(const void
*addr, unsigned
 EXPORT_SYMBOL(find_next_zero_bit_le);
 #endif

+#ifndef find_prev_zero_bit_le
+unsigned long find_prev_zero_bit_le(const void *addr, unsigned
+               long size, unsigned long offset)
+{
+       return _find_prev_bit(addr, NULL, size, offset, ~0UL, 1);
+}
+EXPORT_SYMBOL(find_prev_zero_bit_le);
+#endif
+
 #ifndef find_next_bit_le
 unsigned long find_next_bit_le(const void *addr, unsigned
                long size, unsigned long offset)
@@ -176,6 +289,15 @@ unsigned long find_next_bit_le(const void *addr, unsigned
 EXPORT_SYMBOL(find_next_bit_le);
 #endif

+#ifdef find_prev_bit_le
+unsigned long find_prev_bit_le(const void *addr, unsigned
+               long size, unsigned long offset)
+{
+       return _find_prev_bit(addr, NULL, size, offset, 0UL, 1);
+}
+EXPORT_SYMBOL(find_prev_bit_le);
+#endif
+
 #endif /* __BIG_ENDIAN */

 unsigned long find_next_clump8(unsigned long *clump, const unsigned long *addr,
--
2.29.2

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re: [PATCH] lib/find_bit: Add find_prev_*_bit functions.
  2020-12-02  1:10 [PATCH] lib/find_bit: Add find_prev_*_bit functions Yun Levi
@ 2020-12-02  9:47 ` Andy Shevchenko
  2020-12-02 10:04   ` Rasmus Villemoes
  0 siblings, 1 reply; 1651+ messages in thread
From: Andy Shevchenko @ 2020-12-02  9:47 UTC (permalink / raw)
  To: Yun Levi
  Cc: dushistov, arnd, akpm, gustavo, vilhelm.gray, richard.weiyang,
	joseph.qi, skalluru, yury.norov, jpoimboe, linux-kernel,
	linux-arch

On Wed, Dec 02, 2020 at 10:10:09AM +0900, Yun Levi wrote:
> Inspired find_next_*bit function series, add find_prev_*_bit series.
> I'm not sure whether it'll be used right now But, I add these functions
> for future usage.

This patch has few issues:
- it has more things than described (should be several patches instead)
- new functionality can be split logically to couple or more pieces as well
- it proposes functionality w/o user (dead code)

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] lib/find_bit: Add find_prev_*_bit functions.
  2020-12-02  9:47 ` Andy Shevchenko
@ 2020-12-02 10:04   ` Rasmus Villemoes
  2020-12-02 11:50     ` Yun Levi
  0 siblings, 1 reply; 1651+ messages in thread
From: Rasmus Villemoes @ 2020-12-02 10:04 UTC (permalink / raw)
  To: Andy Shevchenko, Yun Levi
  Cc: dushistov, arnd, akpm, gustavo, vilhelm.gray, richard.weiyang,
	joseph.qi, skalluru, yury.norov, jpoimboe, linux-kernel,
	linux-arch

On 02/12/2020 10.47, Andy Shevchenko wrote:
> On Wed, Dec 02, 2020 at 10:10:09AM +0900, Yun Levi wrote:
>> Inspired find_next_*bit function series, add find_prev_*_bit series.
>> I'm not sure whether it'll be used right now But, I add these functions
>> for future usage.
> 
> This patch has few issues:
> - it has more things than described (should be several patches instead)
> - new functionality can be split logically to couple or more pieces as well
> - it proposes functionality w/o user (dead code)

Yeah, the last point means it can't be applied - please submit it again
if and when you have an actual use case. And I'll add

- it lacks extension of the bitmap test module to cover the new
functions (that also wants to be a separate patch).

Rasmus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] lib/find_bit: Add find_prev_*_bit functions.
  2020-12-02 10:04   ` Rasmus Villemoes
@ 2020-12-02 11:50     ` Yun Levi
       [not found]       ` <CAAH8bW-jUeFVU-0OrJzK-MuGgKJgZv38RZugEQzFRJHSXFRRDA@mail.gmail.com>
  0 siblings, 1 reply; 1651+ messages in thread
From: Yun Levi @ 2020-12-02 11:50 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Andy Shevchenko, dushistov, arnd, akpm, gustavo, vilhelm.gray,
	richard.weiyang, joseph.qi, skalluru, yury.norov, jpoimboe,
	linux-kernel, linux-arch

Thanks for kind advice. But I'm so afraid to have questions below:

 > - it proposes functionality w/o user (dead code)
     Actually, I add these series functions to rewrite some of the
resource clean-up routine.
     A typical case is ethtool_set_per_queue_coalesce 's rollback label.
     Could this usage be an actual use case?

 >- it lacks extension of the bitmap test module to cover the new
 > functions (that also wants to be a separate patch).
     I see, then Could I add some of testcase on lib/test_bitops.c for testing?






On Wed, Dec 2, 2020 at 7:04 PM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> On 02/12/2020 10.47, Andy Shevchenko wrote:
> > On Wed, Dec 02, 2020 at 10:10:09AM +0900, Yun Levi wrote:
> >> Inspired find_next_*bit function series, add find_prev_*_bit series.
> >> I'm not sure whether it'll be used right now But, I add these functions
> >> for future usage.
> >
> > This patch has few issues:
> > - it has more things than described (should be several patches instead)
> > - new functionality can be split logically to couple or more pieces as well
> > - it proposes functionality w/o user (dead code)
>
> Yeah, the last point means it can't be applied - please submit it again
> if and when you have an actual use case. And I'll add
>
> - it lacks extension of the bitmap test module to cover the new
> functions (that also wants to be a separate patch).
>
> Rasmus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAAH8bW-jUeFVU-0OrJzK-MuGgKJgZv38RZugEQzFRJHSXFRRDA@mail.gmail.com>]

* (no subject)
       [not found]       ` <CAAH8bW-jUeFVU-0OrJzK-MuGgKJgZv38RZugEQzFRJHSXFRRDA@mail.gmail.com>
@ 2020-12-02 18:22         ` Yun Levi
  2020-12-02 21:26           ` Yury Norov
  0 siblings, 1 reply; 1651+ messages in thread
From: Yun Levi @ 2020-12-02 18:22 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote:

> Also look at lib/find_bit_benchmark.c
Thanks. I'll see.

> We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> for the purpose of reverse traversing we can go with already existing find_last_bit(),

Thank you. I haven't thought that way.
But I think if we implement reverse traversing using find_last_bit(),
we have a problem.
Suppose the last bit 0, 1, 2, is set.
If we start
    find_last_bit(bitmap, 3) ==> return 2;
    find_last_bit(bitmap, 2) ==> return 1;
    find_last_bit(bitmap, 1) ==> return 0;
    find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
distinguish size 0 input or 0 is set

and the for_each traverse routine prevent above case by returning size
(nbits) using find_next_bit.
So, for compatibility and the same expected return value like next traversing,
I think we need to find_prev_*_bit routine. if my understanding is correct.


>  I think this patch has some good catches. We definitely need to implement
> find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal.
>
> We also should consider adding reverse traversing macros based on find_last_*_bit(),
> if there are proposed users.

Not only this, I think 'steal_from_bitmap_to_front' can be improved
using ffind_prev_zero_bit
like

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index af0013d3df63..9debb9707390 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct
btrfs_free_space_ctl *ctl,
  u64 bitmap_offset;
  unsigned long i;
  unsigned long j;
- unsigned long prev_j;
  u64 bytes;

  bitmap_offset = offset_to_bitmap(ctl, info->offset);
@@ -2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct
btrfs_free_space_ctl *ctl,
  return false;

  i = offset_to_bit(bitmap->offset, ctl->unit, info->offset) - 1;
- j = 0;
- prev_j = (unsigned long)-1;
- for_each_clear_bit_from(j, bitmap->bitmap, BITS_PER_BITMAP) {
- if (j > i)
- break;
- prev_j = j;
- }
- if (prev_j == i)
+ j = find_prev_zero_bit(bitmap->bitmap, BITS_PER_BITMAP, i);
+
+ if (j == i)
  return false;

- if (prev_j == (unsigned long)-1)
+ if (j == BITS_PER_BITMAP)
  bytes = (i + 1) * ctl->unit;
  else
- bytes = (i - prev_j) * ctl->unit;
+ bytes = (i - j) * ctl->unit;

  info->offset -= bytes;
  info->bytes += bytes;

Thanks.

HTH
Levi.

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-02 18:22         ` Yun Levi
@ 2020-12-02 21:26           ` Yury Norov
  2020-12-02 22:51             ` Yun Levi
  0 siblings, 1 reply; 1651+ messages in thread
From: Yury Norov @ 2020-12-02 21:26 UTC (permalink / raw)
  To: Yun Levi
  Cc: Rasmus Villemoes, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote:
>
> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote:
>
> > Also look at lib/find_bit_benchmark.c
> Thanks. I'll see.
>
> > We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> > for the purpose of reverse traversing we can go with already existing find_last_bit(),
>
> Thank you. I haven't thought that way.
> But I think if we implement reverse traversing using find_last_bit(),
> we have a problem.
> Suppose the last bit 0, 1, 2, is set.
> If we start
>     find_last_bit(bitmap, 3) ==> return 2;
>     find_last_bit(bitmap, 2) ==> return 1;
>     find_last_bit(bitmap, 1) ==> return 0;
>     find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
> distinguish size 0 input or 0 is set

If you traverse backward and reach bit #0, you're done. No need to continue.

>
> and the for_each traverse routine prevent above case by returning size
> (nbits) using find_next_bit.
> So, for compatibility and the same expected return value like next traversing,
> I think we need to find_prev_*_bit routine. if my understanding is correct.
>
>
> >  I think this patch has some good catches. We definitely need to implement
> > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal.
> >
> > We also should consider adding reverse traversing macros based on find_last_*_bit(),
> > if there are proposed users.
>
> Not only this, I think 'steal_from_bitmap_to_front' can be improved
> using ffind_prev_zero_bit
> like
>
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index af0013d3df63..9debb9707390 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct
> btrfs_free_space_ctl *ctl,
>   u64 bitmap_offset;
>   unsigned long i;
>   unsigned long j;
> - unsigned long prev_j;
>   u64 bytes;
>
>   bitmap_offset = offset_to_bitmap(ctl, info->offset);
> @@ -2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct
> btrfs_free_space_ctl *ctl,
>   return false;
>
>   i = offset_to_bit(bitmap->offset, ctl->unit, info->offset) - 1;
> - j = 0;
> - prev_j = (unsigned long)-1;
> - for_each_clear_bit_from(j, bitmap->bitmap, BITS_PER_BITMAP) {
> - if (j > i)
> - break;
> - prev_j = j;
> - }
> - if (prev_j == i)
> + j = find_prev_zero_bit(bitmap->bitmap, BITS_PER_BITMAP, i);

This one may be implemented with find_last_zero_bit() as well:

unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP);
if (j <= i || j >= BITS_PER_BITMAP)
        return false;

I believe the latter version is better because find_last_*_bit() is simpler in
implementation (and partially exists), has less parameters, and therefore
simpler for users, and doesn't introduce functionality duplication.

The only consideration I can imagine to advocate find_prev*() is the performance
advantage in the scenario when we know for sure that first N bits of
bitmap are all
set/clear, and we can bypass traversing that area. But again, in this
case we can pass the
bitmap address with the appropriate offset, and stay with find_last_*()

> +
> + if (j == i)
>   return false;
>
> - if (prev_j == (unsigned long)-1)
> + if (j == BITS_PER_BITMAP)
>   bytes = (i + 1) * ctl->unit;
>   else
> - bytes = (i - prev_j) * ctl->unit;
> + bytes = (i - j) * ctl->unit;
>
>   info->offset -= bytes;
>   info->bytes += bytes;
>
> Thanks.
>
> HTH
> Levi.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2020-12-02 21:26           ` Yury Norov
@ 2020-12-02 22:51             ` Yun Levi
  2020-12-03  1:23               ` Yun Levi
  0 siblings, 1 reply; 1651+ messages in thread
From: Yun Levi @ 2020-12-02 22:51 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote:
>
> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote:
> >
> > On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote:
> >
> > > Also look at lib/find_bit_benchmark.c
> > Thanks. I'll see.
> >
> > > We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> > > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> > > for the purpose of reverse traversing we can go with already existing find_last_bit(),
> >
> > Thank you. I haven't thought that way.
> > But I think if we implement reverse traversing using find_last_bit(),
> > we have a problem.
> > Suppose the last bit 0, 1, 2, is set.
> > If we start
> >     find_last_bit(bitmap, 3) ==> return 2;
> >     find_last_bit(bitmap, 2) ==> return 1;
> >     find_last_bit(bitmap, 1) ==> return 0;
> >     find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
> > distinguish size 0 input or 0 is set
>
> If you traverse backward and reach bit #0, you're done. No need to continue.
I think the case when I consider the this macro like

#define for_each_clear_bit_reverse(bit, addr, size)
    for ((bit) = find_last_zero_bit((addr), (size))
          (bit) < (size);
          (bit) = find_prev_zero_bit((addr), (size), (bit)))

If we implement the above macro only with find_last_zero_bit,
I think there is no way without adding any additional variable to finish loop.
But I don't want to add additional variable to sustain same format
with for_each_clear_bit,
That's why i decide to implement find_prev_*_bit series.

I don't know it's correct thinking or biased. Am I wrong?

>
> >
> > and the for_each traverse routine prevent above case by returning size
> > (nbits) using find_next_bit.
> > So, for compatibility and the same expected return value like next traversing,
> > I think we need to find_prev_*_bit routine. if my understanding is correct.
> >
> >
> > >  I think this patch has some good catches. We definitely need to implement
> > > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal.
> > >
> > > We also should consider adding reverse traversing macros based on find_last_*_bit(),
> > > if there are proposed users.
> >
> > Not only this, I think 'steal_from_bitmap_to_front' can be improved
> > using ffind_prev_zero_bit
> > like
> >
> > diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> > index af0013d3df63..9debb9707390 100644
> > --- a/fs/btrfs/free-space-cache.c
> > +++ b/fs/btrfs/free-space-cache.c
> > @@ -2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct
> > btrfs_free_space_ctl *ctl,
> >   u64 bitmap_offset;
> >   unsigned long i;
> >   unsigned long j;
> > - unsigned long prev_j;
> >   u64 bytes;
> >
> >   bitmap_offset = offset_to_bitmap(ctl, info->offset);
> > @@ -2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct
> > btrfs_free_space_ctl *ctl,
> >   return false;
> >
> >   i = offset_to_bit(bitmap->offset, ctl->unit, info->offset) - 1;
> > - j = 0;
> > - prev_j = (unsigned long)-1;
> > - for_each_clear_bit_from(j, bitmap->bitmap, BITS_PER_BITMAP) {
> > - if (j > i)
> > - break;
> > - prev_j = j;
> > - }
> > - if (prev_j == i)
> > + j = find_prev_zero_bit(bitmap->bitmap, BITS_PER_BITMAP, i);
>
> This one may be implemented with find_last_zero_bit() as well:
>
> unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP);
> if (j <= i || j >= BITS_PER_BITMAP)
>         return false;
>
Actually, in that code, we don't need to check the bit after i.
Originally, if my understanding is correct, former code tries to find
the last 0 bit before i.
and if all bits are fully set before i, it use next one as i + 1

that's why i think the if condition should be
   if (j >= i)

But above condition couldn't the discern the case when all bits are
fully set before i.
Also, I think we don't need to check the bit after i and In this case,
find_prev_zero_bit which
specifies the start point is clear to show the meaning of the code.


> I believe the latter version is better because find_last_*_bit() is simpler in
> implementation (and partially exists), has less parameters, and therefore
> simpler for users, and doesn't introduce functionality duplication.
>
> The only consideration I can imagine to advocate find_prev*() is the performance
> advantage in the scenario when we know for sure that first N bits of
> bitmap are all
> set/clear, and we can bypass traversing that area. But again, in this
> case we can pass the
> bitmap address with the appropriate offset, and stay with find_last_*()
>
> > +
> > + if (j == i)
> >   return false;
> >
> > - if (prev_j == (unsigned long)-1)
> > + if (j == BITS_PER_BITMAP)
> >   bytes = (i + 1) * ctl->unit;
> >   else
> > - bytes = (i - prev_j) * ctl->unit;
> > + bytes = (i - j) * ctl->unit;
> >
> >   info->offset -= bytes;
> >   info->bytes += bytes;
> >
> > Thanks.
> >
> > HTH
> > Levi.

Thanks but

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2020-12-02 22:51             ` Yun Levi
@ 2020-12-03  1:23               ` Yun Levi
  2020-12-03  8:33                 ` Rasmus Villemoes
  0 siblings, 1 reply; 1651+ messages in thread
From: Yun Levi @ 2020-12-03  1:23 UTC (permalink / raw)
  To: Yury Norov
  Cc: Rasmus Villemoes, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <ppbuk5246@gmail.com> wrote:
>
> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote:
> >
> > On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote:
> > >
> > > On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote:
> > >
> > > > Also look at lib/find_bit_benchmark.c
> > > Thanks. I'll see.
> > >
> > > > We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> > > > bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> > > > for the purpose of reverse traversing we can go with already existing find_last_bit(),
> > >
> > > Thank you. I haven't thought that way.
> > > But I think if we implement reverse traversing using find_last_bit(),
> > > we have a problem.
> > > Suppose the last bit 0, 1, 2, is set.
> > > If we start
> > >     find_last_bit(bitmap, 3) ==> return 2;
> > >     find_last_bit(bitmap, 2) ==> return 1;
> > >     find_last_bit(bitmap, 1) ==> return 0;
> > >     find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
> > > distinguish size 0 input or 0 is set
> >
> > If you traverse backward and reach bit #0, you're done. No need to continue.
> I think the case when I consider the this macro like
>
> #define for_each_clear_bit_reverse(bit, addr, size)
>     for ((bit) = find_last_zero_bit((addr), (size))
>           (bit) < (size);
>           (bit) = find_prev_zero_bit((addr), (size), (bit)))
>
> If we implement the above macro only with find_last_zero_bit,
> I think there is no way without adding any additional variable to finish loop.
> But I don't want to add additional variable to sustain same format
> with for_each_clear_bit,
> That's why i decide to implement find_prev_*_bit series.
>
> I don't know it's correct thinking or biased. Am I wrong?
>
> >
> > >
> > > and the for_each traverse routine prevent above case by returning size
> > > (nbits) using find_next_bit.
> > > So, for compatibility and the same expected return value like next traversing,
> > > I think we need to find_prev_*_bit routine. if my understanding is correct.
> > >
> > >
> > > >  I think this patch has some good catches. We definitely need to implement
> > > > find_last_zero_bit(), as it is used by fs/ufs, and their local implementation is not optimal.
> > > >
> > > > We also should consider adding reverse traversing macros based on find_last_*_bit(),
> > > > if there are proposed users.
> > >
> > > Not only this, I think 'steal_from_bitmap_to_front' can be improved
> > > using ffind_prev_zero_bit
> > > like
> > >
> > > diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> > > index af0013d3df63..9debb9707390 100644
> > > --- a/fs/btrfs/free-space-cache.c
> > > +++ b/fs/btrfs/free-space-cache.c
> > > @@ -2372,7 +2372,6 @@ static bool steal_from_bitmap_to_front(struct
> > > btrfs_free_space_ctl *ctl,
> > >   u64 bitmap_offset;
> > >   unsigned long i;
> > >   unsigned long j;
> > > - unsigned long prev_j;
> > >   u64 bytes;
> > >
> > >   bitmap_offset = offset_to_bitmap(ctl, info->offset);
> > > @@ -2388,20 +2387,15 @@ static bool steal_from_bitmap_to_front(struct
> > > btrfs_free_space_ctl *ctl,
> > >   return false;
> > >
> > >   i = offset_to_bit(bitmap->offset, ctl->unit, info->offset) - 1;
> > > - j = 0;
> > > - prev_j = (unsigned long)-1;
> > > - for_each_clear_bit_from(j, bitmap->bitmap, BITS_PER_BITMAP) {
> > > - if (j > i)
> > > - break;
> > > - prev_j = j;
> > > - }
> > > - if (prev_j == i)
> > > + j = find_prev_zero_bit(bitmap->bitmap, BITS_PER_BITMAP, i);
> >
> > This one may be implemented with find_last_zero_bit() as well:
> >
> > unsigned log j = find_last_zero_bit(bitmap, BITS_PER_BITMAP);
> > if (j <= i || j >= BITS_PER_BITMAP)
> >         return false;
> >
> Actually, in that code, we don't need to check the bit after i.
> Originally, if my understanding is correct, former code tries to find
> the last 0 bit before i.
> and if all bits are fully set before i, it use next one as i + 1
>
> that's why i think the if condition should be
>    if (j >= i)
>
> But above condition couldn't the discern the case when all bits are
> fully set before i.
> Also, I think we don't need to check the bit after i and In this case,
> find_prev_zero_bit which
> specifies the start point is clear to show the meaning of the code.
>
>
> > I believe the latter version is better because find_last_*_bit() is simpler in
> > implementation (and partially exists), has less parameters, and therefore
> > simpler for users, and doesn't introduce functionality duplication.

I think it's not duplication.
Actually, former you teach me find_first_*_bit should be start word-aligned bit,
But as find_first_*_bit declares it as "size of bitmap" not a start offset.
Though the bitmap size it's word-aligned, it doesn't matter to fine
first bit in the specified size of bitmap (it no, it will return just
size of bitmap)

Likewise, find_last_*_bit is also similar in context.
Fundamentally, it's not a start offset of bitmap but I think it just
size of bitmap.

That's the reason why we need to find_next_*_bit to start at the
specified offset.
In this matter, I think it's better to have find_prev_*_bit.

So, I think we can use both of these functions to be used to achieve a goal.
But, each function has different concept actually that's why I don't
think it's not duplication.

if my understanding is wrong.. Forgive me. and let me know..

Thanks.



> >
> > The only consideration I can imagine to advocate find_prev*() is the performance
> > advantage in the scenario when we know for sure that first N bits of
> > bitmap are all
> > set/clear, and we can bypass traversing that area. But again, in this
> > case we can pass the
> > bitmap address with the appropriate offset, and stay with find_last_*()
> >
> > > +
> > > + if (j == i)
> > >   return false;
> > >
> > > - if (prev_j == (unsigned long)-1)
> > > + if (j == BITS_PER_BITMAP)
> > >   bytes = (i + 1) * ctl->unit;
> > >   else
> > > - bytes = (i - prev_j) * ctl->unit;
> > > + bytes = (i - j) * ctl->unit;
> > >
> > >   info->offset -= bytes;
> > >   info->bytes += bytes;
> > >
> > > Thanks.
> > >
> > > HTH
> > > Levi.
>
> Thanks but

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-03  1:23               ` Yun Levi
@ 2020-12-03  8:33                 ` Rasmus Villemoes
  2020-12-03  9:47                   ` Re: Yun Levi
  0 siblings, 1 reply; 1651+ messages in thread
From: Rasmus Villemoes @ 2020-12-03  8:33 UTC (permalink / raw)
  To: Yun Levi, Yury Norov
  Cc: dushistov, Arnd Bergmann, Andrew Morton, Gustavo A. R. Silva,
	William Breathitt Gray, richard.weiyang, joseph.qi, skalluru,
	Josh Poimboeuf, Linux Kernel Mailing List, linux-arch,
	Andy Shevchenko

On 03/12/2020 02.23, Yun Levi wrote:
> On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <ppbuk5246@gmail.com> wrote:
>>
>> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote:
>>>
>>> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote:
>>>>
>>>> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote:
>>>>
>>>>> Also look at lib/find_bit_benchmark.c
>>>> Thanks. I'll see.
>>>>
>>>>> We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
>>>>> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
>>>>> for the purpose of reverse traversing we can go with already existing find_last_bit(),
>>>>
>>>> Thank you. I haven't thought that way.
>>>> But I think if we implement reverse traversing using find_last_bit(),
>>>> we have a problem.
>>>> Suppose the last bit 0, 1, 2, is set.
>>>> If we start
>>>>     find_last_bit(bitmap, 3) ==> return 2;
>>>>     find_last_bit(bitmap, 2) ==> return 1;
>>>>     find_last_bit(bitmap, 1) ==> return 0;
>>>>     find_last_bit(bitmap, 0) ===> return 0? // here we couldn't

Either just make the return type of all find_prev/find_last be signed
int and use -1 as the sentinel to indicate "no such position exists", so
the loop condition would be foo >= 0. Or, change the condition from
"stop if we get the size returned" to "only continue if we get something
strictly less than the size we passed in (i.e., something which can
possibly be a valid bit index). In the latter case, both (unsigned)-1
aka UINT_MAX and the actual size value passed work equally well as a
sentinel.

If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
something like

for (i = find_last_bit(bitmap, size); i < size; i =
find_last_bit(bitmap, i))

if one wants to use the size argument as the sentinel, the caller would
have to supply a scratch variable to keep track of the last i value:

for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
find_last_bit(bitmap, j))

which is probably a little less ergonomic.

Rasmus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-03  8:33                 ` Rasmus Villemoes
@ 2020-12-03  9:47                   ` Yun Levi
  2020-12-03 18:46                     ` Re: Yury Norov
  0 siblings, 1 reply; 1651+ messages in thread
From: Yun Levi @ 2020-12-03  9:47 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Yury Norov, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

> If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
> something like
>
> for (i = find_last_bit(bitmap, size); i < size; i =
> find_last_bit(bitmap, i))
>
> if one wants to use the size argument as the sentinel, the caller would
> have to supply a scratch variable to keep track of the last i value:
>
> for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
> find_last_bit(bitmap, j))
>
> which is probably a little less ergonomic.

Actually Because I want to avoid the modification of return type of
find_last_*_bit for new sentinel,
I add find_prev_*_bit.
the big difference between find_last_bit and find_prev_bit is
   find_last_bit doesn't check the size bit and use sentinel with size.
   but find_prev_bit check the offset bit and use sentinel with size
which passed by another argument.
   So if we use find_prev_bit, we could have a clear iteration if
using find_prev_bit like.

  #define for_each_set_bit_reverse(bit, addr, size) \
      for ((bit) = find_last_bit((addr), (size));    \
            (bit) < (size);                                     \
            (bit) = find_prev_bit((addr), (size), (bit - 1)))

  #define for_each_set_bit_from_reverse(bit, addr, size) \
      for ((bit) = find_prev_bit((addr), (size), (bit)); \
             (bit) < (size);                                           \
             (bit) = find_prev_bit((addr), (size), (bit - 1)))

Though find_prev_*_bit / find_last_*_bit have the same functionality.
But they also have a small difference.
I think this small this small difference doesn't make some of
confusion to user but it help to solve problem
with a simple way (just like the iteration above).

So I think I need, find_prev_*_bit series.

Am I missing anything?

Thanks.

Levi.

On Thu, Dec 3, 2020 at 5:33 PM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> On 03/12/2020 02.23, Yun Levi wrote:
> > On Thu, Dec 3, 2020 at 7:51 AM Yun Levi <ppbuk5246@gmail.com> wrote:
> >>
> >> On Thu, Dec 3, 2020 at 6:26 AM Yury Norov <yury.norov@gmail.com> wrote:
> >>>
> >>> On Wed, Dec 2, 2020 at 10:22 AM Yun Levi <ppbuk5246@gmail.com> wrote:
> >>>>
> >>>> On Thu, Dec 3, 2020 at 2:26 AM Yury Norov <yury.norov@gmail.com> wrote:
> >>>>
> >>>>> Also look at lib/find_bit_benchmark.c
> >>>> Thanks. I'll see.
> >>>>
> >>>>> We need find_next_*_bit() because find_first_*_bit() can start searching only at word-aligned
> >>>>> bits. In the case of find_last_*_bit(), we can start at any bit. So, if my understanding is correct,
> >>>>> for the purpose of reverse traversing we can go with already existing find_last_bit(),
> >>>>
> >>>> Thank you. I haven't thought that way.
> >>>> But I think if we implement reverse traversing using find_last_bit(),
> >>>> we have a problem.
> >>>> Suppose the last bit 0, 1, 2, is set.
> >>>> If we start
> >>>>     find_last_bit(bitmap, 3) ==> return 2;
> >>>>     find_last_bit(bitmap, 2) ==> return 1;
> >>>>     find_last_bit(bitmap, 1) ==> return 0;
> >>>>     find_last_bit(bitmap, 0) ===> return 0? // here we couldn't
>
> Either just make the return type of all find_prev/find_last be signed
> int and use -1 as the sentinel to indicate "no such position exists", so
> the loop condition would be foo >= 0. Or, change the condition from
> "stop if we get the size returned" to "only continue if we get something
> strictly less than the size we passed in (i.e., something which can
> possibly be a valid bit index). In the latter case, both (unsigned)-1
> aka UINT_MAX and the actual size value passed work equally well as a
> sentinel.
>
> If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
> something like
>
> for (i = find_last_bit(bitmap, size); i < size; i =
> find_last_bit(bitmap, i))
>
> if one wants to use the size argument as the sentinel, the caller would
> have to supply a scratch variable to keep track of the last i value:
>
> for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
> find_last_bit(bitmap, j))
>
> which is probably a little less ergonomic.
>
> Rasmus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-03  9:47                   ` Re: Yun Levi
@ 2020-12-03 18:46                     ` Yury Norov
  2020-12-03 18:52                       ` Re: Willy Tarreau
  2020-12-05 11:10                       ` Re: Rasmus Villemoes
  0 siblings, 2 replies; 1651+ messages in thread
From: Yury Norov @ 2020-12-03 18:46 UTC (permalink / raw)
  To: Yun Levi
  Cc: Rasmus Villemoes, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

Yun, could you please stop top-posting and excessive trimming in the thread?

On Thu, Dec 3, 2020 at 1:47 AM Yun Levi <ppbuk5246@gmail.com> wrote:
> > Either just make the return type of all find_prev/find_last be signed
> > int and use -1 as the sentinel to indicate "no such position exists", so
> > the loop condition would be foo >= 0. Or, change the condition from
> > "stop if we get the size returned" to "only continue if we get something
> > strictly less than the size we passed in (i.e., something which can
> > possibly be a valid bit index). In the latter case, both (unsigned)-1
> > aka UINT_MAX and the actual size value passed work equally well as a
> > sentinel.
> >
> > If one uses UINT_MAX, a for_each_bit_reverse() macro would just be
> > something like
> >
> > for (i = find_last_bit(bitmap, size); i < size; i =
> > find_last_bit(bitmap, i))
> >
> > if one wants to use the size argument as the sentinel, the caller would
> > have to supply a scratch variable to keep track of the last i value:
> >
> > for (j = size, i = find_last_bit(bitmap, j); i < j; j = i, i =
> > find_last_bit(bitmap, j))
> >
> > which is probably a little less ergonomic.
> >
> > Rasmus

I would prefer to avoid changing the find*bit() semantics. As for now,
if any of find_*_bit()
finds nothing, it returns the size of the bitmap it was passed.
Changing this for
a single function would break the consistency, and may cause problems
for those who
rely on existing behaviour.

Passing non-positive size to find_*_bit() should produce undefined
behaviour, because we cannot dereference a pointer to the bitmap in
this case; this is most probably a sign of a problem on a caller side
anyways.

Let's keep this logic unchanged?

> Actually Because I want to avoid the modification of return type of
> find_last_*_bit for new sentinel,
> I add find_prev_*_bit.
> the big difference between find_last_bit and find_prev_bit is
>    find_last_bit doesn't check the size bit and use sentinel with size.
>    but find_prev_bit check the offset bit and use sentinel with size
> which passed by another argument.
>    So if we use find_prev_bit, we could have a clear iteration if
> using find_prev_bit like.
>
>   #define for_each_set_bit_reverse(bit, addr, size) \
>       for ((bit) = find_last_bit((addr), (size));    \
>             (bit) < (size);                                     \
>             (bit) = find_prev_bit((addr), (size), (bit - 1)))
>
>   #define for_each_set_bit_from_reverse(bit, addr, size) \
>       for ((bit) = find_prev_bit((addr), (size), (bit)); \
>              (bit) < (size);                                           \
>              (bit) = find_prev_bit((addr), (size), (bit - 1)))
>
> Though find_prev_*_bit / find_last_*_bit have the same functionality.
> But they also have a small difference.
> I think this small this small difference doesn't make some of
> confusion to user but it help to solve problem
> with a simple way (just like the iteration above).
>
> So I think I need, find_prev_*_bit series.
>
> Am I missing anything?
>
> Thanks.
>
> Levi.

As you said, find_last_bit() and proposed find_prev_*_bit() have the
same functionality.
If you really want to have find_prev_*_bit(), could you please at
least write it using find_last_bit(), otherwise it would be just a
blottering.

Regarding reverse search, we can probably do like this (not tested,
just an idea):

#define for_each_set_bit_reverse(bit, addr, size) \
    for ((bit) = find_last_bit((addr), (size));    \
          (bit) < (size);                                     \
          (size) = (bit), (bit) = find_last_bit((addr), (bit)))

Thanks,
Yury

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-03 18:46                     ` Re: Yury Norov
@ 2020-12-03 18:52                       ` Willy Tarreau
  2020-12-04  1:36                         ` Re: Yun Levi
  2020-12-05 11:10                       ` Re: Rasmus Villemoes
  1 sibling, 1 reply; 1651+ messages in thread
From: Willy Tarreau @ 2020-12-03 18:52 UTC (permalink / raw)
  To: Yury Norov
  Cc: Yun Levi, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch, Andy Shevchenko

On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> Yun, could you please stop top-posting and excessive trimming in the thread?

And re-configure the mail agent to make the "Subject" field appear and
fill it.

Willy

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-03 18:52                       ` Re: Willy Tarreau
@ 2020-12-04  1:36                         ` Yun Levi
  2020-12-04 18:14                           ` Re: Yury Norov
  0 siblings, 1 reply; 1651+ messages in thread
From: Yun Levi @ 2020-12-04  1:36 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Yury Norov, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch, Andy Shevchenko

>On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > Yun, could you please stop top-posting and excessive trimming in the thread?
>
> And re-configure the mail agent to make the "Subject" field appear and
> fill it.

>On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> Yun, could you please stop top-posting and excessive trimming in the thread?
Sorry to make you uncomfortable... Thanks for advice.

>On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> As you said, find_last_bit() and proposed find_prev_*_bit() have the
> same functionality.
> If you really want to have find_prev_*_bit(), could you please at
> least write it using find_last_bit(), otherwise it would be just a
> blottering.

Actually find_prev_*_bit call _find_prev_bit which is a common helper function
like _find_next_bit.
As you know this function is required to support __BIGEDIAN's little
endian search.
find_prev_bit actually wrapper of _find_prev_bit which have a feature
the find_last_bit.

That makes the semantics difference between find_last_bit and find_prev_bit.
-- specify where you find from and
   In loop, find_last_bit couldn't sustain original size as sentinel
return value
    (we should change the size argument for next searching
     But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel
return value is changed per call).

Because we should have _find_prev_bit,
I think it's the matter to choose which is better to usein
find_prev_bit (find_last_bit? or _find_prev_bit?)
sustaining find_prev_bit feature (give size as sentinel return, from
where I start).
if my understanding is correct.

In my view, I prefer to use _find_prev_bit like find_next_bit for
integrated format.

But In some of the benchmarking, find_last_bit is better than _find_prev_bit,
here what I tested (look similar but sometimes have some difference).

              Start testing find_bit() with random-filled bitmap
[  +0.001850] find_next_bit:                  842792 ns, 163788 iterations
[  +0.000873] find_prev_bit:                  870914 ns, 163788 iterations
[  +0.000824] find_next_zero_bit:             821959 ns, 163894 iterations
[  +0.000677] find_prev_zero_bit:             676240 ns, 163894 iterations
[  +0.000777] find_last_bit:                  659103 ns, 163788 iterations
[  +0.001822] find_first_bit:                1708041 ns,  16250 iterations
[  +0.000539] find_next_and_bit:              492182 ns,  73871 iterations
[  +0.000001]
              Start testing find_bit() with sparse bitmap
[  +0.000222] find_next_bit:                   13227 ns,    654 iterations
[  +0.000013] find_prev_bit:                   11652 ns,    654 iterations
[  +0.001845] find_next_zero_bit:            1723869 ns, 327028 iterations
[  +0.001538] find_prev_zero_bit:            1355808 ns, 327028 iterations
[  +0.000010] find_last_bit:                    8114 ns,    654 iterations
[  +0.000867] find_first_bit:                 710639 ns,    654 iterations
[  +0.000006] find_next_and_bit:                4273 ns,      1 iterations
[  +0.000004] find_next_and_bit:                3278 ns,      1 iterations

              Start testing find_bit() with random-filled bitmap
[  +0.001784] find_next_bit:                  805553 ns, 164240 iterations
[  +0.000643] find_prev_bit:                  632474 ns, 164240 iterations
[  +0.000950] find_next_zero_bit:             877215 ns, 163442 iterations
[  +0.000664] find_prev_zero_bit:             662339 ns, 163442 iterations
[  +0.000680] find_last_bit:                  602204 ns, 164240 iterations
[  +0.001912] find_first_bit:                1758208 ns,  16408 iterations
[  +0.000760] find_next_and_bit:              531033 ns,  73798 iterations
[  +0.000002]
              Start testing find_bit() with sparse bitmap
[  +0.000203] find_next_bit:                   12468 ns,    656 iterations
[  +0.000205] find_prev_bit:                   10948 ns,    656 iterations
[  +0.001759] find_next_zero_bit:            1579447 ns, 327026 iterations
[  +0.001935] find_prev_zero_bit:            1931961 ns, 327026 iterations
[  +0.000013] find_last_bit:                    9543 ns,    656 iterations
[  +0.000732] find_first_bit:                 562009 ns,    656 iterations
[  +0.000217] find_next_and_bit:                6804 ns,      1 iterations
[  +0.000007] find_next_and_bit:                4367 ns,      1 iterations

Is it better to write find_prev_bit using find_last_bit?
I question again.

Thanks for your great advice, But please forgive my fault and lackness.

HTH.
Levi.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-04  1:36                         ` Re: Yun Levi
@ 2020-12-04 18:14                           ` Yury Norov
  2020-12-05  0:45                             ` Re: Yun Levi
  0 siblings, 1 reply; 1651+ messages in thread
From: Yury Norov @ 2020-12-04 18:14 UTC (permalink / raw)
  To: Yun Levi
  Cc: Willy Tarreau, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch, Andy Shevchenko

On Thu, Dec 3, 2020 at 5:36 PM Yun Levi <ppbuk5246@gmail.com> wrote:
>
> >On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> > On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > Yun, could you please stop top-posting and excessive trimming in the thread?
> >
> > And re-configure the mail agent to make the "Subject" field appear and
> > fill it.
>
> >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > Yun, could you please stop top-posting and excessive trimming in the thread?
> Sorry to make you uncomfortable... Thanks for advice.
>
> >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > As you said, find_last_bit() and proposed find_prev_*_bit() have the
> > same functionality.
> > If you really want to have find_prev_*_bit(), could you please at
> > least write it using find_last_bit(), otherwise it would be just a
> > blottering.
>
> Actually find_prev_*_bit call _find_prev_bit which is a common helper function
> like _find_next_bit.
> As you know this function is required to support __BIGEDIAN's little
> endian search.
> find_prev_bit actually wrapper of _find_prev_bit which have a feature
> the find_last_bit.
>
> That makes the semantics difference between find_last_bit and find_prev_bit.
> -- specify where you find from and
>    In loop, find_last_bit couldn't sustain original size as sentinel
> return value
>     (we should change the size argument for next searching
>      But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel
> return value is changed per call).
>
> Because we should have _find_prev_bit,
> I think it's the matter to choose which is better to usein
> find_prev_bit (find_last_bit? or _find_prev_bit?)
> sustaining find_prev_bit feature (give size as sentinel return, from
> where I start).
> if my understanding is correct.
>
> In my view, I prefer to use _find_prev_bit like find_next_bit for
> integrated format.
>
> But In some of the benchmarking, find_last_bit is better than _find_prev_bit,
> here what I tested (look similar but sometimes have some difference).
>
>               Start testing find_bit() with random-filled bitmap
> [  +0.001850] find_next_bit:                  842792 ns, 163788 iterations
> [  +0.000873] find_prev_bit:                  870914 ns, 163788 iterations
> [  +0.000824] find_next_zero_bit:             821959 ns, 163894 iterations
> [  +0.000677] find_prev_zero_bit:             676240 ns, 163894 iterations
> [  +0.000777] find_last_bit:                  659103 ns, 163788 iterations
> [  +0.001822] find_first_bit:                1708041 ns,  16250 iterations
> [  +0.000539] find_next_and_bit:              492182 ns,  73871 iterations
> [  +0.000001]
>               Start testing find_bit() with sparse bitmap
> [  +0.000222] find_next_bit:                   13227 ns,    654 iterations
> [  +0.000013] find_prev_bit:                   11652 ns,    654 iterations
> [  +0.001845] find_next_zero_bit:            1723869 ns, 327028 iterations
> [  +0.001538] find_prev_zero_bit:            1355808 ns, 327028 iterations
> [  +0.000010] find_last_bit:                    8114 ns,    654 iterations
> [  +0.000867] find_first_bit:                 710639 ns,    654 iterations
> [  +0.000006] find_next_and_bit:                4273 ns,      1 iterations
> [  +0.000004] find_next_and_bit:                3278 ns,      1 iterations
>
>               Start testing find_bit() with random-filled bitmap
> [  +0.001784] find_next_bit:                  805553 ns, 164240 iterations
> [  +0.000643] find_prev_bit:                  632474 ns, 164240 iterations
> [  +0.000950] find_next_zero_bit:             877215 ns, 163442 iterations
> [  +0.000664] find_prev_zero_bit:             662339 ns, 163442 iterations
> [  +0.000680] find_last_bit:                  602204 ns, 164240 iterations
> [  +0.001912] find_first_bit:                1758208 ns,  16408 iterations
> [  +0.000760] find_next_and_bit:              531033 ns,  73798 iterations
> [  +0.000002]
>               Start testing find_bit() with sparse bitmap
> [  +0.000203] find_next_bit:                   12468 ns,    656 iterations
> [  +0.000205] find_prev_bit:                   10948 ns,    656 iterations
> [  +0.001759] find_next_zero_bit:            1579447 ns, 327026 iterations
> [  +0.001935] find_prev_zero_bit:            1931961 ns, 327026 iterations
> [  +0.000013] find_last_bit:                    9543 ns,    656 iterations
> [  +0.000732] find_first_bit:                 562009 ns,    656 iterations
> [  +0.000217] find_next_and_bit:                6804 ns,      1 iterations
> [  +0.000007] find_next_and_bit:                4367 ns,      1 iterations
>
> Is it better to write find_prev_bit using find_last_bit?
> I question again.

I answer again. It's better not to write find_prev_bit at all and
learn how to use existing functionality.

Yury

> Thanks for your great advice, But please forgive my fault and lackness.
>
> HTH.
> Levi.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-04 18:14                           ` Re: Yury Norov
@ 2020-12-05  0:45                             ` Yun Levi
  0 siblings, 0 replies; 1651+ messages in thread
From: Yun Levi @ 2020-12-05  0:45 UTC (permalink / raw)
  To: Yury Norov
  Cc: Willy Tarreau, Rasmus Villemoes, dushistov, Arnd Bergmann,
	Andrew Morton, Gustavo A. R. Silva, William Breathitt Gray,
	richard.weiyang, joseph.qi, skalluru, Josh Poimboeuf,
	Linux Kernel Mailing List, linux-arch, Andy Shevchenko

> I answer again. It's better not to write find_prev_bit at all and
> learn how to use existing functionality.

Thanks for the answer I'll fix and send the patch again :)

On Sat, Dec 5, 2020 at 3:14 AM Yury Norov <yury.norov@gmail.com> wrote:
>
> On Thu, Dec 3, 2020 at 5:36 PM Yun Levi <ppbuk5246@gmail.com> wrote:
> >
> > >On Fri, Dec 4, 2020 at 3:53 AM Willy Tarreau <w@1wt.eu> wrote:
> > >
> > > On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > > Yun, could you please stop top-posting and excessive trimming in the thread?
> > >
> > > And re-configure the mail agent to make the "Subject" field appear and
> > > fill it.
> >
> > >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > Yun, could you please stop top-posting and excessive trimming in the thread?
> > Sorry to make you uncomfortable... Thanks for advice.
> >
> > >On Thu, Dec 03, 2020 at 10:46:25AM -0800, Yury Norov wrote:
> > > As you said, find_last_bit() and proposed find_prev_*_bit() have the
> > > same functionality.
> > > If you really want to have find_prev_*_bit(), could you please at
> > > least write it using find_last_bit(), otherwise it would be just a
> > > blottering.
> >
> > Actually find_prev_*_bit call _find_prev_bit which is a common helper function
> > like _find_next_bit.
> > As you know this function is required to support __BIGEDIAN's little
> > endian search.
> > find_prev_bit actually wrapper of _find_prev_bit which have a feature
> > the find_last_bit.
> >
> > That makes the semantics difference between find_last_bit and find_prev_bit.
> > -- specify where you find from and
> >    In loop, find_last_bit couldn't sustain original size as sentinel
> > return value
> >     (we should change the size argument for next searching
> >      But it means whenever we call, "NOT SET or NOT CLEAR"'s sentinel
> > return value is changed per call).
> >
> > Because we should have _find_prev_bit,
> > I think it's the matter to choose which is better to usein
> > find_prev_bit (find_last_bit? or _find_prev_bit?)
> > sustaining find_prev_bit feature (give size as sentinel return, from
> > where I start).
> > if my understanding is correct.
> >
> > In my view, I prefer to use _find_prev_bit like find_next_bit for
> > integrated format.
> >
> > But In some of the benchmarking, find_last_bit is better than _find_prev_bit,
> > here what I tested (look similar but sometimes have some difference).
> >
> >               Start testing find_bit() with random-filled bitmap
> > [  +0.001850] find_next_bit:                  842792 ns, 163788 iterations
> > [  +0.000873] find_prev_bit:                  870914 ns, 163788 iterations
> > [  +0.000824] find_next_zero_bit:             821959 ns, 163894 iterations
> > [  +0.000677] find_prev_zero_bit:             676240 ns, 163894 iterations
> > [  +0.000777] find_last_bit:                  659103 ns, 163788 iterations
> > [  +0.001822] find_first_bit:                1708041 ns,  16250 iterations
> > [  +0.000539] find_next_and_bit:              492182 ns,  73871 iterations
> > [  +0.000001]
> >               Start testing find_bit() with sparse bitmap
> > [  +0.000222] find_next_bit:                   13227 ns,    654 iterations
> > [  +0.000013] find_prev_bit:                   11652 ns,    654 iterations
> > [  +0.001845] find_next_zero_bit:            1723869 ns, 327028 iterations
> > [  +0.001538] find_prev_zero_bit:            1355808 ns, 327028 iterations
> > [  +0.000010] find_last_bit:                    8114 ns,    654 iterations
> > [  +0.000867] find_first_bit:                 710639 ns,    654 iterations
> > [  +0.000006] find_next_and_bit:                4273 ns,      1 iterations
> > [  +0.000004] find_next_and_bit:                3278 ns,      1 iterations
> >
> >               Start testing find_bit() with random-filled bitmap
> > [  +0.001784] find_next_bit:                  805553 ns, 164240 iterations
> > [  +0.000643] find_prev_bit:                  632474 ns, 164240 iterations
> > [  +0.000950] find_next_zero_bit:             877215 ns, 163442 iterations
> > [  +0.000664] find_prev_zero_bit:             662339 ns, 163442 iterations
> > [  +0.000680] find_last_bit:                  602204 ns, 164240 iterations
> > [  +0.001912] find_first_bit:                1758208 ns,  16408 iterations
> > [  +0.000760] find_next_and_bit:              531033 ns,  73798 iterations
> > [  +0.000002]
> >               Start testing find_bit() with sparse bitmap
> > [  +0.000203] find_next_bit:                   12468 ns,    656 iterations
> > [  +0.000205] find_prev_bit:                   10948 ns,    656 iterations
> > [  +0.001759] find_next_zero_bit:            1579447 ns, 327026 iterations
> > [  +0.001935] find_prev_zero_bit:            1931961 ns, 327026 iterations
> > [  +0.000013] find_last_bit:                    9543 ns,    656 iterations
> > [  +0.000732] find_first_bit:                 562009 ns,    656 iterations
> > [  +0.000217] find_next_and_bit:                6804 ns,      1 iterations
> > [  +0.000007] find_next_and_bit:                4367 ns,      1 iterations
> >
> > Is it better to write find_prev_bit using find_last_bit?
> > I question again.
>
> I answer again. It's better not to write find_prev_bit at all and
> learn how to use existing functionality.
>
> Yury
>
> > Thanks for your great advice, But please forgive my fault and lackness.
> >
> > HTH.
> > Levi.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-03 18:46                     ` Re: Yury Norov
  2020-12-03 18:52                       ` Re: Willy Tarreau
@ 2020-12-05 11:10                       ` Rasmus Villemoes
  2020-12-05 18:20                         ` Re: Yury Norov
  1 sibling, 1 reply; 1651+ messages in thread
From: Rasmus Villemoes @ 2020-12-05 11:10 UTC (permalink / raw)
  To: Yury Norov, Yun Levi
  Cc: dushistov, Arnd Bergmann, Andrew Morton, Gustavo A. R. Silva,
	William Breathitt Gray, richard.weiyang, joseph.qi, skalluru,
	Josh Poimboeuf, Linux Kernel Mailing List, linux-arch,
	Andy Shevchenko

On 03/12/2020 19.46, Yury Norov wrote:

> I would prefer to avoid changing the find*bit() semantics. As for now,
> if any of find_*_bit()
> finds nothing, it returns the size of the bitmap it was passed.

Yeah, we should actually try to fix that, it causes bad code generation.
It's hard, because callers of course do that "if ret == size" check. But
it's really silly that something like find_first_bit needs to do that
"min(i*BPL + __ffs(word), size)" - the caller does a comparison anyway,
that comparison might as well be "ret >= size" rather than "ret ==
size", and then we could get rid of that branch (which min() necessarily
becomes) at the end of find_next_bit.

I haven't dug very deep into this, but I could also imagine the
arch-specific parts of this might become a little easier to do if the
semantics were just "if no such bit, return an indeterminate value >=
the size".

> Changing this for
> a single function would break the consistency, and may cause problems
> for those who
> rely on existing behaviour.

True. But I think it should be possible - I suppose most users are via
the iterator macros, which could all be updated at once. Changing ret ==
size to ret >= size will still work even if the implementations have not
been switched over, so it should be doable.

> 
> Passing non-positive size to find_*_bit() should produce undefined
> behaviour, because we cannot dereference a pointer to the bitmap in
> this case; this is most probably a sign of a problem on a caller side
> anyways.

No, the out-of-line bitmap functions should all handle the case of a
zero-size bitmap sensibly.

Is bitmap full? Yes (all the 0 bits are set).
Is bitmap empty? Yes, (none of the 0 bits are set).
Find the first bit set (returns 0, there's no such bit)

Etc. The static inlines for small_const_nbits do assume that the pointer
can be dereferenced, which is why small_const_nbits was updated to mean
1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG.

Rasmus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2020-12-05 11:10                       ` Re: Rasmus Villemoes
@ 2020-12-05 18:20                         ` Yury Norov
  0 siblings, 0 replies; 1651+ messages in thread
From: Yury Norov @ 2020-12-05 18:20 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Yun Levi, dushistov, Arnd Bergmann, Andrew Morton,
	Gustavo A. R. Silva, William Breathitt Gray, richard.weiyang,
	joseph.qi, skalluru, Josh Poimboeuf, Linux Kernel Mailing List,
	linux-arch, Andy Shevchenko

On Sat, Dec 5, 2020 at 3:10 AM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> On 03/12/2020 19.46, Yury Norov wrote:
>
> > I would prefer to avoid changing the find*bit() semantics. As for now,
> > if any of find_*_bit()
> > finds nothing, it returns the size of the bitmap it was passed.
>
> Yeah, we should actually try to fix that, it causes bad code generation.
> It's hard, because callers of course do that "if ret == size" check. But
> it's really silly that something like find_first_bit needs to do that
> "min(i*BPL + __ffs(word), size)" - the caller does a comparison anyway,
> that comparison might as well be "ret >= size" rather than "ret ==
> size", and then we could get rid of that branch (which min() necessarily
> becomes) at the end of find_next_bit.

We didn't do that 5 years ago because it's too invasive and the improvement
is barely measurable, the difference is 2 instructions (on arm64).e.
Has something
changed since that?

20000000000000000 <find_first_bit_better>:
   0:   aa0003e3        mov     x3, x0
   4:   aa0103e0        mov     x0, x1
   8:   b4000181        cbz     x1, 38 <find_first_bit_better+0x38>
   c:   f9400064        ldr     x4, [x3]
  10:   d2800802        mov     x2, #0x40                       // #64
  14:   91002063        add     x3, x3, #0x8
  18:   b40000c4        cbz     x4, 30 <find_first_bit_better+0x30>
  1c:   14000008        b       3c <find_first_bit_better+0x3c>
  20:   f8408464        ldr     x4, [x3], #8
  24:   91010045        add     x5, x2, #0x40
  28:   b50000c4        cbnz    x4, 40 <find_first_bit_better+0x40>
  2c:   aa0503e2        mov     x2, x5
  30:   eb00005f        cmp     x2, x0
  34:   54ffff63        b.cc    20 <find_first_bit_better+0x20>  //
b.lo, b.ul, b.last
  38:   d65f03c0        ret
  3c:   d2800002        mov     x2, #0x0                        // #0
  40:   dac00084        rbit    x4, x4
  44:   dac01084        clz     x4, x4
  48:   8b020080        add     x0, x4, x2
  4c:   d65f03c0        ret

0000000000000050 <find_first_bit_worse>:
  50:   aa0003e4        mov     x4, x0
  54:   aa0103e0        mov     x0, x1
  58:   b4000181        cbz     x1, 88 <find_first_bit_worse+0x38>
  5c:   f9400083        ldr     x3, [x4]
  60:   d2800802        mov     x2, #0x40                       // #64
  64:   91002084        add     x4, x4, #0x8
  68:   b40000c3        cbz     x3, 80 <find_first_bit_worse+0x30>
  6c:   14000008        b       8c <find_first_bit_worse+0x3c>
  70:   f8408483        ldr     x3, [x4], #8
  74:   91010045        add     x5, x2, #0x40
  78:   b50000c3        cbnz    x3, 90 <find_first_bit_worse+0x40>
  7c:   aa0503e2        mov     x2, x5
  80:   eb02001f        cmp     x0, x2
  84:   54ffff68        b.hi    70 <find_first_bit_worse+0x20>  // b.pmore
  88:   d65f03c0        ret
  8c:   d2800002        mov     x2, #0x0                        // #0
  90:   dac00063        rbit    x3, x3
  94:   dac01063        clz     x3, x3
  98:   8b020062        add     x2, x3, x2
  9c:   eb02001f        cmp     x0, x2
  a0:   9a829000        csel    x0, x0, x2, ls  // ls = plast
  a4:   d65f03c0        ret

> I haven't dug very deep into this, but I could also imagine the
> arch-specific parts of this might become a little easier to do if the
> semantics were just "if no such bit, return an indeterminate value >=
> the size".
>
> > Changing this for
> > a single function would break the consistency, and may cause problems
> > for those who
> > rely on existing behaviour.
>
> True. But I think it should be possible - I suppose most users are via
> the iterator macros, which could all be updated at once. Changing ret ==
> size to ret >= size will still work even if the implementations have not
> been switched over, so it should be doable.

Since there's no assembler users for it, we can do just:
#define find_first_bit(bitmap, size)
min(better_find_first_bit((bitmap), (size)), (size))

... and deprecate find_first_bit.

> > Passing non-positive size to find_*_bit() should produce undefined
> > behaviour, because we cannot dereference a pointer to the bitmap in
> > this case; this is most probably a sign of a problem on a caller side
> > anyways.
>
> No, the out-of-line bitmap functions should all handle the case of a
> zero-size bitmap sensibly.

I could be more specific, the behaviour is defined: don't dereference
the address and return undefined value (which now is always 0).

> Is bitmap full? Yes (all the 0 bits are set).
> Is bitmap empty? Yes, (none of the 0 bits are set).
> Find the first bit set (returns 0, there's no such bit)

I can't answer because this object is not a map of bits - there's no room for
bits inside.

> Etc. The static inlines for small_const_nbits do assume that the pointer
> can be dereferenced, which is why small_const_nbits was updated to mean
> 1<=bits<=BITS_PER_LONG rather than just bits<=BITS_PER_LONG.

I don't want to do something like

if (size == 0)
        return -1;

... because it legitimizes this kind of usage and hides problems on
callers' side.
Instead, I'd add WARN_ON(size == 0), but I don't think it's so
critical to bother with it.

Yury

> Rasmus

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAGMNF6W8baS_zLYL8DwVsbfPWTP2ohzRB7xutW0X=MUzv93pbA@mail.gmail.com>]

* Re:
       [not found] <CAGMNF6W8baS_zLYL8DwVsbfPWTP2ohzRB7xutW0X=MUzv93pbA@mail.gmail.com>
@ 2020-12-02 17:09   ` Kun Yi
  0 siblings, 0 replies; 1651+ messages in thread
From: Kun Yi @ 2020-12-02 17:09 UTC (permalink / raw)
  To: Kun Yi, Guenter Roeck, robh+dt, Venkatesh, Supreeth
  Cc: OpenBMC Maillist, linux-hwmon, linux-kernel

Much apologies for the super late reply.. I was out for an extended
period of time due to personal circumstances.
I have now addressed most of the comments in the v4 series.

Also cc'ed Supreeth who works on the AMD System Manageability stack.

On Wed, Dec 2, 2020 at 8:57 AM Kun Yi <kunyi@google.com> wrote:
>
> On Sat, Apr 04, 2020 at 08:01:16PM -0700, Kun Yi wrote:
> > SB Temperature Sensor Interface (SB-TSI) is an SMBus compatible
> > interface that reports AMD SoC's Ttcl (normalized temperature),
> > and resembles a typical 8-pin remote temperature sensor's I2C interface
> > to BMC.
> >
> > This commit adds basic support using this interface to read CPU
> > temperature, and read/write high/low CPU temp thresholds.
> >
> > To instantiate this driver on an AMD CPU with SB-TSI
> > support, the i2c bus number would be the bus connected from the board
> > management controller (BMC) to the CPU. The i2c address is specified in
> > Section 6.3.1 of the spec [1]: The SB-TSI address is normally 98h for socket 0
> > and 90h for socket 1, but it could vary based on hardware address select pins.
> >
> > [1]: https://www.amd.com/system/files/TechDocs/56255_OSRR.pdf
> >
> > Test status: tested reading temp1_input, and reading/writing
> > temp1_max/min.
> >
> > Signed-off-by: Kun Yi <kunyi at google.com>
> > ---
> >  drivers/hwmon/Kconfig      |  10 ++
> >  drivers/hwmon/Makefile     |   1 +
> >  drivers/hwmon/sbtsi_temp.c | 259 +++++++++++++++++++++++++++++++++++++
> >  3 files changed, 270 insertions(+)
> >  create mode 100644 drivers/hwmon/sbtsi_temp.c
> >
> > diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> > index 05a30832c6ba..9585dcd01d1b 100644
> > --- a/drivers/hwmon/Kconfig
> > +++ b/drivers/hwmon/Kconfig
> > @@ -1412,6 +1412,16 @@ config SENSORS_RASPBERRYPI_HWMON
> >    This driver can also be built as a module. If so, the module
> >    will be called raspberrypi-hwmon.
> >
> > +config SENSORS_SBTSI
> > + tristate "Emulated SB-TSI temperature sensor"
> > + depends on I2C
> > + help
> > +  If you say yes here you get support for emulated temperature
> > +  sensors on AMD SoCs with SB-TSI interface connected to a BMC device.
> > +
> > +  This driver can also be built as a module. If so, the module will
> > +  be called sbtsi_temp.
> > +
> >  config SENSORS_SHT15
> >   tristate "Sensiron humidity and temperature sensors. SHT15 and compat."
> >   depends on GPIOLIB || COMPILE_TEST
> > diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> > index b0b9c8e57176..cd109f003ce4 100644
> > --- a/drivers/hwmon/Makefile
> > +++ b/drivers/hwmon/Makefile
> > @@ -152,6 +152,7 @@ obj-$(CONFIG_SENSORS_POWR1220)  += powr1220.o
> >  obj-$(CONFIG_SENSORS_PWM_FAN) += pwm-fan.o
> >  obj-$(CONFIG_SENSORS_RASPBERRYPI_HWMON) += raspberrypi-hwmon.o
> >  obj-$(CONFIG_SENSORS_S3C) += s3c-hwmon.o
> > +obj-$(CONFIG_SENSORS_SBTSI) += sbtsi_temp.o
> >  obj-$(CONFIG_SENSORS_SCH56XX_COMMON)+= sch56xx-common.o
> >  obj-$(CONFIG_SENSORS_SCH5627) += sch5627.o
> >  obj-$(CONFIG_SENSORS_SCH5636) += sch5636.o
> > diff --git a/drivers/hwmon/sbtsi_temp.c b/drivers/hwmon/sbtsi_temp.c
> > new file mode 100644
> > index 000000000000..e3ad6a9f7ec1
> > --- /dev/null
> > +++ b/drivers/hwmon/sbtsi_temp.c
> > @@ -0,0 +1,259 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * sbtsi_temp.c - hwmon driver for a SBI Temperature Sensor Interface (SB-TSI)
> > + *                compliant AMD SoC temperature device.
> > + *
> > + * Copyright (c) 2020, Google Inc.
> > + * Copyright (c) 2020, Kun Yi <kunyi at google.com>
> > + */
> > +
> > +#include <linux/err.h>
> > +#include <linux/i2c.h>
> > +#include <linux/init.h>
> > +#include <linux/hwmon.h>
> > +#include <linux/module.h>
> > +#include <linux/mutex.h>
> > +#include <linux/of_device.h>
> > +#include <linux/of.h>
> > +
> > +/*
> > + * SB-TSI registers only support SMBus byte data access. "_INT" registers are
> > + * the integer part of a temperature value or limit, and "_DEC" registers are
> > + * corresponding decimal parts.
> > + */
> > +#define SBTSI_REG_TEMP_INT 0x01 /* RO */
> > +#define SBTSI_REG_STATUS 0x02 /* RO */
> > +#define SBTSI_REG_CONFIG 0x03 /* RO */
> > +#define SBTSI_REG_TEMP_HIGH_INT 0x07 /* RW */
> > +#define SBTSI_REG_TEMP_LOW_INT 0x08 /* RW */
> > +#define SBTSI_REG_TEMP_DEC 0x10 /* RW */
> > +#define SBTSI_REG_TEMP_HIGH_DEC 0x13 /* RW */
> > +#define SBTSI_REG_TEMP_LOW_DEC 0x14 /* RW */
> > +#define SBTSI_REG_REV 0xFF /* RO */
>
> The revision register is not actually used.
Thanks. Removed. I agree that the register is not well documented, at
least publicly.
It shouldn't affect functionality of this driver, so I removed the
definition altogether.
>
> > +
> > +#define SBTSI_CONFIG_READ_ORDER_SHIFT 5
> > +
> > +#define SBTSI_TEMP_MIN 0
> > +#define SBTSI_TEMP_MAX 255875
> > +#define SBTSI_REV_MAX_VALID_ID 4
>
> Not actually used, and I am not sure if it would make sense to check it.
> If at all, it would only make sense if you also check SBTSIxFE (Manufacture
> ID). Unfortunately, the actual SB-TSI specification seems to be non-public,
> so I can't check if the driver as-is supports versions 0..3 (assuming those
> exist).

Thanks. Removed.

>
> > +
> > +/* Each client has this additional data */
> > +struct sbtsi_data {
> > + struct i2c_client *client;
> > + struct mutex lock;
> > +};
> > +
> > +/*
> > + * From SB-TSI spec: CPU temperature readings and limit registers encode the
> > + * temperature in increments of 0.125 from 0 to 255.875. The "high byte"
> > + * register encodes the base-2 of the integer portion, and the upper 3 bits of
> > + * the "low byte" encode in base-2 the decimal portion.
> > + *
> > + * e.g. INT=0x19, DEC=0x20 represents 25.125 degrees Celsius
> > + *
> > + * Therefore temperature in millidegree Celsius =
> > + *   (INT + DEC / 256) * 1000 = (INT * 8 + DEC / 32) * 125
> > + */
> > +static inline int sbtsi_reg_to_mc(s32 integer, s32 decimal)
> > +{
> > + return ((integer << 3) + (decimal >> 5)) * 125;
> > +}
> > +
> > +/*
> > + * Inversely, given temperature in millidegree Celsius
> > + *   INT = (TEMP / 125) / 8
> > + *   DEC = ((TEMP / 125) % 8) * 32
> > + * Caller have to make sure temp doesn't exceed 255875, the max valid value.
> > + */
> > +static inline void sbtsi_mc_to_reg(s32 temp, u8 *integer, u8 *decimal)
> > +{
> > + temp /= 125;
> > + *integer = temp >> 3;
> > + *decimal = (temp & 0x7) << 5;
> > +}
> > +
> > +static int sbtsi_read(struct device *dev, enum hwmon_sensor_types type,
> > +      u32 attr, int channel, long *val)
> > +{
> > + struct sbtsi_data *data = dev_get_drvdata(dev);
> > + s32 temp_int, temp_dec;
> > + int err, reg_int, reg_dec;
> > + u8 read_order;
> > +
> > + if (type != hwmon_temp)
> > + return -EINVAL;
> > +
> > + read_order = 0;
> > + switch (attr) {
> > + case hwmon_temp_input:
> > + /*
> > + * ReadOrder bit specifies the reading order of integer and
> > + * decimal part of CPU temp for atomic reads. If bit == 0,
> > + * reading integer part triggers latching of the decimal part,
> > + * so integer part should be read first. If bit == 1, read
> > + * order should be reversed.
> > + */
> > + err = i2c_smbus_read_byte_data(data->client, SBTSI_REG_CONFIG);
> > + if (err < 0)
> > + return err;
> > +
> As I understand it, the idea is to set this configuration bit once and then
> just use it. Any chance to do that ? This would save an i2c read operation
> each time the temperature is read, and the if/else complexity below.

Unfortunately, the read-order register bit is read-only.

>
> > + read_order = (u8)err & BIT(SBTSI_CONFIG_READ_ORDER_SHIFT);
>
> Nit: typecast is unnecessary.

Done.

>
> > + reg_int = SBTSI_REG_TEMP_INT;
> > + reg_dec = SBTSI_REG_TEMP_DEC;
> > + break;
> > + case hwmon_temp_max:
> > + reg_int = SBTSI_REG_TEMP_HIGH_INT;
> > + reg_dec = SBTSI_REG_TEMP_HIGH_DEC;
> > + break;
> > + case hwmon_temp_min:
> > + reg_int = SBTSI_REG_TEMP_LOW_INT;
> > + reg_dec = SBTSI_REG_TEMP_LOW_DEC;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + if (read_order == 0) {
> > + temp_int = i2c_smbus_read_byte_data(data->client, reg_int);
> > + temp_dec = i2c_smbus_read_byte_data(data->client, reg_dec);
> > + } else {
> > + temp_dec = i2c_smbus_read_byte_data(data->client, reg_dec);
> > + temp_int = i2c_smbus_read_byte_data(data->client, reg_int);
> > + }
>
> Just a thought: if you use regmap and tell it that the limit registers
> are non-volatile, this wouldn't actually read from the chip more than once.

That's a great suggestion, although in our normal use cases the limit
values are read and cached by the
userspace application. Seems changing to regmap would require some
messaging of the code. Would it
be acceptable to keep the initial driver as-is and do that in a following patch?

>
> Also, since the read involves reading two registers, and the first read
> locks the value for the second, you'll need mutex protection when reading
> the current temperature (not for limits, though).

Added mutex locking before/after the temp input reading.

>
> > +
> > + if (temp_int < 0)
> > + return temp_int;
> > + if (temp_dec < 0)
> > + return temp_dec;
> > +
> > + *val = sbtsi_reg_to_mc(temp_int, temp_dec);
> > +
> > + return 0;
> > +}
> > +
> > +static int sbtsi_write(struct device *dev, enum hwmon_sensor_types type,
> > +       u32 attr, int channel, long val)
> > +{
> > + struct sbtsi_data *data = dev_get_drvdata(dev);
> > + int reg_int, reg_dec, err;
> > + u8 temp_int, temp_dec;
> > +
> > + if (type != hwmon_temp)
> > + return -EINVAL;
> > +
> > + switch (attr) {
> > + case hwmon_temp_max:
> > + reg_int = SBTSI_REG_TEMP_HIGH_INT;
> > + reg_dec = SBTSI_REG_TEMP_HIGH_DEC;
> > + break;
> > + case hwmon_temp_min:
> > + reg_int = SBTSI_REG_TEMP_LOW_INT;
> > + reg_dec = SBTSI_REG_TEMP_LOW_DEC;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + val = clamp_val(val, SBTSI_TEMP_MIN, SBTSI_TEMP_MAX);
> > + mutex_lock(&data->lock);
> > + sbtsi_mc_to_reg(val, &temp_int, &temp_dec);
> > + err = i2c_smbus_write_byte_data(data->client, reg_int, temp_int);
> > + if (err)
> > + goto exit;
> > +
> > + err = i2c_smbus_write_byte_data(data->client, reg_dec, temp_dec);
> > +exit:
> > + mutex_unlock(&data->lock);
> > + return err;
> > +}
> > +
> > +static umode_t sbtsi_is_visible(const void *data,
> > + enum hwmon_sensor_types type,
> > + u32 attr, int channel)
> > +{
> > + switch (type) {
> > + case hwmon_temp:
> > + switch (attr) {
> > + case hwmon_temp_input:
> > + return 0444;
> > + case hwmon_temp_min:
> > + return 0644;
> > + case hwmon_temp_max:
> > + return 0644;
> > + }
> > + break;
> > + default:
> > + break;
> > + }
> > + return 0;
> > +}
> > +
> > +static const struct hwmon_channel_info *sbtsi_info[] = {
> > + HWMON_CHANNEL_INFO(chip,
> > +   HWMON_C_REGISTER_TZ),
> > + HWMON_CHANNEL_INFO(temp,
> > +   HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_MAX),
>
> For your consideration: SB-TSI supports reporting high/low alerts.
> With this, it would be possible to implement respective alarm attributes.
> In conjunction with https://patchwork.kernel.org/patch/11277347/mbox/,
> it should also be possible to add interrupt and thus userspace notification
> for those attributes.
>
> SBTSI also supports setting the update rate (SBTSIx04) and setting
> the temperature offset (SBTSIx11, SBTSIx12), which could also be
> implemented as standard attributes.
>
> I won't require that for the initial version, just something to keep
> in mind.

Ack and thanks for the suggestions. I will keep in mind for future improvements.


>
> > + NULL
> > +};
> > +
> > +static const struct hwmon_ops sbtsi_hwmon_ops = {
> > + .is_visible = sbtsi_is_visible,
> > + .read = sbtsi_read,
> > + .write = sbtsi_write,
> > +};
> > +
> > +static const struct hwmon_chip_info sbtsi_chip_info = {
> > + .ops = &sbtsi_hwmon_ops,
> > + .info = sbtsi_info,
> > +};
> > +
> > +static int sbtsi_probe(struct i2c_client *client,
> > +       const struct i2c_device_id *id)
> > +{
> > + struct device *dev = &client->dev;
> > + struct device *hwmon_dev;
> > + struct sbtsi_data *data;
> > +
> > + data = devm_kzalloc(dev, sizeof(struct sbtsi_data), GFP_KERNEL);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + data->client = client;
> > + mutex_init(&data->lock);
> > +
> > + hwmon_dev =
> > + devm_hwmon_device_register_with_info(dev, client->name, data,
> > +     &sbtsi_chip_info, NULL);
> > +
> > + return PTR_ERR_OR_ZERO(hwmon_dev);
> > +}
> > +
> > +static const struct i2c_device_id sbtsi_id[] = {
> > + {"sbtsi", 0},
> > + {}
> > +};
> > +MODULE_DEVICE_TABLE(i2c, sbtsi_id);
> > +
> > +static const struct of_device_id __maybe_unused sbtsi_of_match[] = {
> > + {
> > + .compatible = "amd,sbtsi",
> > + },
> > + { },
> > +};
> > +MODULE_DEVICE_TABLE(of, sbtsi_of_match);
> > +
> > +static struct i2c_driver sbtsi_driver = {
> > + .class = I2C_CLASS_HWMON,
> > + .driver = {
> > + .name = "sbtsi",
> > + .of_match_table = of_match_ptr(sbtsi_of_match),
> > + },
> > + .probe = sbtsi_probe,
> > + .id_table = sbtsi_id,
> > +};
> > +
> > +module_i2c_driver(sbtsi_driver);
> > +
> > +MODULE_AUTHOR("Kun Yi <kunyi at google.com>");
> > +MODULE_DESCRIPTION("Hwmon driver for AMD SB-TSI emulated sensor");
> > +MODULE_LICENSE("GPL");



--
Regards,
Kun

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2020-12-02 17:09   ` Kun Yi
  0 siblings, 0 replies; 1651+ messages in thread
From: Kun Yi @ 2020-12-02 17:09 UTC (permalink / raw)
  To: Kun Yi, Guenter Roeck, robh+dt, Venkatesh, Supreeth
  Cc: linux-hwmon, OpenBMC Maillist, linux-kernel

Much apologies for the super late reply.. I was out for an extended
period of time due to personal circumstances.
I have now addressed most of the comments in the v4 series.

Also cc'ed Supreeth who works on the AMD System Manageability stack.

On Wed, Dec 2, 2020 at 8:57 AM Kun Yi <kunyi@google.com> wrote:
>
> On Sat, Apr 04, 2020 at 08:01:16PM -0700, Kun Yi wrote:
> > SB Temperature Sensor Interface (SB-TSI) is an SMBus compatible
> > interface that reports AMD SoC's Ttcl (normalized temperature),
> > and resembles a typical 8-pin remote temperature sensor's I2C interface
> > to BMC.
> >
> > This commit adds basic support using this interface to read CPU
> > temperature, and read/write high/low CPU temp thresholds.
> >
> > To instantiate this driver on an AMD CPU with SB-TSI
> > support, the i2c bus number would be the bus connected from the board
> > management controller (BMC) to the CPU. The i2c address is specified in
> > Section 6.3.1 of the spec [1]: The SB-TSI address is normally 98h for socket 0
> > and 90h for socket 1, but it could vary based on hardware address select pins.
> >
> > [1]: https://www.amd.com/system/files/TechDocs/56255_OSRR.pdf
> >
> > Test status: tested reading temp1_input, and reading/writing
> > temp1_max/min.
> >
> > Signed-off-by: Kun Yi <kunyi at google.com>
> > ---
> >  drivers/hwmon/Kconfig      |  10 ++
> >  drivers/hwmon/Makefile     |   1 +
> >  drivers/hwmon/sbtsi_temp.c | 259 +++++++++++++++++++++++++++++++++++++
> >  3 files changed, 270 insertions(+)
> >  create mode 100644 drivers/hwmon/sbtsi_temp.c
> >
> > diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> > index 05a30832c6ba..9585dcd01d1b 100644
> > --- a/drivers/hwmon/Kconfig
> > +++ b/drivers/hwmon/Kconfig
> > @@ -1412,6 +1412,16 @@ config SENSORS_RASPBERRYPI_HWMON
> >    This driver can also be built as a module. If so, the module
> >    will be called raspberrypi-hwmon.
> >
> > +config SENSORS_SBTSI
> > + tristate "Emulated SB-TSI temperature sensor"
> > + depends on I2C
> > + help
> > +  If you say yes here you get support for emulated temperature
> > +  sensors on AMD SoCs with SB-TSI interface connected to a BMC device.
> > +
> > +  This driver can also be built as a module. If so, the module will
> > +  be called sbtsi_temp.
> > +
> >  config SENSORS_SHT15
> >   tristate "Sensiron humidity and temperature sensors. SHT15 and compat."
> >   depends on GPIOLIB || COMPILE_TEST
> > diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> > index b0b9c8e57176..cd109f003ce4 100644
> > --- a/drivers/hwmon/Makefile
> > +++ b/drivers/hwmon/Makefile
> > @@ -152,6 +152,7 @@ obj-$(CONFIG_SENSORS_POWR1220)  += powr1220.o
> >  obj-$(CONFIG_SENSORS_PWM_FAN) += pwm-fan.o
> >  obj-$(CONFIG_SENSORS_RASPBERRYPI_HWMON) += raspberrypi-hwmon.o
> >  obj-$(CONFIG_SENSORS_S3C) += s3c-hwmon.o
> > +obj-$(CONFIG_SENSORS_SBTSI) += sbtsi_temp.o
> >  obj-$(CONFIG_SENSORS_SCH56XX_COMMON)+= sch56xx-common.o
> >  obj-$(CONFIG_SENSORS_SCH5627) += sch5627.o
> >  obj-$(CONFIG_SENSORS_SCH5636) += sch5636.o
> > diff --git a/drivers/hwmon/sbtsi_temp.c b/drivers/hwmon/sbtsi_temp.c
> > new file mode 100644
> > index 000000000000..e3ad6a9f7ec1
> > --- /dev/null
> > +++ b/drivers/hwmon/sbtsi_temp.c
> > @@ -0,0 +1,259 @@
> > +// SPDX-License-Identifier: GPL-2.0-or-later
> > +/*
> > + * sbtsi_temp.c - hwmon driver for a SBI Temperature Sensor Interface (SB-TSI)
> > + *                compliant AMD SoC temperature device.
> > + *
> > + * Copyright (c) 2020, Google Inc.
> > + * Copyright (c) 2020, Kun Yi <kunyi at google.com>
> > + */
> > +
> > +#include <linux/err.h>
> > +#include <linux/i2c.h>
> > +#include <linux/init.h>
> > +#include <linux/hwmon.h>
> > +#include <linux/module.h>
> > +#include <linux/mutex.h>
> > +#include <linux/of_device.h>
> > +#include <linux/of.h>
> > +
> > +/*
> > + * SB-TSI registers only support SMBus byte data access. "_INT" registers are
> > + * the integer part of a temperature value or limit, and "_DEC" registers are
> > + * corresponding decimal parts.
> > + */
> > +#define SBTSI_REG_TEMP_INT 0x01 /* RO */
> > +#define SBTSI_REG_STATUS 0x02 /* RO */
> > +#define SBTSI_REG_CONFIG 0x03 /* RO */
> > +#define SBTSI_REG_TEMP_HIGH_INT 0x07 /* RW */
> > +#define SBTSI_REG_TEMP_LOW_INT 0x08 /* RW */
> > +#define SBTSI_REG_TEMP_DEC 0x10 /* RW */
> > +#define SBTSI_REG_TEMP_HIGH_DEC 0x13 /* RW */
> > +#define SBTSI_REG_TEMP_LOW_DEC 0x14 /* RW */
> > +#define SBTSI_REG_REV 0xFF /* RO */
>
> The revision register is not actually used.
Thanks. Removed. I agree that the register is not well documented, at
least publicly.
It shouldn't affect functionality of this driver, so I removed the
definition altogether.
>
> > +
> > +#define SBTSI_CONFIG_READ_ORDER_SHIFT 5
> > +
> > +#define SBTSI_TEMP_MIN 0
> > +#define SBTSI_TEMP_MAX 255875
> > +#define SBTSI_REV_MAX_VALID_ID 4
>
> Not actually used, and I am not sure if it would make sense to check it.
> If at all, it would only make sense if you also check SBTSIxFE (Manufacture
> ID). Unfortunately, the actual SB-TSI specification seems to be non-public,
> so I can't check if the driver as-is supports versions 0..3 (assuming those
> exist).

Thanks. Removed.

>
> > +
> > +/* Each client has this additional data */
> > +struct sbtsi_data {
> > + struct i2c_client *client;
> > + struct mutex lock;
> > +};
> > +
> > +/*
> > + * From SB-TSI spec: CPU temperature readings and limit registers encode the
> > + * temperature in increments of 0.125 from 0 to 255.875. The "high byte"
> > + * register encodes the base-2 of the integer portion, and the upper 3 bits of
> > + * the "low byte" encode in base-2 the decimal portion.
> > + *
> > + * e.g. INT=0x19, DEC=0x20 represents 25.125 degrees Celsius
> > + *
> > + * Therefore temperature in millidegree Celsius =
> > + *   (INT + DEC / 256) * 1000 = (INT * 8 + DEC / 32) * 125
> > + */
> > +static inline int sbtsi_reg_to_mc(s32 integer, s32 decimal)
> > +{
> > + return ((integer << 3) + (decimal >> 5)) * 125;
> > +}
> > +
> > +/*
> > + * Inversely, given temperature in millidegree Celsius
> > + *   INT = (TEMP / 125) / 8
> > + *   DEC = ((TEMP / 125) % 8) * 32
> > + * Caller have to make sure temp doesn't exceed 255875, the max valid value.
> > + */
> > +static inline void sbtsi_mc_to_reg(s32 temp, u8 *integer, u8 *decimal)
> > +{
> > + temp /= 125;
> > + *integer = temp >> 3;
> > + *decimal = (temp & 0x7) << 5;
> > +}
> > +
> > +static int sbtsi_read(struct device *dev, enum hwmon_sensor_types type,
> > +      u32 attr, int channel, long *val)
> > +{
> > + struct sbtsi_data *data = dev_get_drvdata(dev);
> > + s32 temp_int, temp_dec;
> > + int err, reg_int, reg_dec;
> > + u8 read_order;
> > +
> > + if (type != hwmon_temp)
> > + return -EINVAL;
> > +
> > + read_order = 0;
> > + switch (attr) {
> > + case hwmon_temp_input:
> > + /*
> > + * ReadOrder bit specifies the reading order of integer and
> > + * decimal part of CPU temp for atomic reads. If bit == 0,
> > + * reading integer part triggers latching of the decimal part,
> > + * so integer part should be read first. If bit == 1, read
> > + * order should be reversed.
> > + */
> > + err = i2c_smbus_read_byte_data(data->client, SBTSI_REG_CONFIG);
> > + if (err < 0)
> > + return err;
> > +
> As I understand it, the idea is to set this configuration bit once and then
> just use it. Any chance to do that ? This would save an i2c read operation
> each time the temperature is read, and the if/else complexity below.

Unfortunately, the read-order register bit is read-only.

>
> > + read_order = (u8)err & BIT(SBTSI_CONFIG_READ_ORDER_SHIFT);
>
> Nit: typecast is unnecessary.

Done.

>
> > + reg_int = SBTSI_REG_TEMP_INT;
> > + reg_dec = SBTSI_REG_TEMP_DEC;
> > + break;
> > + case hwmon_temp_max:
> > + reg_int = SBTSI_REG_TEMP_HIGH_INT;
> > + reg_dec = SBTSI_REG_TEMP_HIGH_DEC;
> > + break;
> > + case hwmon_temp_min:
> > + reg_int = SBTSI_REG_TEMP_LOW_INT;
> > + reg_dec = SBTSI_REG_TEMP_LOW_DEC;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + if (read_order == 0) {
> > + temp_int = i2c_smbus_read_byte_data(data->client, reg_int);
> > + temp_dec = i2c_smbus_read_byte_data(data->client, reg_dec);
> > + } else {
> > + temp_dec = i2c_smbus_read_byte_data(data->client, reg_dec);
> > + temp_int = i2c_smbus_read_byte_data(data->client, reg_int);
> > + }
>
> Just a thought: if you use regmap and tell it that the limit registers
> are non-volatile, this wouldn't actually read from the chip more than once.

That's a great suggestion, although in our normal use cases the limit
values are read and cached by the
userspace application. Seems changing to regmap would require some
messaging of the code. Would it
be acceptable to keep the initial driver as-is and do that in a following patch?

>
> Also, since the read involves reading two registers, and the first read
> locks the value for the second, you'll need mutex protection when reading
> the current temperature (not for limits, though).

Added mutex locking before/after the temp input reading.

>
> > +
> > + if (temp_int < 0)
> > + return temp_int;
> > + if (temp_dec < 0)
> > + return temp_dec;
> > +
> > + *val = sbtsi_reg_to_mc(temp_int, temp_dec);
> > +
> > + return 0;
> > +}
> > +
> > +static int sbtsi_write(struct device *dev, enum hwmon_sensor_types type,
> > +       u32 attr, int channel, long val)
> > +{
> > + struct sbtsi_data *data = dev_get_drvdata(dev);
> > + int reg_int, reg_dec, err;
> > + u8 temp_int, temp_dec;
> > +
> > + if (type != hwmon_temp)
> > + return -EINVAL;
> > +
> > + switch (attr) {
> > + case hwmon_temp_max:
> > + reg_int = SBTSI_REG_TEMP_HIGH_INT;
> > + reg_dec = SBTSI_REG_TEMP_HIGH_DEC;
> > + break;
> > + case hwmon_temp_min:
> > + reg_int = SBTSI_REG_TEMP_LOW_INT;
> > + reg_dec = SBTSI_REG_TEMP_LOW_DEC;
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> > +
> > + val = clamp_val(val, SBTSI_TEMP_MIN, SBTSI_TEMP_MAX);
> > + mutex_lock(&data->lock);
> > + sbtsi_mc_to_reg(val, &temp_int, &temp_dec);
> > + err = i2c_smbus_write_byte_data(data->client, reg_int, temp_int);
> > + if (err)
> > + goto exit;
> > +
> > + err = i2c_smbus_write_byte_data(data->client, reg_dec, temp_dec);
> > +exit:
> > + mutex_unlock(&data->lock);
> > + return err;
> > +}
> > +
> > +static umode_t sbtsi_is_visible(const void *data,
> > + enum hwmon_sensor_types type,
> > + u32 attr, int channel)
> > +{
> > + switch (type) {
> > + case hwmon_temp:
> > + switch (attr) {
> > + case hwmon_temp_input:
> > + return 0444;
> > + case hwmon_temp_min:
> > + return 0644;
> > + case hwmon_temp_max:
> > + return 0644;
> > + }
> > + break;
> > + default:
> > + break;
> > + }
> > + return 0;
> > +}
> > +
> > +static const struct hwmon_channel_info *sbtsi_info[] = {
> > + HWMON_CHANNEL_INFO(chip,
> > +   HWMON_C_REGISTER_TZ),
> > + HWMON_CHANNEL_INFO(temp,
> > +   HWMON_T_INPUT | HWMON_T_MIN | HWMON_T_MAX),
>
> For your consideration: SB-TSI supports reporting high/low alerts.
> With this, it would be possible to implement respective alarm attributes.
> In conjunction with https://patchwork.kernel.org/patch/11277347/mbox/,
> it should also be possible to add interrupt and thus userspace notification
> for those attributes.
>
> SBTSI also supports setting the update rate (SBTSIx04) and setting
> the temperature offset (SBTSIx11, SBTSIx12), which could also be
> implemented as standard attributes.
>
> I won't require that for the initial version, just something to keep
> in mind.

Ack and thanks for the suggestions. I will keep in mind for future improvements.


>
> > + NULL
> > +};
> > +
> > +static const struct hwmon_ops sbtsi_hwmon_ops = {
> > + .is_visible = sbtsi_is_visible,
> > + .read = sbtsi_read,
> > + .write = sbtsi_write,
> > +};
> > +
> > +static const struct hwmon_chip_info sbtsi_chip_info = {
> > + .ops = &sbtsi_hwmon_ops,
> > + .info = sbtsi_info,
> > +};
> > +
> > +static int sbtsi_probe(struct i2c_client *client,
> > +       const struct i2c_device_id *id)
> > +{
> > + struct device *dev = &client->dev;
> > + struct device *hwmon_dev;
> > + struct sbtsi_data *data;
> > +
> > + data = devm_kzalloc(dev, sizeof(struct sbtsi_data), GFP_KERNEL);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + data->client = client;
> > + mutex_init(&data->lock);
> > +
> > + hwmon_dev =
> > + devm_hwmon_device_register_with_info(dev, client->name, data,
> > +     &sbtsi_chip_info, NULL);
> > +
> > + return PTR_ERR_OR_ZERO(hwmon_dev);
> > +}
> > +
> > +static const struct i2c_device_id sbtsi_id[] = {
> > + {"sbtsi", 0},
> > + {}
> > +};
> > +MODULE_DEVICE_TABLE(i2c, sbtsi_id);
> > +
> > +static const struct of_device_id __maybe_unused sbtsi_of_match[] = {
> > + {
> > + .compatible = "amd,sbtsi",
> > + },
> > + { },
> > +};
> > +MODULE_DEVICE_TABLE(of, sbtsi_of_match);
> > +
> > +static struct i2c_driver sbtsi_driver = {
> > + .class = I2C_CLASS_HWMON,
> > + .driver = {
> > + .name = "sbtsi",
> > + .of_match_table = of_match_ptr(sbtsi_of_match),
> > + },
> > + .probe = sbtsi_probe,
> > + .id_table = sbtsi_id,
> > +};
> > +
> > +module_i2c_driver(sbtsi_driver);
> > +
> > +MODULE_AUTHOR("Kun Yi <kunyi at google.com>");
> > +MODULE_DESCRIPTION("Hwmon driver for AMD SB-TSI emulated sensor");
> > +MODULE_LICENSE("GPL");



--
Regards,
Kun

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-01-08 10:32 Misono Tomohiro
  2021-01-08 12:30   ` Re: Arnd Bergmann
  0 siblings, 1 reply; 1651+ messages in thread
From: Misono Tomohiro @ 2021-01-08 10:32 UTC (permalink / raw)
  To: linux-arm-kernel, soc; +Cc: will, catalin.marinas, arnd, olof, misono.tomohiro

Subject: [RFC PATCH 00/10] Add Fujitsu A64FX soc entry/hardware barrier driver

Hello,

This series adds Fujitsu A64FX SoC entry in drivers/soc and hardware
barrier driver for it.

[Driver Description]
 A64FX CPU has several functions for HPC workload and hardware barrier
 is one of them. It is a mechanism to realize fast synchronization by
 PEs belonging to the same L3 cache domain by using implementation
 defined hardware registers.
 For more details, see A64FX HPC extension specification in
 https://github.com/fujitsu/A64FX

 The driver mainly offers a set of ioctls to manipulate related registers.
 Patch 1-9 implements driver code and patch 10 finally adds kconfig,
 Makefile and MAINTAINER entry for the driver.  

 Also, C library and test program for this driver is available on: 
 https://github.com/fujitsu/hardware_barrier

 The driver is based on v5.11-rc2 and tested on FX700 environment.

[RFC]
 This is the first time we upstream drivers for our chip and I want to
 confirm driver location and patch submission process.

 Based on my observation it seems drivers/soc folder is right place to put
 this driver, so I added Kconfig entry for arm64 platform config, created
 soc/fujitsu folder and updated MAINTAINER entry accordingly (last patch).
 Is it right?

 Also for final submission I think I need to 1) create some public git
 tree to push driver code (github or something), 2) make pull request to
 SOC team (soc@kernel.org). Is it a correct procedure?

 I will appreciate any help/comments.

sidenote: We plan to post other drivers for A64FX HPC extension
(prefetch control and cache control) too anytime soon.

Misono Tomohiro (10):
  soc: fujitsu: hwb: Add hardware barrier driver init/exit code
  soc: fujtisu: hwb: Add open operation
  soc: fujitsu: hwb: Add IOC_BB_ALLOC ioctl
  soc: fujitsu: hwb: Add IOC_BW_ASSIGN ioctl
  soc: fujitsu: hwb: Add IOC_BW_UNASSIGN ioctl
  soc: fujitsu: hwb: Add IOC_BB_FREE ioctl
  soc: fujitsu: hwb: Add IOC_GET_PE_INFO ioctl
  soc: fujitsu: hwb: Add release operation
  soc: fujitsu: hwb: Add sysfs entry
  soc: fujitsu: hwb: Add Kconfig/Makefile to build fujitsu_hwb driver

 MAINTAINERS                            |    7 +
 arch/arm64/Kconfig.platforms           |    5 +
 drivers/soc/Kconfig                    |    1 +
 drivers/soc/Makefile                   |    1 +
 drivers/soc/fujitsu/Kconfig            |   24 +
 drivers/soc/fujitsu/Makefile           |    2 +
 drivers/soc/fujitsu/fujitsu_hwb.c      | 1253 ++++++++++++++++++++++++
 include/uapi/linux/fujitsu_hpc_ioctl.h |   41 +
 8 files changed, 1334 insertions(+)
 create mode 100644 drivers/soc/fujitsu/Kconfig
 create mode 100644 drivers/soc/fujitsu/Makefile
 create mode 100644 drivers/soc/fujitsu/fujitsu_hwb.c
 create mode 100644 include/uapi/linux/fujitsu_hpc_ioctl.h

-- 
2.26.2

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2021-01-08 12:30   ` Arnd Bergmann
  0 siblings, 0 replies; 1651+ messages in thread
From: Arnd Bergmann @ 2021-01-08 12:30 UTC (permalink / raw)
  To: Misono Tomohiro
  Cc: Linux ARM, SoC Team, Will Deacon, Catalin Marinas, Arnd Bergmann,
	Olof Johansson

On Fri, Jan 8, 2021 at 11:32 AM Misono Tomohiro
<misono.tomohiro@jp.fujitsu.com> wrote:
> Subject: [RFC PATCH 00/10] Add Fujitsu A64FX soc entry/hardware barrier driver
> [RFC]
>  This is the first time we upstream drivers for our chip and I want to
>  confirm driver location and patch submission process.
>
>  Based on my observation it seems drivers/soc folder is right place to put
>  this driver, so I added Kconfig entry for arm64 platform config, created
>  soc/fujitsu folder and updated MAINTAINER entry accordingly (last patch).
>  Is it right?

This looks good as a start. It may be possible that during review, we
come up with a different location or a different user interface that may
change the code, but if it stays in drivers/soc/fujitsu, then the other
steps are absolutely right.

>  Also for final submission I think I need to 1) create some public git
>  tree to push driver code (github or something), 2) make pull request to
>  SOC team (soc@kernel.org). Is it a correct procedure?

Yes. I would prefer something other than github, e.g. an account
on a fujitsu.com host, on kernel.org, or on git.linaro.org, but github
works if none of the alternatives are easy for you.

When you send a pull request, make sure you sign the tag with
a gpg key, ideally after getting it on the kernel.org keyring [1].

       Arnd

[1] https://korg.docs.kernel.org/pgpkeys.html

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2021-01-08 12:30   ` Arnd Bergmann
  0 siblings, 0 replies; 1651+ messages in thread
From: Arnd Bergmann @ 2021-01-08 12:30 UTC (permalink / raw)
  To: Misono Tomohiro
  Cc: Arnd Bergmann, Catalin Marinas, SoC Team, Olof Johansson,
	Will Deacon, Linux ARM

On Fri, Jan 8, 2021 at 11:32 AM Misono Tomohiro
<misono.tomohiro@jp.fujitsu.com> wrote:
> Subject: [RFC PATCH 00/10] Add Fujitsu A64FX soc entry/hardware barrier driver
> [RFC]
>  This is the first time we upstream drivers for our chip and I want to
>  confirm driver location and patch submission process.
>
>  Based on my observation it seems drivers/soc folder is right place to put
>  this driver, so I added Kconfig entry for arm64 platform config, created
>  soc/fujitsu folder and updated MAINTAINER entry accordingly (last patch).
>  Is it right?

This looks good as a start. It may be possible that during review, we
come up with a different location or a different user interface that may
change the code, but if it stays in drivers/soc/fujitsu, then the other
steps are absolutely right.

>  Also for final submission I think I need to 1) create some public git
>  tree to push driver code (github or something), 2) make pull request to
>  SOC team (soc@kernel.org). Is it a correct procedure?

Yes. I would prefer something other than github, e.g. an account
on a fujitsu.com host, on kernel.org, or on git.linaro.org, but github
works if none of the alternatives are easy for you.

When you send a pull request, make sure you sign the tag with
a gpg key, ideally after getting it on the kernel.org keyring [1].

       Arnd

[1] https://korg.docs.kernel.org/pgpkeys.html

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2021-01-08 10:35 misono.tomohiro
  0 siblings, 0 replies; 1651+ messages in thread
From: misono.tomohiro @ 2021-01-08 10:35 UTC (permalink / raw)
  To: misono.tomohiro@fujitsu.com,
	'linux-arm-kernel@lists.infradead.org',
	'soc@kernel.org'
  Cc: 'will@kernel.org', 'catalin.marinas@arm.com',
	'arnd@arndb.de', 'olof@lixom.net'

Sorry, I failed to add proper subject to cover letter and it does not show lore archive.
I will resend the whole serieses. Plese discard this.

Tomohiro

> -----Original Message-----
> From: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
> Sent: Friday, January 8, 2021 7:32 PM
> To: linux-arm-kernel@lists.infradead.org; soc@kernel.org
> Cc: will@kernel.org; catalin.marinas@arm.com; arnd@arndb.de; olof@lixom.net; Misono, Tomohiro/味曽野 智礼
> <misono.tomohiro@fujitsu.com>
> Subject:
> 
> Subject: [RFC PATCH 00/10] Add Fujitsu A64FX soc entry/hardware barrier driver
> 
> Hello,
> 
> This series adds Fujitsu A64FX SoC entry in drivers/soc and hardware barrier driver for it.
> 
> [Driver Description]
>  A64FX CPU has several functions for HPC workload and hardware barrier  is one of them. It is a mechanism to realize
> fast synchronization by  PEs belonging to the same L3 cache domain by using implementation  defined hardware
> registers.
>  For more details, see A64FX HPC extension specification in  https://github.com/fujitsu/A64FX
> 
>  The driver mainly offers a set of ioctls to manipulate related registers.
>  Patch 1-9 implements driver code and patch 10 finally adds kconfig,  Makefile and MAINTAINER entry for the driver.
> 
>  Also, C library and test program for this driver is available on:
>  https://github.com/fujitsu/hardware_barrier
> 
>  The driver is based on v5.11-rc2 and tested on FX700 environment.
> 
> [RFC]
>  This is the first time we upstream drivers for our chip and I want to  confirm driver location and patch submission
> process.
> 
>  Based on my observation it seems drivers/soc folder is right place to put  this driver, so I added Kconfig entry for arm64
> platform config, created  soc/fujitsu folder and updated MAINTAINER entry accordingly (last patch).
>  Is it right?
> 
>  Also for final submission I think I need to 1) create some public git  tree to push driver code (github or something), 2)
> make pull request to  SOC team (soc@kernel.org). Is it a correct procedure?
> 
>  I will appreciate any help/comments.
> 
> sidenote: We plan to post other drivers for A64FX HPC extension (prefetch control and cache control) too anytime soon.
> 
> Misono Tomohiro (10):
>   soc: fujitsu: hwb: Add hardware barrier driver init/exit code
>   soc: fujtisu: hwb: Add open operation
>   soc: fujitsu: hwb: Add IOC_BB_ALLOC ioctl
>   soc: fujitsu: hwb: Add IOC_BW_ASSIGN ioctl
>   soc: fujitsu: hwb: Add IOC_BW_UNASSIGN ioctl
>   soc: fujitsu: hwb: Add IOC_BB_FREE ioctl
>   soc: fujitsu: hwb: Add IOC_GET_PE_INFO ioctl
>   soc: fujitsu: hwb: Add release operation
>   soc: fujitsu: hwb: Add sysfs entry
>   soc: fujitsu: hwb: Add Kconfig/Makefile to build fujitsu_hwb driver
> 
>  MAINTAINERS                            |    7 +
>  arch/arm64/Kconfig.platforms           |    5 +
>  drivers/soc/Kconfig                    |    1 +
>  drivers/soc/Makefile                   |    1 +
>  drivers/soc/fujitsu/Kconfig            |   24 +
>  drivers/soc/fujitsu/Makefile           |    2 +
>  drivers/soc/fujitsu/fujitsu_hwb.c      | 1253 ++++++++++++++++++++++++
>  include/uapi/linux/fujitsu_hpc_ioctl.h |   41 +
>  8 files changed, 1334 insertions(+)
>  create mode 100644 drivers/soc/fujitsu/Kconfig  create mode 100644 drivers/soc/fujitsu/Makefile  create mode
> 100644 drivers/soc/fujitsu/fujitsu_hwb.c  create mode 100644 include/uapi/linux/fujitsu_hpc_ioctl.h
> 
> --
> 2.26.2


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <w2q9lf-sait7s-qswxlnzeof4i-7j13q0-zgu9pt-xk3x5enp994p-kewn2p-o86qyug0mutj-91m157sheva0-4k2l8v20kyjp-heu04baxqdc7op987-9zc0bxi0jcgo-wyl26layz5p9-esqncc-g48ass.1610618007875@email.android.com>]

* Re:
       [not found] <w2q9lf-sait7s-qswxlnzeof4i-7j13q0-zgu9pt-xk3x5enp994p-kewn2p-o86qyug0mutj-91m157sheva0-4k2l8v20kyjp-heu04baxqdc7op987-9zc0bxi0jcgo-wyl26layz5p9-esqncc-g48ass.1610618007875@email.android.com>
@ 2021-01-14 10:09 ` Alexander Kapshuk
  0 siblings, 0 replies; 1651+ messages in thread
From: Alexander Kapshuk @ 2021-01-14 10:09 UTC (permalink / raw)
  To: bigbird2444@163.com; +Cc: kernelnewbies@kernelnewbies.org

On Thu, Jan 14, 2021 at 11:54 AM ‪bigbird2444@163.com‬
<bigbird2444@163.com> wrote:
>
> On Thu, Jan 14, 2021 at 8:01 AM Alexander Kapshuk
> <alexander.kapshuk@gmail.com> wrote:
> >
> > On Thu, Jan 14, 2021 at 8:14 AM bigbird2444@163.com <bigbird2444@163.com> wrote:
> > >
> > >
> > > I've just added a newbies mailing list, How to join other mailing lists, and I'd like to see what other people are communicating with.
> > >
> > > _______________________________________________
> > > Kernelnewbies mailing list
> > > Kernelnewbies@kernelnewbies.org
> > > https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
> >
> > Not sure what other lists you were referring to, but you may want to
> > check out these mailing lists, http://vger.kernel.org/vger-lists.html,
> > and see if that's what you were after.
>
> >If you just would like to read the mails on >the different mailing
> >list, you do not need to subscribe.
>
> >You can find all emails at >https://lore.kernel.org/lists.html, just
> >look into the various mailing lists and see >what is of interest to
> >you.
>
> >Lukas
>
>
> Thank you, how do I subscribe to other mailing lists?
>
> Liang Peng
>
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

By clicking on the link for the mailing list of interest, e.g.
linux-next, http://vger.kernel.org/vger-lists.html#linux-next,
followed by clicking on the subscribe link, which would launch your
email client, if available, with majordomo@vger.kernel.org as the
recipient and the following email body:
subscribe name-of-mailing-list

Alternatively, you could simply send the subscription request above
using an email client of your preference.

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-01-19  0:10 David Howells
  2021-01-20 14:46 ` Jarkko Sakkinen
  0 siblings, 1 reply; 1651+ messages in thread
From: David Howells @ 2021-01-19  0:10 UTC (permalink / raw)
  To: torvalds
  Cc: Tobias Markus, Tianjia Zhang, dhowells, keyrings, linux-crypto,
	linux-security-module, stable, linux-kernel


From: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>

On the following call path, `sig->pkey_algo` is not assigned
in asymmetric_key_verify_signature(), which causes runtime
crash in public_key_verify_signature().

  keyctl_pkey_verify
    asymmetric_key_verify_signature
      verify_signature
        public_key_verify_signature

This patch simply check this situation and fixes the crash
caused by NULL pointer.

Fixes: 215525639631 ("X.509: support OSCCA SM2-with-SM3 certificate verification")
Reported-by: Tobias Markus <tobias@markus-regensburg.de>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Tested-by: João Fonseca <jpedrofonseca@ua.pt>
Cc: stable@vger.kernel.org # v5.10+
---

 crypto/asymmetric_keys/public_key.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
index 8892908ad58c..788a4ba1e2e7 100644
--- a/crypto/asymmetric_keys/public_key.c
+++ b/crypto/asymmetric_keys/public_key.c
@@ -356,7 +356,8 @@ int public_key_verify_signature(const struct public_key *pkey,
 	if (ret)
 		goto error_free_key;
 
-	if (strcmp(sig->pkey_algo, "sm2") == 0 && sig->data_size) {
+	if (sig->pkey_algo && strcmp(sig->pkey_algo, "sm2") == 0 &&
+	    sig->data_size) {
 		ret = cert_sig_digest_update(sig, tfm);
 		if (ret)
 			goto error_free_key;


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2021-01-19  0:10 David Howells
@ 2021-01-20 14:46 ` Jarkko Sakkinen
  0 siblings, 0 replies; 1651+ messages in thread
From: Jarkko Sakkinen @ 2021-01-20 14:46 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, Tobias Markus, Tianjia Zhang, keyrings, linux-crypto,
	linux-security-module, stable, linux-kernel

On Tue, Jan 19, 2021 at 12:10:33AM +0000, David Howells wrote:
> 
> From: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
> 
> On the following call path, `sig->pkey_algo` is not assigned
> in asymmetric_key_verify_signature(), which causes runtime
> crash in public_key_verify_signature().
> 
>   keyctl_pkey_verify
>     asymmetric_key_verify_signature
>       verify_signature
>         public_key_verify_signature
> 
> This patch simply check this situation and fixes the crash
> caused by NULL pointer.
> 
> Fixes: 215525639631 ("X.509: support OSCCA SM2-with-SM3 certificate verification")
> Reported-by: Tobias Markus <tobias@markus-regensburg.de>
> Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Reviewed-and-tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
> Tested-by: João Fonseca <jpedrofonseca@ua.pt>
> Cc: stable@vger.kernel.org # v5.10+
> ---

For what it's worth

Acked-by: Jarkko Sakkinen <jarkko@kernel.org>

/Jarkko

> 
>  crypto/asymmetric_keys/public_key.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/crypto/asymmetric_keys/public_key.c b/crypto/asymmetric_keys/public_key.c
> index 8892908ad58c..788a4ba1e2e7 100644
> --- a/crypto/asymmetric_keys/public_key.c
> +++ b/crypto/asymmetric_keys/public_key.c
> @@ -356,7 +356,8 @@ int public_key_verify_signature(const struct public_key *pkey,
>  	if (ret)
>  		goto error_free_key;
>  
> -	if (strcmp(sig->pkey_algo, "sm2") == 0 && sig->data_size) {
> +	if (sig->pkey_algo && strcmp(sig->pkey_algo, "sm2") == 0 &&
> +	    sig->data_size) {
>  		ret = cert_sig_digest_update(sig, tfm);
>  		if (ret)
>  			goto error_free_key;
> 
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAMCTd2kkax9P-OFNHYYz8nKuaKOOkz-zoJ7h2nZ6maUGmjXC-g@mail.gmail.com>]

* Re:
       [not found] <CAMCTd2kkax9P-OFNHYYz8nKuaKOOkz-zoJ7h2nZ6maUGmjXC-g@mail.gmail.com>
@ 2021-03-16 12:16 ` westjoshuaalan
  0 siblings, 0 replies; 1651+ messages in thread
From: westjoshuaalan @ 2021-03-16 12:16 UTC (permalink / raw)
  To: linux-rdma

subscribe linux-rdma

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-04-05  0:01 Mitali Borkar
  2021-04-06  7:03 ` Arnd Bergmann
  0 siblings, 1 reply; 1651+ messages in thread
From: Mitali Borkar @ 2021-04-05  0:01 UTC (permalink / raw)
  To: manish, GR-Linux-NIC-Dev, gregkh; +Cc: linux-staging, linux-kernel

outreachy-kernel@googlegroups.com, mitaliborkar810@gmail.com 
Bcc: 
Subject: [PATCH] staging: qlge:remove else after break
Reply-To: 

Fixed Warning:- else is not needed after break
break terminates the loop if encountered. else is unnecessary and
increases indenatation

Signed-off-by: Mitali Borkar <mitaliborkar810@gmail.com>
---
 drivers/staging/qlge/qlge_mpi.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/staging/qlge/qlge_mpi.c b/drivers/staging/qlge/qlge_mpi.c
index 2630ebf50341..3a49f187203b 100644
--- a/drivers/staging/qlge/qlge_mpi.c
+++ b/drivers/staging/qlge/qlge_mpi.c
@@ -935,13 +935,11 @@ static int qlge_idc_wait(struct qlge_adapter *qdev)
 			netif_err(qdev, drv, qdev->ndev, "IDC Success.\n");
 			status = 0;
 			break;
-		} else {
-			netif_err(qdev, drv, qdev->ndev,
+		}	netif_err(qdev, drv, qdev->ndev,
 				  "IDC: Invalid State 0x%.04x.\n",
 				  mbcp->mbox_out[0]);
 			status = -EIO;
 			break;
-		}
 	}
 
 	return status;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2021-04-05  0:01 Mitali Borkar
@ 2021-04-06  7:03 ` Arnd Bergmann
  0 siblings, 0 replies; 1651+ messages in thread
From: Arnd Bergmann @ 2021-04-06  7:03 UTC (permalink / raw)
  To: Mitali Borkar
  Cc: manish, GR-Linux-NIC-Dev, gregkh, linux-staging,
	Linux Kernel Mailing List

On Mon, Apr 5, 2021 at 2:03 AM Mitali Borkar <mitaliborkar810@gmail.com> wrote:
>
> outreachy-kernel@googlegroups.com, mitaliborkar810@gmail.com
> Bcc:
> Subject: [PATCH] staging: qlge:remove else after break
> Reply-To:
>
> Fixed Warning:- else is not needed after break
> break terminates the loop if encountered. else is unnecessary and
> increases indenatation
>
> Signed-off-by: Mitali Borkar <mitaliborkar810@gmail.com>
> ---
>  drivers/staging/qlge/qlge_mpi.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/staging/qlge/qlge_mpi.c b/drivers/staging/qlge/qlge_mpi.c
> index 2630ebf50341..3a49f187203b 100644
> --- a/drivers/staging/qlge/qlge_mpi.c
> +++ b/drivers/staging/qlge/qlge_mpi.c
> @@ -935,13 +935,11 @@ static int qlge_idc_wait(struct qlge_adapter *qdev)
>                         netif_err(qdev, drv, qdev->ndev, "IDC Success.\n");
>                         status = 0;
>                         break;
> -               } else {
> -                       netif_err(qdev, drv, qdev->ndev,
> +               }       netif_err(qdev, drv, qdev->ndev,
>                                   "IDC: Invalid State 0x%.04x.\n",
>                                   mbcp->mbox_out[0]);
>                         status = -EIO;
>                         break;
> -               }
>         }

It looks like you got this one wrong in multiple ways:

- This is not an equivalent transformation, since the errror is now
  printed in the first part of the 'if()' block as well.

- The indentation is wrong now, with the netif_err() starting in the
  same line as the '}'.

- The description mentions a change in indentation, but you did not
   actually change it.

- The changelog text appears mangled.

        Arnd

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-04-05 21:12 David Villasana Jiménez
  2021-04-06  5:17 ` Greg KH
  0 siblings, 1 reply; 1651+ messages in thread
From: David Villasana Jiménez @ 2021-04-05 21:12 UTC (permalink / raw)
  To: mchehab; +Cc: linux-media, linux-staging

linux-kernel@vger.kernel.org, outreachy-kernel@googlegroups.com
Bcc: 
Subject: [PATCH] staging: media: atomisp: i2c: Fix alignment to match open
 parenthesis
Reply-To: 

Change alignment of arguments in the function
__gc0310_write_reg_is_consecutive() to match open parenthesis. Issue found
by checkpatch.pl

Signed-off-by: David Villasana Jiménez <davidvillasana14@gmail.com>
---
 drivers/staging/media/atomisp/i2c/atomisp-gc0310.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/atomisp/i2c/atomisp-gc0310.c b/drivers/staging/media/atomisp/i2c/atomisp-gc0310.c
index 2b71de722ec3..6be3ee1d93a5 100644
--- a/drivers/staging/media/atomisp/i2c/atomisp-gc0310.c
+++ b/drivers/staging/media/atomisp/i2c/atomisp-gc0310.c
@@ -192,8 +192,8 @@ static int __gc0310_buf_reg_array(struct i2c_client *client,
 }
 
 static int __gc0310_write_reg_is_consecutive(struct i2c_client *client,
-	struct gc0310_write_ctrl *ctrl,
-	const struct gc0310_reg *next)
+					     struct gc0310_write_ctrl *ctrl,
+					     const struct gc0310_reg *next)
 {
 	if (ctrl->index == 0)
 		return 1;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2021-04-05 21:12 David Villasana Jiménez
@ 2021-04-06  5:17 ` Greg KH
  0 siblings, 0 replies; 1651+ messages in thread
From: Greg KH @ 2021-04-06  5:17 UTC (permalink / raw)
  To: David Villasana Jiménez; +Cc: mchehab, linux-media, linux-staging

On Mon, Apr 05, 2021 at 04:12:48PM -0500, David Villasana Jiménez wrote:
> linux-kernel@vger.kernel.org, outreachy-kernel@googlegroups.com
> Bcc: 
> Subject: [PATCH] staging: media: atomisp: i2c: Fix alignment to match open
>  parenthesis
> Reply-To: 

Something went wrong with your email again :(

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-04-15 13:41 Emmanuel Blot
  2021-04-15 16:07 ` Palmer Dabbelt
  2021-04-15 22:27 ` Re: Alistair Francis
  0 siblings, 2 replies; 1651+ messages in thread
From: Emmanuel Blot @ 2021-04-15 13:41 UTC (permalink / raw)
  To: qemu-riscv
  Cc: Emmanuel Blot, Palmer Dabbelt, Alistair Francis, Sagar Karandikar,
	Bastian Koppelmann

Date: Tue, 13 Apr 2021 18:01:52 +0200
Subject: [PATCH] target/riscv: fix exception index on instruction access fault

When no MMU is used and the guest code attempts to fetch an instruction
from an invalid memory location, the exception index defaults to a data
load access fault, rather an instruction access fault.

Signed-off-by: Emmanuel Blot <emmanuel.blot@sifive.com>

---
 target/riscv/cpu_helper.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 21c54ef5613..4e107b1bd23 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -691,8 +691,10 @@ void riscv_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
 
     if (access_type == MMU_DATA_STORE) {
         cs->exception_index = RISCV_EXCP_STORE_AMO_ACCESS_FAULT;
-    } else {
+    } else if (access_type == MMU_DATA_LOAD) {
         cs->exception_index = RISCV_EXCP_LOAD_ACCESS_FAULT;
+    } else {
+        cs->exception_index = RISCV_EXCP_INST_ACCESS_FAULT;
     }
 
     env->badaddr = addr;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2021-04-15 13:41 Emmanuel Blot
@ 2021-04-15 16:07 ` Palmer Dabbelt
  2021-04-15 22:27 ` Re: Alistair Francis
  1 sibling, 0 replies; 1651+ messages in thread
From: Palmer Dabbelt @ 2021-04-15 16:07 UTC (permalink / raw)
  To: emmanuel.blot
  Cc: qemu-riscv, emmanuel.blot, Alistair Francis, sagark,
	Bastian Koppelmann

On Thu, 15 Apr 2021 06:41:29 PDT (-0700), emmanuel.blot@sifive.com wrote:
> Date: Tue, 13 Apr 2021 18:01:52 +0200
> Subject: [PATCH] target/riscv: fix exception index on instruction access fault
>
> When no MMU is used and the guest code attempts to fetch an instruction
> from an invalid memory location, the exception index defaults to a data
> load access fault, rather an instruction access fault.
>
> Signed-off-by: Emmanuel Blot <emmanuel.blot@sifive.com>
>
> ---
>  target/riscv/cpu_helper.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 21c54ef5613..4e107b1bd23 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -691,8 +691,10 @@ void riscv_cpu_do_transaction_failed(CPUState *cs, hwaddr physaddr,
>
>      if (access_type == MMU_DATA_STORE) {
>          cs->exception_index = RISCV_EXCP_STORE_AMO_ACCESS_FAULT;
> -    } else {
> +    } else if (access_type == MMU_DATA_LOAD) {
>          cs->exception_index = RISCV_EXCP_LOAD_ACCESS_FAULT;
> +    } else {
> +        cs->exception_index = RISCV_EXCP_INST_ACCESS_FAULT;
>      }
>
>      env->badaddr = addr;

Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-04-15 13:41 Emmanuel Blot
  2021-04-15 16:07 ` Palmer Dabbelt
@ 2021-04-15 22:27 ` Alistair Francis
  1 sibling, 0 replies; 1651+ messages in thread
From: Alistair Francis @ 2021-04-15 22:27 UTC (permalink / raw)
  To: qemu-riscv@nongnu.org, emmanuel.blot@sifive.com
  Cc: palmer@dabbelt.com, kbastian@mail.uni-paderborn.de,
	sagark@eecs.berkeley.edu

On Thu, 2021-04-15 at 15:41 +0200, Emmanuel Blot wrote:
> Date: Tue, 13 Apr 2021 18:01:52 +0200
> Subject: [PATCH] target/riscv: fix exception index on instruction
> access fault
> 
> When no MMU is used and the guest code attempts to fetch an instruction
> from an invalid memory location, the exception index defaults to a data
> load access fault, rather an instruction access fault.
> 
> Signed-off-by: Emmanuel Blot <emmanuel.blot@sifive.com>

Thanks for the patch. Can you send the patch to the QEMU mailling list?
qemu-devel@nongnu.org

Alistair

> 
> ---
>  target/riscv/cpu_helper.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 21c54ef5613..4e107b1bd23 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -691,8 +691,10 @@ void riscv_cpu_do_transaction_failed(CPUState *cs,
> hwaddr physaddr,
>  
>      if (access_type == MMU_DATA_STORE) {
>          cs->exception_index = RISCV_EXCP_STORE_AMO_ACCESS_FAULT;
> -    } else {
> +    } else if (access_type == MMU_DATA_LOAD) {
>          cs->exception_index = RISCV_EXCP_LOAD_ACCESS_FAULT;
> +    } else {
> +        cs->exception_index = RISCV_EXCP_INST_ACCESS_FAULT;
>      }
>  
>      env->badaddr = addr;


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <b84772b0-e009-3b68-4e74-525ad8531f95@gmail.com>]

* Re:
       [not found] <b84772b0-e009-3b68-4e74-525ad8531f95@gmail.com>
@ 2021-04-23 13:57 ` Ivan Koveshnikov
  2021-04-23 20:35   ` Re: Kajetan Puchalski
  0 siblings, 1 reply; 1651+ messages in thread
From: Ivan Koveshnikov @ 2021-04-23 13:57 UTC (permalink / raw)
  To: Kajetan Puchalski; +Cc: rust-for-linux

Hi Kajetan,

You need to send an `subscribe rust-for-linux` to
<majordomo@vger.kernel.org> email, not to the mail list.
http://vger.kernel.org/vger-lists.html page contains links that can
prepare an email message with correct format and address.

Best regards,
Ivan Koveshnikov

On Fri, 23 Apr 2021 at 02:53, Kajetan Puchalski <mrkajetanp@gmail.com> wrote:
>
> subscribe

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-04-23 13:57 ` Re: Ivan Koveshnikov
@ 2021-04-23 20:35   ` Kajetan Puchalski
  0 siblings, 0 replies; 1651+ messages in thread
From: Kajetan Puchalski @ 2021-04-23 20:35 UTC (permalink / raw)
  To: Ivan Koveshnikov; +Cc: rust-for-linux

On 4/23/21 2:57 PM, Ivan Koveshnikov wrote:
> Hi Kajetan,
> 
> You need to send an `subscribe rust-for-linux` to
> <majordomo@vger.kernel.org> email, not to the mail list.
> http://vger.kernel.org/vger-lists.html page contains links that can
> prepare an email message with correct format and address.
> 
> Best regards,
> Ivan Koveshnikov
> 
> 
> On Fri, 23 Apr 2021 at 02:53, Kajetan Puchalski <mrkajetanp@gmail.com> wrote:
>>
>> subscribe

Hi Ivan,

Thank you, apologies, I realised that I wasn't supposed to do that 
literally right after I had sent the email.

Regards,
Kajetan

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAJr+-6ZR2oH0J4D_Ou13JvX8HLUUK=MKQwD0Kn53cmvAuT99bg@mail.gmail.com>]

* Re:
       [not found] <CAJr+-6ZR2oH0J4D_Ou13JvX8HLUUK=MKQwD0Kn53cmvAuT99bg@mail.gmail.com>
@ 2021-04-27  7:56 ` Fox Chen
  0 siblings, 0 replies; 1651+ messages in thread
From: Fox Chen @ 2021-04-27  7:56 UTC (permalink / raw)
  To: Skylar Givens; +Cc: rust-for-linux

Hi Skylar,

On Tue, Apr 27, 2021 at 11:38 AM Skylar Givens <skylargivens@gmail.com> wrote:
>
> subscribe rust-for-linux

For subscribing please see:
http://vger.kernel.org/majordomo-info.html
http://vger.kernel.org/vger-lists.html


thanks,
fox

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-05-15 22:57 Dmitry Baryshkov
  2021-06-02 21:45   ` Re: Dmitry Baryshkov
  0 siblings, 1 reply; 1651+ messages in thread
From: Dmitry Baryshkov @ 2021-05-15 22:57 UTC (permalink / raw)
  To: Bjorn Andersson, Rob Clark, Sean Paul, Abhinav Kumar
  Cc: Jonathan Marek, Stephen Boyd, David Airlie, Daniel Vetter,
	linux-arm-msm, dri-devel, freedreno

From Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # This line is ignored.
From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reply-To: 
Subject: [PATCH v2 0/6] drm/msm/dpu: simplify RM code
In-Reply-To: 

There is no need to request most of hardware blocks through the resource
manager (RM), since typically there is 1:1 or N:1 relationship between
corresponding blocks. Each LM is tied to the single PP. Each MERGE_3D
can be used by the specified pair of PPs.  Each DSPP is also tied to
single LM. So instead of allocating them through the RM, get them via
static configuration.

Depends on: https://lore.kernel.org/linux-arm-msm/20210515190909.1809050-1-dmitry.baryshkov@linaro.org

Changes since v1:
 - Split into separate patch series to ease review.

----------------------------------------------------------------
Dmitry Baryshkov (6):
      drm/msm/dpu: get DSPP blocks directly rather than through RM
      drm/msm/dpu: get MERGE_3D blocks directly rather than through RM
      drm/msm/dpu: get PINGPONG blocks directly rather than through RM
      drm/msm/dpu: get INTF blocks directly rather than through RM
      drm/msm/dpu: drop unused lm_max_width from RM
      drm/msm/dpu: simplify peer LM handling

 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        |  54 +---
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h        |   8 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |   5 -
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c   |   8 -
 .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c   |   8 -
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     |   2 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.c          |  14 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h          |   7 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c    |   7 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h    |   4 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c            |  53 +++-
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h            |   5 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             | 310 ++-------------------
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h             |  18 +-
 drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h          |   9 +-
 16 files changed, 115 insertions(+), 401 deletions(-)



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-05-15 22:57 Dmitry Baryshkov
@ 2021-06-02 21:45   ` Dmitry Baryshkov
  0 siblings, 0 replies; 1651+ messages in thread
From: Dmitry Baryshkov @ 2021-06-02 21:45 UTC (permalink / raw)
  To: Bjorn Andersson, Rob Clark, Sean Paul, Abhinav Kumar
  Cc: Jonathan Marek, Stephen Boyd, David Airlie, Daniel Vetter,
	linux-arm-msm, dri-devel, freedreno

On 16/05/2021 01:57, Dmitry Baryshkov wrote:
>  From Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # This line is ignored.
> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Reply-To:
> Subject: [PATCH v2 0/6] drm/msm/dpu: simplify RM code
> In-Reply-To:
> 
> There is no need to request most of hardware blocks through the resource
> manager (RM), since typically there is 1:1 or N:1 relationship between
> corresponding blocks. Each LM is tied to the single PP. Each MERGE_3D
> can be used by the specified pair of PPs.  Each DSPP is also tied to
> single LM. So instead of allocating them through the RM, get them via
> static configuration.
> 
> Depends on: https://lore.kernel.org/linux-arm-msm/20210515190909.1809050-1-dmitry.baryshkov@linaro.org
> 
> Changes since v1:
>   - Split into separate patch series to ease review.

Another gracious ping, now for this series.

I want to send next version with minor changes, but I'd like to hear 
your overall opinion before doing that.

> 
> ----------------------------------------------------------------
> Dmitry Baryshkov (6):
>        drm/msm/dpu: get DSPP blocks directly rather than through RM
>        drm/msm/dpu: get MERGE_3D blocks directly rather than through RM
>        drm/msm/dpu: get PINGPONG blocks directly rather than through RM
>        drm/msm/dpu: get INTF blocks directly rather than through RM
>        drm/msm/dpu: drop unused lm_max_width from RM
>        drm/msm/dpu: simplify peer LM handling
> 
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        |  54 +---
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h        |   8 -
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |   5 -
>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c   |   8 -
>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c   |   8 -
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     |   2 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |   4 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.c          |  14 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h          |   7 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c    |   7 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h    |   4 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c            |  53 +++-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h            |   5 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             | 310 ++-------------------
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h             |  18 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h          |   9 +-
>   16 files changed, 115 insertions(+), 401 deletions(-)
> 
> 


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2021-06-02 21:45   ` Dmitry Baryshkov
  0 siblings, 0 replies; 1651+ messages in thread
From: Dmitry Baryshkov @ 2021-06-02 21:45 UTC (permalink / raw)
  To: Bjorn Andersson, Rob Clark, Sean Paul, Abhinav Kumar
  Cc: Jonathan Marek, Stephen Boyd, linux-arm-msm, dri-devel,
	David Airlie, freedreno

On 16/05/2021 01:57, Dmitry Baryshkov wrote:
>  From Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # This line is ignored.
> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Reply-To:
> Subject: [PATCH v2 0/6] drm/msm/dpu: simplify RM code
> In-Reply-To:
> 
> There is no need to request most of hardware blocks through the resource
> manager (RM), since typically there is 1:1 or N:1 relationship between
> corresponding blocks. Each LM is tied to the single PP. Each MERGE_3D
> can be used by the specified pair of PPs.  Each DSPP is also tied to
> single LM. So instead of allocating them through the RM, get them via
> static configuration.
> 
> Depends on: https://lore.kernel.org/linux-arm-msm/20210515190909.1809050-1-dmitry.baryshkov@linaro.org
> 
> Changes since v1:
>   - Split into separate patch series to ease review.

Another gracious ping, now for this series.

I want to send next version with minor changes, but I'd like to hear 
your overall opinion before doing that.

> 
> ----------------------------------------------------------------
> Dmitry Baryshkov (6):
>        drm/msm/dpu: get DSPP blocks directly rather than through RM
>        drm/msm/dpu: get MERGE_3D blocks directly rather than through RM
>        drm/msm/dpu: get PINGPONG blocks directly rather than through RM
>        drm/msm/dpu: get INTF blocks directly rather than through RM
>        drm/msm/dpu: drop unused lm_max_width from RM
>        drm/msm/dpu: simplify peer LM handling
> 
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c        |  54 +---
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h        |   8 -
>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder_phys.h   |   5 -
>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_cmd.c   |   8 -
>   .../gpu/drm/msm/disp/dpu1/dpu_encoder_phys_vid.c   |   8 -
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c     |   2 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h     |   4 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.c          |  14 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_lm.h          |   7 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.c    |   7 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_hw_pingpong.h    |   4 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c            |  53 +++-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h            |   5 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c             | 310 ++-------------------
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.h             |  18 +-
>   drivers/gpu/drm/msm/disp/dpu1/dpu_trace.h          |   9 +-
>   16 files changed, 115 insertions(+), 401 deletions(-)
> 
> 


-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <60a57e3a.lbqA81rLGmtH2qoy%Radisson97@gmx.de>]

* Re:
       [not found] <60a57e3a.lbqA81rLGmtH2qoy%Radisson97@gmx.de>
@ 2021-05-21 11:04 ` Alejandro Colomar (man-pages)
  0 siblings, 0 replies; 1651+ messages in thread
From: Alejandro Colomar (man-pages) @ 2021-05-21 11:04 UTC (permalink / raw)
  To: Radisson97; +Cc: linux-man, Michael Kerrisk (man-pages)

Hi Walter,

On 5/19/21 11:08 PM, Radisson97@gmx.de wrote:
>  From 765db7b7714514780b4e613c6d09d2ff454b1ef8 Mon Sep 17 00:00:00 2001
> From: Harms <wharms@bfs.de>
> Date: Wed, 19 May 2021 22:25:08 +0200
> Subject: [PATCH] gamma.3:Add reentrant functions gamma_r
> 
> Add three variants of gamma_r and explain
> the use of the second argument sig
> 
> Signed-off-by: Harms <wharms@bfs.de>

I just read the manual page about gamma, and those functions/macros are 
deprecated (use either lgamma or tgamma instead).  As far as I can read, 
those alternative functions have all the functionality one can need, so 
I guess there's zero reasons to use gamma at all, which is a misleading 
alias for lgamma.  I think I won't patch that page at all.

The glibc source code itself has a comment saying that gamma macros are 
obsolete:

[
#if defined __USE_MISC || (defined __USE_XOPEN && !defined  __USE_XOPEN2K)
# if !__MATH_DECLARING_FLOATN
/* Obsolete alias for `lgamma'.  */
__MATHCALL (gamma,, (_Mdouble_));
# endif
#endif
]

Thanks,

Alex


-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
Senior SW Engineer; http://www.alejandro-colomar.es/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-06-06 19:19 Davidlohr Bueso
  2021-06-07 16:02 ` André Almeida
  0 siblings, 1 reply; 1651+ messages in thread
From: Davidlohr Bueso @ 2021-06-06 19:19 UTC (permalink / raw)
  To: Andrï¿½ Almeida
  Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	linux-kernel, Steven Rostedt, Sebastian Andrzej Siewior, kernel,
	krisman, pgriffais, z.figura12, joel, malteskarupke, linux-api,
	fweimer, libc-alpha, linux-kselftest, shuah, acme, corbet,
	Peter Oskolkov, Andrey Semashev, mtk.manpages

Bcc:
Subject: Re: [PATCH v4 07/15] docs: locking: futex2: Add documentation
Reply-To:
In-Reply-To: <20210603195924.361327-8-andrealmeid@collabora.com>

On Thu, 03 Jun 2021, Andrï¿½ Almeida wrote:

>Add a new documentation file specifying both userspace API and internal
>implementation details of futex2 syscalls.

I think equally important would be to provide a manpage for each new
syscall you are introducing, and keep mkt in the loop as in the past he
extensively documented and improved futex manpages, and overall has a
lot of experience with dealing with kernel interfaces.

Thanks,
Davidlohr

>
>Signed-off-by: André Almeida <andrealmeid@collabora.com>
>---
> Documentation/locking/futex2.rst | 198 +++++++++++++++++++++++++++++++
> Documentation/locking/index.rst  |   1 +
> 2 files changed, 199 insertions(+)
> create mode 100644 Documentation/locking/futex2.rst
>
>diff --git a/Documentation/locking/futex2.rst b/Documentation/locking/futex2.rst
>new file mode 100644
>index 000000000000..2f74d7c97a55
>--- /dev/null
>+++ b/Documentation/locking/futex2.rst
>@@ -0,0 +1,198 @@
>+.. SPDX-License-Identifier: GPL-2.0
>+
>+======
>+futex2
>+======
>+
>+:Author: André Almeida <andrealmeid@collabora.com>
>+
>+futex, or fast user mutex, is a set of syscalls to allow userspace to create
>+performant synchronization mechanisms, such as mutexes, semaphores and
>+conditional variables in userspace. C standard libraries, like glibc, uses it
>+as a means to implement more high level interfaces like pthreads.
>+
>+The interface
>+=============
>+
>+uAPI functions
>+--------------
>+
>+.. kernel-doc:: kernel/futex2.c
>+   :identifiers: sys_futex_wait sys_futex_wake sys_futex_waitv sys_futex_requeue
>+
>+uAPI structures
>+---------------
>+
>+.. kernel-doc:: include/uapi/linux/futex.h
>+
>+The ``flag`` argument
>+---------------------
>+
>+The flag is used to specify the size of the futex word
>+(FUTEX_[8, 16, 32, 64]). It's mandatory to define one, since there's no
>+default size.
>+
>+By default, the timeout uses a monotonic clock, but can be used as a realtime
>+one by using the FUTEX_REALTIME_CLOCK flag.
>+
>+By default, futexes are of the private type, that means that this user address
>+will be accessed by threads that share the same memory region. This allows for
>+some internal optimizations, so they are faster. However, if the address needs
>+to be shared with different processes (like using ``mmap()`` or ``shm()``), they
>+need to be defined as shared and the flag FUTEX_SHARED_FLAG is used to set that.
>+
>+By default, the operation has no NUMA-awareness, meaning that the user can't
>+choose the memory node where the kernel side futex data will be stored. The
>+user can choose the node where it wants to operate by setting the
>+FUTEX_NUMA_FLAG and using the following structure (where X can be 8, 16, 32 or
>+64)::
>+
>+ struct futexX_numa {
>+         __uX value;
>+         __sX hint;
>+ };
>+
>+This structure should be passed at the ``void *uaddr`` of futex functions. The
>+address of the structure will be used to be waited on/waken on, and the
>+``value`` will be compared to ``val`` as usual. The ``hint`` member is used to
>+define which node the futex will use. When waiting, the futex will be
>+registered on a kernel-side table stored on that node; when waking, the futex
>+will be searched for on that given table. That means that there's no redundancy
>+between tables, and the wrong ``hint`` value will lead to undesired behavior.
>+Userspace is responsible for dealing with node migrations issues that may
>+occur. ``hint`` can range from [0, MAX_NUMA_NODES), for specifying a node, or
>+-1, to use the same node the current process is using.
>+
>+When not using FUTEX_NUMA_FLAG on a NUMA system, the futex will be stored on a
>+global table on allocated on the first node.
>+
>+The ``timo`` argument
>+---------------------
>+
>+As per the Y2038 work done in the kernel, new interfaces shouldn't add timeout
>+options known to be buggy. Given that, ``timo`` should be a 64-bit timeout at
>+all platforms, using an absolute timeout value.
>+
>+Implementation
>+==============
>+
>+The internal implementation follows a similar design to the original futex.
>+Given that we want to replicate the same external behavior of current futex,
>+this should be somewhat expected.
>+
>+Waiting
>+-------
>+
>+For the wait operations, they are all treated as if you want to wait on N
>+futexes, so the path for futex_wait and futex_waitv is the basically the same.
>+For both syscalls, the first step is to prepare an internal list for the list
>+of futexes to wait for (using struct futexv_head). For futex_wait() calls, this
>+list will have a single object.
>+
>+We have a hash table, where waiters register themselves before sleeping. Then
>+the wake function checks this table looking for waiters at uaddr.  The hash
>+bucket to be used is determined by a struct futex_key, that stores information
>+to uniquely identify an address from a given process. Given the huge address
>+space, there'll be hash collisions, so we store information to be later used on
>+collision treatment.
>+
>+First, for every futex we want to wait on, we check if (``*uaddr == val``).
>+This check is done holding the bucket lock, so we are correctly serialized with
>+any futex_wake() calls. If any waiter fails the check above, we dequeue all
>+futexes. The check (``*uaddr == val``) can fail for two reasons:
>+
>+- The values are different, and we return -EAGAIN. However, if while
>+  dequeueing we found that some futexes were awakened, we prioritize this
>+  and return success.
>+
>+- When trying to access the user address, we do so with page faults
>+  disabled because we are holding a bucket's spin lock (and can't sleep
>+  while holding a spin lock). If there's an error, it might be a page
>+  fault, or an invalid address. We release the lock, dequeue everyone
>+  (because it's illegal to sleep while there are futexes enqueued, we
>+  could lose wakeups) and try again with page fault enabled. If we
>+  succeed, this means that the address is valid, but we need to do
>+  all the work again. For serialization reasons, we need to have the
>+  spin lock when getting the user value. Additionally, for shared
>+  futexes, we also need to recalculate the hash, since the underlying
>+  mapping mechanisms could have changed when dealing with page fault.
>+  If, even with page fault enabled, we can't access the address, it
>+  means it's an invalid user address, and we return -EFAULT. For this
>+  case, we prioritize the error, even if some futexes were awaken.
>+
>+If the check is OK, they are enqueued on a linked list in our bucket, and
>+proceed to the next one. If all waiters succeed, we put the thread to sleep
>+until a futex_wake() call, timeout expires or we get a signal. After waking up,
>+we dequeue everyone, and check if some futex was awakened. This dequeue is done
>+by iteratively walking at each element of struct futex_head list.
>+
>+All enqueuing/dequeuing operations requires to hold the bucket lock, to avoid
>+racing while modifying the list.
>+
>+Waking
>+------
>+
>+We get the bucket that's storing the waiters at uaddr, and wake the required
>+number of waiters, checking for hash collision.
>+
>+There's an optimization that makes futex_wake() not take the bucket lock if
>+there's no one to be woken on that bucket. It checks an atomic counter that each
>+bucket has, if it says 0, then the syscall exits. In order for this to work, the
>+waiter thread increases it before taking the lock, so the wake thread will
>+correctly see that there's someone waiting and will continue the path to take
>+the bucket lock. To get the correct serialization, the waiter issues a memory
>+barrier after increasing the bucket counter and the waker issues a memory
>+barrier before checking it.
>+
>+Requeuing
>+---------
>+
>+The requeue path first checks for each struct futex_requeue and their flags.
>+Then, it will compare the expected value with the one at uaddr1::uaddr.
>+Following the same serialization explained at Waking_, we increase the atomic
>+counter for the bucket of uaddr2 before taking the lock. We need to have both
>+buckets locks at same time so we don't race with other futex operation. To
>+ensure the locks are taken in the same order for all threads (and thus avoiding
>+deadlocks), every requeue operation takes the "smaller" bucket first, when
>+comparing both addresses.
>+
>+If the compare with user value succeeds, we proceed by waking ``nr_wake``
>+futexes, and then requeuing ``nr_requeue`` from bucket of uaddr1 to the uaddr2.
>+This consists in a simple list deletion/addition and replacing the old futex key
>+with the new one.
>+
>+Futex keys
>+----------
>+
>+There are two types of futexes: private and shared ones. The private are futexes
>+meant to be used by threads that share the same memory space, are easier to be
>+uniquely identified and thus can have some performance optimization. The
>+elements for identifying one are: the start address of the page where the
>+address is, the address offset within the page and the current->mm pointer.
>+
>+Now, for uniquely identifying a shared futex:
>+
>+- If the page containing the user address is an anonymous page, we can
>+  just use the same data used for private futexes (the start address of
>+  the page, the address offset within the page and the current->mm
>+  pointer); that will be enough for uniquely identifying such futex. We
>+  also set one bit at the key to differentiate if a private futex is
>+  used on the same address (mixing shared and private calls does not
>+  work).
>+
>+- If the page is file-backed, current->mm maybe isn't the same one for
>+  every user of this futex, so we need to use other data: the
>+  page->index, a UUID for the struct inode and the offset within the
>+  page.
>+
>+Note that members of futex_key don't have any particular meaning after they
>+are part of the struct - they are just bytes to identify a futex.  Given that,
>+we don't need to use a particular name or type that matches the original data,
>+we only need to care about the bitsize of each component and make both private
>+and shared fit in the same memory space.
>+
>+Source code documentation
>+=========================
>+
>+.. kernel-doc:: kernel/futex2.c
>+   :no-identifiers: sys_futex_wait sys_futex_wake sys_futex_waitv sys_futex_requeue
>diff --git a/Documentation/locking/index.rst b/Documentation/locking/index.rst
>index 7003bd5aeff4..9bf03c7fa1ec 100644
>--- a/Documentation/locking/index.rst
>+++ b/Documentation/locking/index.rst
>@@ -24,6 +24,7 @@ locking
>     percpu-rw-semaphore
>     robust-futexes
>     robust-futex-ABI
>+    futex2
>
> .. only::  subproject and html
>
>--
>2.31.1
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-06-06 19:19 Davidlohr Bueso
@ 2021-06-07 16:02 ` André Almeida
  0 siblings, 0 replies; 1651+ messages in thread
From: André Almeida @ 2021-06-07 16:02 UTC (permalink / raw)
  To: Davidlohr Bueso
  Cc: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Darren Hart,
	linux-kernel, Steven Rostedt, Sebastian Andrzej Siewior, kernel,
	krisman, pgriffais, z.figura12, joel, malteskarupke, linux-api,
	fweimer, libc-alpha, linux-kselftest, shuah, acme, corbet,
	Peter Oskolkov, Andrey Semashev, mtk.manpages

Às 16:19 de 06/06/21, Davidlohr Bueso escreveu:
> Bcc:
> Subject: Re: [PATCH v4 07/15] docs: locking: futex2: Add documentation
> Reply-To:
> In-Reply-To: <20210603195924.361327-8-andrealmeid@collabora.com>
> 
> On Thu, 03 Jun 2021, Andrï¿½ Almeida wrote:
> 
>> Add a new documentation file specifying both userspace API and internal
>> implementation details of futex2 syscalls.
> 
> I think equally important would be to provide a manpage for each new
> syscall you are introducing, and keep mkt in the loop as in the past he
> extensively documented and improved futex manpages, and overall has a
> lot of experience with dealing with kernel interfaces.

Right, I'll add the man pages in a future version and make sure to have
mkt in the loop, thanks for the tip.

> 
> Thanks,
> Davidlohr
> 
>>
>> Signed-off-by: André Almeida <andrealmeid@collabora.com>
>> ---
>> Documentation/locking/futex2.rst | 198 +++++++++++++++++++++++++++++++
>> Documentation/locking/index.rst  |   1 +
>> 2 files changed, 199 insertions(+)
>> create mode 100644 Documentation/locking/futex2.rst
>>
>> diff --git a/Documentation/locking/futex2.rst
>> b/Documentation/locking/futex2.rst
>> new file mode 100644
>> index 000000000000..2f74d7c97a55
>> --- /dev/null
>> +++ b/Documentation/locking/futex2.rst
>> @@ -0,0 +1,198 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +======
>> +futex2
>> +======
>> +
>> +:Author: André Almeida <andrealmeid@collabora.com>
>> +
>> +futex, or fast user mutex, is a set of syscalls to allow userspace to
>> create
>> +performant synchronization mechanisms, such as mutexes, semaphores and
>> +conditional variables in userspace. C standard libraries, like glibc,
>> uses it
>> +as a means to implement more high level interfaces like pthreads.
>> +
>> +The interface
>> +=============
>> +
>> +uAPI functions
>> +--------------
>> +
>> +.. kernel-doc:: kernel/futex2.c
>> +   :identifiers: sys_futex_wait sys_futex_wake sys_futex_waitv
>> sys_futex_requeue
>> +
>> +uAPI structures
>> +---------------
>> +
>> +.. kernel-doc:: include/uapi/linux/futex.h
>> +
>> +The ``flag`` argument
>> +---------------------
>> +
>> +The flag is used to specify the size of the futex word
>> +(FUTEX_[8, 16, 32, 64]). It's mandatory to define one, since there's no
>> +default size.
>> +
>> +By default, the timeout uses a monotonic clock, but can be used as a
>> realtime
>> +one by using the FUTEX_REALTIME_CLOCK flag.
>> +
>> +By default, futexes are of the private type, that means that this
>> user address
>> +will be accessed by threads that share the same memory region. This
>> allows for
>> +some internal optimizations, so they are faster. However, if the
>> address needs
>> +to be shared with different processes (like using ``mmap()`` or
>> ``shm()``), they
>> +need to be defined as shared and the flag FUTEX_SHARED_FLAG is used
>> to set that.
>> +
>> +By default, the operation has no NUMA-awareness, meaning that the
>> user can't
>> +choose the memory node where the kernel side futex data will be
>> stored. The
>> +user can choose the node where it wants to operate by setting the
>> +FUTEX_NUMA_FLAG and using the following structure (where X can be 8,
>> 16, 32 or
>> +64)::
>> +
>> + struct futexX_numa {
>> +         __uX value;
>> +         __sX hint;
>> + };
>> +
>> +This structure should be passed at the ``void *uaddr`` of futex
>> functions. The
>> +address of the structure will be used to be waited on/waken on, and the
>> +``value`` will be compared to ``val`` as usual. The ``hint`` member
>> is used to
>> +define which node the futex will use. When waiting, the futex will be
>> +registered on a kernel-side table stored on that node; when waking,
>> the futex
>> +will be searched for on that given table. That means that there's no
>> redundancy
>> +between tables, and the wrong ``hint`` value will lead to undesired
>> behavior.
>> +Userspace is responsible for dealing with node migrations issues that
>> may
>> +occur. ``hint`` can range from [0, MAX_NUMA_NODES), for specifying a
>> node, or
>> +-1, to use the same node the current process is using.
>> +
>> +When not using FUTEX_NUMA_FLAG on a NUMA system, the futex will be
>> stored on a
>> +global table on allocated on the first node.
>> +
>> +The ``timo`` argument
>> +---------------------
>> +
>> +As per the Y2038 work done in the kernel, new interfaces shouldn't
>> add timeout
>> +options known to be buggy. Given that, ``timo`` should be a 64-bit
>> timeout at
>> +all platforms, using an absolute timeout value.
>> +
>> +Implementation
>> +==============
>> +
>> +The internal implementation follows a similar design to the original
>> futex.
>> +Given that we want to replicate the same external behavior of current
>> futex,
>> +this should be somewhat expected.
>> +
>> +Waiting
>> +-------
>> +
>> +For the wait operations, they are all treated as if you want to wait
>> on N
>> +futexes, so the path for futex_wait and futex_waitv is the basically
>> the same.
>> +For both syscalls, the first step is to prepare an internal list for
>> the list
>> +of futexes to wait for (using struct futexv_head). For futex_wait()
>> calls, this
>> +list will have a single object.
>> +
>> +We have a hash table, where waiters register themselves before
>> sleeping. Then
>> +the wake function checks this table looking for waiters at uaddr. 
>> The hash
>> +bucket to be used is determined by a struct futex_key, that stores
>> information
>> +to uniquely identify an address from a given process. Given the huge
>> address
>> +space, there'll be hash collisions, so we store information to be
>> later used on
>> +collision treatment.
>> +
>> +First, for every futex we want to wait on, we check if (``*uaddr ==
>> val``).
>> +This check is done holding the bucket lock, so we are correctly
>> serialized with
>> +any futex_wake() calls. If any waiter fails the check above, we
>> dequeue all
>> +futexes. The check (``*uaddr == val``) can fail for two reasons:
>> +
>> +- The values are different, and we return -EAGAIN. However, if while
>> +  dequeueing we found that some futexes were awakened, we prioritize
>> this
>> +  and return success.
>> +
>> +- When trying to access the user address, we do so with page faults
>> +  disabled because we are holding a bucket's spin lock (and can't sleep
>> +  while holding a spin lock). If there's an error, it might be a page
>> +  fault, or an invalid address. We release the lock, dequeue everyone
>> +  (because it's illegal to sleep while there are futexes enqueued, we
>> +  could lose wakeups) and try again with page fault enabled. If we
>> +  succeed, this means that the address is valid, but we need to do
>> +  all the work again. For serialization reasons, we need to have the
>> +  spin lock when getting the user value. Additionally, for shared
>> +  futexes, we also need to recalculate the hash, since the underlying
>> +  mapping mechanisms could have changed when dealing with page fault.
>> +  If, even with page fault enabled, we can't access the address, it
>> +  means it's an invalid user address, and we return -EFAULT. For this
>> +  case, we prioritize the error, even if some futexes were awaken.
>> +
>> +If the check is OK, they are enqueued on a linked list in our bucket,
>> and
>> +proceed to the next one. If all waiters succeed, we put the thread to
>> sleep
>> +until a futex_wake() call, timeout expires or we get a signal. After
>> waking up,
>> +we dequeue everyone, and check if some futex was awakened. This
>> dequeue is done
>> +by iteratively walking at each element of struct futex_head list.
>> +
>> +All enqueuing/dequeuing operations requires to hold the bucket lock,
>> to avoid
>> +racing while modifying the list.
>> +
>> +Waking
>> +------
>> +
>> +We get the bucket that's storing the waiters at uaddr, and wake the
>> required
>> +number of waiters, checking for hash collision.
>> +
>> +There's an optimization that makes futex_wake() not take the bucket
>> lock if
>> +there's no one to be woken on that bucket. It checks an atomic
>> counter that each
>> +bucket has, if it says 0, then the syscall exits. In order for this
>> to work, the
>> +waiter thread increases it before taking the lock, so the wake thread
>> will
>> +correctly see that there's someone waiting and will continue the path
>> to take
>> +the bucket lock. To get the correct serialization, the waiter issues
>> a memory
>> +barrier after increasing the bucket counter and the waker issues a
>> memory
>> +barrier before checking it.
>> +
>> +Requeuing
>> +---------
>> +
>> +The requeue path first checks for each struct futex_requeue and their
>> flags.
>> +Then, it will compare the expected value with the one at uaddr1::uaddr.
>> +Following the same serialization explained at Waking_, we increase
>> the atomic
>> +counter for the bucket of uaddr2 before taking the lock. We need to
>> have both
>> +buckets locks at same time so we don't race with other futex
>> operation. To
>> +ensure the locks are taken in the same order for all threads (and
>> thus avoiding
>> +deadlocks), every requeue operation takes the "smaller" bucket first,
>> when
>> +comparing both addresses.
>> +
>> +If the compare with user value succeeds, we proceed by waking
>> ``nr_wake``
>> +futexes, and then requeuing ``nr_requeue`` from bucket of uaddr1 to
>> the uaddr2.
>> +This consists in a simple list deletion/addition and replacing the
>> old futex key
>> +with the new one.
>> +
>> +Futex keys
>> +----------
>> +
>> +There are two types of futexes: private and shared ones. The private
>> are futexes
>> +meant to be used by threads that share the same memory space, are
>> easier to be
>> +uniquely identified and thus can have some performance optimization. The
>> +elements for identifying one are: the start address of the page where
>> the
>> +address is, the address offset within the page and the current->mm
>> pointer.
>> +
>> +Now, for uniquely identifying a shared futex:
>> +
>> +- If the page containing the user address is an anonymous page, we can
>> +  just use the same data used for private futexes (the start address of
>> +  the page, the address offset within the page and the current->mm
>> +  pointer); that will be enough for uniquely identifying such futex. We
>> +  also set one bit at the key to differentiate if a private futex is
>> +  used on the same address (mixing shared and private calls does not
>> +  work).
>> +
>> +- If the page is file-backed, current->mm maybe isn't the same one for
>> +  every user of this futex, so we need to use other data: the
>> +  page->index, a UUID for the struct inode and the offset within the
>> +  page.
>> +
>> +Note that members of futex_key don't have any particular meaning
>> after they
>> +are part of the struct - they are just bytes to identify a futex. 
>> Given that,
>> +we don't need to use a particular name or type that matches the
>> original data,
>> +we only need to care about the bitsize of each component and make
>> both private
>> +and shared fit in the same memory space.
>> +
>> +Source code documentation
>> +=========================
>> +
>> +.. kernel-doc:: kernel/futex2.c
>> +   :no-identifiers: sys_futex_wait sys_futex_wake sys_futex_waitv
>> sys_futex_requeue
>> diff --git a/Documentation/locking/index.rst
>> b/Documentation/locking/index.rst
>> index 7003bd5aeff4..9bf03c7fa1ec 100644
>> --- a/Documentation/locking/index.rst
>> +++ b/Documentation/locking/index.rst
>> @@ -24,6 +24,7 @@ locking
>>     percpu-rw-semaphore
>>     robust-futexes
>>     robust-futex-ABI
>> +    futex2
>>
>> .. only::  subproject and html
>>
>> -- 
>> 2.31.1
>>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAFBCWQJX4Xy8Sot7en5JBTuKrzy=_6xFkc+QgOxJEC7G6x+jzg@mail.gmail.com>]

* Re:
       [not found] <CAFBCWQJX4Xy8Sot7en5JBTuKrzy=_6xFkc+QgOxJEC7G6x+jzg@mail.gmail.com>
@ 2021-06-12  3:43 ` Ammar Faizi
  0 siblings, 0 replies; 1651+ messages in thread
From: Ammar Faizi @ 2021-06-12  3:43 UTC (permalink / raw)
  To: io-uring

subscribe io-uring

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAGsV3ysM+p_HAq+LgOe4db09e+zRtvELHUQzCjF8FVE2UF+3Ow@mail.gmail.com>]

* Re:
       [not found] <CAGsV3ysM+p_HAq+LgOe4db09e+zRtvELHUQzCjF8FVE2UF+3Ow@mail.gmail.com>
@ 2021-06-29 13:52 ` Alex Deucher
  0 siblings, 0 replies; 1651+ messages in thread
From: Alex Deucher @ 2021-06-29 13:52 UTC (permalink / raw)
  To: shashank singh; +Cc: Maling list - DRI developers

Yes, please see this page for more information:
https://www.x.org/wiki/XorgEVoC/

Alex

On Mon, Jun 21, 2021 at 2:26 PM shashank singh
<shashanksingh819@gmail.com> wrote:
>
> Hello everyone, my name is Shashank Singh. I hope this is the right platform to reach out to the 'X.org' community. I was curious about the X.org Endless Vacation of Code. Is this program still active?
>
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-07-16 17:07 Subhasmita Swain
  2021-07-16 18:15 ` Lukas Bulwahn
  0 siblings, 1 reply; 1651+ messages in thread
From: Subhasmita Swain @ 2021-07-16 17:07 UTC (permalink / raw)
  To: lukas.bulwahn, linux-kernel-mentees


[-- Attachment #1.1: Type: text/plain, Size: 125 bytes --]

I am interested in the Mining Maintainers mentorship program and I would
like to work on the tasks for the mentee selection.

[-- Attachment #1.2: Type: text/html, Size: 343 bytes --]

[-- Attachment #2: Type: text/plain, Size: 201 bytes --]

_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-07-16 17:07 Subhasmita Swain
@ 2021-07-16 18:15 ` Lukas Bulwahn
  0 siblings, 0 replies; 1651+ messages in thread
From: Lukas Bulwahn @ 2021-07-16 18:15 UTC (permalink / raw)
  To: Subhasmita Swain; +Cc: linux-kernel-mentees

On Fri, Jul 16, 2021 at 7:07 PM Subhasmita Swain
<subhasmitaofc@gmail.com> wrote:
>
> I am interested in the Mining Maintainers mentorship program and I would like to work on the tasks for the mentee selection.

Thanks for your interest.

For the mentee selection, please work on the following exercise:

First, download the kernel git repository and compile the kernel with
the x86-defconfig build configuration.

There is a file called MAINTAINERS in the root directory of the git
repository. Read the introduction at the beginning of the MAINTAINERS
file and understand the content in the file and its organisation.

Explain in your words: What is stored in the MAINTAINERS file?

Now, search for specific MAINTAINER entries; Please answer: Who are
the maintainers and reviewers of the following sections?

AMD IOMMU (AMD-VI)
DRIVER CORE, KOBJECTS, DEBUGFS AND SYSFS
DRM DRIVERS
FUTEX SUBSYSTEM
I2C SUBSYSTEM
JAILHOUSE HYPERVISOR INTERFACE
KCOV
KCSAN
LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
NAND FLASH SUBSYSTEM
THE REST

Please let me know about your answers and always send your responses
to the linux-kernel-mentees mailing list.

After that first exercise, exercises 2 and 3 will follow.

Lukas
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v9] iomap: Support file tail packing
@ 2021-07-27  2:59 Gao Xiang
  2021-07-27 15:10 ` Darrick J. Wong
  0 siblings, 1 reply; 1651+ messages in thread
From: Gao Xiang @ 2021-07-27  2:59 UTC (permalink / raw)
  To: linux-erofs, linux-fsdevel
  Cc: Andreas Gruenbacher, Darrick J . Wong, LKML, Matthew Wilcox,
	Joseph Qi, Christoph Hellwig

The existing inline data support only works for cases where the entire
file is stored as inline data.  For larger files, EROFS stores the
initial blocks separately and then can pack a small tail adjacent to the
inode.  Generalise inline data to allow for tail packing.  Tails may not
cross a page boundary in memory.

We currently have no filesystems that support tail packing and writing,
so that case is currently disabled (see iomap_write_begin_inline).

Cc: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
---
v8: https://lore.kernel.org/r/20210726145734.214295-1-hsiangkao@linux.alibaba.com
changes since v8:
 - update the subject to 'iomap: Support file tail packing' as there
   are clearly a number of ways to make the inline data support more
   flexible (Matthew);

 - add one extra safety check (Darrick):
	if (WARN_ON_ONCE(size > iomap->length))
		return -EIO;

 fs/iomap/buffered-io.c | 42 ++++++++++++++++++++++++++++++------------
 fs/iomap/direct-io.c   | 10 ++++++----
 include/linux/iomap.h  | 18 ++++++++++++++++++
 3 files changed, 54 insertions(+), 16 deletions(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 87ccb3438bec..f429b9d87dbe 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -205,25 +205,32 @@ struct iomap_readpage_ctx {
 	struct readahead_control *rac;
 };
 
-static void
-iomap_read_inline_data(struct inode *inode, struct page *page,
+static int iomap_read_inline_data(struct inode *inode, struct page *page,
 		struct iomap *iomap)
 {
-	size_t size = i_size_read(inode);
+	size_t size = i_size_read(inode) - iomap->offset;
 	void *addr;
 
 	if (PageUptodate(page))
-		return;
+		return 0;
 
-	BUG_ON(page_has_private(page));
-	BUG_ON(page->index);
-	BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
+	/* inline data must start page aligned in the file */
+	if (WARN_ON_ONCE(offset_in_page(iomap->offset)))
+		return -EIO;
+	if (WARN_ON_ONCE(size > PAGE_SIZE -
+			 offset_in_page(iomap->inline_data)))
+		return -EIO;
+	if (WARN_ON_ONCE(size > iomap->length))
+		return -EIO;
+	if (WARN_ON_ONCE(page_has_private(page)))
+		return -EIO;
 
 	addr = kmap_atomic(page);
 	memcpy(addr, iomap->inline_data, size);
 	memset(addr + size, 0, PAGE_SIZE - size);
 	kunmap_atomic(addr);
 	SetPageUptodate(page);
+	return 0;
 }
 
 static inline bool iomap_block_needs_zeroing(struct inode *inode,
@@ -247,8 +254,10 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
 	sector_t sector;
 
 	if (iomap->type == IOMAP_INLINE) {
-		WARN_ON_ONCE(pos);
-		iomap_read_inline_data(inode, page, iomap);
+		int ret = iomap_read_inline_data(inode, page, iomap);
+
+		if (ret)
+			return ret;
 		return PAGE_SIZE;
 	}
 
@@ -589,6 +598,15 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
 	return 0;
 }
 
+static int iomap_write_begin_inline(struct inode *inode,
+		struct page *page, struct iomap *srcmap)
+{
+	/* needs more work for the tailpacking case, disable for now */
+	if (WARN_ON_ONCE(srcmap->offset != 0))
+		return -EIO;
+	return iomap_read_inline_data(inode, page, srcmap);
+}
+
 static int
 iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 		struct page **pagep, struct iomap *iomap, struct iomap *srcmap)
@@ -618,7 +636,7 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
 	}
 
 	if (srcmap->type == IOMAP_INLINE)
-		iomap_read_inline_data(inode, page, srcmap);
+		status = iomap_write_begin_inline(inode, page, srcmap);
 	else if (iomap->flags & IOMAP_F_BUFFER_HEAD)
 		status = __block_write_begin_int(page, pos, len, NULL, srcmap);
 	else
@@ -671,11 +689,11 @@ static size_t iomap_write_end_inline(struct inode *inode, struct page *page,
 	void *addr;
 
 	WARN_ON_ONCE(!PageUptodate(page));
-	BUG_ON(pos + copied > PAGE_SIZE - offset_in_page(iomap->inline_data));
+	BUG_ON(!iomap_inline_data_valid(iomap));
 
 	flush_dcache_page(page);
 	addr = kmap_atomic(page);
-	memcpy(iomap->inline_data + pos, addr + pos, copied);
+	memcpy(iomap_inline_data(iomap, pos), addr + pos, copied);
 	kunmap_atomic(addr);
 
 	mark_inode_dirty(inode);
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index 9398b8c31323..41ccbfc9dc82 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -378,23 +378,25 @@ iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length,
 		struct iomap_dio *dio, struct iomap *iomap)
 {
 	struct iov_iter *iter = dio->submit.iter;
+	void *inline_data = iomap_inline_data(iomap, pos);
 	size_t copied;
 
-	BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data));
+	if (WARN_ON_ONCE(!iomap_inline_data_valid(iomap)))
+		return -EIO;
 
 	if (dio->flags & IOMAP_DIO_WRITE) {
 		loff_t size = inode->i_size;
 
 		if (pos > size)
-			memset(iomap->inline_data + size, 0, pos - size);
-		copied = copy_from_iter(iomap->inline_data + pos, length, iter);
+			memset(iomap_inline_data(iomap, size), 0, pos - size);
+		copied = copy_from_iter(inline_data, length, iter);
 		if (copied) {
 			if (pos + copied > size)
 				i_size_write(inode, pos + copied);
 			mark_inode_dirty(inode);
 		}
 	} else {
-		copied = copy_to_iter(iomap->inline_data + pos, length, iter);
+		copied = copy_to_iter(inline_data, length, iter);
 	}
 	dio->size += copied;
 	return copied;
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 479c1da3e221..b8ec145b2975 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -97,6 +97,24 @@ iomap_sector(struct iomap *iomap, loff_t pos)
 	return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT;
 }
 
+/*
+ * Returns the inline data pointer for logical offset @pos.
+ */
+static inline void *iomap_inline_data(struct iomap *iomap, loff_t pos)
+{
+	return iomap->inline_data + pos - iomap->offset;
+}
+
+/*
+ * Check if the mapping's length is within the valid range for inline data.
+ * This is used to guard against accessing data beyond the page inline_data
+ * points at.
+ */
+static inline bool iomap_inline_data_valid(struct iomap *iomap)
+{
+	return iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data);
+}
+
 /*
  * When a filesystem sets page_ops in an iomap mapping it returns, page_prepare
  * and page_done will be called for each page written to.  This only applies to
-- 
2.24.4


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* (no subject)
  2021-07-27  2:59 [PATCH v9] iomap: Support file tail packing Gao Xiang
@ 2021-07-27 15:10 ` Darrick J. Wong
  2021-07-27 15:23   ` Andreas Grünbacher
  2021-07-27 15:30   ` Re: Gao Xiang
  0 siblings, 2 replies; 1651+ messages in thread
From: Darrick J. Wong @ 2021-07-27 15:10 UTC (permalink / raw)
  To: Gao Xiang
  Cc: Andreas Gruenbacher, LKML, Matthew Wilcox, Joseph Qi,
	linux-fsdevel, linux-erofs, Christoph Hellwig

I'll change the subject to:

iomap: support reading inline data from non-zero pos

The existing inline data support only works for cases where the entire
file is stored as inline data.  For larger files, EROFS stores the
initial blocks separately and the remainder of the file ("file tail")
adjacent to the inode.  Generalise inline data to allow reading the
inline file tail.  Tails may not cross a page boundary in memory.

We currently have no filesystems that support tails and writing,
so that case is currently disabled (see iomap_write_begin_inline).

If that's ok with everyone,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D


On Tue, Jul 27, 2021 at 10:59:56AM +0800, Gao Xiang wrote:
> The existing inline data support only works for cases where the entire
> file is stored as inline data.  For larger files, EROFS stores the
> initial blocks separately and then can pack a small tail adjacent to the
> inode.  Generalise inline data to allow for tail packing.  Tails may not
> cross a page boundary in memory.
> 
> We currently have no filesystems that support tail packing and writing,
> so that case is currently disabled (see iomap_write_begin_inline).
> 
> Cc: Darrick J. Wong <djwong@kernel.org>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
> ---
> v8: https://lore.kernel.org/r/20210726145734.214295-1-hsiangkao@linux.alibaba.com
> changes since v8:
>  - update the subject to 'iomap: Support file tail packing' as there
>    are clearly a number of ways to make the inline data support more
>    flexible (Matthew);
> 
>  - add one extra safety check (Darrick):
> 	if (WARN_ON_ONCE(size > iomap->length))
> 		return -EIO;
> 
>  fs/iomap/buffered-io.c | 42 ++++++++++++++++++++++++++++++------------
>  fs/iomap/direct-io.c   | 10 ++++++----
>  include/linux/iomap.h  | 18 ++++++++++++++++++
>  3 files changed, 54 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 87ccb3438bec..f429b9d87dbe 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -205,25 +205,32 @@ struct iomap_readpage_ctx {
>  	struct readahead_control *rac;
>  };
>  
> -static void
> -iomap_read_inline_data(struct inode *inode, struct page *page,
> +static int iomap_read_inline_data(struct inode *inode, struct page *page,
>  		struct iomap *iomap)
>  {
> -	size_t size = i_size_read(inode);
> +	size_t size = i_size_read(inode) - iomap->offset;
>  	void *addr;
>  
>  	if (PageUptodate(page))
> -		return;
> +		return 0;
>  
> -	BUG_ON(page_has_private(page));
> -	BUG_ON(page->index);
> -	BUG_ON(size > PAGE_SIZE - offset_in_page(iomap->inline_data));
> +	/* inline data must start page aligned in the file */
> +	if (WARN_ON_ONCE(offset_in_page(iomap->offset)))
> +		return -EIO;
> +	if (WARN_ON_ONCE(size > PAGE_SIZE -
> +			 offset_in_page(iomap->inline_data)))
> +		return -EIO;
> +	if (WARN_ON_ONCE(size > iomap->length))
> +		return -EIO;
> +	if (WARN_ON_ONCE(page_has_private(page)))
> +		return -EIO;
>  
>  	addr = kmap_atomic(page);
>  	memcpy(addr, iomap->inline_data, size);
>  	memset(addr + size, 0, PAGE_SIZE - size);
>  	kunmap_atomic(addr);
>  	SetPageUptodate(page);
> +	return 0;
>  }
>  
>  static inline bool iomap_block_needs_zeroing(struct inode *inode,
> @@ -247,8 +254,10 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data,
>  	sector_t sector;
>  
>  	if (iomap->type == IOMAP_INLINE) {
> -		WARN_ON_ONCE(pos);
> -		iomap_read_inline_data(inode, page, iomap);
> +		int ret = iomap_read_inline_data(inode, page, iomap);
> +
> +		if (ret)
> +			return ret;
>  		return PAGE_SIZE;
>  	}
>  
> @@ -589,6 +598,15 @@ __iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, int flags,
>  	return 0;
>  }
>  
> +static int iomap_write_begin_inline(struct inode *inode,
> +		struct page *page, struct iomap *srcmap)
> +{
> +	/* needs more work for the tailpacking case, disable for now */
> +	if (WARN_ON_ONCE(srcmap->offset != 0))
> +		return -EIO;
> +	return iomap_read_inline_data(inode, page, srcmap);
> +}
> +
>  static int
>  iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
>  		struct page **pagep, struct iomap *iomap, struct iomap *srcmap)
> @@ -618,7 +636,7 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags,
>  	}
>  
>  	if (srcmap->type == IOMAP_INLINE)
> -		iomap_read_inline_data(inode, page, srcmap);
> +		status = iomap_write_begin_inline(inode, page, srcmap);
>  	else if (iomap->flags & IOMAP_F_BUFFER_HEAD)
>  		status = __block_write_begin_int(page, pos, len, NULL, srcmap);
>  	else
> @@ -671,11 +689,11 @@ static size_t iomap_write_end_inline(struct inode *inode, struct page *page,
>  	void *addr;
>  
>  	WARN_ON_ONCE(!PageUptodate(page));
> -	BUG_ON(pos + copied > PAGE_SIZE - offset_in_page(iomap->inline_data));
> +	BUG_ON(!iomap_inline_data_valid(iomap));
>  
>  	flush_dcache_page(page);
>  	addr = kmap_atomic(page);
> -	memcpy(iomap->inline_data + pos, addr + pos, copied);
> +	memcpy(iomap_inline_data(iomap, pos), addr + pos, copied);
>  	kunmap_atomic(addr);
>  
>  	mark_inode_dirty(inode);
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index 9398b8c31323..41ccbfc9dc82 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -378,23 +378,25 @@ iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length,
>  		struct iomap_dio *dio, struct iomap *iomap)
>  {
>  	struct iov_iter *iter = dio->submit.iter;
> +	void *inline_data = iomap_inline_data(iomap, pos);
>  	size_t copied;
>  
> -	BUG_ON(pos + length > PAGE_SIZE - offset_in_page(iomap->inline_data));
> +	if (WARN_ON_ONCE(!iomap_inline_data_valid(iomap)))
> +		return -EIO;
>  
>  	if (dio->flags & IOMAP_DIO_WRITE) {
>  		loff_t size = inode->i_size;
>  
>  		if (pos > size)
> -			memset(iomap->inline_data + size, 0, pos - size);
> -		copied = copy_from_iter(iomap->inline_data + pos, length, iter);
> +			memset(iomap_inline_data(iomap, size), 0, pos - size);
> +		copied = copy_from_iter(inline_data, length, iter);
>  		if (copied) {
>  			if (pos + copied > size)
>  				i_size_write(inode, pos + copied);
>  			mark_inode_dirty(inode);
>  		}
>  	} else {
> -		copied = copy_to_iter(iomap->inline_data + pos, length, iter);
> +		copied = copy_to_iter(inline_data, length, iter);
>  	}
>  	dio->size += copied;
>  	return copied;
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 479c1da3e221..b8ec145b2975 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -97,6 +97,24 @@ iomap_sector(struct iomap *iomap, loff_t pos)
>  	return (iomap->addr + pos - iomap->offset) >> SECTOR_SHIFT;
>  }
>  
> +/*
> + * Returns the inline data pointer for logical offset @pos.
> + */
> +static inline void *iomap_inline_data(struct iomap *iomap, loff_t pos)
> +{
> +	return iomap->inline_data + pos - iomap->offset;
> +}
> +
> +/*
> + * Check if the mapping's length is within the valid range for inline data.
> + * This is used to guard against accessing data beyond the page inline_data
> + * points at.
> + */
> +static inline bool iomap_inline_data_valid(struct iomap *iomap)
> +{
> +	return iomap->length <= PAGE_SIZE - offset_in_page(iomap->inline_data);
> +}
> +
>  /*
>   * When a filesystem sets page_ops in an iomap mapping it returns, page_prepare
>   * and page_done will be called for each page written to.  This only applies to
> -- 
> 2.24.4
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-07-27 15:10 ` Darrick J. Wong
@ 2021-07-27 15:23   ` Andreas Grünbacher
  2021-07-27 15:30   ` Re: Gao Xiang
  1 sibling, 0 replies; 1651+ messages in thread
From: Andreas Grünbacher @ 2021-07-27 15:23 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Andreas Gruenbacher, LKML, Matthew Wilcox, Joseph Qi,
	Linux FS-devel Mailing List, linux-erofs, Christoph Hellwig

Am Di., 27. Juli 2021 um 17:11 Uhr schrieb Darrick J. Wong <djwong@kernel.org>:
> I'll change the subject to:
>
> iomap: support reading inline data from non-zero pos

That surely works for me.

Thanks,
Andreas

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-07-27 15:10 ` Darrick J. Wong
  2021-07-27 15:23   ` Andreas Grünbacher
@ 2021-07-27 15:30   ` Gao Xiang
  1 sibling, 0 replies; 1651+ messages in thread
From: Gao Xiang @ 2021-07-27 15:30 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Andreas Gruenbacher, LKML, Matthew Wilcox, Joseph Qi,
	linux-fsdevel, linux-erofs, Christoph Hellwig

On Tue, Jul 27, 2021 at 08:10:51AM -0700, Darrick J. Wong wrote:
> I'll change the subject to:
> 
> iomap: support reading inline data from non-zero pos

I'm fine with this too. Many thanks for updating!

Thanks,
Gao Xiang


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-08-21 14:40 TECOB270_Ganesh Pawar
  2021-08-21 23:52 ` Jeff King
  0 siblings, 1 reply; 1651+ messages in thread
From: TECOB270_Ganesh Pawar @ 2021-08-21 14:40 UTC (permalink / raw)
  To: git

I'm not sure if this is a bug.

To reproduce:
1. Set the contents of .git/hooks/prepare-commit-msg to this:
```
#!/bin/sh

COMMIT_MSG_FILE=$1

echo "Initial Commit." > "$COMMIT_MSG_FILE"
echo "" >> "$COMMIT_MSG_FILE"
echo "# Some random comment." >> "$COMMIT_MSG_FILE"
```
Notice the comment being added to the file.

2. Append a commit with the --no-edit flag.
`git commit --amend --no-edit`

The comment ("Some random comment" in this case) is included in the
final commit message, but it shouldn't right?

If I don't pass the flag and just save the commit without changing
anything, the comment isn't included. Shouldn't this be the case with
the --no-edit flag too?

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-08-21 14:40 TECOB270_Ganesh Pawar
@ 2021-08-21 23:52 ` Jeff King
  0 siblings, 0 replies; 1651+ messages in thread
From: Jeff King @ 2021-08-21 23:52 UTC (permalink / raw)
  To: TECOB270_Ganesh Pawar; +Cc: git

On Sat, Aug 21, 2021 at 08:10:59PM +0530, TECOB270_Ganesh Pawar wrote:

> To reproduce:
> 1. Set the contents of .git/hooks/prepare-commit-msg to this:
> ```
> #!/bin/sh
> 
> COMMIT_MSG_FILE=$1
> 
> echo "Initial Commit." > "$COMMIT_MSG_FILE"
> echo "" >> "$COMMIT_MSG_FILE"
> echo "# Some random comment." >> "$COMMIT_MSG_FILE"
> ```
> Notice the comment being added to the file.
> 
> 2. Append a commit with the --no-edit flag.
> `git commit --amend --no-edit`
> 
> The comment ("Some random comment" in this case) is included in the
> final commit message, but it shouldn't right?
> 
> If I don't pass the flag and just save the commit without changing
> anything, the comment isn't included. Shouldn't this be the case with
> the --no-edit flag too?

No, the behavior you're seeing is expected. Try this:

  git commit --cleanup=strip --amend --no-edit

The default for "--cleanup" is "strip" when the editor is run, and
"whitespace" otherwise. I.e., if Git did not insert comments, then it
doesn't remove them either.

If you have a hook which is inserting comments which may need to be
stripped, you may want to set the commit.cleanup config to tell Git to
always remove them (but beware that invocations like "git commit -F"
will also start stripping comments).

See "--cleanup" in "git help commit" for the possible values.

-Peff

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAKPXbjesQH_k1Z7k4kNwpoAf-jYgbUaPqPCgNTJZ35peVBy_pA@mail.gmail.com>]

* Re:
       [not found] <CAKPXbjesQH_k1Z7k4kNwpoAf-jYgbUaPqPCgNTJZ35peVBy_pA@mail.gmail.com>
@ 2021-08-29 12:01 ` Lukas Bulwahn
  0 siblings, 0 replies; 1651+ messages in thread
From: Lukas Bulwahn @ 2021-08-29 12:01 UTC (permalink / raw)
  To: Harshita; +Cc: linux-kernel-mentees

On Sat, Aug 28, 2021 at 4:18 AM Harshita <hrsa.kshyp@gmail.com> wrote:
>
> Hello, I m Harshita.
>
> I am interested in the Checkpatch Documentation mentorship program and I would like to work on the tasks for the mentee selection.
>

Sorry, you were too late and missed the deadline. Please re-apply for
the next period.

Lukas
_______________________________________________
Linux-kernel-mentees mailing list
Linux-kernel-mentees@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
@ 2021-09-03 20:51 Mr. James Khmalo
  0 siblings, 0 replies; 1651+ messages in thread
From: Mr. James Khmalo @ 2021-09-03 20:51 UTC (permalink / raw)
  To: soc

Good Day,

I know this email might come to you as a surprise as first coming from one you haven’t met with before.
I am Mr. James Khmalo, the bank manager with ABSA bank of South Africa,  and a personal banker of Dr.Mohamed Farouk Ibrahim, an Egyptian who happened to be a medical contractor attached to the overthrown Afghan government by the Taliban government.   
Dr.Mohamed Farouk Ibrahim deposits some sum of money with our bank but passed away with his family while trying to escape from Kandahar.
The said sum can be used for an investment if you are interested.  Details relating to the funds are in my position and will present you as the Next-of-Kin because there was none, and I shall furnish you with more detail once your response.

Regards,
Mr. James Khmalo
Tel: 27-632696383
South Africa

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-10-08  1:24 Dmitry Baryshkov
  2021-10-12 23:59 ` Linus Walleij
  2021-10-17 21:35 ` Re: Linus Walleij
  0 siblings, 2 replies; 1651+ messages in thread
From: Dmitry Baryshkov @ 2021-10-08  1:24 UTC (permalink / raw)
  To: Andy Gross, Bjorn Andersson, Linus Walleij, Rob Herring
  Cc: linux-gpio, devicetree, linux-arm-msm

In 2019 (in kernel 5.4) spmi-gpio and ssbi-gpio drivers were converted
to hierarchical IRQ helpers, however MPP drivers were not converted at
that moment. Complete this by converting MPP drivers.

Changes since v2:
 - Add patches fixing/updating mpps nodes in the existing device trees

Changes since v1:
 - Drop the interrupt-controller from initial schema conversion
 - Add gpio-line-names to the qcom,pmic-mpp schema and to the example

The following changes since commit 6880fa6c56601bb8ed59df6c30fd390cc5f6dd8f:

  Linux 5.15-rc1 (2021-09-12 16:28:37 -0700)

are available in the Git repository at:

  https://git.linaro.org/people/dmitry.baryshkov/kernel.git spmi-mpp

for you to fetch changes up to 9bccc31fc5cec98f26ca639a2ee21a9831efe1de:

  arm64: dts: qcom: pm8994: add interrupt controller properties (2021-10-08 04:19:33 +0300)

----------------------------------------------------------------
Dmitry Baryshkov (25):
      dt-bindings: pinctrl: qcom,pmic-mpp: Convert qcom pmic mpp bindings to YAML
      dt-bindings: mfd: qcom-pm8xxx: add missing child nodes
      ARM: dts: qcom-apq8064: add gpio-ranges to mpps nodes
      ARM: dts: qcom-msm8660: add gpio-ranges to mpps nodes
      ARM: dts: qcom-pm8841: add gpio-ranges to mpps nodes
      ARM: dts: qcom-pm8941: add gpio-ranges to mpps nodes
      ARM: dts: qcom-pma8084: add gpio-ranges to mpps nodes
      ARM: dts: qcom-mdm9615: add gpio-ranges to mpps node, fix its name
      ARM: dts: qcom-apq8060-dragonboard: fix mpps state names
      arm64: dts: qcom: pm8916: fix mpps device tree node
      arm64: dts: qcom: pm8994: fix mpps device tree node
      arm64: dts: qcom: apq8016-sbc: fix mpps state names
      pinctrl: qcom: ssbi-mpp: hardcode IRQ counts
      pinctrl: qcom: ssbi-mpp: add support for hierarchical IRQ chip
      pinctrl: qcom: spmi-mpp: hardcode IRQ counts
      pinctrl: qcom: spmi-mpp: add support for hierarchical IRQ chip
      dt-bindings: pinctrl: qcom,pmic-mpp: switch to #interrupt-cells
      ARM: dts: qcom-apq8064: add interrupt controller properties
      ARM: dts: qcom-mdm9615: add interrupt controller properties
      ARM: dts: qcom-msm8660: add interrupt controller properties
      ARM: dts: qcom-pm8841: add interrupt controller properties
      ARM: dts: qcom-pm8941: add interrupt controller properties
      ARM: dts: qcom-pma8084: add interrupt controller properties
      arm64: dts: qcom: pm8916: add interrupt controller properties
      arm64: dts: qcom: pm8994: add interrupt controller properties

 .../devicetree/bindings/mfd/qcom-pm8xxx.yaml       |  12 ++
 .../devicetree/bindings/pinctrl/qcom,pmic-mpp.txt  | 187 --------------------
 .../devicetree/bindings/pinctrl/qcom,pmic-mpp.yaml | 188 +++++++++++++++++++++
 arch/arm/boot/dts/qcom-apq8060-dragonboard.dts     |   4 +-
 arch/arm/boot/dts/qcom-apq8064.dtsi                |  23 +--
 arch/arm/boot/dts/qcom-mdm9615.dtsi                |  12 +-
 arch/arm/boot/dts/qcom-msm8660.dtsi                |  17 +-
 arch/arm/boot/dts/qcom-pm8841.dtsi                 |   7 +-
 arch/arm/boot/dts/qcom-pm8941.dtsi                 |  11 +-
 arch/arm/boot/dts/qcom-pma8084.dtsi                |  11 +-
 arch/arm64/boot/dts/qcom/apq8016-sbc.dtsi          |   4 +-
 arch/arm64/boot/dts/qcom/pm8916.dtsi               |   9 +-
 arch/arm64/boot/dts/qcom/pm8994.dtsi               |  13 +-
 drivers/pinctrl/qcom/pinctrl-spmi-mpp.c            | 111 ++++++++----
 drivers/pinctrl/qcom/pinctrl-ssbi-mpp.c            | 133 +++++++++++----
 15 files changed, 414 insertions(+), 328 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/pinctrl/qcom,pmic-mpp.txt
 create mode 100644 Documentation/devicetree/bindings/pinctrl/qcom,pmic-mpp.yaml



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-10-08  1:24 Dmitry Baryshkov
@ 2021-10-12 23:59 ` Linus Walleij
  2021-10-13  3:46   ` Re: Dmitry Baryshkov
  2021-10-17 16:54   ` Re: Bjorn Andersson
  2021-10-17 21:35 ` Re: Linus Walleij
  1 sibling, 2 replies; 1651+ messages in thread
From: Linus Walleij @ 2021-10-12 23:59 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Bjorn Andersson, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Fri, Oct 8, 2021 at 3:25 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:

> In 2019 (in kernel 5.4) spmi-gpio and ssbi-gpio drivers were converted
> to hierarchical IRQ helpers, however MPP drivers were not converted at
> that moment. Complete this by converting MPP drivers.
>
> Changes since v2:
>  - Add patches fixing/updating mpps nodes in the existing device trees

Thanks a *lot* for being thorough and fixing all this properly!

I am happy to apply the pinctrl portions to the pinctrl tree, I'm
uncertain about Rob's syntax checker robot here, are there real
problems? Sometimes it complains about things being changed
in the DTS files at the same time.

I could apply all of this (including DTS changes) to an immutable
branch and offer to Bjorn if he is fine with the patches and
the general approach.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-10-12 23:59 ` Linus Walleij
@ 2021-10-13  3:46   ` Dmitry Baryshkov
  2021-10-13 23:39     ` Re: Linus Walleij
  2021-10-17 16:54   ` Re: Bjorn Andersson
  1 sibling, 1 reply; 1651+ messages in thread
From: Dmitry Baryshkov @ 2021-10-13  3:46 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Andy Gross, Bjorn Andersson, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Wed, 13 Oct 2021 at 02:59, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Fri, Oct 8, 2021 at 3:25 AM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
>
> > In 2019 (in kernel 5.4) spmi-gpio and ssbi-gpio drivers were converted
> > to hierarchical IRQ helpers, however MPP drivers were not converted at
> > that moment. Complete this by converting MPP drivers.
> >
> > Changes since v2:
> >  - Add patches fixing/updating mpps nodes in the existing device trees
>
> Thanks a *lot* for being thorough and fixing all this properly!
>
> I am happy to apply the pinctrl portions to the pinctrl tree, I'm
> uncertain about Rob's syntax checker robot here, are there real
> problems? Sometimes it complains about things being changed
> in the DTS files at the same time.

Rob's checker reports issue that are being fixed by respective
patches. I think I've updated all dts entries for the mpp devices tree
nodes.

> I could apply all of this (including DTS changes) to an immutable
> branch and offer to Bjorn if he is fine with the patches and
> the general approach.

I'm fine with either approach.

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-10-13  3:46   ` Re: Dmitry Baryshkov
@ 2021-10-13 23:39     ` Linus Walleij
  0 siblings, 0 replies; 1651+ messages in thread
From: Linus Walleij @ 2021-10-13 23:39 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Bjorn Andersson, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Wed, Oct 13, 2021 at 5:46 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
> On Wed, 13 Oct 2021 at 02:59, Linus Walleij <linus.walleij@linaro.org> wrote:

> > I am happy to apply the pinctrl portions to the pinctrl tree, I'm
> > uncertain about Rob's syntax checker robot here, are there real
> > problems? Sometimes it complains about things being changed
> > in the DTS files at the same time.
>
> Rob's checker reports issue that are being fixed by respective
> patches. I think I've updated all dts entries for the mpp devices tree
> nodes.
>
> > I could apply all of this (including DTS changes) to an immutable
> > branch and offer to Bjorn if he is fine with the patches and
> > the general approach.
>
> I'm fine with either approach.

Let's see what Bjorn says, if nothing happens poke me again and I'll
create an immutable branch and merge it.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-10-12 23:59 ` Linus Walleij
  2021-10-13  3:46   ` Re: Dmitry Baryshkov
@ 2021-10-17 16:54   ` Bjorn Andersson
  2021-10-17 21:31     ` Re: Linus Walleij
  1 sibling, 1 reply; 1651+ messages in thread
From: Bjorn Andersson @ 2021-10-17 16:54 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Dmitry Baryshkov, Andy Gross, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Tue 12 Oct 18:59 CDT 2021, Linus Walleij wrote:

> On Fri, Oct 8, 2021 at 3:25 AM Dmitry Baryshkov
> <dmitry.baryshkov@linaro.org> wrote:
> 
> > In 2019 (in kernel 5.4) spmi-gpio and ssbi-gpio drivers were converted
> > to hierarchical IRQ helpers, however MPP drivers were not converted at
> > that moment. Complete this by converting MPP drivers.
> >
> > Changes since v2:
> >  - Add patches fixing/updating mpps nodes in the existing device trees
> 
> Thanks a *lot* for being thorough and fixing all this properly!
> 
> I am happy to apply the pinctrl portions to the pinctrl tree, I'm
> uncertain about Rob's syntax checker robot here, are there real
> problems? Sometimes it complains about things being changed
> in the DTS files at the same time.
> 
> I could apply all of this (including DTS changes) to an immutable
> branch and offer to Bjorn if he is fine with the patches and
> the general approach.
> 

I like the driver changes and I'm wrapping up a second pull for the dts
pieces in the coming few days. So if you're happy to take the driver
patches I'll include the DT changes for 5.16 as well.

Thanks,
Bjorn

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-10-17 16:54   ` Re: Bjorn Andersson
@ 2021-10-17 21:31     ` Linus Walleij
  0 siblings, 0 replies; 1651+ messages in thread
From: Linus Walleij @ 2021-10-17 21:31 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Dmitry Baryshkov, Andy Gross, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

On Sun, Oct 17, 2021 at 6:54 PM Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:

> I like the driver changes and I'm wrapping up a second pull for the dts
> pieces in the coming few days. So if you're happy to take the driver
> patches I'll include the DT changes for 5.16 as well.

OK let's do like that. I'll queue the binding changes and driver
changes so we finally get this fixed up.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-10-08  1:24 Dmitry Baryshkov
  2021-10-12 23:59 ` Linus Walleij
@ 2021-10-17 21:35 ` Linus Walleij
  1 sibling, 0 replies; 1651+ messages in thread
From: Linus Walleij @ 2021-10-17 21:35 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Andy Gross, Bjorn Andersson, Rob Herring,
	open list:GPIO SUBSYSTEM,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS, MSM

I queued thes patches in the pinctrl tree for v5.16:

On Fri, Oct 8, 2021 at 3:25 AM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:

>       dt-bindings: pinctrl: qcom,pmic-mpp: Convert qcom pmic mpp bindings to YAML
>       pinctrl: qcom: ssbi-mpp: hardcode IRQ counts
>       pinctrl: qcom: ssbi-mpp: add support for hierarchical IRQ chip
>       pinctrl: qcom: spmi-mpp: hardcode IRQ counts
>       pinctrl: qcom: spmi-mpp: add support for hierarchical IRQ chip
>       dt-bindings: pinctrl: qcom,pmic-mpp: switch to #interrupt-cells

Any breakages will be fixed when Bjorn applies the DTS changes to his
tree.

I wonder about the MFD patch, maybe Lee can expedite merging that too
or ACK it for Bjorn to merge with the remainders.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20211011231530.GA22856@t>]

* (no subject)
       [not found] <20211011231530.GA22856@t>
@ 2021-10-12  1:23 ` James Bottomley
  2021-10-12  2:30   ` Bart Van Assche
  0 siblings, 1 reply; 1651+ messages in thread
From: James Bottomley @ 2021-10-12  1:23 UTC (permalink / raw)
  To: docfate111, linux-scsi

On Mon, 2021-10-11 at 19:15 -0400, docfate111 wrote:
> linux-scsi@vger.kernel.org,
> linux-kernel@vger.kernel.org,
> martin.petersen@oracle.com
> Bcc: 
> Subject: [PATCH] scsi_lib fix the NULL pointer dereference
> Reply-To: 
> 
> scsi_setup_scsi_cmnd should check for the pointer before
> scsi_command_size dereferences it.

Have you seen this?  As in do you have a trace?  This should be an
impossible condition, so we need to see where it came from.  The patch
as proposed is not right, because if something is setting cmd_len
without setting the cmnd pointer we need the cause fixed rather than
applying a band aid in scsi_setup_scsi_cmnd().

James

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-10-12  1:23 ` James Bottomley
@ 2021-10-12  2:30   ` Bart Van Assche
  0 siblings, 0 replies; 1651+ messages in thread
From: Bart Van Assche @ 2021-10-12  2:30 UTC (permalink / raw)
  To: jejb, docfate111, linux-scsi

On 10/11/21 18:23, James Bottomley wrote:
> On Mon, 2021-10-11 at 19:15 -0400, docfate111 wrote:
>> linux-scsi@vger.kernel.org,
>> linux-kernel@vger.kernel.org,
>> martin.petersen@oracle.com
>> Bcc:
>> Subject: [PATCH] scsi_lib fix the NULL pointer dereference
>> Reply-To:
>>
>> scsi_setup_scsi_cmnd should check for the pointer before
>> scsi_command_size dereferences it.
> 
> Have you seen this?  As in do you have a trace?  This should be an
> impossible condition, so we need to see where it came from.  The patch
> as proposed is not right, because if something is setting cmd_len
> without setting the cmnd pointer we need the cause fixed rather than
> applying a band aid in scsi_setup_scsi_cmnd().

Hi James and Thelford,

This patch looks like a duplicate of a patch posted one month ago? I 
think Christoph agrees to remove the cmd_len == 0 check. See also 
https://lore.kernel.org/linux-scsi/20210904064534.1919476-1-qiulaibin@huawei.com/.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v5 00/11] Add support for X86/ACPI camera sensor/PMIC setup with clk and regulator platform data
@ 2021-11-02  9:48 Hans de Goede
  2021-11-02  9:49 ` [PATCH v5 05/11] clk: Introduce clk-tps68470 driver Hans de Goede
  0 siblings, 1 reply; 1651+ messages in thread
From: Hans de Goede @ 2021-11-02  9:48 UTC (permalink / raw)
  To: Rafael J . Wysocki, Mark Gross, Andy Shevchenko, Wolfram Sang,
	Mika Westerberg, Daniel Scally, Laurent Pinchart,
	Mauro Carvalho Chehab, Liam Girdwood, Mark Brown,
	Michael Turquette, Stephen Boyd
  Cc: Hans de Goede, Len Brown, linux-acpi, platform-driver-x86,
	linux-kernel, linux-i2c, Sakari Ailus, Kate Hsuan, linux-media,
	linux-clk

Here is v5 of my patch-set adding support for camera sensor connected to a
TPS68470 PMIC on x86/ACPI devices.

Changes in v5:
- Update regulator_init_data in patch 10/11 to include the VCM regulator
- Address various small review remarks from Andy
- Make a couple of functions / vars static in the clk + regulator drivers
  Reported-by: kernel test robot <lkp@intel.com>

Changes in v4:
[PATCH 01/11] ACPI: delay enumeration of devices with a _DEP
              pointing to an INT3472 device:
- Move the acpi_dev_ready_for_enumeration() check to acpi_bus_attach()
  (replacing the acpi_device_is_present() check there)

[PATCH 04/11] regulator: Introduce tps68470-regulator driver:
- Make the top comment block use c++ style comments
- Drop the bogus builtin regulator_init_data
- Make the driver enable the PMIC clk when enabling the Core buck
  regulator, this switching regulator needs the PLL to be on
- Kconfig: add || COMPILE_TEST, fix help text

[PATCH 05/11] clk: Introduce clk-tps68470 driver
- Kconfig: select REGMAP_I2C, add || COMPILE_TEST, fix help text
- tps68470_clk_prepare(): Wait for the PLL to lock before returning
- tps68470_clk_unprepare(): Remove unnecesary clearing of divider regs
- tps68470_clk_probe(): Use devm_clk_hw_register()
- Misc. small cleanups

I'm quite happy with how this works now, so from my pov this is the final
version of the device-instantiation deferral code / approach.

###

The clk and regulator frameworks expect clk/regulator consumer-devices
to have info about the consumed clks/regulators described in the device's
fw_node, but on ACPI this info is missing.

This series worksaround this by providing platform_data with the info to
the TPS68470 clk/regulator MFD cells.

Patches 1 - 2 deal with a probe-ordering problem this introduces,
since the lookups are only registered when the provider-driver binds,
trying to get these clks/regulators before then results in a -ENOENT
error for clks and a dummy regulator for regulators. See the patches
for more details.

Patch 3 adds a header file which adds tps68470_clk_platform_data and
tps68470_regulator_platform_data structs. The futher patches depend on
this new header file.

Patch 4 + 5 add the TPS68470 clk and regulator drivers

Patches 6 - 11 Modify the INT3472 driver which instantiates the MFD cells to
provide the necessary platform-data.

Assuming this series is acceptable to everyone, we need to talk about how
to merge this.

Patch 2 has already been acked by Wolfram for merging by Rafael, so patch
1 + 2 can be merged into linux-pm, independent of the rest of the series
(there are some runtime deps on other changes for everything to work,
but the camera-sensors impacted by this are not fully supported yet in
the mainline kernel anyways).

For "[PATCH 03/13] platform_data: Add linux/platform_data/tps68470.h file",
which all further patches depend on I plan to provide an immutable branch
myself (once it has been reviewed), which the clk / regulator maintainers
can then merge before merging the clk / regulator driver which depends on
this.

And I will merge that IM-branch + patches 6-11 into the pdx86 tree myself.

Regards,

Hans


Daniel Scally (1):
  platform/x86: int3472: Enable I2c daisy chain

Hans de Goede (10):
  ACPI: delay enumeration of devices with a _DEP pointing to an INT3472
    device
  i2c: acpi: Use acpi_dev_ready_for_enumeration() helper
  platform_data: Add linux/platform_data/tps68470.h file
  regulator: Introduce tps68470-regulator driver
  clk: Introduce clk-tps68470 driver
  platform/x86: int3472: Split into 2 drivers
  platform/x86: int3472: Add get_sensor_adev_and_name() helper
  platform/x86: int3472: Pass tps68470_clk_platform_data to the
    tps68470-regulator MFD-cell
  platform/x86: int3472: Pass tps68470_regulator_platform_data to the
    tps68470-regulator MFD-cell
  platform/x86: int3472: Deal with probe ordering issues

 drivers/acpi/scan.c                           |  37 ++-
 drivers/clk/Kconfig                           |   8 +
 drivers/clk/Makefile                          |   1 +
 drivers/clk/clk-tps68470.c                    | 257 ++++++++++++++++++
 drivers/i2c/i2c-core-acpi.c                   |   5 +-
 drivers/platform/x86/intel/int3472/Makefile   |   9 +-
 ...lk_and_regulator.c => clk_and_regulator.c} |   2 +-
 drivers/platform/x86/intel/int3472/common.c   |  82 ++++++
 .../{intel_skl_int3472_common.h => common.h}  |   6 +-
 ...ntel_skl_int3472_discrete.c => discrete.c} |  51 ++--
 .../intel/int3472/intel_skl_int3472_common.c  | 106 --------
 ...ntel_skl_int3472_tps68470.c => tps68470.c} |  99 ++++++-
 drivers/platform/x86/intel/int3472/tps68470.h |  25 ++
 .../x86/intel/int3472/tps68470_board_data.c   | 134 +++++++++
 drivers/regulator/Kconfig                     |   9 +
 drivers/regulator/Makefile                    |   1 +
 drivers/regulator/tps68470-regulator.c        | 212 +++++++++++++++
 include/acpi/acpi_bus.h                       |   5 +-
 include/linux/mfd/tps68470.h                  |  11 +
 include/linux/platform_data/tps68470.h        |  35 +++
 20 files changed, 944 insertions(+), 151 deletions(-)
 create mode 100644 drivers/clk/clk-tps68470.c
 rename drivers/platform/x86/intel/int3472/{intel_skl_int3472_clk_and_regulator.c => clk_and_regulator.c} (99%)
 create mode 100644 drivers/platform/x86/intel/int3472/common.c
 rename drivers/platform/x86/intel/int3472/{intel_skl_int3472_common.h => common.h} (94%)
 rename drivers/platform/x86/intel/int3472/{intel_skl_int3472_discrete.c => discrete.c} (91%)
 delete mode 100644 drivers/platform/x86/intel/int3472/intel_skl_int3472_common.c
 rename drivers/platform/x86/intel/int3472/{intel_skl_int3472_tps68470.c => tps68470.c} (54%)
 create mode 100644 drivers/platform/x86/intel/int3472/tps68470.h
 create mode 100644 drivers/platform/x86/intel/int3472/tps68470_board_data.c
 create mode 100644 drivers/regulator/tps68470-regulator.c
 create mode 100644 include/linux/platform_data/tps68470.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v5 05/11] clk: Introduce clk-tps68470 driver
  2021-11-02  9:48 [PATCH v5 00/11] Add support for X86/ACPI camera sensor/PMIC setup with clk and regulator platform data Hans de Goede
@ 2021-11-02  9:49 ` Hans de Goede
       [not found]   ` <163588780885.2993099.2088131017920983969@swboyd.mtv.corp.google.com>
  0 siblings, 1 reply; 1651+ messages in thread
From: Hans de Goede @ 2021-11-02  9:49 UTC (permalink / raw)
  To: Rafael J . Wysocki, Mark Gross, Andy Shevchenko, Wolfram Sang,
	Mika Westerberg, Daniel Scally, Laurent Pinchart,
	Mauro Carvalho Chehab, Liam Girdwood, Mark Brown,
	Michael Turquette, Stephen Boyd
  Cc: Hans de Goede, Len Brown, linux-acpi, platform-driver-x86,
	linux-kernel, linux-i2c, Sakari Ailus, Kate Hsuan, linux-media,
	linux-clk

The TPS68470 PMIC provides Clocks, GPIOs and Regulators. At present in
the kernel the Regulators and Clocks are controlled by an OpRegion
driver designed to work with power control methods defined in ACPI, but
some platforms lack those methods, meaning drivers need to be able to
consume the resources of these chips through the usual frameworks.

This commit adds a driver for the clocks provided by the tps68470,
and is designed to bind to the platform_device registered by the
intel_skl_int3472 module.

This is based on this out of tree driver written by Intel:
https://github.com/intel/linux-intel-lts/blob/4.14/base/drivers/clk/clk-tps68470.c
with various cleanups added.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
Changes in v5:
- Small comment cleanups based on review from Andy

Changes in v4:
- Kconfig: select REGMAP_I2C, add || COMPILE_TEST, fix help text
- tps68470_clk_prepare(): Wait for the PLL to lock before returning
- tps68470_clk_unprepare(): Remove unnecesary clearing of divider regs
- tps68470_clk_probe(): Use devm_clk_hw_register()
- Misc. small cleanups

Changes in v2:
- Update the comment on why a subsys_initcall is used to register the drv
- Fix trailing whitespice on line 100
---
 drivers/clk/Kconfig          |   8 ++
 drivers/clk/Makefile         |   1 +
 drivers/clk/clk-tps68470.c   | 257 +++++++++++++++++++++++++++++++++++
 include/linux/mfd/tps68470.h |  11 ++
 4 files changed, 277 insertions(+)
 create mode 100644 drivers/clk/clk-tps68470.c

diff --git a/drivers/clk/Kconfig b/drivers/clk/Kconfig
index c5b3dc97396a..4e9098d79249 100644
--- a/drivers/clk/Kconfig
+++ b/drivers/clk/Kconfig
@@ -169,6 +169,14 @@ config COMMON_CLK_CDCE706
 	help
 	  This driver supports TI CDCE706 programmable 3-PLL clock synthesizer.
 
+config COMMON_CLK_TPS68470
+	tristate "Clock Driver for TI TPS68470 PMIC"
+	depends on I2C
+	depends on INTEL_SKL_INT3472 || COMPILE_TEST
+	select REGMAP_I2C
+	help
+	  This driver supports the clocks provided by the TPS68470 PMIC.
+
 config COMMON_CLK_CDCE925
 	tristate "Clock driver for TI CDCE913/925/937/949 devices"
 	depends on I2C
diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile
index e42312121e51..6b6a88ae1425 100644
--- a/drivers/clk/Makefile
+++ b/drivers/clk/Makefile
@@ -63,6 +63,7 @@ obj-$(CONFIG_COMMON_CLK_SI570)		+= clk-si570.o
 obj-$(CONFIG_COMMON_CLK_STM32F)		+= clk-stm32f4.o
 obj-$(CONFIG_COMMON_CLK_STM32H7)	+= clk-stm32h7.o
 obj-$(CONFIG_COMMON_CLK_STM32MP157)	+= clk-stm32mp1.o
+obj-$(CONFIG_COMMON_CLK_TPS68470)      += clk-tps68470.o
 obj-$(CONFIG_CLK_TWL6040)		+= clk-twl6040.o
 obj-$(CONFIG_ARCH_VT8500)		+= clk-vt8500.o
 obj-$(CONFIG_COMMON_CLK_VC5)		+= clk-versaclock5.o
diff --git a/drivers/clk/clk-tps68470.c b/drivers/clk/clk-tps68470.c
new file mode 100644
index 000000000000..2ad0ac2f4096
--- /dev/null
+++ b/drivers/clk/clk-tps68470.c
@@ -0,0 +1,257 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Clock driver for TPS68470 PMIC
+ *
+ * Copyright (c) 2021 Red Hat Inc.
+ * Copyright (C) 2018 Intel Corporation
+ *
+ * Authors:
+ *	Hans de Goede <hdegoede@redhat.com>
+ *	Zaikuo Wang <zaikuo.wang@intel.com>
+ *	Tianshu Qiu <tian.shu.qiu@intel.com>
+ *	Jian Xu Zheng <jian.xu.zheng@intel.com>
+ *	Yuning Pu <yuning.pu@intel.com>
+ *	Antti Laakso <antti.laakso@intel.com>
+ */
+
+#include <linux/clk-provider.h>
+#include <linux/clkdev.h>
+#include <linux/kernel.h>
+#include <linux/mfd/tps68470.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/platform_data/tps68470.h>
+#include <linux/regmap.h>
+
+#define TPS68470_CLK_NAME "tps68470-clk"
+
+#define to_tps68470_clkdata(clkd) \
+	container_of(clkd, struct tps68470_clkdata, clkout_hw)
+
+static struct tps68470_clkout_freqs {
+	unsigned long freq;
+	unsigned int xtaldiv;
+	unsigned int plldiv;
+	unsigned int postdiv;
+	unsigned int buckdiv;
+	unsigned int boostdiv;
+} clk_freqs[] = {
+/*
+ *  The PLL is used to multiply the crystal oscillator
+ *  frequency range of 3 MHz to 27 MHz by a programmable
+ *  factor of F = (M/N)*(1/P) such that the output
+ *  available at the HCLK_A or HCLK_B pins are in the range
+ *  of 4 MHz to 64 MHz in increments of 0.1 MHz.
+ *
+ * hclk_# = osc_in * (((plldiv*2)+320) / (xtaldiv+30)) * (1 / 2^postdiv)
+ *
+ * PLL_REF_CLK should be as close as possible to 100kHz
+ * PLL_REF_CLK = input clk / XTALDIV[7:0] + 30)
+ *
+ * PLL_VCO_CLK = (PLL_REF_CLK * (plldiv*2 + 320))
+ *
+ * BOOST should be as close as possible to 2Mhz
+ * BOOST = PLL_VCO_CLK / (BOOSTDIV[4:0] + 16) *
+ *
+ * BUCK should be as close as possible to 5.2Mhz
+ * BUCK = PLL_VCO_CLK / (BUCKDIV[3:0] + 5)
+ *
+ * osc_in   xtaldiv  plldiv   postdiv   hclk_#
+ * 20Mhz    170      32       1         19.2Mhz
+ * 20Mhz    170      40       1         20Mhz
+ * 20Mhz    170      80       1         24Mhz
+ */
+	{ 19200000, 170, 32, 1, 2, 3 },
+	{ 20000000, 170, 40, 1, 3, 4 },
+	{ 24000000, 170, 80, 1, 4, 8 },
+};
+
+struct tps68470_clkdata {
+	struct clk_hw clkout_hw;
+	struct regmap *regmap;
+	unsigned int clk_cfg_idx;
+};
+
+static int tps68470_clk_is_prepared(struct clk_hw *hw)
+{
+	struct tps68470_clkdata *clkdata = to_tps68470_clkdata(hw);
+	int val;
+
+	if (regmap_read(clkdata->regmap, TPS68470_REG_PLLCTL, &val))
+		return 0;
+
+	return val & TPS68470_PLL_EN_MASK;
+}
+
+static int tps68470_clk_prepare(struct clk_hw *hw)
+{
+	struct tps68470_clkdata *clkdata = to_tps68470_clkdata(hw);
+	unsigned int idx = clkdata->clk_cfg_idx;
+
+	regmap_write(clkdata->regmap, TPS68470_REG_BOOSTDIV, clk_freqs[idx].boostdiv);
+	regmap_write(clkdata->regmap, TPS68470_REG_BUCKDIV, clk_freqs[idx].buckdiv);
+	regmap_write(clkdata->regmap, TPS68470_REG_PLLSWR, TPS68470_PLLSWR_DEFAULT);
+	regmap_write(clkdata->regmap, TPS68470_REG_XTALDIV, clk_freqs[idx].xtaldiv);
+	regmap_write(clkdata->regmap, TPS68470_REG_PLLDIV, clk_freqs[idx].plldiv);
+	regmap_write(clkdata->regmap, TPS68470_REG_POSTDIV, clk_freqs[idx].postdiv);
+	regmap_write(clkdata->regmap, TPS68470_REG_POSTDIV2, clk_freqs[idx].postdiv);
+	regmap_write(clkdata->regmap, TPS68470_REG_CLKCFG2, TPS68470_CLKCFG2_DRV_STR_2MA);
+
+	regmap_write(clkdata->regmap, TPS68470_REG_PLLCTL,
+		     TPS68470_OSC_EXT_CAP_DEFAULT << TPS68470_OSC_EXT_CAP_SHIFT |
+		     TPS68470_CLK_SRC_XTAL << TPS68470_CLK_SRC_SHIFT);
+
+	regmap_write(clkdata->regmap, TPS68470_REG_CLKCFG1,
+			   (TPS68470_PLL_OUTPUT_ENABLE << TPS68470_OUTPUT_A_SHIFT) |
+			   (TPS68470_PLL_OUTPUT_ENABLE << TPS68470_OUTPUT_B_SHIFT));
+
+	regmap_update_bits(clkdata->regmap, TPS68470_REG_PLLCTL,
+			   TPS68470_PLL_EN_MASK, TPS68470_PLL_EN_MASK);
+
+	/*
+	 * The PLLCTL reg lock bit is set by the PMIC after approx. 4ms and
+	 * does not indicate a true lock, so just wait 4 ms.
+	 */
+	usleep_range(4000, 5000);
+
+	return 0;
+}
+
+static void tps68470_clk_unprepare(struct clk_hw *hw)
+{
+	struct tps68470_clkdata *clkdata = to_tps68470_clkdata(hw);
+
+	/* Disable clock first ... */
+	regmap_update_bits(clkdata->regmap, TPS68470_REG_PLLCTL, TPS68470_PLL_EN_MASK, 0);
+
+	/* ... and then tri-state the clock outputs. */
+	regmap_write(clkdata->regmap, TPS68470_REG_CLKCFG1, 0);
+}
+
+static unsigned long tps68470_clk_recalc_rate(struct clk_hw *hw, unsigned long parent_rate)
+{
+	struct tps68470_clkdata *clkdata = to_tps68470_clkdata(hw);
+
+	return clk_freqs[clkdata->clk_cfg_idx].freq;
+}
+
+/*
+ * This returns the index of the clk_freqs[] cfg with the closest rate for
+ * use in tps68470_clk_round_rate(). tps68470_clk_set_rate() checks that
+ * the rate of the returned cfg is an exact match.
+ */
+static unsigned int tps68470_clk_cfg_lookup(unsigned long rate)
+{
+	long diff, best_diff = LONG_MAX;
+	unsigned int i, best_idx = 0;
+
+	for (i = 0; i < ARRAY_SIZE(clk_freqs); i++) {
+		diff = clk_freqs[i].freq - rate;
+		if (diff == 0)
+			return i;
+
+		diff = abs(diff);
+		if (diff < best_diff) {
+			best_diff = diff;
+			best_idx = i;
+		}
+	}
+
+	return best_idx;
+}
+
+static long tps68470_clk_round_rate(struct clk_hw *hw, unsigned long rate,
+				    unsigned long *parent_rate)
+{
+	unsigned int idx = tps68470_clk_cfg_lookup(rate);
+
+	return clk_freqs[idx].freq;
+}
+
+static int tps68470_clk_set_rate(struct clk_hw *hw, unsigned long rate,
+				 unsigned long parent_rate)
+{
+	struct tps68470_clkdata *clkdata = to_tps68470_clkdata(hw);
+	unsigned int idx = tps68470_clk_cfg_lookup(rate);
+
+	if (rate != clk_freqs[idx].freq)
+		return -EINVAL;
+
+	clkdata->clk_cfg_idx = idx;
+
+	return 0;
+}
+
+static const struct clk_ops tps68470_clk_ops = {
+	.is_prepared = tps68470_clk_is_prepared,
+	.prepare = tps68470_clk_prepare,
+	.unprepare = tps68470_clk_unprepare,
+	.recalc_rate = tps68470_clk_recalc_rate,
+	.round_rate = tps68470_clk_round_rate,
+	.set_rate = tps68470_clk_set_rate,
+};
+
+static const struct clk_init_data tps68470_clk_initdata = {
+	.name = TPS68470_CLK_NAME,
+	.ops = &tps68470_clk_ops,
+};
+
+static int tps68470_clk_probe(struct platform_device *pdev)
+{
+	struct tps68470_clk_platform_data *pdata = pdev->dev.platform_data;
+	struct tps68470_clkdata *tps68470_clkdata;
+	int ret;
+
+	tps68470_clkdata = devm_kzalloc(&pdev->dev, sizeof(*tps68470_clkdata),
+					GFP_KERNEL);
+	if (!tps68470_clkdata)
+		return -ENOMEM;
+
+	tps68470_clkdata->regmap = dev_get_drvdata(pdev->dev.parent);
+	tps68470_clkdata->clkout_hw.init = &tps68470_clk_initdata;
+	ret = devm_clk_hw_register(&pdev->dev, &tps68470_clkdata->clkout_hw);
+	if (ret)
+		return ret;
+
+	ret = devm_clk_hw_register_clkdev(&pdev->dev, &tps68470_clkdata->clkout_hw,
+					  TPS68470_CLK_NAME, NULL);
+	if (ret)
+		return ret;
+
+	if (pdata) {
+		ret = devm_clk_hw_register_clkdev(&pdev->dev,
+						  &tps68470_clkdata->clkout_hw,
+						  pdata->consumer_con_id,
+						  pdata->consumer_dev_name);
+	}
+
+	return ret;
+}
+
+static struct platform_driver tps68470_clk_driver = {
+	.driver = {
+		.name = TPS68470_CLK_NAME,
+	},
+	.probe = tps68470_clk_probe,
+};
+
+/*
+ * The ACPI tps68470 probe-ordering depends on the clk/gpio/regulator drivers
+ * registering before the drivers for the camera-sensors which use them bind.
+ * subsys_initcall() ensures this when the drivers are builtin.
+ */
+static int __init tps68470_clk_init(void)
+{
+	return platform_driver_register(&tps68470_clk_driver);
+}
+subsys_initcall(tps68470_clk_init);
+
+static void __exit tps68470_clk_exit(void)
+{
+	platform_driver_unregister(&tps68470_clk_driver);
+}
+module_exit(tps68470_clk_exit);
+
+MODULE_ALIAS("platform:tps68470-clk");
+MODULE_DESCRIPTION("clock driver for TPS68470 pmic");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/mfd/tps68470.h b/include/linux/mfd/tps68470.h
index ffe81127d91c..7807fa329db0 100644
--- a/include/linux/mfd/tps68470.h
+++ b/include/linux/mfd/tps68470.h
@@ -75,6 +75,17 @@
 #define TPS68470_CLKCFG1_MODE_A_MASK	GENMASK(1, 0)
 #define TPS68470_CLKCFG1_MODE_B_MASK	GENMASK(3, 2)
 
+#define TPS68470_CLKCFG2_DRV_STR_2MA	0x05
+#define TPS68470_PLL_OUTPUT_ENABLE	0x02
+#define TPS68470_CLK_SRC_XTAL		BIT(0)
+#define TPS68470_PLLSWR_DEFAULT		GENMASK(1, 0)
+#define TPS68470_OSC_EXT_CAP_DEFAULT	0x05
+
+#define TPS68470_OUTPUT_A_SHIFT		0x00
+#define TPS68470_OUTPUT_B_SHIFT		0x02
+#define TPS68470_CLK_SRC_SHIFT		GENMASK(2, 0)
+#define TPS68470_OSC_EXT_CAP_SHIFT	BIT(2)
+
 #define TPS68470_GPIO_CTL_REG_A(x)	(TPS68470_REG_GPCTL0A + (x) * 2)
 #define TPS68470_GPIO_CTL_REG_B(x)	(TPS68470_REG_GPCTL0B + (x) * 2)
 #define TPS68470_GPIO_MODE_MASK		GENMASK(1, 0)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

[parent not found: <163588780885.2993099.2088131017920983969@swboyd.mtv.corp.google.com>]

* Re:
       [not found]   ` <163588780885.2993099.2088131017920983969@swboyd.mtv.corp.google.com>
@ 2021-11-25 15:01     ` Hans de Goede
  0 siblings, 0 replies; 1651+ messages in thread
From: Hans de Goede @ 2021-11-25 15:01 UTC (permalink / raw)
  To: Stephen Boyd, Andy Shevchenko, Daniel Scally, Laurent Pinchart,
	Liam Girdwood, Mark Brown, Mark Gross, Mauro Carvalho Chehab,
	Michael Turquette, Mika Westerberg, Rafael J.Wysocki,
	Wolfram Sang
  Cc: Len Brown, linux-acpi, platform-driver-x86, linux-kernel,
	linux-i2c, Sakari Ailus, Kate Hsuan, linux-media, linux-clk

Hi,

On 11/2/21 22:16, Stephen Boyd wrote:
> Quoting Hans de Goede (2021-11-02 02:49:01)
>> diff --git a/drivers/clk/clk-tps68470.c b/drivers/clk/clk-tps68470.c
>> new file mode 100644
>> index 000000000000..2ad0ac2f4096
>> --- /dev/null
>> +++ b/drivers/clk/clk-tps68470.c
>> @@ -0,0 +1,257 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Clock driver for TPS68470 PMIC
>> + *
>> + * Copyright (c) 2021 Red Hat Inc.
>> + * Copyright (C) 2018 Intel Corporation
>> + *
>> + * Authors:
>> + *     Hans de Goede <hdegoede@redhat.com>
>> + *     Zaikuo Wang <zaikuo.wang@intel.com>
>> + *     Tianshu Qiu <tian.shu.qiu@intel.com>
>> + *     Jian Xu Zheng <jian.xu.zheng@intel.com>
>> + *     Yuning Pu <yuning.pu@intel.com>
>> + *     Antti Laakso <antti.laakso@intel.com>
>> + */
>> +
>> +#include <linux/clk-provider.h>
>> +#include <linux/clkdev.h>
>> +#include <linux/kernel.h>
>> +#include <linux/mfd/tps68470.h>
>> +#include <linux/module.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/platform_data/tps68470.h>
>> +#include <linux/regmap.h>
>> +
>> +#define TPS68470_CLK_NAME "tps68470-clk"
>> +
>> +#define to_tps68470_clkdata(clkd) \
>> +       container_of(clkd, struct tps68470_clkdata, clkout_hw)
>> +
> [...]
>> +
>> +static int tps68470_clk_set_rate(struct clk_hw *hw, unsigned long rate,
>> +                                unsigned long parent_rate)
>> +{
>> +       struct tps68470_clkdata *clkdata = to_tps68470_clkdata(hw);
>> +       unsigned int idx = tps68470_clk_cfg_lookup(rate);
>> +
>> +       if (rate != clk_freqs[idx].freq)
>> +               return -EINVAL;
>> +
>> +       clkdata->clk_cfg_idx = idx;
> 
> It deserves a comment that set_rate can only be called when the clk is
> gated. We have CLK_SET_RATE_GATE flag as well that should be set if the
> clk can't support changing rate while enabled. With that flag set, this
> function should be able to actually change hardware with the assumption
> that the framework won't call down into this clk_op when the clk is
> enabled.

Ok for v6 I've added the CLK_SET_RATE_GATE flag + a comment why
it used and moved the divider programming to tps68470_clk_set_rate()m
while keeping the PLL_EN + output-enable writes in tps68470_clk_prepare()


> 
>> +
>> +       return 0;
>> +}
>> +
>> +static const struct clk_ops tps68470_clk_ops = {
>> +       .is_prepared = tps68470_clk_is_prepared,
>> +       .prepare = tps68470_clk_prepare,
>> +       .unprepare = tps68470_clk_unprepare,
>> +       .recalc_rate = tps68470_clk_recalc_rate,
>> +       .round_rate = tps68470_clk_round_rate,
>> +       .set_rate = tps68470_clk_set_rate,
>> +};
>> +
>> +static const struct clk_init_data tps68470_clk_initdata = {
> 
> Is there a reason to make this a static global? It's probably better to
> throw it on the stack so that a structure isn't sitting around after
> driver probe being unused.

Fixed for v6.

Thanks & Regards,

Hans


> 
>> +       .name = TPS68470_CLK_NAME,
>> +       .ops = &tps68470_clk_ops,
>> +};
> 


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAGGnn3JZdc3ETS_AijasaFUqLY9e5Q1ZHK3+806rtsEBnAo5Og@mail.gmail.com>]

* Re:
       [not found] <CAGGnn3JZdc3ETS_AijasaFUqLY9e5Q1ZHK3+806rtsEBnAo5Og@mail.gmail.com>
@ 2021-11-23 17:20 ` Christian COMMARMOND
  0 siblings, 0 replies; 1651+ messages in thread
From: Christian COMMARMOND @ 2021-11-23 17:20 UTC (permalink / raw)
  To: linux-btrfs

Hi,

I use a TERRAMASTER F5-422, with 14Gb disks with 3 btrfs partitions.
After repeated power outages, the 3rd partition mounts, but data is
not visible, other that the first root directory.

I tried to repair the disk and get this:
[root@TNAS-00E1FD ~]# btrfsck --repair /dev/mapper/vg0-lv2
enabling repair mode
...
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/mapper/vg0-lv2
UUID: a7b536f5-1827-479c-9170-eccbbc624370
[1/7] checking root items
Error: could not find btree root extent for root 257
ERROR: failed to repair root items: No such file or directory

(I put the full /var/log/messages at the end of this mail).

What can I do to get my data back?
This is a backup disk, and I am supposed to have a copy of it in
another place, but there too, murphy's law, I had some disk failures,
and lost some of my data.
So it would be very good to be able to recover some data from these disks.

Other informations:
lsblk:
NAME          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
sda             8:0    0   3.7T  0 disk
|-sda1          8:1    0   285M  0 part
|-sda2          8:2    0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
-|-sda3          8:3    0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sda4          8:4    0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2
sdb             8:16   0   3.7T  0 disk
|-sdb1          8:17   0   285M  0 part
|-sdb2          8:18   0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
|-sdb3          8:19   0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sdb4          8:20   0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2
sdc             8:32   0   3.7T  0 disk
|-sdc1          8:33   0   285M  0 part
|-sdc2          8:34   0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
|-sdc3          8:35   0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sdc4          8:36   0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2
sdd             8:48   0   3.7T  0 disk
|-sdd1          8:49   0   285M  0 part
|-sdd2          8:50   0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
|-sdd3          8:51   0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sdd4          8:52   0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2
sde             8:64   0   3.7T  0 disk
|-sde1          8:65   0   285M  0 part
|-sde2          8:66   0   1.9G  0 part
| `-md9         9:9    0   1.9G  0 raid1 /
|-sde3          8:67   0   977M  0 part
| `-md8         9:8    0 976.4M  0 raid1 [SWAP]
`-sde4          8:68   0   3.7T  0 part
  `-md0         9:0    0  14.6T  0 raid5
    |-vg0-lv0 251:0    0     2T  0 lvm   /mnt/md0
    |-vg0-lv1 251:1    0   3.9T  0 lvm   /mnt/md1
    `-vg0-lv2 251:2    0   8.7T  0 lvm   /mnt/md2


df -h:
Filesystem                Size      Used Available Use% Mounted on
/dev/md9                  1.8G    576.8M      1.2G  32% /
devtmpfs                  1.8G         0      1.8G   0% /dev
tmpfs                     1.8G         0      1.8G   0% /dev/shm
tmpfs                     1.8G      1.1M      1.8G   0% /tmp
tmpfs                     1.8G    236.0K      1.8G   0% /run
tmpfs                     1.8G      6.3M      1.8G   0% /opt/var
/dev/mapper/vg0-lv0       2.0T     34.5M      2.0T   0% /mnt/md0
/dev/mapper/vg0-lv1       3.9T     16.3M      3.9T   0% /mnt/md1
/dev/mapper/vg0-lv2       8.7T      2.9T      5.8T  33% /mnt/md2

This physical disks are new (a few months) and do not show errors.

I hope there is a way to fix this.

regards,

Christian COMMARMOND


Here, the full (restricted to 'kernel') from the lines where I begin
to see errors:
Nov 23 17:00:46 TNAS-00E1FD kernel: [   34.540572] Detached from
scsi7, channel 0, id 0, lun 0, type 0
Nov 23 17:00:48 TNAS-00E1FD kernel: [   37.148169] md: md8 stopped.
Nov 23 17:00:48 TNAS-00E1FD kernel: [   37.154395] md/raid1:md8:
active with 1 out of 72 mirrors
Nov 23 17:00:48 TNAS-00E1FD kernel: [   37.155564] md8: detected
capacity change from 0 to 1023868928
Nov 23 17:00:49 TNAS-00E1FD kernel: [   38.240910] md: recovery of
RAID array md8
Nov 23 17:00:49 TNAS-00E1FD kernel: [   38.276712] md: md8: recovery
interrupted.
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.346552] md: recovery of
RAID array md8
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.392148] md: md8: recovery
interrupted.
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.458126] md: recovery of
RAID array md8
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.494025] md: md8: recovery
interrupted.
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.576871] md: recovery of
RAID array md8
Nov 23 17:00:50 TNAS-00E1FD kernel: [   38.837269] Adding 999868k swap
on /dev/md8.  Priority:-1 extents:1 across:999868k
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.801285] md: md0 stopped.
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.859798] md/raid:md0: device
sda4 operational as raid disk 0
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.861417] md/raid:md0: device
sde4 operational as raid disk 4
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.863675] md/raid:md0: device
sdd4 operational as raid disk 3
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.865059] md/raid:md0: device
sdc4 operational as raid disk 2
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.866373] md/raid:md0: device
sdb4 operational as raid disk 1
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.869300] md/raid:md0: raid
level 5 active with 5 out of 5 devices, algorithm 2
Nov 23 17:00:51 TNAS-00E1FD kernel: [   39.926721] md0: detected
capacity change from 0 to 15989118861312
Nov 23 17:00:57 TNAS-00E1FD kernel: [   46.111539] md: md8: recovery done.
Nov 23 17:00:57 TNAS-00E1FD kernel: [   46.269349] flashcache:
flashcache-3.1.1 initialized
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.394510] BTRFS: device fsid
bdc3dbee-00a3-4541-99b4-096cd27939f2 devid 1 transid 679
/dev/mapper/vg0-lv0
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.397072] BTRFS info (device
dm-0): metadata ratio 50
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.399122] BTRFS info (device
dm-0): using free space tree
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.400380] BTRFS info (device
dm-0): has skinny extents
Nov 23 17:00:58 TNAS-00E1FD kernel: [   46.471236] BTRFS info (device
dm-0): new size for /dev/mapper/vg0-lv0 is 2147483648000
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.087622] BTRFS: device fsid
a5828e5a-1b11-4743-891c-11d0d8aeb1ae devid 1 transid 107
/dev/mapper/vg0-lv1
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.089943] BTRFS info (device
dm-1): metadata ratio 50
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.091505] BTRFS info (device
dm-1): using free space tree
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.093062] BTRFS info (device
dm-1): has skinny extents
Nov 23 17:00:58 TNAS-00E1FD kernel: [   47.150713] BTRFS info (device
dm-1): new size for /dev/mapper/vg0-lv1 is 4294967296000
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.737119] BTRFS: device fsid
a7b536f5-1827-479c-9170-eccbbc624370 devid 1 transid 142633
/dev/mapper/vg0-lv2
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.739313] BTRFS info (device
dm-2): metadata ratio 50
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.740630] BTRFS info (device
dm-2): using free space tree
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.741892] BTRFS info (device
dm-2): has skinny extents
Nov 23 17:00:59 TNAS-00E1FD kernel: [   47.946451] BTRFS info (device
dm-2): bdev /dev/mapper/vg0-lv2 errs: wr 0, rd 0, flush 0, corrupt 0,
gen 8
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.693394] BTRFS info (device
dm-2): checking UUID tree
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.700560] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.707394] BTRFS info (device
dm-2): new size for /dev/mapper/vg0-lv2 is 9546663723008
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.713109] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.715107] BTRFS warning
(device dm-2): iterating uuid_tree failed -5
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.795716] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:01 TNAS-00E1FD kernel: [   49.798231] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:03 TNAS-00E1FD kernel: [   52.272802] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:03 TNAS-00E1FD kernel: [   52.275264] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:03 TNAS-00E1FD kernel: [   52.277208] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:03 TNAS-00E1FD kernel: [   52.278483] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:04 TNAS-00E1FD kernel: [   52.570033] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:04 TNAS-00E1FD kernel: [   52.571487] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:05 TNAS-00E1FD kernel: [   54.250527] nf_conntrack:
default automatic helper assignment has been turned off for security
reasons and CT-based  firewall rule not found. Use the iptables CT
target to attach helpers instead.
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.050418]
verify_parent_transid: 2 callbacks suppressed
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.050424] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.063012] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.166746] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.167903] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:07 TNAS-00E1FD kernel: [   56.274188] NFSD: starting
90-second grace period (net ffffffff9db5abc0)
Nov 23 17:01:09 TNAS-00E1FD kernel: [   57.524631] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:09 TNAS-00E1FD kernel: [   57.525878] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:09 TNAS-00E1FD kernel: [   57.589706] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:09 TNAS-00E1FD kernel: [   57.590882] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:01:10 TNAS-00E1FD kernel: [   58.315852] warning: `smbd'
uses legacy ethtool link settings API, link modes are only partially
reported
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.882060] BTRFS error (device
dm-2): incorrect extent count for 29360128; counted 740, expected 677
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.883000] BTRFS: error
(device dm-2) in convert_free_space_to_extents:457: errno=-5 IO
failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.883946] BTRFS info (device
dm-2): forced readonly
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.884896] BTRFS: error
(device dm-2) in add_to_free_space_tree:1052: errno=-5 IO failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.885863] BTRFS: error
(device dm-2) in __btrfs_free_extent:7106: errno=-5 IO failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.886825] BTRFS: error
(device dm-2) in btrfs_run_delayed_refs:3009: errno=-5 IO failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.887803] BTRFS warning
(device dm-2): Skipping commit of aborted transaction.
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.888807] BTRFS: error
(device dm-2) in cleanup_transaction:1873: errno=-5 IO failure
Nov 23 17:01:31 TNAS-00E1FD kernel: [   79.892906] BTRFS error (device
dm-2): incorrect extent count for 29360128; counted 739, expected 676
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.199509] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.212280] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.214362] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.216331] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.224184] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.225500] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.227338] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:02:55 TNAS-00E1FD kernel: [  164.228636] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.915492] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.936745] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.938543] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.940375] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.951375] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.952810] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.972430] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.973548] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.974819] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:37 TNAS-00E1FD kernel: [  205.975984] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.807122]
verify_parent_transid: 6 callbacks suppressed
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.807127] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.819996] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.923926] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  222.925434] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  223.061241] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:54 TNAS-00E1FD kernel: [  223.062463] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:59 TNAS-00E1FD kernel: [  227.554549] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:03:59 TNAS-00E1FD kernel: [  227.556100] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:04:13 TNAS-00E1FD kernel: [  242.190152] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:04:13 TNAS-00E1FD kernel: [  242.202843] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:04:13 TNAS-00E1FD kernel: [  242.215390] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:04:13 TNAS-00E1FD kernel: [  242.217241] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:05:15 TNAS-00E1FD kernel: [  303.772878] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:05:15 TNAS-00E1FD kernel: [  303.785862] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:14 TNAS-00E1FD kernel: [  362.480763] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:14 TNAS-00E1FD kernel: [  362.493848] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.055419] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.068306] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.069074] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.069862] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.076040] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.076821] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.077643] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:06:43 TNAS-00E1FD kernel: [  392.078360] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:02 TNAS-00E1FD kernel: [  830.643054] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:02 TNAS-00E1FD kernel: [  830.664937] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:11 TNAS-00E1FD kernel: [  839.988330] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:11 TNAS-00E1FD kernel: [  839.989850] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:11 TNAS-00E1FD kernel: [  839.991371] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:11 TNAS-00E1FD kernel: [  839.992867] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:12 TNAS-00E1FD kernel: [  840.488126] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:14:12 TNAS-00E1FD kernel: [  840.488998] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.266877] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.288688] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.289624] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.290454] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.300198] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.300917] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.301704] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:16:36 TNAS-00E1FD kernel: [  985.302318] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:34:24 TNAS-00E1FD kernel: [ 2052.815271] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:34:24 TNAS-00E1FD kernel: [ 2052.838506] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:34:52 TNAS-00E1FD kernel: [ 2081.273231] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:34:52 TNAS-00E1FD kernel: [ 2081.296585] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:39:26 TNAS-00E1FD kernel: [ 2354.866442] BTRFS error (device
dm-2): cleaner transaction attach returned -30
Nov 23 17:56:30 TNAS-00E1FD kernel: [ 3378.825461] BTRFS info (device
dm-2): using free space tree
Nov 23 17:56:30 TNAS-00E1FD kernel: [ 3378.825891] BTRFS info (device
dm-2): has skinny extents
Nov 23 17:56:30 TNAS-00E1FD kernel: [ 3378.968533] BTRFS info (device
dm-2): bdev /dev/mapper/vg0-lv2 errs: wr 0, rd 0, flush 0, corrupt 0,
gen 8
Nov 23 17:56:32 TNAS-00E1FD kernel: [ 3380.525294] BTRFS info (device
dm-2): checking UUID tree
Nov 23 17:56:32 TNAS-00E1FD kernel: [ 3380.535839] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:56:32 TNAS-00E1FD kernel: [ 3380.544791] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:56:32 TNAS-00E1FD kernel: [ 3380.545579] BTRFS warning
(device dm-2): iterating uuid_tree failed -5
Nov 23 17:56:42 TNAS-00E1FD kernel: [ 3391.302453] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:56:43 TNAS-00E1FD kernel: [ 3391.328368] BTRFS error (device
dm-2): parent transid verify failed on 174735360 wanted 37018 found
37023
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.806326] BTRFS error (device
dm-2): incorrect extent count for 29360128; counted 740, expected 677
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.806836] BTRFS: error
(device dm-2) in convert_free_space_to_extents:457: errno=-5 IO
failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.807367] BTRFS info (device
dm-2): forced readonly
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.807904] BTRFS: error
(device dm-2) in add_to_free_space_tree:1052: errno=-5 IO failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.808493] BTRFS: error
(device dm-2) in __btrfs_free_extent:7106: errno=-5 IO failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.809160] BTRFS: error
(device dm-2) in btrfs_run_delayed_refs:3009: errno=-5 IO failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.809785] BTRFS warning
(device dm-2): Skipping commit of aborted transaction.
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.810444] BTRFS: error
(device dm-2) in cleanup_transaction:1873: errno=-5 IO failure
Nov 23 17:57:01 TNAS-00E1FD kernel: [ 3409.814113] BTRFS error (device
dm-2): incorrect extent count for 29360128; counted 739, expected 676

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20211126221034.21331-1-lukasz.bartosik@semihalf.com--annotate>]

* Re:
       [not found] <20211126221034.21331-1-lukasz.bartosik@semihalf.com--annotate>
@ 2021-11-29 21:59   ` sean.wang
  0 siblings, 0 replies; 1651+ messages in thread
From: sean.wang @ 2021-11-29 21:59 UTC (permalink / raw)
  To: lb
  Cc: marcel, johan.hedberg, luiz.dentz, upstream, linux-bluetooth,
	linux-mediatek, linux-kernel, Sean Wang

From: Sean Wang <sean.wang@mediatek.com>

>Enable msft opcode for btmtksdio driver.
>
>Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
>---
> drivers/bluetooth/btmtksdio.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/bluetooth/btmtksdio.c b/drivers/bluetooth/btmtksdio.c index d9cf0c492e29..2a7a615663b9 100644
>--- a/drivers/bluetooth/btmtksdio.c
>+++ b/drivers/bluetooth/btmtksdio.c
>@@ -887,6 +887,7 @@ static int btmtksdio_setup(struct hci_dev *hdev)
>	if (enable_autosuspend)
>		pm_runtime_allow(bdev->dev);
>
>+	hci_set_msft_opcode(hdev, 0xFD30);

Hi Łukasz,

msft feature is supposed only supported on mt7921. Could you help rework the patch to enalbe msft opocde only for mt7921?

	Sean

>	bt_dev_info(hdev, "Device setup in %llu usecs", duration);
>
>	return 0;
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2021-11-29 21:59   ` sean.wang
  0 siblings, 0 replies; 1651+ messages in thread
From: sean.wang @ 2021-11-29 21:59 UTC (permalink / raw)
  To: lb
  Cc: marcel, johan.hedberg, luiz.dentz, upstream, linux-bluetooth,
	linux-mediatek, linux-kernel, Sean Wang

From: Sean Wang <sean.wang@mediatek.com>

>Enable msft opcode for btmtksdio driver.
>
>Signed-off-by: Łukasz Bartosik <lb@semihalf.com>
>---
> drivers/bluetooth/btmtksdio.c | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/drivers/bluetooth/btmtksdio.c b/drivers/bluetooth/btmtksdio.c index d9cf0c492e29..2a7a615663b9 100644
>--- a/drivers/bluetooth/btmtksdio.c
>+++ b/drivers/bluetooth/btmtksdio.c
>@@ -887,6 +887,7 @@ static int btmtksdio_setup(struct hci_dev *hdev)
>	if (enable_autosuspend)
>		pm_runtime_allow(bdev->dev);
>
>+	hci_set_msft_opcode(hdev, 0xFD30);

Hi Łukasz,

msft feature is supposed only supported on mt7921. Could you help rework the patch to enalbe msft opocde only for mt7921?

	Sean

>	bt_dev_info(hdev, "Device setup in %llu usecs", duration);
>
>	return 0;
>

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2021-12-20  6:46 Ralf Beck
  2021-12-20  7:55 ` Greg KH
  2021-12-20 10:01 ` Re: Oliver Neukum
  0 siblings, 2 replies; 1651+ messages in thread
From: Ralf Beck @ 2021-12-20  6:46 UTC (permalink / raw)
  To: linux-usb

Currently the usb core is disabling the use of and endpoint, if the endpoint address is present in two different USB interface descriptors within the same USB configuration.
This behaviour is obviously based on following passage in the USB specification:

"An endpoint is not shared among interfaces within a single configuration unless the endpoint is used by alternate settings of the same interface."

However, this behaviour prevents using some interfaces (in my case the Motu AVB audio devices) in their vendor specific mode.

They use a single USB configuration with tqo sets of interfaces, which use the same isochronous entpoint numbers.

One set with audio class specific interfaces for use by an audi class driver.
The other set with vendor specific interfaces for use by the vendor driver.
Obviously the class specific interfaces and vendor specific interfaces are not intended to be use by a driver simultaniously.

There must be another solution to deal with this. It is unacceptable to request a user of these devices to have to disablethe duplicate endpoint check and recompile the kernel on every update in order to be able to use their devices in vendor mode.

Sincerly,
Ralf Beck

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-12-20  6:46 Ralf Beck
@ 2021-12-20  7:55 ` Greg KH
  2021-12-20 10:01 ` Re: Oliver Neukum
  1 sibling, 0 replies; 1651+ messages in thread
From: Greg KH @ 2021-12-20  7:55 UTC (permalink / raw)
  To: Ralf Beck; +Cc: linux-usb

On Mon, Dec 20, 2021 at 07:46:34AM +0100, Ralf Beck wrote:
> 
> Currently the usb core is disabling the use of and endpoint, if the endpoint address is present in two different USB interface descriptors within the same USB configuration.
> This behaviour is obviously based on following passage in the USB specification:
> 
> "An endpoint is not shared among interfaces within a single configuration unless the endpoint is used by alternate settings of the same interface."
> 
> However, this behaviour prevents using some interfaces (in my case the Motu AVB audio devices) in their vendor specific mode.
> 
> They use a single USB configuration with tqo sets of interfaces, which use the same isochronous entpoint numbers.
> 
> One set with audio class specific interfaces for use by an audi class driver.
> The other set with vendor specific interfaces for use by the vendor driver.
> Obviously the class specific interfaces and vendor specific interfaces are not intended to be use by a driver simultaniously.
> 
> There must be another solution to deal with this. It is unacceptable to request a user of these devices to have to disablethe duplicate endpoint check and recompile the kernel on every update in order to be able to use their devices in vendor mode.

The device sounds like it des not follow the USB specification, so how
does it work with any operating system?

What in-kernel driver binds to the device in vendor mode?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2021-12-20  6:46 Ralf Beck
  2021-12-20  7:55 ` Greg KH
@ 2021-12-20 10:01 ` Oliver Neukum
  1 sibling, 0 replies; 1651+ messages in thread
From: Oliver Neukum @ 2021-12-20 10:01 UTC (permalink / raw)
  To: Ralf Beck, linux-usb


On 20.12.21 07:46, Ralf Beck wrote
> One set with audio class specific interfaces for use by an audi class driver.
> The other set with vendor specific interfaces for use by the vendor driver.
> Obviously the class specific interfaces and vendor specific interfaces are not intended to be use by a driver simultaniously.
Such devices are buggy. We usually define quirks for such devices.
> There must be another solution to deal with this. It is unacceptable to request a user of these devices to have to disablethe duplicate endpoint check and recompile the kernel on every update in order to be able to use their devices in vendor mode.
I suggest you write a patch to introduce a quirk that disables one of the
interfaces and disregards disabled interfaces for purposes of the check.

    Regards
        Oliver

PS: Please use a subject line when you post.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20211229092443.GA10533@L-PF27918B-1352.localdomain>]

* Re:
       [not found] <20211229092443.GA10533@L-PF27918B-1352.localdomain>
@ 2022-01-05  6:05 ` Jason Wang
  2022-01-05  6:27   ` Re: Jason Wang
  0 siblings, 1 reply; 1651+ messages in thread
From: Jason Wang @ 2022-01-05  6:05 UTC (permalink / raw)
  To: Wu Zongyong; +Cc: virtualization

On Wed, Dec 29, 2021 at 5:31 PM Wu Zongyong
<wuzongyong@linux.alibaba.com> wrote:
>
> linux-kernel@vger.kernel.org
> Bcc:
> Subject: Should we call vdpa_config_ops->get_vq_num_{max,min} with a
>  virtqueue index?
> Reply-To: Wu Zongyong <wuzongyong@linux.alibaba.com>
>
> Hi jason,
>
> AFAIK, a virtio device may have multiple virtqueues of diffrent sizes.
> It is okay for modern devices with the vdpa_config_ops->get_vq_num_max
> implementation with a static number currently since modern devices can
> reset the queue size. But for legacy-virtio based devices, we cannot
> allocate correct sizes for these virtqueues since it is not supported to
> negotiate the queue size with harware.
>
> So as the title said, I wonder is it neccessary to add a new parameter
> `index` to vdpa_config_ops->get_vq_num_{max,min} to help us get the size
> of a dedicated virtqueue.

I've posted something like this in the past here:

https://lore.kernel.org/lkml/CACycT3tMd750PQ0mgqCjHnxM4RmMcx2+Eo=2RBs2E2W3qPJang@mail.gmail.com/

>
> Or we can introduce a new callback like get_config_vq_num?
>
> What do you think?

If you wish, you can carry on my work. We can start by reusing the
current ops, if it doesn't work, we can use new.

Thanks

>
> Thanks
>
>
>
>
>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-05  6:05 ` Re: Jason Wang
@ 2022-01-05  6:27   ` Jason Wang
  0 siblings, 0 replies; 1651+ messages in thread
From: Jason Wang @ 2022-01-05  6:27 UTC (permalink / raw)
  To: Wu Zongyong; +Cc: virtualization

On Wed, Jan 5, 2022 at 2:05 PM Jason Wang <jasowang@redhat.com> wrote:
>
> On Wed, Dec 29, 2021 at 5:31 PM Wu Zongyong
> <wuzongyong@linux.alibaba.com> wrote:
> >
> > linux-kernel@vger.kernel.org
> > Bcc:
> > Subject: Should we call vdpa_config_ops->get_vq_num_{max,min} with a
> >  virtqueue index?
> > Reply-To: Wu Zongyong <wuzongyong@linux.alibaba.com>
> >
> > Hi jason,
> >
> > AFAIK, a virtio device may have multiple virtqueues of diffrent sizes.
> > It is okay for modern devices with the vdpa_config_ops->get_vq_num_max
> > implementation with a static number currently since modern devices can
> > reset the queue size. But for legacy-virtio based devices, we cannot
> > allocate correct sizes for these virtqueues since it is not supported to
> > negotiate the queue size with harware.
> >
> > So as the title said, I wonder is it neccessary to add a new parameter
> > `index` to vdpa_config_ops->get_vq_num_{max,min} to help us get the size
> > of a dedicated virtqueue.
>
> I've posted something like this in the past here:
>
> https://lore.kernel.org/lkml/CACycT3tMd750PQ0mgqCjHnxM4RmMcx2+Eo=2RBs2E2W3qPJang@mail.gmail.com/
>
> >
> > Or we can introduce a new callback like get_config_vq_num?
> >
> > What do you think?
>
> If you wish, you can carry on my work. We can start by reusing the
> current ops, if it doesn't work, we can use new.

Just to clarify, I meant, we probably need to introduce a new uAPI on
top of the above version.

Thanks

>
> Thanks
>
> >
> > Thanks
> >
> >
> >
> >
> >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-01-13 17:53 Varun Sethi
  2022-01-14 17:17 ` Fabio Estevam
  0 siblings, 1 reply; 1651+ messages in thread
From: Varun Sethi @ 2022-01-13 17:53 UTC (permalink / raw)
  To: festevam@gmail.com
  Cc: linux-crypto@vger.kernel.org, andrew.smirnov@gmail.com,
	Horia Geanta, Gaurav Jain, Pankaj Gupta

Hi Fabio, Andrey,
So far we have observed this issue on i.MX6 only. Disabling prediction resistance isn't the solution for the problem. We are working on identifying the proper fix for this issue and would post the patch for the same.

Regards
Varun
>>On Tue, Jan 11, 2022 at 4:41 AM Fabio Estevam <festevam@gmail.com> wrote:
>>
>> From: Fabio Estevam <festevam@denx.de>
>>
>> Since commit 358ba762d9f1 ("crypto: caam - enable prediction resistance
>> in HRWNG") the following CAAM errors can be seen on i.MX6:
>>
>> caam_jr 2101000.jr: 20003c5b: CCB: desc idx 60: RNG: Hardware error
>> hwrng: no data available
>> caam_jr 2101000.jr: 20003c5b: CCB: desc idx 60: RNG: Hardware error
>> hwrng: no data available
>> caam_jr 2101000.jr: 20003c5b: CCB: desc idx 60: RNG: Hardware error
>> hwrng: no data available
>> caam_jr 2101000.jr: 20003c5b: CCB: desc idx 60: RNG: Hardware error
>> hwrng: no data available
>>
>> OP_ALG_PR_ON is enabled unconditionally, which may cause the problem
>> on i.MX devices.

>Is this true for every i.MX device? I haven't worked with the
>i.MX6Q/i.MX8 hardware I was enabling this feature for in a while, so
>I'm not 100% up to date on all of the problems we've seen with those,
>but last time enabling prediction resistance didn't seem to cause any
>issues besides a noticeable slowdown of random data generation.

>Can this be a Kconfig option or maybe a runtime flag so that it'd
>still be possible for some i.MX users to keep PR enabled?

>>
>> Fix the problem by only enabling OP_ALG_PR_ON on platforms that have
> >Management Complex support.
>>
> >Fixes: 358ba762d9f1 ("crypto: caam - enable prediction resistance in HRWNG")
> >Signed-off-by: Fabio Estevam <festevam@denx.de>
> >---
>  >drivers/crypto/caam/caamrng.c | 15 +++++++++++----
> > 1 file changed, 11 insertions(+), 4 deletions(-)
>>
> >diff --git a/drivers/crypto/caam/caamrng.c b/drivers/crypto/caam/caamrng.c
> >index 77d048dfe5d0..3514fe5de2a5 100644
> >--- a/drivers/crypto/caam/caamrng.c
> >+++ b/drivers/crypto/caam/caamrng.c
> >@@ -63,12 +63,19 @@ static void caam_rng_done(struct device *jrdev, u32 *desc, u32 err,
> >        complete(jctx->done);
> > }
>>
> >-static u32 *caam_init_desc(u32 *desc, dma_addr_t dst_dma)
>> +static u32 *caam_init_desc(struct device *jrdev, u32 *desc, dma_addr_t dst_dma)
> > {
>> +       struct caam_drv_private *priv = dev_get_drvdata(jrdev->parent);
> >+
> >        init_job_desc(desc, 0); /* + 1 cmd_sz */
> >        /* Generate random bytes: + 1 cmd_sz */
> >-       append_operation(desc, OP_ALG_ALGSEL_RNG | OP_TYPE_CLASS1_ALG |
>> -                        OP_ALG_PR_ON);
>> +
>> +       if (priv->mc_en)
>> +               append_operation(desc, OP_ALG_ALGSEL_RNG | OP_TYPE_CLASS1_ALG |
>> +                                 OP_ALG_PR_ON);
>> +       else
>> +               append_operation(desc, OP_ALG_ALGSEL_RNG | OP_TYPE_CLASS1_ALG);
>> +
> >        /* Store bytes: + 1 cmd_sz + caam_ptr_sz  */
> >        append_fifo_store(desc, dst_dma,
> >                          CAAM_RNG_MAX_FIFO_STORE_SIZE, FIFOST_TYPE_RNGSTORE);
> >@@ -101,7 +108,7 @@ static int caam_rng_read_one(struct device *jrdev,
>>
> >        init_completion(done);
> >        err = caam_jr_enqueue(jrdev,
>> -                             caam_init_desc(desc, dst_dma),
>> +                             caam_init_desc(jrdev, desc, dst_dma),
> >                              caam_rng_done, &jctx);
>>        if (err == -EINPROGRESS) {
> >                wait_for_completion(done);
>> --
> 2.25.1
>





^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-13 17:53 Varun Sethi
@ 2022-01-14 17:17 ` Fabio Estevam
  0 siblings, 0 replies; 1651+ messages in thread
From: Fabio Estevam @ 2022-01-14 17:17 UTC (permalink / raw)
  To: Varun Sethi
  Cc: linux-crypto@vger.kernel.org, andrew.smirnov@gmail.com,
	Horia Geanta, Gaurav Jain, Pankaj Gupta

Hi Varun,

On Thu, Jan 13, 2022 at 2:53 PM Varun Sethi <V.Sethi@nxp.com> wrote:
>
> Hi Fabio, Andrey,
> So far we have observed this issue on i.MX6 only. Disabling prediction resistance isn't the solution for the problem. We are working on identifying the proper fix for this issue and would post the patch for the same.

Please copy me when you submit a fix for this issue.

Thanks!

Fabio Estevam

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-01-14 10:54 Li RongQing
  2022-01-14 10:55 ` Paolo Bonzini
  0 siblings, 1 reply; 1651+ messages in thread
From: Li RongQing @ 2022-01-14 10:54 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, wanpengli, jmattson, tglx, bp, x86,
	kvm, joro, peterz

After support paravirtualized TLB shootdowns, steal_time.preempted
includes not only KVM_VCPU_PREEMPTED, but also KVM_VCPU_FLUSH_TLB

and kvm_vcpu_is_preempted should test only with KVM_VCPU_PREEMPTED

Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
diff with v1: 
clear the rest of rax, suggested by Sean and peter
remove Fixes tag, since no issue in practice

 arch/x86/kernel/kvm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index b061d17..45c9ce8d 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -1025,8 +1025,8 @@ asm(
 ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
 "__raw_callee_save___kvm_vcpu_is_preempted:"
 "movq	__per_cpu_offset(,%rdi,8), %rax;"
-"cmpb	$0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
-"setne	%al;"
+"movb	" __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax), %al;"
+"and	$" __stringify(KVM_VCPU_PREEMPTED) ", %rax;"
 "ret;"
 ".size __raw_callee_save___kvm_vcpu_is_preempted, .-__raw_callee_save___kvm_vcpu_is_preempted;"
 ".popsection");
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-14 10:54 Li RongQing
@ 2022-01-14 10:55 ` Paolo Bonzini
  2022-01-14 17:13   ` Re: Sean Christopherson
  0 siblings, 1 reply; 1651+ messages in thread
From: Paolo Bonzini @ 2022-01-14 10:55 UTC (permalink / raw)
  To: Li RongQing, seanjc, vkuznets, wanpengli, jmattson, tglx, bp, x86,
	kvm, joro, peterz

On 1/14/22 11:54, Li RongQing wrote:
> After support paravirtualized TLB shootdowns, steal_time.preempted
> includes not only KVM_VCPU_PREEMPTED, but also KVM_VCPU_FLUSH_TLB
> 
> and kvm_vcpu_is_preempted should test only with KVM_VCPU_PREEMPTED
> 
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
> diff with v1:
> clear the rest of rax, suggested by Sean and peter
> remove Fixes tag, since no issue in practice
> 
>   arch/x86/kernel/kvm.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index b061d17..45c9ce8d 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -1025,8 +1025,8 @@ asm(
>   ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
>   "__raw_callee_save___kvm_vcpu_is_preempted:"
>   "movq	__per_cpu_offset(,%rdi,8), %rax;"
> -"cmpb	$0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
> -"setne	%al;"
> +"movb	" __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax), %al;"
> +"and	$" __stringify(KVM_VCPU_PREEMPTED) ", %rax;"

This assumes that KVM_VCPU_PREEMPTED is 1.  It could also be %eax 
(slightly cheaper).  Overall, I prefer to leave the code as is using setne.

Paolo

>   "ret;"
>   ".size __raw_callee_save___kvm_vcpu_is_preempted, .-__raw_callee_save___kvm_vcpu_is_preempted;"
>   ".popsection");


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-14 10:55 ` Paolo Bonzini
@ 2022-01-14 17:13   ` Sean Christopherson
  2022-01-14 17:17     ` Re: Paolo Bonzini
  0 siblings, 1 reply; 1651+ messages in thread
From: Sean Christopherson @ 2022-01-14 17:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Li RongQing, vkuznets, wanpengli, jmattson, tglx, bp, x86, kvm,
	joro, peterz

On Fri, Jan 14, 2022, Paolo Bonzini wrote:
> On 1/14/22 11:54, Li RongQing wrote:
> > After support paravirtualized TLB shootdowns, steal_time.preempted
> > includes not only KVM_VCPU_PREEMPTED, but also KVM_VCPU_FLUSH_TLB
> > 
> > and kvm_vcpu_is_preempted should test only with KVM_VCPU_PREEMPTED
> > 
> > Signed-off-by: Li RongQing <lirongqing@baidu.com>
> > ---
> > diff with v1:
> > clear the rest of rax, suggested by Sean and peter
> > remove Fixes tag, since no issue in practice
> > 
> >   arch/x86/kernel/kvm.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> > index b061d17..45c9ce8d 100644
> > --- a/arch/x86/kernel/kvm.c
> > +++ b/arch/x86/kernel/kvm.c
> > @@ -1025,8 +1025,8 @@ asm(
> >   ".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
> >   "__raw_callee_save___kvm_vcpu_is_preempted:"
> >   "movq	__per_cpu_offset(,%rdi,8), %rax;"
> > -"cmpb	$0, " __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax);"
> > -"setne	%al;"
> > +"movb	" __stringify(KVM_STEAL_TIME_preempted) "+steal_time(%rax), %al;"
> > +"and	$" __stringify(KVM_VCPU_PREEMPTED) ", %rax;"
> 
> This assumes that KVM_VCPU_PREEMPTED is 1.

Ah, right, because technically the compiler is only required to be able to store
'1' and '0' in the boolean.  That said, KVM_VCPU_PREEMPTED is ABI and isn't going
to change, so this could be "solved" with a comment.

> It could also be %eax (slightly cheaper).

Ya.

> Overall, I prefer to leave the code as is using setne.

But that also makes dangerous assumptions: (a) that the return type is bool,
and (b) that the compiler uses a single byte for bools.

If the assumptiong about KVM_VCPU_PREEMPTED being '1' is a sticking point, what
about combining the two to make everyone happy?

	andl	$" __stringify(KVM_VCPU_PREEMPTED) ", %eax
	setnz	%al

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-14 17:13   ` Re: Sean Christopherson
@ 2022-01-14 17:17     ` Paolo Bonzini
  0 siblings, 0 replies; 1651+ messages in thread
From: Paolo Bonzini @ 2022-01-14 17:17 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Li RongQing, vkuznets, wanpengli, jmattson, tglx, bp, x86, kvm,
	joro, peterz

On 1/14/22 18:13, Sean Christopherson wrote:
> If the assumptiong about KVM_VCPU_PREEMPTED being '1' is a sticking point, what
> about combining the two to make everyone happy?
> 
> 	andl	$" __stringify(KVM_VCPU_PREEMPTED) ", %eax
> 	setnz	%al

Sure, that's indeed a nice solution (I appreciate the attention to 
detail in setne->setnz, too :)).

Paolo


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-01-20 15:28 Myrtle Shah
  2022-01-20 15:37 ` Vitaly Wool
  0 siblings, 1 reply; 1651+ messages in thread
From: Myrtle Shah @ 2022-01-20 15:28 UTC (permalink / raw)
  To: linux-riscv, paul.walmsley, palmer; +Cc: linux-kernel

These are some initial patches to bugs I found attempting to
get a XIP kernel working on hardware:
 - 32-bit VexRiscv processor
 - kernel in SPI flash, at 0x00200000
 - 16MB of RAM at 0x10000000
 - MMU enabled
 
I still have some more debugging to do, but these at least
get the kernel as far as initialising the MMU, and I would
appreciate feedback if anyone else is working on RISC-V XIP.



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-20 15:28 Myrtle Shah
@ 2022-01-20 15:37 ` Vitaly Wool
  2022-01-20 23:29   ` Re: Damien Le Moal
  2022-02-04 21:45   ` Re: Palmer Dabbelt
  0 siblings, 2 replies; 1651+ messages in thread
From: Vitaly Wool @ 2022-01-20 15:37 UTC (permalink / raw)
  To: Myrtle Shah; +Cc: linux-riscv, Paul Walmsley, Palmer Dabbelt, LKML

Hey,

On Thu, Jan 20, 2022 at 4:30 PM Myrtle Shah <gatecat@ds0.me> wrote:
>
> These are some initial patches to bugs I found attempting to
> get a XIP kernel working on hardware:
>  - 32-bit VexRiscv processor
>  - kernel in SPI flash, at 0x00200000
>  - 16MB of RAM at 0x10000000
>  - MMU enabled
>
> I still have some more debugging to do, but these at least
> get the kernel as far as initialising the MMU, and I would
> appreciate feedback if anyone else is working on RISC-V XIP.

I'll try to support you as much as I can, unfortunately I don't have
any 32-bit RISC-V around so I was rather thinking of extending the
RISC-V XIP support to 64-bit non-MMU targets.
For now just please keep in mind that there might be some inherent
assumptions that a target is 64 bit.

Best regards,
Vitaly

>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-20 15:37 ` Vitaly Wool
@ 2022-01-20 23:29   ` Damien Le Moal
  2022-02-04 21:45   ` Re: Palmer Dabbelt
  1 sibling, 0 replies; 1651+ messages in thread
From: Damien Le Moal @ 2022-01-20 23:29 UTC (permalink / raw)
  To: Vitaly Wool, Myrtle Shah; +Cc: linux-riscv, Paul Walmsley, Palmer Dabbelt, LKML

On 2022/01/21 0:37, Vitaly Wool wrote:
> Hey,
> 
> On Thu, Jan 20, 2022 at 4:30 PM Myrtle Shah <gatecat@ds0.me> wrote:
>>
>> These are some initial patches to bugs I found attempting to
>> get a XIP kernel working on hardware:
>>  - 32-bit VexRiscv processor
>>  - kernel in SPI flash, at 0x00200000
>>  - 16MB of RAM at 0x10000000
>>  - MMU enabled
>>
>> I still have some more debugging to do, but these at least
>> get the kernel as far as initialising the MMU, and I would
>> appreciate feedback if anyone else is working on RISC-V XIP.
> 
> I'll try to support you as much as I can, unfortunately I don't have
> any 32-bit RISC-V around so I was rather thinking of extending the
> RISC-V XIP support to 64-bit non-MMU targets.

That would be great ! I am completing the buildroot patches for the K210. Got
u-boot almost working for SD card boot too (fighting a problem with rootfs
kernel mount on boot when using u-boot though).

> For now just please keep in mind that there might be some inherent
> assumptions that a target is 64 bit.
> 
> Best regards,
> Vitaly
> 
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


-- 
Damien Le Moal
Western Digital Research

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-20 15:37 ` Vitaly Wool
  2022-01-20 23:29   ` Re: Damien Le Moal
@ 2022-02-04 21:45   ` Palmer Dabbelt
  1 sibling, 0 replies; 1651+ messages in thread
From: Palmer Dabbelt @ 2022-02-04 21:45 UTC (permalink / raw)
  To: vitaly.wool; +Cc: gatecat, linux-riscv, Paul Walmsley, linux-kernel

On Thu, 20 Jan 2022 07:37:00 PST (-0800), vitaly.wool@konsulko.com wrote:
> Hey,
>
> On Thu, Jan 20, 2022 at 4:30 PM Myrtle Shah <gatecat@ds0.me> wrote:
>>
>> These are some initial patches to bugs I found attempting to
>> get a XIP kernel working on hardware:
>>  - 32-bit VexRiscv processor
>>  - kernel in SPI flash, at 0x00200000
>>  - 16MB of RAM at 0x10000000
>>  - MMU enabled
>>
>> I still have some more debugging to do, but these at least
>> get the kernel as far as initialising the MMU, and I would
>> appreciate feedback if anyone else is working on RISC-V XIP.
>
> I'll try to support you as much as I can, unfortunately I don't have
> any 32-bit RISC-V around so I was rather thinking of extending the
> RISC-V XIP support to 64-bit non-MMU targets.
> For now just please keep in mind that there might be some inherent
> assumptions that a target is 64 bit.

I don't test any of the XIP configs, but if you guys have something that's sane
to run in QEMU I'm happy to do so.  Given that there's now some folks finding
boot bugs it's probably worth getting what does boot into a regression test so
it's less likely to break moving forwards.

These are on fixes, with the second one split up so it's got a better chance of
landing in the stable trees.

Thanks!

>
> Best regards,
> Vitaly
>
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-01-24 12:43 Arınç ÜNAL
  2022-01-25 14:03 ` Sergio Paracuellos
  0 siblings, 1 reply; 1651+ messages in thread
From: Arınç ÜNAL @ 2022-01-24 12:43 UTC (permalink / raw)
  To: Greg KH, Sergio Paracuellos, NeilBrown, DENG Qingfang,
	Andrew Lunn, Luiz Angelo Daros de Luca
  Cc: linux-staging

Hey everyone,

In preperation to mainline mt7621-dts; fix formatting, dtc warning on
switch0@0 node and pinctrl properties for ethernet node on the mt7621.dtsi.
Move the GB-PC2 specific external phy configuration on the main dtsi to
GB-PC2's devicetree, gbpc2.dts.

Now that pinctrl properties are properly defined on the ethernet node,
GMAC1 will start working.

Traffic flow on GMAC1 was tested on a mt7621a board with these modes:
External phy <-> GMAC1
PHY 0/4 <-> GMAC1

Cheers.
Arınç

[0]: https://lore.kernel.org/netdev/83a35aa3-6cb8-2bc4-2ff4-64278bbcd8c8@arinc9.com/T/

Arınç ÜNAL (4):
      staging: mt7621-dts: fix formatting
      staging: mt7621-dts: fix switch0@0 warnings
      staging: mt7621-dts: use trgmii on gmac0 and enable flow control on port@6
      staging: mt7621-dts: fix pinctrl properties for ethernet

 drivers/staging/mt7621-dts/gbpc2.dts   | 16 +++++++++++-----
 drivers/staging/mt7621-dts/mt7621.dtsi | 32 ++++++++++++++++----------------
 2 files changed, 27 insertions(+), 21 deletions(-)


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-24 12:43 Arınç ÜNAL
@ 2022-01-25 14:03 ` Sergio Paracuellos
  2022-01-25 15:24   ` Re: Arınç ÜNAL
  0 siblings, 1 reply; 1651+ messages in thread
From: Sergio Paracuellos @ 2022-01-25 14:03 UTC (permalink / raw)
  To: Arınç ÜNAL
  Cc: Greg KH, NeilBrown, DENG Qingfang, Andrew Lunn,
	Luiz Angelo Daros de Luca, linux-staging

Hi Arinc!

On Mon, Jan 24, 2022 at 1:45 PM Arınç ÜNAL <arinc.unal@arinc9.com> wrote:
>
> Hey everyone,
>
> In preperation to mainline mt7621-dts; fix formatting, dtc warning on
> switch0@0 node and pinctrl properties for ethernet node on the mt7621.dtsi.
> Move the GB-PC2 specific external phy configuration on the main dtsi to
> GB-PC2's devicetree, gbpc2.dts.
>
> Now that pinctrl properties are properly defined on the ethernet node,
> GMAC1 will start working.
>
> Traffic flow on GMAC1 was tested on a mt7621a board with these modes:
> External phy <-> GMAC1
> PHY 0/4 <-> GMAC1
>
> Cheers.
> Arınç

Nitpick: next time try to put also a subject like "staging:
mt7621-dts: cleanups (or whatever)" in the cover letter of the series.

>
> [0]: https://lore.kernel.org/netdev/83a35aa3-6cb8-2bc4-2ff4-64278bbcd8c8@arinc9.com/T/
>
> Arınç ÜNAL (4):
>       staging: mt7621-dts: fix formatting
>       staging: mt7621-dts: fix switch0@0 warnings
>       staging: mt7621-dts: use trgmii on gmac0 and enable flow control on port@6
>       staging: mt7621-dts: fix pinctrl properties for ethernet
>
>  drivers/staging/mt7621-dts/gbpc2.dts   | 16 +++++++++++-----
>  drivers/staging/mt7621-dts/mt7621.dtsi | 32 ++++++++++++++++----------------
>  2 files changed, 27 insertions(+), 21 deletions(-)
>
>

Thanks for doing this!

Best regards,
    Sergio Paracuellos

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-25 14:03 ` Sergio Paracuellos
@ 2022-01-25 15:24   ` Arınç ÜNAL
  2022-01-25 15:50     ` Re: Sergio Paracuellos
  0 siblings, 1 reply; 1651+ messages in thread
From: Arınç ÜNAL @ 2022-01-25 15:24 UTC (permalink / raw)
  To: Sergio Paracuellos
  Cc: Greg KH, NeilBrown, DENG Qingfang, Andrew Lunn,
	Luiz Angelo Daros de Luca, linux-staging

Hey Sergio,

On 25/01/2022 17:03, Sergio Paracuellos wrote:
> Hi Arinc!
> 
> On Mon, Jan 24, 2022 at 1:45 PM Arınç ÜNAL <arinc.unal@arinc9.com> wrote:
>>
>> Hey everyone,
>>
>> In preperation to mainline mt7621-dts; fix formatting, dtc warning on
>> switch0@0 node and pinctrl properties for ethernet node on the mt7621.dtsi.
>> Move the GB-PC2 specific external phy configuration on the main dtsi to
>> GB-PC2's devicetree, gbpc2.dts.
>>
>> Now that pinctrl properties are properly defined on the ethernet node,
>> GMAC1 will start working.
>>
>> Traffic flow on GMAC1 was tested on a mt7621a board with these modes:
>> External phy <-> GMAC1
>> PHY 0/4 <-> GMAC1
>>
>> Cheers.
>> Arınç
> 
> Nitpick: next time try to put also a subject like "staging:
> mt7621-dts: cleanups (or whatever)" in the cover letter of the series.

I had already sent v2 with that. I'll send v3 with your input on the 
series, thanks!

> 
>>
>> [0]: https://lore.kernel.org/netdev/83a35aa3-6cb8-2bc4-2ff4-64278bbcd8c8@arinc9.com/T/
>>
>> Arınç ÜNAL (4):
>>        staging: mt7621-dts: fix formatting
>>        staging: mt7621-dts: fix switch0@0 warnings
>>        staging: mt7621-dts: use trgmii on gmac0 and enable flow control on port@6
>>        staging: mt7621-dts: fix pinctrl properties for ethernet
>>
>>   drivers/staging/mt7621-dts/gbpc2.dts   | 16 +++++++++++-----
>>   drivers/staging/mt7621-dts/mt7621.dtsi | 32 ++++++++++++++++----------------
>>   2 files changed, 27 insertions(+), 21 deletions(-)
>>
>>
> 
> Thanks for doing this!
> 
> Best regards,
>      Sergio Paracuellos

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-25 15:24   ` Re: Arınç ÜNAL
@ 2022-01-25 15:50     ` Sergio Paracuellos
  0 siblings, 0 replies; 1651+ messages in thread
From: Sergio Paracuellos @ 2022-01-25 15:50 UTC (permalink / raw)
  To: Arınç ÜNAL
  Cc: Greg KH, NeilBrown, DENG Qingfang, Andrew Lunn,
	Luiz Angelo Daros de Luca, linux-staging

On Tue, Jan 25, 2022 at 4:24 PM Arınç ÜNAL <arinc.unal@arinc9.com> wrote:
>
> Hey Sergio,
>
> On 25/01/2022 17:03, Sergio Paracuellos wrote:
> > Hi Arinc!
> >
> > On Mon, Jan 24, 2022 at 1:45 PM Arınç ÜNAL <arinc.unal@arinc9.com> wrote:
> >>
> >> Hey everyone,
> >>
> >> In preperation to mainline mt7621-dts; fix formatting, dtc warning on
> >> switch0@0 node and pinctrl properties for ethernet node on the mt7621.dtsi.
> >> Move the GB-PC2 specific external phy configuration on the main dtsi to
> >> GB-PC2's devicetree, gbpc2.dts.
> >>
> >> Now that pinctrl properties are properly defined on the ethernet node,
> >> GMAC1 will start working.
> >>
> >> Traffic flow on GMAC1 was tested on a mt7621a board with these modes:
> >> External phy <-> GMAC1
> >> PHY 0/4 <-> GMAC1
> >>
> >> Cheers.
> >> Arınç
> >
> > Nitpick: next time try to put also a subject like "staging:
> > mt7621-dts: cleanups (or whatever)" in the cover letter of the series.
>
> I had already sent v2 with that. I'll send v3 with your input on the
> series, thanks!

True, sorry I missed that!

Thanks,
    Sergio Paracuellos
>
> >
> >>
> >> [0]: https://lore.kernel.org/netdev/83a35aa3-6cb8-2bc4-2ff4-64278bbcd8c8@arinc9.com/T/
> >>
> >> Arınç ÜNAL (4):
> >>        staging: mt7621-dts: fix formatting
> >>        staging: mt7621-dts: fix switch0@0 warnings
> >>        staging: mt7621-dts: use trgmii on gmac0 and enable flow control on port@6
> >>        staging: mt7621-dts: fix pinctrl properties for ethernet
> >>
> >>   drivers/staging/mt7621-dts/gbpc2.dts   | 16 +++++++++++-----
> >>   drivers/staging/mt7621-dts/mt7621.dtsi | 32 ++++++++++++++++----------------
> >>   2 files changed, 27 insertions(+), 21 deletions(-)
> >>
> >>
> >
> > Thanks for doing this!
> >
> > Best regards,
> >      Sergio Paracuellos

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <10b1995b392e490aaa2db645f219015e@dji.com>]

* 转发: 
       [not found] <10b1995b392e490aaa2db645f219015e@dji.com>
@ 2022-01-17 12:54 ` Caine Chen
  2022-02-03 11:49   ` Daniel Vacek
  0 siblings, 1 reply; 1651+ messages in thread
From: Caine Chen @ 2022-01-17 12:54 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org

Hi guys:
We found that some IRQ threads will block in local_bh_disable( ) for
long time in some situation and we hope to get your valuable suggestions.
My kernel version is 5.4 and the irq-delay is caused by the use of
write_lock_bh().
It can be described in the following figure:
(1) Thread_1 which is a SCHED_NORMAL thread runs on CPU1,
    and it uses read_lock_bh() to protect some data.
(2) Thread_2 which is a SCHED_RR thread runs on CPU1 and it preempts thread_1
    after thread_1 invoked read_lock_bh(). Thread_2 may run 60 ms in my system.
(3) Thread_3 which is a SCHED_NORMAL thread runs on CPU0. This thread acquires
    writer's lock by invoking write_lock_bh(). This function will disable
    button-half firstly by invoking local_bh_disable( ). But it will block in
    rt_write_lock() , because read lock is held by thread_1.
(4) At this time, if irq thread without IRQF_NO_THREAD flag on CPU0 trys to
    acquire bh_lock(it has been renamed as softirq_ctrl.lock now), irq
    thread will block because this lock is held by thread_3.

------------------------------------------------------------------------------------------------------------------------------------
CPU1                                                                            CPU0
-------------------------------------------------                    ---------------------------------------------------------------
thread_2                       thread_1                           thread_3                               irq_thread
--------------                  -----------                           -----------                            --------------
                                 read_lock_bh()

......
                                                                     write_lock_bh()
/*do work*/                                                                                               /* irq thread block here*/
                                                                                                              local_bh_disable()
......
                                 read_unlock_bh()
                                                                     ......
                                                                     /* do work */
                                                                     ......
                                                                     write_unlock_bh()
                                                                                                              irq_thread_fn()
----------------------------------------------------------------------------------------------------------------------------------

In this case, if SCHED_RR thread_2 preempts thread_1 and runs too much time, all
irq threads on CPU0 will be blocked.
It looks like a priority reverse problem of real-time thread preempt.
How can I avoid this problem?  I have a few thoughts:
(1) The key point, I think, is that write_lock_bh()/read_lock_bh() will disable
    buttom half which will disable some irq threads too. Could I use
    write_lock_irq()/read_lock_irq() instead?
(2) If my irq handler wants to get better performance, I should request a
    threaded handler for the IRQ as Sebastian suggested in LKML
    <RE: irq thread latency caused by softirq_ctrl.lock contention>.
    Is threaded handler designed for low irq delay?
(3) Thread_2 takes too long time for running. So it is not suitable to set this
    thread with high rt-priority. Should I reduce this thread's priority to
    solve this problem?

Are there better ways to avoid this problem? We hope to get your valuable
suggestions. Thanks!

Best regards,
Caine.chen
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

此电子邮件及附件所包含内容具有机密性，且仅限于接收人使用。未经允许，禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者，请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-01-17 12:54 ` 转发: Caine Chen
@ 2022-02-03 11:49   ` Daniel Vacek
  0 siblings, 0 replies; 1651+ messages in thread
From: Daniel Vacek @ 2022-02-03 11:49 UTC (permalink / raw)
  To: Caine Chen; +Cc: linux-rt-users@vger.kernel.org

Hi Caine,

On Tue, Jan 18, 2022 at 4:44 AM Caine Chen <caine.chen@dji.com> wrote:
>
> Hi guys:
> We found that some IRQ threads will block in local_bh_disable( ) for
> long time in some situation and we hope to get your valuable suggestions.
> My kernel version is 5.4 and the irq-delay is caused by the use of
> write_lock_bh().
> It can be described in the following figure:
> (1) Thread_1 which is a SCHED_NORMAL thread runs on CPU1,
>     and it uses read_lock_bh() to protect some data.
> (2) Thread_2 which is a SCHED_RR thread runs on CPU1 and it preempts thread_1
>     after thread_1 invoked read_lock_bh(). Thread_2 may run 60 ms in my system.
> (3) Thread_3 which is a SCHED_NORMAL thread runs on CPU0. This thread acquires
>     writer's lock by invoking write_lock_bh(). This function will disable
>     button-half firstly by invoking local_bh_disable( ). But it will block in
>     rt_write_lock() , because read lock is held by thread_1.
> (4) At this time, if irq thread without IRQF_NO_THREAD flag on CPU0 trys to
>     acquire bh_lock(it has been renamed as softirq_ctrl.lock now), irq
>     thread will block because this lock is held by thread_3.
>
> ------------------------------------------------------------------------------------------------------------------------------------
> CPU1                                                                            CPU0
> -------------------------------------------------                    ---------------------------------------------------------------
> thread_2                       thread_1                           thread_3                               irq_thread
> --------------                  -----------                           -----------                            --------------
>                                  read_lock_bh()
>
> ......
>                                                                      write_lock_bh()
> /*do work*/                                                                                               /* irq thread block here*/
>                                                                                                               local_bh_disable()
> ......
>                                  read_unlock_bh()
>                                                                      ......
>                                                                      /* do work */
>                                                                      ......
>                                                                      write_unlock_bh()
>                                                                                                               irq_thread_fn()
> ----------------------------------------------------------------------------------------------------------------------------------
>
> In this case, if SCHED_RR thread_2 preempts thread_1 and runs too much time, all
> irq threads on CPU0 will be blocked.
> It looks like a priority reverse problem of real-time thread preempt.

Not really. I guess there's one misunderstanding in your description.
Disabling the bottom half is local to running thread and not to the
CPU which executes that thread. As an effect, preemption practically
enables the bottom half again (as long as the new thread did not have
it already disabled before, of course...).

That said, the irq_thread will _not_ be blocked as bottom half is not
disabled in it's context. From your chart, it's disabled only in
thread_3 context and thread_1 context. But these two are independent
(due to the different thread contexts and not the different CPU
contexts as you misassumed) and they do not block each other either,
it's the rw_lock serializing these threads, right?

You should be able to see this with tracing. There should be no issue
or the issue is different than you think it is and different than you
described here.

Hopefully the above helps you,
Daniel

> How can I avoid this problem?  I have a few thoughts:
> (1) The key point, I think, is that write_lock_bh()/read_lock_bh() will disable
>     buttom half which will disable some irq threads too. Could I use
>     write_lock_irq()/read_lock_irq() instead?
> (2) If my irq handler wants to get better performance, I should request a
>     threaded handler for the IRQ as Sebastian suggested in LKML
>     <RE: irq thread latency caused by softirq_ctrl.lock contention>.
>     Is threaded handler designed for low irq delay?
> (3) Thread_2 takes too long time for running. So it is not suitable to set this
>     thread with high rt-priority. Should I reduce this thread's priority to
>     solve this problem?
>
> Are there better ways to avoid this problem? We hope to get your valuable
> suggestions. Thanks!
>
> Best regards,
> Caine.chen
> This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.
>
> 此电子邮件及附件所包含内容具有机密性，且仅限于接收人使用。未经允许，禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者，请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH] net/failsafe: Fix crash due to global devargs syntax parsing from secondary process
@ 2022-02-10  7:10 madhuker.mythri
  2022-02-10 15:00 ` Ferruh Yigit
  0 siblings, 1 reply; 1651+ messages in thread
From: madhuker.mythri @ 2022-02-10  7:10 UTC (permalink / raw)
  To: grive; +Cc: dev, Madhuker Mythri

From: Madhuker Mythri <madhuker.mythri@oracle.com>

Failsafe pmd started crashing with global devargs syntax as devargs is
not memset to zero. Access it to in rte_devargs_parse resulted in a
crash when called from secondary process.

Bugzilla Id: 933

Signed-off-by: Madhuker Mythri <madhuker.mythri@oracle.com>
---
 drivers/net/failsafe/failsafe.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 3c754a5f66..aa93cc6000 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -360,6 +360,7 @@ rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
 			if (sdev->devargs.name[0] == '\0')
 				continue;
 
+			memset(&devargs, 0, sizeof(devargs));
 			/* rebuild devargs to be able to get the bus name. */
 			ret = rte_devargs_parse(&devargs,
 						sdev->devargs.name);
-- 
2.32.0.windows.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* (no subject)
  2022-02-10  7:10 [PATCH] net/failsafe: Fix crash due to global devargs syntax parsing from secondary process madhuker.mythri
@ 2022-02-10 15:00 ` Ferruh Yigit
  2022-02-10 16:08   ` Gaëtan Rivet
  0 siblings, 1 reply; 1651+ messages in thread
From: Ferruh Yigit @ 2022-02-10 15:00 UTC (permalink / raw)
  To: madhuker.mythri, grive; +Cc: dev

On 2/10/2022 7:10 AM, madhuker.mythri@oracle.com wrote:
> From: Madhuker Mythri <madhuker.mythri@oracle.com>
> 
> Failsafe pmd started crashing with global devargs syntax as devargs is
> not memset to zero. Access it to in rte_devargs_parse resulted in a
> crash when called from secondary process.
> 
> Bugzilla Id: 933
> 
> Signed-off-by: Madhuker Mythri <madhuker.mythri@oracle.com>
> ---
>   drivers/net/failsafe/failsafe.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
> index 3c754a5f66..aa93cc6000 100644
> --- a/drivers/net/failsafe/failsafe.c
> +++ b/drivers/net/failsafe/failsafe.c
> @@ -360,6 +360,7 @@ rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
>   			if (sdev->devargs.name[0] == '\0')
>   				continue;
>   
> +			memset(&devargs, 0, sizeof(devargs));
>   			/* rebuild devargs to be able to get the bus name. */
>   			ret = rte_devargs_parse(&devargs,
>   						sdev->devargs.name);

if 'rte_devargs_parse()' requires 'devargs' parameter to be memset,
what do you think memset it in the API?
This prevents forgotten cases like this.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-02-10 15:00 ` Ferruh Yigit
@ 2022-02-10 16:08   ` Gaëtan Rivet
  0 siblings, 0 replies; 1651+ messages in thread
From: Gaëtan Rivet @ 2022-02-10 16:08 UTC (permalink / raw)
  To: Ferruh Yigit, madhuker.mythri; +Cc: dev

On Thu, Feb 10, 2022, at 16:00, Ferruh Yigit wrote:
> On 2/10/2022 7:10 AM, madhuker.mythri@oracle.com wrote:
>> From: Madhuker Mythri <madhuker.mythri@oracle.com>
>> 
>> Failsafe pmd started crashing with global devargs syntax as devargs is
>> not memset to zero. Access it to in rte_devargs_parse resulted in a
>> crash when called from secondary process.
>> 
>> Bugzilla Id: 933
>> 
>> Signed-off-by: Madhuker Mythri <madhuker.mythri@oracle.com>
>> ---
>>   drivers/net/failsafe/failsafe.c | 1 +
>>   1 file changed, 1 insertion(+)
>> 
>> diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
>> index 3c754a5f66..aa93cc6000 100644
>> --- a/drivers/net/failsafe/failsafe.c
>> +++ b/drivers/net/failsafe/failsafe.c
>> @@ -360,6 +360,7 @@ rte_pmd_failsafe_probe(struct rte_vdev_device *vdev)
>>   			if (sdev->devargs.name[0] == '\0')
>>   				continue;
>>   
>> +			memset(&devargs, 0, sizeof(devargs));
>>   			/* rebuild devargs to be able to get the bus name. */
>>   			ret = rte_devargs_parse(&devargs,
>>   						sdev->devargs.name);
>
> if 'rte_devargs_parse()' requires 'devargs' parameter to be memset,
> what do you think memset it in the API?
> This prevents forgotten cases like this.

Hi,

I was looking at it this morning.
Before the last release, rte_devargs_parse() was only supporting legacy syntax.
It never read from the devargs structure, only wrote to it. So it was safe to
use with a non-memset devargs.

The rte_devargs_layer_parse() however is more complex. To allow rte_dev_iterator_init() to call it without doing memory allocation, it reads parts of the devargs to make decisions.

Doing a first call to rte_devargs_layer_parse() as part of rte_devargs_parse() thus
modified the contract it had with the users, that it would never read from devargs.

It is not possible to completely avoid reading from devargs in rte_devargs_layer_parse().
It is necessary for RTE_DEV_FOREACH() to be safe to interrupt without having to do iterator cleanup.

This is my current understanding. In that context, yes I think it is preferrable to
do memset() within rte_devargs_parse(). It will restore the previous part of the API saying that calling it with non-memset devargs was safe to do.

Thanks,
-- 
Gaetan Rivet

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: Re:
@ 2022-02-11 15:06 Caine Chen
  0 siblings, 0 replies; 1651+ messages in thread
From: Caine Chen @ 2022-02-11 15:06 UTC (permalink / raw)
  To: neelx.g@gmail.com; +Cc: linux-rt-users@vger.kernel.org

Hi Daniel:
Thanks for your reply.

> Not really. I guess there's one misunderstanding in your description.
> Disabling the bottom half is local to running thread and not to the
> CPU which executes that thread. As an effect, preemption practically
> enables the bottom half again (as long as the new thread did not have
> it already disabled before, of course...).

It's a bit confused for me why disabling the bottom half is local to thread
and not to the CPU. From my humble perspective, every forced threaded
irq_threads will invoke local_bh_disable( ) and try to get bh_lock before they
enter irq handler. If bh_lock(now is softirq_ctrl.lock) is held by other thread,
all forced-threaded irq_threads on this CPU will wait until the lock is released.
So how does preemption enable the bottom half again?

To test this, I did an experiment in v5.4 kernel.
First, I created a kthread and bound it to CPU0:

int test_init( )
{
        ......
        p = kthread_create(my_debug_func, NULL, "my_test");
        kthread_bind(p, 0);
        wake_up_process(p);
        ......
}

This kthread will invoke local_bh_disable()/local_bh_enable() periodically:

int my_debug_func(void *arg)
{
        ......
        while(!kthread_should_stop()) {
                ......
                local_bh_disable();
                /* just do some busy work, such as memcpy, kmalloc and so on */
                do_some_work();
                local_bh_enable();
        }
        ......
}

What'more, I added some logs in some forced-threaded irq handlers to find out when they was excuted.
After "my_test" thread disabled local bh, there were no forced-threaded irq threads running on CPU0.
But after "my_test" thread enabled local bh, forced-threaded irqs came again.

It seems that disabling the bottom half is local to CPU.

> That said, the irq_thread will _not_ be blocked as bottom half is not
> disabled in it's context. From your chart, it's disabled only in
> thread_3 context and thread_1 context. But these two are independent
> (due to the different thread contexts and not the different CPU
> contexts as you misassumed) and they do not block each other either,
> it's the rw_lock serializing these threads, right?

> You should be able to see this with tracing. There should be no issue
> or the issue is different than you think it is and different than you
> described here.

> Hopefully the above helps you,
> Daniel

Thanks
Caine
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

此电子邮件及附件所包含内容具有机密性，且仅限于接收人使用。未经允许，禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者，请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-02-13 22:40 Ronnie Sahlberg
  2022-02-14  7:52 ` ronnie sahlberg
  0 siblings, 1 reply; 1651+ messages in thread
From: Ronnie Sahlberg @ 2022-02-13 22:40 UTC (permalink / raw)
  To: linux-cifs; +Cc: Steve French

Steve, List,

Here is a small patch htat fixes an issue with modefromsid where
it would strip off and remove all the ACEs that grants us access to the file.
It fixes this by restoring the "allow AuthenticatedUsers access" ACE that is stripped in

set_chmod_dacl():
                /* If it's any one of the ACE we're replacing, skip! */
                if (((compare_sids(&pntace->sid, &sid_unix_NFS_mode) == 0) ||
                                (compare_sids(&pntace->sid, pownersid) == 0) ||
                                (compare_sids(&pntace->sid, pgrpsid) == 0) ||
                                (compare_sids(&pntace->sid, &sid_everyone) == 0) ||
                                (compare_sids(&pntace->sid, &sid_authusers) == 0))) {
                        goto next_ace;
                }

This part is confusing since form many of these cases we are NOT replacing
all these ACEs but only some of them but the code unconditionally removes
all of them, contrary to what the comment suggests.

I think some of my confusion here is that afaik we don't have good documentation
of how modefromsid, and idsfromsid, are supposed to work, what the
restrictions are or the expected semantics.
We need to document both modefromsid and idsfromsid and what the expected
semantics are for when either of them or both of them are enabled.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-02-13 22:40 Ronnie Sahlberg
@ 2022-02-14  7:52 ` ronnie sahlberg
  0 siblings, 0 replies; 1651+ messages in thread
From: ronnie sahlberg @ 2022-02-14  7:52 UTC (permalink / raw)
  To: Ronnie Sahlberg; +Cc: linux-cifs, Steve French

Steve,

I  have added a test to the buildbot to verify the fix.
I.e. two ACEs when file are created, one for mode and one for AuthenticatedUsers
and that after chmod we still have two ACEs but the one for mode has
been updated.

The test is cifs/107
and it can also show how we can now modify the mountoptions we need on
a test by test
basis by using -o remount, ...     wooohooo new mount api :-)



On Mon, Feb 14, 2022 at 9:47 AM Ronnie Sahlberg <lsahlber@redhat.com> wrote:
>
> Steve, List,
>
> Here is a small patch htat fixes an issue with modefromsid where
> it would strip off and remove all the ACEs that grants us access to the file.
> It fixes this by restoring the "allow AuthenticatedUsers access" ACE that is stripped in
>
> set_chmod_dacl():
>                 /* If it's any one of the ACE we're replacing, skip! */
>                 if (((compare_sids(&pntace->sid, &sid_unix_NFS_mode) == 0) ||
>                                 (compare_sids(&pntace->sid, pownersid) == 0) ||
>                                 (compare_sids(&pntace->sid, pgrpsid) == 0) ||
>                                 (compare_sids(&pntace->sid, &sid_everyone) == 0) ||
>                                 (compare_sids(&pntace->sid, &sid_authusers) == 0))) {
>                         goto next_ace;
>                 }
>
> This part is confusing since form many of these cases we are NOT replacing
> all these ACEs but only some of them but the code unconditionally removes
> all of them, contrary to what the comment suggests.
>
> I think some of my confusion here is that afaik we don't have good documentation
> of how modefromsid, and idsfromsid, are supposed to work, what the
> restrictions are or the expected semantics.
> We need to document both modefromsid and idsfromsid and what the expected
> semantics are for when either of them or both of them are enabled.
>
>
>
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2022-03-04  8:47 Harald Hauge
  0 siblings, 0 replies; 1651+ messages in thread
From: Harald Hauge @ 2022-03-04  8:47 UTC (permalink / raw)
  To: bpf

Hello,
I'm Harald Hauge, an Investment Manager from Norway.
I will your assistance in executing this Business from my country
to yours.

This is a short term investment with good returns. Kindly
reply to confirm the validity of your email so I can give you comprehensive details about the project.

Best Regards,
Harald Hauge
Business Consultant

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20220301070226.2477769-1-jaydeepjd.8914>]

* (no subject)
       [not found] <20220301070226.2477769-1-jaydeepjd.8914>
@ 2022-03-06 11:10 ` Jaydeep P Das
  2022-03-06 11:22   ` Jaydeep Das
  0 siblings, 1 reply; 1651+ messages in thread
From: Jaydeep P Das @ 2022-03-06 11:10 UTC (permalink / raw)
  To: git

This patch removes the previously implemented regex
for matching method calls and adds a new regex
for specially matching floating numbers starting with
a decimal point. 

Added some more compound operators and their tests.

--
Thanks :-]
Jaydeep

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-06 11:10 ` Jaydeep P Das
@ 2022-03-06 11:22   ` Jaydeep Das
  0 siblings, 0 replies; 1651+ messages in thread
From: Jaydeep Das @ 2022-03-06 11:22 UTC (permalink / raw)
  To: git

Please ignore this patch. I think I made some mistake
when copy-pasting the In-reply-to code.

Sorry for the trouble. I have sent this same patch on
the appropriate thread.

Thanks,
Jaydeep.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <Yj1hkpyUqJE9sQ2p@redhat.com>]

* Re:
       [not found] <Yj1hkpyUqJE9sQ2p@redhat.com>
@ 2022-03-25  7:52 ` Jason Wang
  2022-03-25  9:10   ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1651+ messages in thread
From: Jason Wang @ 2022-03-25  7:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> Bcc:
> Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> Reply-To:
> In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
>
> On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> >
> > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > This is a rework on the previous IRQ hardening that is done for
> > > > virtio-pci where several drawbacks were found and were reverted:
> > > >
> > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > >     that is used by some device such as virtio-blk
> > > > 2) done only for PCI transport
> > > >
> > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > by introducing a global irq_soft_enabled variable for each
> > > > virtio_device. Then we can to toggle it during
> > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > the future, we may provide config_ops for the transport that doesn't
> > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > but the cost should be acceptable.
> > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> >
> >
> > Even if we allow the transport driver to synchornize through
> > synchronize_irq() we still need a check in the vring_interrupt().
> >
> > We do something like the following previously:
> >
> >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> >                 return IRQ_NONE;
> >
> > But it looks like a bug since speculative read can be done before the check
> > where the interrupt handler can't see the uncommitted setup which is done by
> > the driver.
>
> I don't think so - if you sync after setting the value then
> you are guaranteed that any handler running afterwards
> will see the new value.

The problem is not disabled but the enable. We use smp_store_relase()
to make sure the driver commits the setup before enabling the irq. It
means the read needs to be ordered as well in vring_interrupt().

>
> Although I couldn't find anything about this in memory-barriers.txt
> which surprises me.
>
> CC Paul to help make sure I'm right.
>
>
> >
> > >
> > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > module parameter is introduced to enable the hardening so function
> > > > hardening is disabled by default.
> > > Which devices are these? How come they send an interrupt before there
> > > are any buffers in any queues?
> >
> >
> > I copied this from the commit log for 22b7050a024d7
> >
> > "
> >
> >     This change will also benefit old hypervisors (before 2009)
> >     that send interrupts without checking DRIVER_OK: previously,
> >     the callback could race with driver-specific initialization.
> > "
> >
> > If this is only for config interrupt, I can remove the above log.
>
>
> This is only for config interrupt.

Ok.

>
> >
> > >
> > > > Note that the hardening is only done for vring interrupt since the
> > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > ("virtio: defer config changed notifications"). But the method that is
> > > > used by config interrupt can't be reused by the vring interrupt
> > > > handler because it uses spinlock to do the synchronization which is
> > > > expensive.
> > > >
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > >
> > > > ---
> > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > >   include/linux/virtio.h        |  4 ++++
> > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > --- a/drivers/virtio/virtio.c
> > > > +++ b/drivers/virtio/virtio.c
> > > > @@ -7,6 +7,12 @@
> > > >   #include <linux/of.h>
> > > >   #include <uapi/linux/virtio_ids.h>
> > > > +static bool irq_hardening = false;
> > > > +
> > > > +module_param(irq_hardening, bool, 0444);
> > > > +MODULE_PARM_DESC(irq_hardening,
> > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > +
> > > >   /* Unique numbering for virtio devices. */
> > > >   static DEFINE_IDA(virtio_index_ida);
> > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > >    * */
> > > >   void virtio_reset_device(struct virtio_device *dev)
> > > >   {
> > > > + /*
> > > > +  * The below synchronize_rcu() guarantees that any
> > > > +  * interrupt for this line arriving after
> > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > +  * irq_soft_enabled == false.
> > > News to me I did not know synchronize_rcu has anything to do
> > > with interrupts. Did not you intend to use synchronize_irq?
> > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > though it's most likely is ...
> >
> >
> > According to the comment above tree RCU version of synchronize_rcu():
> >
> > """
> >
> >  * RCU read-side critical sections are delimited by rcu_read_lock()
> >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> >  * v5.0 and later, regions of code across which interrupts, preemption,
> >  * or softirqs have been disabled also serve as RCU read-side critical
> >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> >  * and NMI handlers.
> > """
> >
> > So interrupt handlers are treated as read-side critical sections.
> >
> > And it has the comment for explain the barrier:
> >
> > """
> >
> >  * Note that this guarantee implies further memory-ordering guarantees.
> >  * On systems with more than one CPU, when synchronize_rcu() returns,
> >  * each CPU is guaranteed to have executed a full memory barrier since
> >  * the end of its last RCU read-side critical section whose beginning
> >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > """
> >
> > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > irq_soft_enabled as false.
> >
>
> You are right. So then
> 1. I do not think we need load_acquire - why is it needed? Just
>    READ_ONCE should do.

See above.

> 2. isn't synchronize_irq also doing the same thing?


Yes, but it requires a config ops since the IRQ knowledge is transport specific.

>
>
> > >
> > > > +  */
> > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > + synchronize_rcu();
> > > > +
> > > >           dev->config->reset(dev);
> > > >   }
> > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > Please add comment explaining where it will be enabled.
> > > Also, we *really* don't need to synch if it was already disabled,
> > > let's not add useless overhead to the boot sequence.
> >
> >
> > Ok.
> >
> >
> > >
> > >
> > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > >           spin_lock_init(&dev->config_lock);
> > > >           dev->config_enabled = false;
> > > >           dev->config_change_pending = false;
> > > > + dev->irq_soft_check = irq_hardening;
> > > > +
> > > > + if (dev->irq_soft_check)
> > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > >           /* We always start by resetting the device, in case a previous
> > > >            * driver messed it up.  This also tests that code path a little. */
> > > one of the points of hardening is it's also helpful for buggy
> > > devices. this flag defeats the purpose.
> >
> >
> > Do you mean:
> >
> > 1) we need something like config_enable? This seems not easy to be
> > implemented without obvious overhead, mainly the synchronize with the
> > interrupt handlers
>
> But synchronize is only on tear-down path. That is not critical for any
> users at the moment, even less than probe.

I meant if we have vq->irq_pending, we need to call vring_interrupt()
in the virtio_device_ready() and synchronize the IRQ handlers with
spinlock or others.

>
> > 2) enable this by default, so I don't object, but this may have some risk
> > for old hypervisors
>
>
> The risk if there's a driver adding buffers without setting DRIVER_OK.

Probably not, we have devices that accept random inputs from outside,
net, console, input etc. I've done a round of audits of the Qemu
codes. They look all fine since day0.

> So with this approach, how about we rename the flag "driver_ok"?
> And then add_buf can actually test it and BUG_ON if not there  (at least
> in the debug build).

This looks like a hardening of the driver in the core instead of the
device. I think it can be done but in a separate series.

>
> And going down from there, how about we cache status in the
> device? Then we don't need to keep re-reading it every time,
> speeding boot up a tiny bit.

I don't fully understand here, actually spec requires status to be
read back for validation in many cases.

Thanks

>
> >
> > >
> > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > --- a/drivers/virtio/virtio_ring.c
> > > > +++ b/drivers/virtio/virtio_ring.c
> > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > >   }
> > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > >   {
> > > > + struct virtqueue *_vq = v;
> > > > + struct virtio_device *vdev = _vq->vdev;
> > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > +         return IRQ_NONE;
> > > > + }
> > > > +
> > > >           if (!more_used(vq)) {
> > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > >                   return IRQ_NONE;
> > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > index 5464f398912a..957d6ad604ac 100644
> > > > --- a/include/linux/virtio.h
> > > > +++ b/include/linux/virtio.h
> > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > >    * @config_enabled: configuration change reporting enabled
> > > >    * @config_change_pending: configuration change reported while disabled
> > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > + * @irq_soft_enabled: callbacks enabled
> > > >    * @config_lock: protects configuration change reporting
> > > >    * @dev: underlying device.
> > > >    * @id: the device type identification (used to match it with a driver).
> > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > >           bool failed;
> > > >           bool config_enabled;
> > > >           bool config_change_pending;
> > > > + bool irq_soft_check;
> > > > + bool irq_soft_enabled;
> > > >           spinlock_t config_lock;
> > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > >           struct device dev;
> > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > --- a/include/linux/virtio_config.h
> > > > +++ b/include/linux/virtio_config.h
> > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > >           return __virtio_test_bit(vdev, fbit);
> > > >   }
> > > > +/*
> > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > + * @vdev: the device
> > > > + */
> > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > +{
> > > > + if (!vdev->irq_soft_check)
> > > > +         return true;
> > > > +
> > > > + /*
> > > > +  * Read irq_soft_enabled before reading other device specific
> > > > +  * data. Paried with smp_store_relase() in
> > > paired
> >
> >
> > Will fix.
> >
> > Thanks
> >
> >
> > >
> > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > +  * virtio_reset_device().
> > > > +  */
> > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > +}
> > > > +
> > > >   /**
> > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > >    * @vdev: the device
> > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > >           if (dev->config->enable_cbs)
> > > >                     dev->config->enable_cbs(dev);
> > > > + /*
> > > > +  * Commit the driver setup before enabling the virtqueue
> > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > +  * virtio_irq_soft_enabled()
> > > > +  */
> > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > +
> > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > >   }
> > > > --
> > > > 2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-25  7:52 ` Re: Jason Wang
@ 2022-03-25  9:10   ` Michael S. Tsirkin
  2022-03-25  9:20     ` Re: Jason Wang
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-03-25  9:10 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > Bcc:
> > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > Reply-To:
> > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> >
> > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > >
> > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > >
> > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > >     that is used by some device such as virtio-blk
> > > > > 2) done only for PCI transport
> > > > >
> > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > by introducing a global irq_soft_enabled variable for each
> > > > > virtio_device. Then we can to toggle it during
> > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > but the cost should be acceptable.
> > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > >
> > >
> > > Even if we allow the transport driver to synchornize through
> > > synchronize_irq() we still need a check in the vring_interrupt().
> > >
> > > We do something like the following previously:
> > >
> > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > >                 return IRQ_NONE;
> > >
> > > But it looks like a bug since speculative read can be done before the check
> > > where the interrupt handler can't see the uncommitted setup which is done by
> > > the driver.
> >
> > I don't think so - if you sync after setting the value then
> > you are guaranteed that any handler running afterwards
> > will see the new value.
> 
> The problem is not disabled but the enable.

So a misbehaving device can lose interrupts? That's not a problem at all
imo.

> We use smp_store_relase()
> to make sure the driver commits the setup before enabling the irq. It
> means the read needs to be ordered as well in vring_interrupt().
> 
> >
> > Although I couldn't find anything about this in memory-barriers.txt
> > which surprises me.
> >
> > CC Paul to help make sure I'm right.
> >
> >
> > >
> > > >
> > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > module parameter is introduced to enable the hardening so function
> > > > > hardening is disabled by default.
> > > > Which devices are these? How come they send an interrupt before there
> > > > are any buffers in any queues?
> > >
> > >
> > > I copied this from the commit log for 22b7050a024d7
> > >
> > > "
> > >
> > >     This change will also benefit old hypervisors (before 2009)
> > >     that send interrupts without checking DRIVER_OK: previously,
> > >     the callback could race with driver-specific initialization.
> > > "
> > >
> > > If this is only for config interrupt, I can remove the above log.
> >
> >
> > This is only for config interrupt.
> 
> Ok.
> 
> >
> > >
> > > >
> > > > > Note that the hardening is only done for vring interrupt since the
> > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > handler because it uses spinlock to do the synchronization which is
> > > > > expensive.
> > > > >
> > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > >
> > > > > ---
> > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > >   include/linux/virtio.h        |  4 ++++
> > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > --- a/drivers/virtio/virtio.c
> > > > > +++ b/drivers/virtio/virtio.c
> > > > > @@ -7,6 +7,12 @@
> > > > >   #include <linux/of.h>
> > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > +static bool irq_hardening = false;
> > > > > +
> > > > > +module_param(irq_hardening, bool, 0444);
> > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > +
> > > > >   /* Unique numbering for virtio devices. */
> > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > >    * */
> > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > >   {
> > > > > + /*
> > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > +  * interrupt for this line arriving after
> > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > +  * irq_soft_enabled == false.
> > > > News to me I did not know synchronize_rcu has anything to do
> > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > though it's most likely is ...
> > >
> > >
> > > According to the comment above tree RCU version of synchronize_rcu():
> > >
> > > """
> > >
> > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > >  * or softirqs have been disabled also serve as RCU read-side critical
> > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > >  * and NMI handlers.
> > > """
> > >
> > > So interrupt handlers are treated as read-side critical sections.
> > >
> > > And it has the comment for explain the barrier:
> > >
> > > """
> > >
> > >  * Note that this guarantee implies further memory-ordering guarantees.
> > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > >  * each CPU is guaranteed to have executed a full memory barrier since
> > >  * the end of its last RCU read-side critical section whose beginning
> > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > """
> > >
> > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > irq_soft_enabled as false.
> > >
> >
> > You are right. So then
> > 1. I do not think we need load_acquire - why is it needed? Just
> >    READ_ONCE should do.
> 
> See above.
> 
> > 2. isn't synchronize_irq also doing the same thing?
> 
> 
> Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> 
> >
> >
> > > >
> > > > > +  */
> > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > + synchronize_rcu();
> > > > > +
> > > > >           dev->config->reset(dev);
> > > > >   }
> > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > Please add comment explaining where it will be enabled.
> > > > Also, we *really* don't need to synch if it was already disabled,
> > > > let's not add useless overhead to the boot sequence.
> > >
> > >
> > > Ok.
> > >
> > >
> > > >
> > > >
> > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > >           spin_lock_init(&dev->config_lock);
> > > > >           dev->config_enabled = false;
> > > > >           dev->config_change_pending = false;
> > > > > + dev->irq_soft_check = irq_hardening;
> > > > > +
> > > > > + if (dev->irq_soft_check)
> > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > >           /* We always start by resetting the device, in case a previous
> > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > one of the points of hardening is it's also helpful for buggy
> > > > devices. this flag defeats the purpose.
> > >
> > >
> > > Do you mean:
> > >
> > > 1) we need something like config_enable? This seems not easy to be
> > > implemented without obvious overhead, mainly the synchronize with the
> > > interrupt handlers
> >
> > But synchronize is only on tear-down path. That is not critical for any
> > users at the moment, even less than probe.
> 
> I meant if we have vq->irq_pending, we need to call vring_interrupt()
> in the virtio_device_ready() and synchronize the IRQ handlers with
> spinlock or others.
> 
> >
> > > 2) enable this by default, so I don't object, but this may have some risk
> > > for old hypervisors
> >
> >
> > The risk if there's a driver adding buffers without setting DRIVER_OK.
> 
> Probably not, we have devices that accept random inputs from outside,
> net, console, input etc. I've done a round of audits of the Qemu
> codes. They look all fine since day0.
> 
> > So with this approach, how about we rename the flag "driver_ok"?
> > And then add_buf can actually test it and BUG_ON if not there  (at least
> > in the debug build).
> 
> This looks like a hardening of the driver in the core instead of the
> device. I think it can be done but in a separate series.
> 
> >
> > And going down from there, how about we cache status in the
> > device? Then we don't need to keep re-reading it every time,
> > speeding boot up a tiny bit.
> 
> I don't fully understand here, actually spec requires status to be
> read back for validation in many cases.
> 
> Thanks
> 
> >
> > >
> > > >
> > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > >   }
> > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > >   {
> > > > > + struct virtqueue *_vq = v;
> > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > +         return IRQ_NONE;
> > > > > + }
> > > > > +
> > > > >           if (!more_used(vq)) {
> > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > >                   return IRQ_NONE;
> > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > --- a/include/linux/virtio.h
> > > > > +++ b/include/linux/virtio.h
> > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > >    * @config_enabled: configuration change reporting enabled
> > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > + * @irq_soft_enabled: callbacks enabled
> > > > >    * @config_lock: protects configuration change reporting
> > > > >    * @dev: underlying device.
> > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > >           bool failed;
> > > > >           bool config_enabled;
> > > > >           bool config_change_pending;
> > > > > + bool irq_soft_check;
> > > > > + bool irq_soft_enabled;
> > > > >           spinlock_t config_lock;
> > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > >           struct device dev;
> > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > --- a/include/linux/virtio_config.h
> > > > > +++ b/include/linux/virtio_config.h
> > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > >           return __virtio_test_bit(vdev, fbit);
> > > > >   }
> > > > > +/*
> > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > + * @vdev: the device
> > > > > + */
> > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > +{
> > > > > + if (!vdev->irq_soft_check)
> > > > > +         return true;
> > > > > +
> > > > > + /*
> > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > +  * data. Paried with smp_store_relase() in
> > > > paired
> > >
> > >
> > > Will fix.
> > >
> > > Thanks
> > >
> > >
> > > >
> > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > +  * virtio_reset_device().
> > > > > +  */
> > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > +}
> > > > > +
> > > > >   /**
> > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > >    * @vdev: the device
> > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > >           if (dev->config->enable_cbs)
> > > > >                     dev->config->enable_cbs(dev);
> > > > > + /*
> > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > +  * virtio_irq_soft_enabled()
> > > > > +  */
> > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > +
> > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > >   }
> > > > > --
> > > > > 2.25.1
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-25  9:10   ` Re: Michael S. Tsirkin
@ 2022-03-25  9:20     ` Jason Wang
  2022-03-25 10:09       ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1651+ messages in thread
From: Jason Wang @ 2022-03-25  9:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > Bcc:
> > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > Reply-To:
> > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > >
> > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > >
> > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > >
> > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > >     that is used by some device such as virtio-blk
> > > > > > 2) done only for PCI transport
> > > > > >
> > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > virtio_device. Then we can to toggle it during
> > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > but the cost should be acceptable.
> > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > >
> > > >
> > > > Even if we allow the transport driver to synchornize through
> > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > >
> > > > We do something like the following previously:
> > > >
> > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > >                 return IRQ_NONE;
> > > >
> > > > But it looks like a bug since speculative read can be done before the check
> > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > the driver.
> > >
> > > I don't think so - if you sync after setting the value then
> > > you are guaranteed that any handler running afterwards
> > > will see the new value.
> >
> > The problem is not disabled but the enable.
>
> So a misbehaving device can lose interrupts? That's not a problem at all
> imo.

It's the interrupt raised before setting irq_soft_enabled to true:

CPU 0 probe) driver specific setup (not commited)
CPU 1 IRQ handler) read the uninitialized variable
CPU 0 probe) set irq_soft_enabled to true
CPU 1 IRQ handler) read irq_soft_enable as true
CPU 1 IRQ handler) use the uninitialized variable

Thanks

>
> > We use smp_store_relase()
> > to make sure the driver commits the setup before enabling the irq. It
> > means the read needs to be ordered as well in vring_interrupt().
> >
> > >
> > > Although I couldn't find anything about this in memory-barriers.txt
> > > which surprises me.
> > >
> > > CC Paul to help make sure I'm right.
> > >
> > >
> > > >
> > > > >
> > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > module parameter is introduced to enable the hardening so function
> > > > > > hardening is disabled by default.
> > > > > Which devices are these? How come they send an interrupt before there
> > > > > are any buffers in any queues?
> > > >
> > > >
> > > > I copied this from the commit log for 22b7050a024d7
> > > >
> > > > "
> > > >
> > > >     This change will also benefit old hypervisors (before 2009)
> > > >     that send interrupts without checking DRIVER_OK: previously,
> > > >     the callback could race with driver-specific initialization.
> > > > "
> > > >
> > > > If this is only for config interrupt, I can remove the above log.
> > >
> > >
> > > This is only for config interrupt.
> >
> > Ok.
> >
> > >
> > > >
> > > > >
> > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > expensive.
> > > > > >
> > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > >
> > > > > > ---
> > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > --- a/drivers/virtio/virtio.c
> > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > @@ -7,6 +7,12 @@
> > > > > >   #include <linux/of.h>
> > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > +static bool irq_hardening = false;
> > > > > > +
> > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > +
> > > > > >   /* Unique numbering for virtio devices. */
> > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > >    * */
> > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > >   {
> > > > > > + /*
> > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > +  * interrupt for this line arriving after
> > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > +  * irq_soft_enabled == false.
> > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > though it's most likely is ...
> > > >
> > > >
> > > > According to the comment above tree RCU version of synchronize_rcu():
> > > >
> > > > """
> > > >
> > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > >  * and NMI handlers.
> > > > """
> > > >
> > > > So interrupt handlers are treated as read-side critical sections.
> > > >
> > > > And it has the comment for explain the barrier:
> > > >
> > > > """
> > > >
> > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > >  * the end of its last RCU read-side critical section whose beginning
> > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > """
> > > >
> > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > irq_soft_enabled as false.
> > > >
> > >
> > > You are right. So then
> > > 1. I do not think we need load_acquire - why is it needed? Just
> > >    READ_ONCE should do.
> >
> > See above.
> >
> > > 2. isn't synchronize_irq also doing the same thing?
> >
> >
> > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> >
> > >
> > >
> > > > >
> > > > > > +  */
> > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > + synchronize_rcu();
> > > > > > +
> > > > > >           dev->config->reset(dev);
> > > > > >   }
> > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > Please add comment explaining where it will be enabled.
> > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > let's not add useless overhead to the boot sequence.
> > > >
> > > >
> > > > Ok.
> > > >
> > > >
> > > > >
> > > > >
> > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > >           spin_lock_init(&dev->config_lock);
> > > > > >           dev->config_enabled = false;
> > > > > >           dev->config_change_pending = false;
> > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > +
> > > > > > + if (dev->irq_soft_check)
> > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > >           /* We always start by resetting the device, in case a previous
> > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > one of the points of hardening is it's also helpful for buggy
> > > > > devices. this flag defeats the purpose.
> > > >
> > > >
> > > > Do you mean:
> > > >
> > > > 1) we need something like config_enable? This seems not easy to be
> > > > implemented without obvious overhead, mainly the synchronize with the
> > > > interrupt handlers
> > >
> > > But synchronize is only on tear-down path. That is not critical for any
> > > users at the moment, even less than probe.
> >
> > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > in the virtio_device_ready() and synchronize the IRQ handlers with
> > spinlock or others.
> >
> > >
> > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > for old hypervisors
> > >
> > >
> > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> >
> > Probably not, we have devices that accept random inputs from outside,
> > net, console, input etc. I've done a round of audits of the Qemu
> > codes. They look all fine since day0.
> >
> > > So with this approach, how about we rename the flag "driver_ok"?
> > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > in the debug build).
> >
> > This looks like a hardening of the driver in the core instead of the
> > device. I think it can be done but in a separate series.
> >
> > >
> > > And going down from there, how about we cache status in the
> > > device? Then we don't need to keep re-reading it every time,
> > > speeding boot up a tiny bit.
> >
> > I don't fully understand here, actually spec requires status to be
> > read back for validation in many cases.
> >
> > Thanks
> >
> > >
> > > >
> > > > >
> > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > >   }
> > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > >   {
> > > > > > + struct virtqueue *_vq = v;
> > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > +         return IRQ_NONE;
> > > > > > + }
> > > > > > +
> > > > > >           if (!more_used(vq)) {
> > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > >                   return IRQ_NONE;
> > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > --- a/include/linux/virtio.h
> > > > > > +++ b/include/linux/virtio.h
> > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > >    * @config_lock: protects configuration change reporting
> > > > > >    * @dev: underlying device.
> > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > >           bool failed;
> > > > > >           bool config_enabled;
> > > > > >           bool config_change_pending;
> > > > > > + bool irq_soft_check;
> > > > > > + bool irq_soft_enabled;
> > > > > >           spinlock_t config_lock;
> > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > >           struct device dev;
> > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > --- a/include/linux/virtio_config.h
> > > > > > +++ b/include/linux/virtio_config.h
> > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > >   }
> > > > > > +/*
> > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > + * @vdev: the device
> > > > > > + */
> > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > +{
> > > > > > + if (!vdev->irq_soft_check)
> > > > > > +         return true;
> > > > > > +
> > > > > > + /*
> > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > +  * data. Paried with smp_store_relase() in
> > > > > paired
> > > >
> > > >
> > > > Will fix.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > >
> > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > +  * virtio_reset_device().
> > > > > > +  */
> > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > +}
> > > > > > +
> > > > > >   /**
> > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > >    * @vdev: the device
> > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > >           if (dev->config->enable_cbs)
> > > > > >                     dev->config->enable_cbs(dev);
> > > > > > + /*
> > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > +  * virtio_irq_soft_enabled()
> > > > > > +  */
> > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > +
> > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > >   }
> > > > > > --
> > > > > > 2.25.1
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-25  9:20     ` Re: Jason Wang
@ 2022-03-25 10:09       ` Michael S. Tsirkin
  2022-03-28  4:56         ` Re: Jason Wang
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-03-25 10:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > Bcc:
> > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > Reply-To:
> > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > >
> > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > >
> > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > >
> > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > >     that is used by some device such as virtio-blk
> > > > > > > 2) done only for PCI transport
> > > > > > >
> > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > but the cost should be acceptable.
> > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > >
> > > > >
> > > > > Even if we allow the transport driver to synchornize through
> > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > >
> > > > > We do something like the following previously:
> > > > >
> > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > >                 return IRQ_NONE;
> > > > >
> > > > > But it looks like a bug since speculative read can be done before the check
> > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > the driver.
> > > >
> > > > I don't think so - if you sync after setting the value then
> > > > you are guaranteed that any handler running afterwards
> > > > will see the new value.
> > >
> > > The problem is not disabled but the enable.
> >
> > So a misbehaving device can lose interrupts? That's not a problem at all
> > imo.
> 
> It's the interrupt raised before setting irq_soft_enabled to true:
> 
> CPU 0 probe) driver specific setup (not commited)
> CPU 1 IRQ handler) read the uninitialized variable
> CPU 0 probe) set irq_soft_enabled to true
> CPU 1 IRQ handler) read irq_soft_enable as true
> CPU 1 IRQ handler) use the uninitialized variable
> 
> Thanks

Yea, it hurts if you do it.  So do not do it then ;).

irq_soft_enabled (I think driver_ok or status is a better name)
should be initialized to false *before* irq is requested.

And requesting irq commits all memory otherwise all drivers would be
broken, if it doesn't it just needs to be fixed, not worked around in
virtio.


> >
> > > We use smp_store_relase()
> > > to make sure the driver commits the setup before enabling the irq. It
> > > means the read needs to be ordered as well in vring_interrupt().
> > >
> > > >
> > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > which surprises me.
> > > >
> > > > CC Paul to help make sure I'm right.
> > > >
> > > >
> > > > >
> > > > > >
> > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > hardening is disabled by default.
> > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > are any buffers in any queues?
> > > > >
> > > > >
> > > > > I copied this from the commit log for 22b7050a024d7
> > > > >
> > > > > "
> > > > >
> > > > >     This change will also benefit old hypervisors (before 2009)
> > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > >     the callback could race with driver-specific initialization.
> > > > > "
> > > > >
> > > > > If this is only for config interrupt, I can remove the above log.
> > > >
> > > >
> > > > This is only for config interrupt.
> > >
> > > Ok.
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > expensive.
> > > > > > >
> > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > >
> > > > > > > ---
> > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > @@ -7,6 +7,12 @@
> > > > > > >   #include <linux/of.h>
> > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > +static bool irq_hardening = false;
> > > > > > > +
> > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > +
> > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > >    * */
> > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > >   {
> > > > > > > + /*
> > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > +  * interrupt for this line arriving after
> > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > +  * irq_soft_enabled == false.
> > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > though it's most likely is ...
> > > > >
> > > > >
> > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > >
> > > > > """
> > > > >
> > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > >  * and NMI handlers.
> > > > > """
> > > > >
> > > > > So interrupt handlers are treated as read-side critical sections.
> > > > >
> > > > > And it has the comment for explain the barrier:
> > > > >
> > > > > """
> > > > >
> > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > """
> > > > >
> > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > irq_soft_enabled as false.
> > > > >
> > > >
> > > > You are right. So then
> > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > >    READ_ONCE should do.
> > >
> > > See above.
> > >
> > > > 2. isn't synchronize_irq also doing the same thing?
> > >
> > >
> > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > >
> > > >
> > > >
> > > > > >
> > > > > > > +  */
> > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > + synchronize_rcu();
> > > > > > > +
> > > > > > >           dev->config->reset(dev);
> > > > > > >   }
> > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > Please add comment explaining where it will be enabled.
> > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > let's not add useless overhead to the boot sequence.
> > > > >
> > > > >
> > > > > Ok.
> > > > >
> > > > >
> > > > > >
> > > > > >
> > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > >           dev->config_enabled = false;
> > > > > > >           dev->config_change_pending = false;
> > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > +
> > > > > > > + if (dev->irq_soft_check)
> > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > devices. this flag defeats the purpose.
> > > > >
> > > > >
> > > > > Do you mean:
> > > > >
> > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > interrupt handlers
> > > >
> > > > But synchronize is only on tear-down path. That is not critical for any
> > > > users at the moment, even less than probe.
> > >
> > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > spinlock or others.
> > >
> > > >
> > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > for old hypervisors
> > > >
> > > >
> > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > >
> > > Probably not, we have devices that accept random inputs from outside,
> > > net, console, input etc. I've done a round of audits of the Qemu
> > > codes. They look all fine since day0.
> > >
> > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > in the debug build).
> > >
> > > This looks like a hardening of the driver in the core instead of the
> > > device. I think it can be done but in a separate series.
> > >
> > > >
> > > > And going down from there, how about we cache status in the
> > > > device? Then we don't need to keep re-reading it every time,
> > > > speeding boot up a tiny bit.
> > >
> > > I don't fully understand here, actually spec requires status to be
> > > read back for validation in many cases.
> > >
> > > Thanks
> > >
> > > >
> > > > >
> > > > > >
> > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > >   }
> > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > >   {
> > > > > > > + struct virtqueue *_vq = v;
> > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > +         return IRQ_NONE;
> > > > > > > + }
> > > > > > > +
> > > > > > >           if (!more_used(vq)) {
> > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > >                   return IRQ_NONE;
> > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > --- a/include/linux/virtio.h
> > > > > > > +++ b/include/linux/virtio.h
> > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > >    * @dev: underlying device.
> > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > >           bool failed;
> > > > > > >           bool config_enabled;
> > > > > > >           bool config_change_pending;
> > > > > > > + bool irq_soft_check;
> > > > > > > + bool irq_soft_enabled;
> > > > > > >           spinlock_t config_lock;
> > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > >           struct device dev;
> > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > >   }
> > > > > > > +/*
> > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > + * @vdev: the device
> > > > > > > + */
> > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > +{
> > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > +         return true;
> > > > > > > +
> > > > > > > + /*
> > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > paired
> > > > >
> > > > >
> > > > > Will fix.
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > >
> > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > +  * virtio_reset_device().
> > > > > > > +  */
> > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > +}
> > > > > > > +
> > > > > > >   /**
> > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > >    * @vdev: the device
> > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > >           if (dev->config->enable_cbs)
> > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > + /*
> > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > +  */
> > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > +
> > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > >   }
> > > > > > > --
> > > > > > > 2.25.1
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-25 10:09       ` Re: Michael S. Tsirkin
@ 2022-03-28  4:56         ` Jason Wang
  2022-03-28  5:59           ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1651+ messages in thread
From: Jason Wang @ 2022-03-28  4:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > Bcc:
> > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > Reply-To:
> > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > >
> > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > >
> > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > >
> > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > 2) done only for PCI transport
> > > > > > > >
> > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > but the cost should be acceptable.
> > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > >
> > > > > >
> > > > > > Even if we allow the transport driver to synchornize through
> > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > >
> > > > > > We do something like the following previously:
> > > > > >
> > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > >                 return IRQ_NONE;
> > > > > >
> > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > the driver.
> > > > >
> > > > > I don't think so - if you sync after setting the value then
> > > > > you are guaranteed that any handler running afterwards
> > > > > will see the new value.
> > > >
> > > > The problem is not disabled but the enable.
> > >
> > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > imo.
> >
> > It's the interrupt raised before setting irq_soft_enabled to true:
> >
> > CPU 0 probe) driver specific setup (not commited)
> > CPU 1 IRQ handler) read the uninitialized variable
> > CPU 0 probe) set irq_soft_enabled to true
> > CPU 1 IRQ handler) read irq_soft_enable as true
> > CPU 1 IRQ handler) use the uninitialized variable
> >
> > Thanks
>
> Yea, it hurts if you do it.  So do not do it then ;).
>
> irq_soft_enabled (I think driver_ok or status is a better name)

I can change it to driver_ok.

> should be initialized to false *before* irq is requested.
>
> And requesting irq commits all memory otherwise all drivers would be
> broken,

So I think we might talk different issues:

1) Whether request_irq() commits the previous setups, I think the
answer is yes, since the spin_unlock of desc->lock (release) can
guarantee this though there seems no documentation around
request_irq() to say this.

And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
using smp_wmb() before the request_irq().

And even if write is ordered we still need read to be ordered to be
paired with that.

> if it doesn't it just needs to be fixed, not worked around in
> virtio.

2) virtio drivers might do a lot of setups between request_irq() and
virtio_device_ready():

request_irq()
driver specific setups
virtio_device_ready()

CPU 0 probe) request_irq()
CPU 1 IRQ handler) read the uninitialized variable
CPU 0 probe) driver specific setups
CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
CPU 1 IRQ handler) read irq_soft_enable as true
CPU 1 IRQ handler) use the uninitialized variable

Thanks

>
>
> > >
> > > > We use smp_store_relase()
> > > > to make sure the driver commits the setup before enabling the irq. It
> > > > means the read needs to be ordered as well in vring_interrupt().
> > > >
> > > > >
> > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > which surprises me.
> > > > >
> > > > > CC Paul to help make sure I'm right.
> > > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > hardening is disabled by default.
> > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > are any buffers in any queues?
> > > > > >
> > > > > >
> > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > >
> > > > > > "
> > > > > >
> > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > >     the callback could race with driver-specific initialization.
> > > > > > "
> > > > > >
> > > > > > If this is only for config interrupt, I can remove the above log.
> > > > >
> > > > >
> > > > > This is only for config interrupt.
> > > >
> > > > Ok.
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > expensive.
> > > > > > > >
> > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > >
> > > > > > > > ---
> > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > >   #include <linux/of.h>
> > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > +static bool irq_hardening = false;
> > > > > > > > +
> > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > +
> > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > >    * */
> > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > >   {
> > > > > > > > + /*
> > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > though it's most likely is ...
> > > > > >
> > > > > >
> > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > >
> > > > > > """
> > > > > >
> > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > >  * and NMI handlers.
> > > > > > """
> > > > > >
> > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > >
> > > > > > And it has the comment for explain the barrier:
> > > > > >
> > > > > > """
> > > > > >
> > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > """
> > > > > >
> > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > irq_soft_enabled as false.
> > > > > >
> > > > >
> > > > > You are right. So then
> > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > >    READ_ONCE should do.
> > > >
> > > > See above.
> > > >
> > > > > 2. isn't synchronize_irq also doing the same thing?
> > > >
> > > >
> > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > >
> > > > >
> > > > >
> > > > > > >
> > > > > > > > +  */
> > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > + synchronize_rcu();
> > > > > > > > +
> > > > > > > >           dev->config->reset(dev);
> > > > > > > >   }
> > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > let's not add useless overhead to the boot sequence.
> > > > > >
> > > > > >
> > > > > > Ok.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > >           dev->config_enabled = false;
> > > > > > > >           dev->config_change_pending = false;
> > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > +
> > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > devices. this flag defeats the purpose.
> > > > > >
> > > > > >
> > > > > > Do you mean:
> > > > > >
> > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > interrupt handlers
> > > > >
> > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > users at the moment, even less than probe.
> > > >
> > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > spinlock or others.
> > > >
> > > > >
> > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > for old hypervisors
> > > > >
> > > > >
> > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > >
> > > > Probably not, we have devices that accept random inputs from outside,
> > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > codes. They look all fine since day0.
> > > >
> > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > in the debug build).
> > > >
> > > > This looks like a hardening of the driver in the core instead of the
> > > > device. I think it can be done but in a separate series.
> > > >
> > > > >
> > > > > And going down from there, how about we cache status in the
> > > > > device? Then we don't need to keep re-reading it every time,
> > > > > speeding boot up a tiny bit.
> > > >
> > > > I don't fully understand here, actually spec requires status to be
> > > > read back for validation in many cases.
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > >   }
> > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > >   {
> > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > +         return IRQ_NONE;
> > > > > > > > + }
> > > > > > > > +
> > > > > > > >           if (!more_used(vq)) {
> > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > >                   return IRQ_NONE;
> > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > >    * @dev: underlying device.
> > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > >           bool failed;
> > > > > > > >           bool config_enabled;
> > > > > > > >           bool config_change_pending;
> > > > > > > > + bool irq_soft_check;
> > > > > > > > + bool irq_soft_enabled;
> > > > > > > >           spinlock_t config_lock;
> > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > >           struct device dev;
> > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > >   }
> > > > > > > > +/*
> > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > + * @vdev: the device
> > > > > > > > + */
> > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > +{
> > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > +         return true;
> > > > > > > > +
> > > > > > > > + /*
> > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > paired
> > > > > >
> > > > > >
> > > > > > Will fix.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > +  * virtio_reset_device().
> > > > > > > > +  */
> > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > >   /**
> > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > >    * @vdev: the device
> > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > + /*
> > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > +  */
> > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > +
> > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > >   }
> > > > > > > > --
> > > > > > > > 2.25.1
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-28  4:56         ` Re: Jason Wang
@ 2022-03-28  5:59           ` Michael S. Tsirkin
  2022-03-28  6:18             ` Re: Jason Wang
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-03-28  5:59 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Mon, Mar 28, 2022 at 12:56:41PM +0800, Jason Wang wrote:
> On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > Bcc:
> > > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > > Reply-To:
> > > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > > >
> > > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > > >
> > > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > > >
> > > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > > 2) done only for PCI transport
> > > > > > > > >
> > > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > > but the cost should be acceptable.
> > > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > > >
> > > > > > >
> > > > > > > Even if we allow the transport driver to synchornize through
> > > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > > >
> > > > > > > We do something like the following previously:
> > > > > > >
> > > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > > >                 return IRQ_NONE;
> > > > > > >
> > > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > > the driver.
> > > > > >
> > > > > > I don't think so - if you sync after setting the value then
> > > > > > you are guaranteed that any handler running afterwards
> > > > > > will see the new value.
> > > > >
> > > > > The problem is not disabled but the enable.
> > > >
> > > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > > imo.
> > >
> > > It's the interrupt raised before setting irq_soft_enabled to true:
> > >
> > > CPU 0 probe) driver specific setup (not commited)
> > > CPU 1 IRQ handler) read the uninitialized variable
> > > CPU 0 probe) set irq_soft_enabled to true
> > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > CPU 1 IRQ handler) use the uninitialized variable
> > >
> > > Thanks
> >
> > Yea, it hurts if you do it.  So do not do it then ;).
> >
> > irq_soft_enabled (I think driver_ok or status is a better name)
> 
> I can change it to driver_ok.
> 
> > should be initialized to false *before* irq is requested.
> >
> > And requesting irq commits all memory otherwise all drivers would be
> > broken,
> 
> So I think we might talk different issues:
> 
> 1) Whether request_irq() commits the previous setups, I think the
> answer is yes, since the spin_unlock of desc->lock (release) can
> guarantee this though there seems no documentation around
> request_irq() to say this.
> 
> And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> using smp_wmb() before the request_irq().
> 
> And even if write is ordered we still need read to be ordered to be
> paired with that.
> 
> > if it doesn't it just needs to be fixed, not worked around in
> > virtio.
> 
> 2) virtio drivers might do a lot of setups between request_irq() and
> virtio_device_ready():
> 
> request_irq()
> driver specific setups
> virtio_device_ready()
> 
> CPU 0 probe) request_irq()
> CPU 1 IRQ handler) read the uninitialized variable
> CPU 0 probe) driver specific setups
> CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> CPU 1 IRQ handler) read irq_soft_enable as true
> CPU 1 IRQ handler) use the uninitialized variable
> 
> Thanks


As I said, virtio_device_ready needs to do synchronize_irq.
That will guarantee all setup is visible to the specific IRQ, this
is what it's point is.


> >
> >
> > > >
> > > > > We use smp_store_relase()
> > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > >
> > > > > >
> > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > which surprises me.
> > > > > >
> > > > > > CC Paul to help make sure I'm right.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > hardening is disabled by default.
> > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > are any buffers in any queues?
> > > > > > >
> > > > > > >
> > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > >
> > > > > > > "
> > > > > > >
> > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > "
> > > > > > >
> > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > >
> > > > > >
> > > > > > This is only for config interrupt.
> > > > >
> > > > > Ok.
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > expensive.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > >   #include <linux/of.h>
> > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > +
> > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > +
> > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > >    * */
> > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > >   {
> > > > > > > > > + /*
> > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > though it's most likely is ...
> > > > > > >
> > > > > > >
> > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > >
> > > > > > > """
> > > > > > >
> > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > >  * and NMI handlers.
> > > > > > > """
> > > > > > >
> > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > >
> > > > > > > And it has the comment for explain the barrier:
> > > > > > >
> > > > > > > """
> > > > > > >
> > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > """
> > > > > > >
> > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > irq_soft_enabled as false.
> > > > > > >
> > > > > >
> > > > > > You are right. So then
> > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > >    READ_ONCE should do.
> > > > >
> > > > > See above.
> > > > >
> > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > >
> > > > >
> > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > >
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > > > +  */
> > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > + synchronize_rcu();
> > > > > > > > > +
> > > > > > > > >           dev->config->reset(dev);
> > > > > > > > >   }
> > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > >
> > > > > > >
> > > > > > > Ok.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > >           dev->config_enabled = false;
> > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > +
> > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > devices. this flag defeats the purpose.
> > > > > > >
> > > > > > >
> > > > > > > Do you mean:
> > > > > > >
> > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > interrupt handlers
> > > > > >
> > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > users at the moment, even less than probe.
> > > > >
> > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > spinlock or others.
> > > > >
> > > > > >
> > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > for old hypervisors
> > > > > >
> > > > > >
> > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > >
> > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > codes. They look all fine since day0.
> > > > >
> > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > in the debug build).
> > > > >
> > > > > This looks like a hardening of the driver in the core instead of the
> > > > > device. I think it can be done but in a separate series.
> > > > >
> > > > > >
> > > > > > And going down from there, how about we cache status in the
> > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > speeding boot up a tiny bit.
> > > > >
> > > > > I don't fully understand here, actually spec requires status to be
> > > > > read back for validation in many cases.
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > >   }
> > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > >   {
> > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > + }
> > > > > > > > > +
> > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > >    * @dev: underlying device.
> > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > >           bool failed;
> > > > > > > > >           bool config_enabled;
> > > > > > > > >           bool config_change_pending;
> > > > > > > > > + bool irq_soft_check;
> > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > >           spinlock_t config_lock;
> > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > >           struct device dev;
> > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > >   }
> > > > > > > > > +/*
> > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > + * @vdev: the device
> > > > > > > > > + */
> > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > +{
> > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > +         return true;
> > > > > > > > > +
> > > > > > > > > + /*
> > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > paired
> > > > > > >
> > > > > > >
> > > > > > > Will fix.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > +  */
> > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > >   /**
> > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > >    * @vdev: the device
> > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > + /*
> > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > +  */
> > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > +
> > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > >   }
> > > > > > > > > --
> > > > > > > > > 2.25.1
> > > > > >
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-28  5:59           ` Re: Michael S. Tsirkin
@ 2022-03-28  6:18             ` Jason Wang
  2022-03-28 10:40               ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1651+ messages in thread
From: Jason Wang @ 2022-03-28  6:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Mon, Mar 28, 2022 at 1:59 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Mar 28, 2022 at 12:56:41PM +0800, Jason Wang wrote:
> > On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > > > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > Bcc:
> > > > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > > > Reply-To:
> > > > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > > > >
> > > > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > > > >
> > > > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > > > >
> > > > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > > > 2) done only for PCI transport
> > > > > > > > > >
> > > > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > > > but the cost should be acceptable.
> > > > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > > > >
> > > > > > > >
> > > > > > > > Even if we allow the transport driver to synchornize through
> > > > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > > > >
> > > > > > > > We do something like the following previously:
> > > > > > > >
> > > > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > > > >                 return IRQ_NONE;
> > > > > > > >
> > > > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > > > the driver.
> > > > > > >
> > > > > > > I don't think so - if you sync after setting the value then
> > > > > > > you are guaranteed that any handler running afterwards
> > > > > > > will see the new value.
> > > > > >
> > > > > > The problem is not disabled but the enable.
> > > > >
> > > > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > > > imo.
> > > >
> > > > It's the interrupt raised before setting irq_soft_enabled to true:
> > > >
> > > > CPU 0 probe) driver specific setup (not commited)
> > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > CPU 0 probe) set irq_soft_enabled to true
> > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > CPU 1 IRQ handler) use the uninitialized variable
> > > >
> > > > Thanks
> > >
> > > Yea, it hurts if you do it.  So do not do it then ;).
> > >
> > > irq_soft_enabled (I think driver_ok or status is a better name)
> >
> > I can change it to driver_ok.
> >
> > > should be initialized to false *before* irq is requested.
> > >
> > > And requesting irq commits all memory otherwise all drivers would be
> > > broken,
> >
> > So I think we might talk different issues:
> >
> > 1) Whether request_irq() commits the previous setups, I think the
> > answer is yes, since the spin_unlock of desc->lock (release) can
> > guarantee this though there seems no documentation around
> > request_irq() to say this.
> >
> > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > using smp_wmb() before the request_irq().
> >
> > And even if write is ordered we still need read to be ordered to be
> > paired with that.
> >
> > > if it doesn't it just needs to be fixed, not worked around in
> > > virtio.
> >
> > 2) virtio drivers might do a lot of setups between request_irq() and
> > virtio_device_ready():
> >
> > request_irq()
> > driver specific setups
> > virtio_device_ready()
> >
> > CPU 0 probe) request_irq()
> > CPU 1 IRQ handler) read the uninitialized variable
> > CPU 0 probe) driver specific setups
> > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > CPU 1 IRQ handler) read irq_soft_enable as true
> > CPU 1 IRQ handler) use the uninitialized variable
> >
> > Thanks
>
>
> As I said, virtio_device_ready needs to do synchronize_irq.
> That will guarantee all setup is visible to the specific IRQ,

Only the interrupt after synchronize_irq() returns.

>this
> is what it's point is.

What happens if an interrupt is raised in the middle like:

smp_store_release(dev->irq_soft_enabled, true)
IRQ handler
synchornize_irq()

If we don't enforce a reading order, the IRQ handler may still see the
uninitialized variable.

Thanks

>
>
> > >
> > >
> > > > >
> > > > > > We use smp_store_relase()
> > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > >
> > > > > > >
> > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > which surprises me.
> > > > > > >
> > > > > > > CC Paul to help make sure I'm right.
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > hardening is disabled by default.
> > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > are any buffers in any queues?
> > > > > > > >
> > > > > > > >
> > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > >
> > > > > > > > "
> > > > > > > >
> > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > "
> > > > > > > >
> > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > >
> > > > > > >
> > > > > > > This is only for config interrupt.
> > > > > >
> > > > > > Ok.
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > expensive.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > >
> > > > > > > > > > ---
> > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > +
> > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > +
> > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > >    * */
> > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > >   {
> > > > > > > > > > + /*
> > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > though it's most likely is ...
> > > > > > > >
> > > > > > > >
> > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > >
> > > > > > > > """
> > > > > > > >
> > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > >  * and NMI handlers.
> > > > > > > > """
> > > > > > > >
> > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > >
> > > > > > > > And it has the comment for explain the barrier:
> > > > > > > >
> > > > > > > > """
> > > > > > > >
> > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > """
> > > > > > > >
> > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > irq_soft_enabled as false.
> > > > > > > >
> > > > > > >
> > > > > > > You are right. So then
> > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > >    READ_ONCE should do.
> > > > > >
> > > > > > See above.
> > > > > >
> > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > >
> > > > > >
> > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > > +  */
> > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > +
> > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > >   }
> > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > >
> > > > > > > >
> > > > > > > > Ok.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > +
> > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > >
> > > > > > > >
> > > > > > > > Do you mean:
> > > > > > > >
> > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > interrupt handlers
> > > > > > >
> > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > users at the moment, even less than probe.
> > > > > >
> > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > spinlock or others.
> > > > > >
> > > > > > >
> > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > for old hypervisors
> > > > > > >
> > > > > > >
> > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > >
> > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > codes. They look all fine since day0.
> > > > > >
> > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > in the debug build).
> > > > > >
> > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > device. I think it can be done but in a separate series.
> > > > > >
> > > > > > >
> > > > > > > And going down from there, how about we cache status in the
> > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > speeding boot up a tiny bit.
> > > > > >
> > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > read back for validation in many cases.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > >   }
> > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > >   {
> > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > + }
> > > > > > > > > > +
> > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > >           bool failed;
> > > > > > > > > >           bool config_enabled;
> > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > >           struct device dev;
> > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > >   }
> > > > > > > > > > +/*
> > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > + * @vdev: the device
> > > > > > > > > > + */
> > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > +{
> > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > +         return true;
> > > > > > > > > > +
> > > > > > > > > > + /*
> > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > paired
> > > > > > > >
> > > > > > > >
> > > > > > > > Will fix.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > +  */
> > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > >   /**
> > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > >    * @vdev: the device
> > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > + /*
> > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > +  */
> > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > +
> > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > >   }
> > > > > > > > > > --
> > > > > > > > > > 2.25.1
> > > > > > >
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-28  6:18             ` Re: Jason Wang
@ 2022-03-28 10:40               ` Michael S. Tsirkin
  2022-03-29  7:12                 ` Re: Jason Wang
  2022-03-29  8:35                 ` Re: Thomas Gleixner
  0 siblings, 2 replies; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-03-28 10:40 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> On Mon, Mar 28, 2022 at 1:59 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Mar 28, 2022 at 12:56:41PM +0800, Jason Wang wrote:
> > > On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > > > > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > >
> > > > > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > >
> > > > > > > > Bcc:
> > > > > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > > > > Reply-To:
> > > > > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > > > > >
> > > > > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > > > > >
> > > > > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > > > > >
> > > > > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > > > > 2) done only for PCI transport
> > > > > > > > > > >
> > > > > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > > > > but the cost should be acceptable.
> > > > > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Even if we allow the transport driver to synchornize through
> > > > > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > > > > >
> > > > > > > > > We do something like the following previously:
> > > > > > > > >
> > > > > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > > > > >                 return IRQ_NONE;
> > > > > > > > >
> > > > > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > > > > the driver.
> > > > > > > >
> > > > > > > > I don't think so - if you sync after setting the value then
> > > > > > > > you are guaranteed that any handler running afterwards
> > > > > > > > will see the new value.
> > > > > > >
> > > > > > > The problem is not disabled but the enable.
> > > > > >
> > > > > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > > > > imo.
> > > > >
> > > > > It's the interrupt raised before setting irq_soft_enabled to true:
> > > > >
> > > > > CPU 0 probe) driver specific setup (not commited)
> > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > CPU 0 probe) set irq_soft_enabled to true
> > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > >
> > > > > Thanks
> > > >
> > > > Yea, it hurts if you do it.  So do not do it then ;).
> > > >
> > > > irq_soft_enabled (I think driver_ok or status is a better name)
> > >
> > > I can change it to driver_ok.
> > >
> > > > should be initialized to false *before* irq is requested.
> > > >
> > > > And requesting irq commits all memory otherwise all drivers would be
> > > > broken,
> > >
> > > So I think we might talk different issues:
> > >
> > > 1) Whether request_irq() commits the previous setups, I think the
> > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > guarantee this though there seems no documentation around
> > > request_irq() to say this.
> > >
> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > using smp_wmb() before the request_irq().
> > >
> > > And even if write is ordered we still need read to be ordered to be
> > > paired with that.

IMO it synchronizes with the CPU to which irq is
delivered. Otherwise basically all drivers would be broken,
wouldn't they be?
I don't know whether it's correct on all platforms, but if not
we need to fix request_irq.

> > >
> > > > if it doesn't it just needs to be fixed, not worked around in
> > > > virtio.
> > >
> > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > virtio_device_ready():
> > >
> > > request_irq()
> > > driver specific setups
> > > virtio_device_ready()
> > >
> > > CPU 0 probe) request_irq()
> > > CPU 1 IRQ handler) read the uninitialized variable
> > > CPU 0 probe) driver specific setups
> > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > CPU 1 IRQ handler) use the uninitialized variable
> > >
> > > Thanks
> >
> >
> > As I said, virtio_device_ready needs to do synchronize_irq.
> > That will guarantee all setup is visible to the specific IRQ,
> 
> Only the interrupt after synchronize_irq() returns.

Anything else is a buggy device though.

> >this
> > is what it's point is.
> 
> What happens if an interrupt is raised in the middle like:
> 
> smp_store_release(dev->irq_soft_enabled, true)
> IRQ handler
> synchornize_irq()
> 
> If we don't enforce a reading order, the IRQ handler may still see the
> uninitialized variable.
> 
> Thanks

IMHO variables should be initialized before request_irq
to a value meaning "not a valid interrupt".
Specifically driver_ok = false.
Handler in the scenario you describe will then see !driver_ok
and exit immediately.


> >
> >
> > > >
> > > >
> > > > > >
> > > > > > > We use smp_store_relase()
> > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > >
> > > > > > > >
> > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > which surprises me.
> > > > > > > >
> > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > are any buffers in any queues?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > >
> > > > > > > > > "
> > > > > > > > >
> > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > "
> > > > > > > > >
> > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > >
> > > > > > > >
> > > > > > > > This is only for config interrupt.
> > > > > > >
> > > > > > > Ok.
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > expensive.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > +
> > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > +
> > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > >    * */
> > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > >   {
> > > > > > > > > > > + /*
> > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > though it's most likely is ...
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > >
> > > > > > > > > """
> > > > > > > > >
> > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > >  * and NMI handlers.
> > > > > > > > > """
> > > > > > > > >
> > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > >
> > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > >
> > > > > > > > > """
> > > > > > > > >
> > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > """
> > > > > > > > >
> > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > irq_soft_enabled as false.
> > > > > > > > >
> > > > > > > >
> > > > > > > > You are right. So then
> > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > >    READ_ONCE should do.
> > > > > > >
> > > > > > > See above.
> > > > > > >
> > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > >
> > > > > > >
> > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > +  */
> > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > +
> > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > >   }
> > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Ok.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > +
> > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Do you mean:
> > > > > > > > >
> > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > interrupt handlers
> > > > > > > >
> > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > users at the moment, even less than probe.
> > > > > > >
> > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > spinlock or others.
> > > > > > >
> > > > > > > >
> > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > for old hypervisors
> > > > > > > >
> > > > > > > >
> > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > >
> > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > codes. They look all fine since day0.
> > > > > > >
> > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > in the debug build).
> > > > > > >
> > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > device. I think it can be done but in a separate series.
> > > > > > >
> > > > > > > >
> > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > speeding boot up a tiny bit.
> > > > > > >
> > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > read back for validation in many cases.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > >   }
> > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > >   {
> > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > + }
> > > > > > > > > > > +
> > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > >           bool failed;
> > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > >           struct device dev;
> > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > >   }
> > > > > > > > > > > +/*
> > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > + */
> > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > +{
> > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > +         return true;
> > > > > > > > > > > +
> > > > > > > > > > > + /*
> > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > paired
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Will fix.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > +  */
> > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > +}
> > > > > > > > > > > +
> > > > > > > > > > >   /**
> > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > + /*
> > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > +  */
> > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > +
> > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > >   }
> > > > > > > > > > > --
> > > > > > > > > > > 2.25.1
> > > > > > > >
> > > > > >
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-28 10:40               ` Re: Michael S. Tsirkin
@ 2022-03-29  7:12                 ` Jason Wang
  2022-03-29 14:08                   ` Re: Michael S. Tsirkin
  2022-03-29  8:35                 ` Re: Thomas Gleixner
  1 sibling, 1 reply; 1651+ messages in thread
From: Jason Wang @ 2022-03-29  7:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Mon, Mar 28, 2022 at 6:41 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> > On Mon, Mar 28, 2022 at 1:59 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Mar 28, 2022 at 12:56:41PM +0800, Jason Wang wrote:
> > > > On Fri, Mar 25, 2022 at 6:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Fri, Mar 25, 2022 at 05:20:19PM +0800, Jason Wang wrote:
> > > > > > On Fri, Mar 25, 2022 at 5:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Fri, Mar 25, 2022 at 03:52:00PM +0800, Jason Wang wrote:
> > > > > > > > On Fri, Mar 25, 2022 at 2:31 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > Bcc:
> > > > > > > > > Subject: Re: [PATCH 3/3] virtio: harden vring IRQ
> > > > > > > > > Message-ID: <20220325021422-mutt-send-email-mst@kernel.org>
> > > > > > > > > Reply-To:
> > > > > > > > > In-Reply-To: <f7046303-7d7d-e39f-3c71-3688126cc812@redhat.com>
> > > > > > > > >
> > > > > > > > > On Fri, Mar 25, 2022 at 11:04:08AM +0800, Jason Wang wrote:
> > > > > > > > > >
> > > > > > > > > > 在 2022/3/24 下午7:03, Michael S. Tsirkin 写道:
> > > > > > > > > > > On Thu, Mar 24, 2022 at 04:40:04PM +0800, Jason Wang wrote:
> > > > > > > > > > > > This is a rework on the previous IRQ hardening that is done for
> > > > > > > > > > > > virtio-pci where several drawbacks were found and were reverted:
> > > > > > > > > > > >
> > > > > > > > > > > > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> > > > > > > > > > > >     that is used by some device such as virtio-blk
> > > > > > > > > > > > 2) done only for PCI transport
> > > > > > > > > > > >
> > > > > > > > > > > > In this patch, we tries to borrow the idea from the INTX IRQ hardening
> > > > > > > > > > > > in the reverted commit 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> > > > > > > > > > > > by introducing a global irq_soft_enabled variable for each
> > > > > > > > > > > > virtio_device. Then we can to toggle it during
> > > > > > > > > > > > virtio_reset_device()/virtio_device_ready(). A synchornize_rcu() is
> > > > > > > > > > > > used in virtio_reset_device() to synchronize with the IRQ handlers. In
> > > > > > > > > > > > the future, we may provide config_ops for the transport that doesn't
> > > > > > > > > > > > use IRQ. With this, vring_interrupt() can return check and early if
> > > > > > > > > > > > irq_soft_enabled is false. This lead to smp_load_acquire() to be used
> > > > > > > > > > > > but the cost should be acceptable.
> > > > > > > > > > > Maybe it should be but is it? Can't we use synchronize_irq instead?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Even if we allow the transport driver to synchornize through
> > > > > > > > > > synchronize_irq() we still need a check in the vring_interrupt().
> > > > > > > > > >
> > > > > > > > > > We do something like the following previously:
> > > > > > > > > >
> > > > > > > > > >         if (!READ_ONCE(vp_dev->intx_soft_enabled))
> > > > > > > > > >                 return IRQ_NONE;
> > > > > > > > > >
> > > > > > > > > > But it looks like a bug since speculative read can be done before the check
> > > > > > > > > > where the interrupt handler can't see the uncommitted setup which is done by
> > > > > > > > > > the driver.
> > > > > > > > >
> > > > > > > > > I don't think so - if you sync after setting the value then
> > > > > > > > > you are guaranteed that any handler running afterwards
> > > > > > > > > will see the new value.
> > > > > > > >
> > > > > > > > The problem is not disabled but the enable.
> > > > > > >
> > > > > > > So a misbehaving device can lose interrupts? That's not a problem at all
> > > > > > > imo.
> > > > > >
> > > > > > It's the interrupt raised before setting irq_soft_enabled to true:
> > > > > >
> > > > > > CPU 0 probe) driver specific setup (not commited)
> > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > CPU 0 probe) set irq_soft_enabled to true
> > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > Yea, it hurts if you do it.  So do not do it then ;).
> > > > >
> > > > > irq_soft_enabled (I think driver_ok or status is a better name)
> > > >
> > > > I can change it to driver_ok.
> > > >
> > > > > should be initialized to false *before* irq is requested.
> > > > >
> > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > broken,
> > > >
> > > > So I think we might talk different issues:
> > > >
> > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > guarantee this though there seems no documentation around
> > > > request_irq() to say this.
> > > >
> > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > using smp_wmb() before the request_irq().
> > > >
> > > > And even if write is ordered we still need read to be ordered to be
> > > > paired with that.
>
> IMO it synchronizes with the CPU to which irq is
> delivered. Otherwise basically all drivers would be broken,
> wouldn't they be?

I guess it's because most of the drivers don't care much about the
buggy/malicious device.  And most of the devices may require an extra
step to enable device IRQ after request_irq(). Or it's the charge of
the driver to do the synchronization.

> I don't know whether it's correct on all platforms, but if not
> we need to fix request_irq.
>
> > > >
> > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > virtio.
> > > >
> > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > virtio_device_ready():
> > > >
> > > > request_irq()
> > > > driver specific setups
> > > > virtio_device_ready()
> > > >
> > > > CPU 0 probe) request_irq()
> > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > CPU 0 probe) driver specific setups
> > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > CPU 1 IRQ handler) use the uninitialized variable
> > > >
> > > > Thanks
> > >
> > >
> > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > That will guarantee all setup is visible to the specific IRQ,
> >
> > Only the interrupt after synchronize_irq() returns.
>
> Anything else is a buggy device though.

Yes, but the goal of this patch is to prevent the possible attack from
buggy(malicious) devices.

>
> > >this
> > > is what it's point is.
> >
> > What happens if an interrupt is raised in the middle like:
> >
> > smp_store_release(dev->irq_soft_enabled, true)
> > IRQ handler
> > synchornize_irq()
> >
> > If we don't enforce a reading order, the IRQ handler may still see the
> > uninitialized variable.
> >
> > Thanks
>
> IMHO variables should be initialized before request_irq
> to a value meaning "not a valid interrupt".
> Specifically driver_ok = false.
> Handler in the scenario you describe will then see !driver_ok
> and exit immediately.

So just to make sure we're on the same page.

1) virtio_reset_device() will set the driver_ok to false;
2) virtio_device_ready() will set the driver_ok to true

So for virtio drivers, it often did:

1) virtio_reset_device()
2) find_vqs() which will call request_irq()
3) other driver specific setups
4) virtio_device_ready()

In virtio_device_ready(), the patch perform the following currently:

smp_store_release(driver_ok, true);
set_status(DRIVER_OK);

Per your suggestion, to add synchronize_irq() after
smp_store_release() so we had

smp_store_release(driver_ok, true);
synchornize_irq()
set_status(DRIVER_OK)

Suppose there's a interrupt raised before the synchronize_irq(), if we do:

if (READ_ONCE(driver_ok)) {
      vq->callback()
}

It will see the driver_ok as true but how can we make sure
vq->callback sees the driver specific setups (3) above?

And an example is virtio_scsi():

virtio_reset_device()
virtscsi_probe()
    virtscsi_init()
        virtio_find_vqs()
        ...
        virtscsi_init_vq(&vscsi->event_vq, vqs[1])
    ....
    virtio_device_ready()

In virtscsi_event_done():

virtscsi_event_done():
    virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);

We need to make sure the even_done reads driver_ok before read vscsi->event_vq.

Thanks

>
>
> > >
> > >
> > > > >
> > > > >
> > > > > > >
> > > > > > > > We use smp_store_relase()
> > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > >
> > > > > > > > >
> > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > which surprises me.
> > > > > > > > >
> > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > >
> > > > > > > > > > "
> > > > > > > > > >
> > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > "
> > > > > > > > > >
> > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > This is only for config interrupt.
> > > > > > > >
> > > > > > > > Ok.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > expensive.
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > >
> > > > > > > > > > > > ---
> > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > +
> > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > +
> > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > >    * */
> > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > >   {
> > > > > > > > > > > > + /*
> > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > though it's most likely is ...
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > >
> > > > > > > > > > """
> > > > > > > > > >
> > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > """
> > > > > > > > > >
> > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > >
> > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > >
> > > > > > > > > > """
> > > > > > > > > >
> > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > """
> > > > > > > > > >
> > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > You are right. So then
> > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > >    READ_ONCE should do.
> > > > > > > >
> > > > > > > > See above.
> > > > > > > >
> > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > >
> > > > > > > >
> > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > +  */
> > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > +
> > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > >   }
> > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Ok.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > +
> > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Do you mean:
> > > > > > > > > >
> > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > interrupt handlers
> > > > > > > > >
> > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > users at the moment, even less than probe.
> > > > > > > >
> > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > spinlock or others.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > for old hypervisors
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > >
> > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > codes. They look all fine since day0.
> > > > > > > >
> > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > in the debug build).
> > > > > > > >
> > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > speeding boot up a tiny bit.
> > > > > > > >
> > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > read back for validation in many cases.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > >   }
> > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > >   {
> > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > + }
> > > > > > > > > > > > +
> > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > >           bool failed;
> > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > >   }
> > > > > > > > > > > > +/*
> > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > + */
> > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > +{
> > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > +         return true;
> > > > > > > > > > > > +
> > > > > > > > > > > > + /*
> > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > paired
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Will fix.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > +  */
> > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > >   /**
> > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > + /*
> > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > +  */
> > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > +
> > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > >   }
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.25.1
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-29  7:12                 ` Re: Jason Wang
@ 2022-03-29 14:08                   ` Michael S. Tsirkin
  2022-03-30  2:40                     ` Re: Jason Wang
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-03-29 14:08 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > broken,
> > > > >
> > > > > So I think we might talk different issues:
> > > > >
> > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > guarantee this though there seems no documentation around
> > > > > request_irq() to say this.
> > > > >
> > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > using smp_wmb() before the request_irq().
> > > > >
> > > > > And even if write is ordered we still need read to be ordered to be
> > > > > paired with that.
> >
> > IMO it synchronizes with the CPU to which irq is
> > delivered. Otherwise basically all drivers would be broken,
> > wouldn't they be?
> 
> I guess it's because most of the drivers don't care much about the
> buggy/malicious device.  And most of the devices may require an extra
> step to enable device IRQ after request_irq(). Or it's the charge of
> the driver to do the synchronization.

It is true that the use-case of malicious devices is somewhat boutique.
But I think most drivers do want to have their hotplug routines to be
robust, yes.

> > I don't know whether it's correct on all platforms, but if not
> > we need to fix request_irq.
> >
> > > > >
> > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > virtio.
> > > > >
> > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > virtio_device_ready():
> > > > >
> > > > > request_irq()
> > > > > driver specific setups
> > > > > virtio_device_ready()
> > > > >
> > > > > CPU 0 probe) request_irq()
> > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > CPU 0 probe) driver specific setups
> > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > >
> > > > > Thanks
> > > >
> > > >
> > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > That will guarantee all setup is visible to the specific IRQ,
> > >
> > > Only the interrupt after synchronize_irq() returns.
> >
> > Anything else is a buggy device though.
> 
> Yes, but the goal of this patch is to prevent the possible attack from
> buggy(malicious) devices.

Right. However if a driver of a *buggy* device somehow sees driver_ok =
false even though it's actually initialized, that is not a deal breaker
as that does not open us up to an attack.

> >
> > > >this
> > > > is what it's point is.
> > >
> > > What happens if an interrupt is raised in the middle like:
> > >
> > > smp_store_release(dev->irq_soft_enabled, true)
> > > IRQ handler
> > > synchornize_irq()
> > >
> > > If we don't enforce a reading order, the IRQ handler may still see the
> > > uninitialized variable.
> > >
> > > Thanks
> >
> > IMHO variables should be initialized before request_irq
> > to a value meaning "not a valid interrupt".
> > Specifically driver_ok = false.
> > Handler in the scenario you describe will then see !driver_ok
> > and exit immediately.
> 
> So just to make sure we're on the same page.
> 
> 1) virtio_reset_device() will set the driver_ok to false;
> 2) virtio_device_ready() will set the driver_ok to true
> 
> So for virtio drivers, it often did:
> 
> 1) virtio_reset_device()
> 2) find_vqs() which will call request_irq()
> 3) other driver specific setups
> 4) virtio_device_ready()
> 
> In virtio_device_ready(), the patch perform the following currently:
> 
> smp_store_release(driver_ok, true);
> set_status(DRIVER_OK);
> 
> Per your suggestion, to add synchronize_irq() after
> smp_store_release() so we had
> 
> smp_store_release(driver_ok, true);
> synchornize_irq()
> set_status(DRIVER_OK)
> 
> Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> 
> if (READ_ONCE(driver_ok)) {
>       vq->callback()
> }
> 
> It will see the driver_ok as true but how can we make sure
> vq->callback sees the driver specific setups (3) above?
> 
> And an example is virtio_scsi():
> 
> virtio_reset_device()
> virtscsi_probe()
>     virtscsi_init()
>         virtio_find_vqs()
>         ...
>         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
>     ....
>     virtio_device_ready()
> 
> In virtscsi_event_done():
> 
> virtscsi_event_done():
>     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> 
> We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> 
> Thanks


See response by Thomas. A simple if (!dev->driver_ok) should be enough,
it's all under a lock.

> >
> >
> > > >
> > > >
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > > > We use smp_store_relase()
> > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > which surprises me.
> > > > > > > > > >
> > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > >
> > > > > > > > > > > "
> > > > > > > > > > >
> > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > "
> > > > > > > > > > >
> > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > This is only for config interrupt.
> > > > > > > > >
> > > > > > > > > Ok.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > expensive.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > >
> > > > > > > > > > > > > ---
> > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > +
> > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > >    * */
> > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > >   {
> > > > > > > > > > > > > + /*
> > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > >
> > > > > > > > > > > """
> > > > > > > > > > >
> > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > """
> > > > > > > > > > >
> > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > >
> > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > >
> > > > > > > > > > > """
> > > > > > > > > > >
> > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > """
> > > > > > > > > > >
> > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > You are right. So then
> > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > >    READ_ONCE should do.
> > > > > > > > >
> > > > > > > > > See above.
> > > > > > > > >
> > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > +  */
> > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > +
> > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > >   }
> > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Ok.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Do you mean:
> > > > > > > > > > >
> > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > interrupt handlers
> > > > > > > > > >
> > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > >
> > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > spinlock or others.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > for old hypervisors
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > >
> > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > codes. They look all fine since day0.
> > > > > > > > >
> > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > in the debug build).
> > > > > > > > >
> > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > >
> > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > read back for validation in many cases.
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > >   }
> > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > >   {
> > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > + }
> > > > > > > > > > > > > +
> > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > > >           bool failed;
> > > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > > >   }
> > > > > > > > > > > > > +/*
> > > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > > + */
> > > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > > +{
> > > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > > +         return true;
> > > > > > > > > > > > > +
> > > > > > > > > > > > > + /*
> > > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > > paired
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Will fix.
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > > +  */
> > > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > > +}
> > > > > > > > > > > > > +
> > > > > > > > > > > > >   /**
> > > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > > + /*
> > > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > > +  */
> > > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > > +
> > > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > >   }
> > > > > > > > > > > > > --
> > > > > > > > > > > > > 2.25.1
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-29 14:08                   ` Re: Michael S. Tsirkin
@ 2022-03-30  2:40                     ` Jason Wang
  2022-03-30  5:14                       ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1651+ messages in thread
From: Jason Wang @ 2022-03-30  2:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Tue, Mar 29, 2022 at 10:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > > broken,
> > > > > >
> > > > > > So I think we might talk different issues:
> > > > > >
> > > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > > guarantee this though there seems no documentation around
> > > > > > request_irq() to say this.
> > > > > >
> > > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > > using smp_wmb() before the request_irq().
> > > > > >
> > > > > > And even if write is ordered we still need read to be ordered to be
> > > > > > paired with that.
> > >
> > > IMO it synchronizes with the CPU to which irq is
> > > delivered. Otherwise basically all drivers would be broken,
> > > wouldn't they be?
> >
> > I guess it's because most of the drivers don't care much about the
> > buggy/malicious device.  And most of the devices may require an extra
> > step to enable device IRQ after request_irq(). Or it's the charge of
> > the driver to do the synchronization.
>
> It is true that the use-case of malicious devices is somewhat boutique.
> But I think most drivers do want to have their hotplug routines to be
> robust, yes.
>
> > > I don't know whether it's correct on all platforms, but if not
> > > we need to fix request_irq.
> > >
> > > > > >
> > > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > > virtio.
> > > > > >
> > > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > > virtio_device_ready():
> > > > > >
> > > > > > request_irq()
> > > > > > driver specific setups
> > > > > > virtio_device_ready()
> > > > > >
> > > > > > CPU 0 probe) request_irq()
> > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > CPU 0 probe) driver specific setups
> > > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > >
> > > > > > Thanks
> > > > >
> > > > >
> > > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > > That will guarantee all setup is visible to the specific IRQ,
> > > >
> > > > Only the interrupt after synchronize_irq() returns.
> > >
> > > Anything else is a buggy device though.
> >
> > Yes, but the goal of this patch is to prevent the possible attack from
> > buggy(malicious) devices.
>
> Right. However if a driver of a *buggy* device somehow sees driver_ok =
> false even though it's actually initialized, that is not a deal breaker
> as that does not open us up to an attack.
>
> > >
> > > > >this
> > > > > is what it's point is.
> > > >
> > > > What happens if an interrupt is raised in the middle like:
> > > >
> > > > smp_store_release(dev->irq_soft_enabled, true)
> > > > IRQ handler
> > > > synchornize_irq()
> > > >
> > > > If we don't enforce a reading order, the IRQ handler may still see the
> > > > uninitialized variable.
> > > >
> > > > Thanks
> > >
> > > IMHO variables should be initialized before request_irq
> > > to a value meaning "not a valid interrupt".
> > > Specifically driver_ok = false.
> > > Handler in the scenario you describe will then see !driver_ok
> > > and exit immediately.
> >
> > So just to make sure we're on the same page.
> >
> > 1) virtio_reset_device() will set the driver_ok to false;
> > 2) virtio_device_ready() will set the driver_ok to true
> >
> > So for virtio drivers, it often did:
> >
> > 1) virtio_reset_device()
> > 2) find_vqs() which will call request_irq()
> > 3) other driver specific setups
> > 4) virtio_device_ready()
> >
> > In virtio_device_ready(), the patch perform the following currently:
> >
> > smp_store_release(driver_ok, true);
> > set_status(DRIVER_OK);
> >
> > Per your suggestion, to add synchronize_irq() after
> > smp_store_release() so we had
> >
> > smp_store_release(driver_ok, true);
> > synchornize_irq()
> > set_status(DRIVER_OK)
> >
> > Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> >
> > if (READ_ONCE(driver_ok)) {
> >       vq->callback()
> > }
> >
> > It will see the driver_ok as true but how can we make sure
> > vq->callback sees the driver specific setups (3) above?
> >
> > And an example is virtio_scsi():
> >
> > virtio_reset_device()
> > virtscsi_probe()
> >     virtscsi_init()
> >         virtio_find_vqs()
> >         ...
> >         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
> >     ....
> >     virtio_device_ready()
> >
> > In virtscsi_event_done():
> >
> > virtscsi_event_done():
> >     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> >
> > We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> >
> > Thanks
>
>
> See response by Thomas. A simple if (!dev->driver_ok) should be enough,
> it's all under a lock.

Ordered through ACQUIRE+RELEASE actually since the irq handler is not
running under the lock.

Another question, for synchronize_irq() do you prefer

1) transport specific callbacks
or
2) a simple synchornize_rcu()

Thanks

>
> > >
> > >
> > > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > > We use smp_store_relase()
> > > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > > which surprises me.
> > > > > > > > > > >
> > > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > > >
> > > > > > > > > > > > "
> > > > > > > > > > > >
> > > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > > "
> > > > > > > > > > > >
> > > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > This is only for config interrupt.
> > > > > > > > > >
> > > > > > > > > > Ok.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > > expensive.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > > >
> > > > > > > > > > > > > > ---
> > > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > > >    * */
> > > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > > >
> > > > > > > > > > > > """
> > > > > > > > > > > >
> > > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > > """
> > > > > > > > > > > >
> > > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > > >
> > > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > > >
> > > > > > > > > > > > """
> > > > > > > > > > > >
> > > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > > """
> > > > > > > > > > > >
> > > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > You are right. So then
> > > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > > >    READ_ONCE should do.
> > > > > > > > > >
> > > > > > > > > > See above.
> > > > > > > > > >
> > > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > > >   }
> > > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Ok.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Do you mean:
> > > > > > > > > > > >
> > > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > > interrupt handlers
> > > > > > > > > > >
> > > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > > >
> > > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > > spinlock or others.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > > for old hypervisors
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > > >
> > > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > > codes. They look all fine since day0.
> > > > > > > > > >
> > > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > > in the debug build).
> > > > > > > > > >
> > > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > > >
> > > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > > read back for validation in many cases.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > > + }
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > > > >           bool failed;
> > > > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > +/*
> > > > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > > > + */
> > > > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > > > +{
> > > > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > > > +         return true;
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > > > paired
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Will fix.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > > > +}
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >   /**
> > > > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > > > +
> > > > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > 2.25.1
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-30  2:40                     ` Re: Jason Wang
@ 2022-03-30  5:14                       ` Michael S. Tsirkin
  2022-03-30  5:53                         ` Re: Jason Wang
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-03-30  5:14 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 10:40:59AM +0800, Jason Wang wrote:
> On Tue, Mar 29, 2022 at 10:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > > > broken,
> > > > > > >
> > > > > > > So I think we might talk different issues:
> > > > > > >
> > > > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > > > guarantee this though there seems no documentation around
> > > > > > > request_irq() to say this.
> > > > > > >
> > > > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > > > using smp_wmb() before the request_irq().
> > > > > > >
> > > > > > > And even if write is ordered we still need read to be ordered to be
> > > > > > > paired with that.
> > > >
> > > > IMO it synchronizes with the CPU to which irq is
> > > > delivered. Otherwise basically all drivers would be broken,
> > > > wouldn't they be?
> > >
> > > I guess it's because most of the drivers don't care much about the
> > > buggy/malicious device.  And most of the devices may require an extra
> > > step to enable device IRQ after request_irq(). Or it's the charge of
> > > the driver to do the synchronization.
> >
> > It is true that the use-case of malicious devices is somewhat boutique.
> > But I think most drivers do want to have their hotplug routines to be
> > robust, yes.
> >
> > > > I don't know whether it's correct on all platforms, but if not
> > > > we need to fix request_irq.
> > > >
> > > > > > >
> > > > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > > > virtio.
> > > > > > >
> > > > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > > > virtio_device_ready():
> > > > > > >
> > > > > > > request_irq()
> > > > > > > driver specific setups
> > > > > > > virtio_device_ready()
> > > > > > >
> > > > > > > CPU 0 probe) request_irq()
> > > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > > CPU 0 probe) driver specific setups
> > > > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > > >
> > > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > > > That will guarantee all setup is visible to the specific IRQ,
> > > > >
> > > > > Only the interrupt after synchronize_irq() returns.
> > > >
> > > > Anything else is a buggy device though.
> > >
> > > Yes, but the goal of this patch is to prevent the possible attack from
> > > buggy(malicious) devices.
> >
> > Right. However if a driver of a *buggy* device somehow sees driver_ok =
> > false even though it's actually initialized, that is not a deal breaker
> > as that does not open us up to an attack.
> >
> > > >
> > > > > >this
> > > > > > is what it's point is.
> > > > >
> > > > > What happens if an interrupt is raised in the middle like:
> > > > >
> > > > > smp_store_release(dev->irq_soft_enabled, true)
> > > > > IRQ handler
> > > > > synchornize_irq()
> > > > >
> > > > > If we don't enforce a reading order, the IRQ handler may still see the
> > > > > uninitialized variable.
> > > > >
> > > > > Thanks
> > > >
> > > > IMHO variables should be initialized before request_irq
> > > > to a value meaning "not a valid interrupt".
> > > > Specifically driver_ok = false.
> > > > Handler in the scenario you describe will then see !driver_ok
> > > > and exit immediately.
> > >
> > > So just to make sure we're on the same page.
> > >
> > > 1) virtio_reset_device() will set the driver_ok to false;
> > > 2) virtio_device_ready() will set the driver_ok to true
> > >
> > > So for virtio drivers, it often did:
> > >
> > > 1) virtio_reset_device()
> > > 2) find_vqs() which will call request_irq()
> > > 3) other driver specific setups
> > > 4) virtio_device_ready()
> > >
> > > In virtio_device_ready(), the patch perform the following currently:
> > >
> > > smp_store_release(driver_ok, true);
> > > set_status(DRIVER_OK);
> > >
> > > Per your suggestion, to add synchronize_irq() after
> > > smp_store_release() so we had
> > >
> > > smp_store_release(driver_ok, true);
> > > synchornize_irq()
> > > set_status(DRIVER_OK)
> > >
> > > Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> > >
> > > if (READ_ONCE(driver_ok)) {
> > >       vq->callback()
> > > }
> > >
> > > It will see the driver_ok as true but how can we make sure
> > > vq->callback sees the driver specific setups (3) above?
> > >
> > > And an example is virtio_scsi():
> > >
> > > virtio_reset_device()
> > > virtscsi_probe()
> > >     virtscsi_init()
> > >         virtio_find_vqs()
> > >         ...
> > >         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
> > >     ....
> > >     virtio_device_ready()
> > >
> > > In virtscsi_event_done():
> > >
> > > virtscsi_event_done():
> > >     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> > >
> > > We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> > >
> > > Thanks
> >
> >
> > See response by Thomas. A simple if (!dev->driver_ok) should be enough,
> > it's all under a lock.
> 
> Ordered through ACQUIRE+RELEASE actually since the irq handler is not
> running under the lock.
> 
> Another question, for synchronize_irq() do you prefer
> 
> 1) transport specific callbacks
> or
> 2) a simple synchornize_rcu()
> 
> Thanks


1) I think, and I'd add a wrapper so we can switch to 2 if we really
want to. But for now synchronizing the specific irq is obviously designed to
make any changes to memory visible to this irq. that
seems cleaner and easier to understand than memory ordering tricks
and relying on side effects of synchornize_rcu, even though
internally this all boils down to memory ordering since
memory is what's used to implement locks :).
Not to mention, synchronize_irq just scales much better from performance
POV.


> >
> > > >
> > > >
> > > > > >
> > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > We use smp_store_relase()
> > > > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > > > which surprises me.
> > > > > > > > > > > >
> > > > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > > > >
> > > > > > > > > > > > > "
> > > > > > > > > > > > >
> > > > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > > > "
> > > > > > > > > > > > >
> > > > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > This is only for config interrupt.
> > > > > > > > > > >
> > > > > > > > > > > Ok.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > > > expensive.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > > > >    * */
> > > > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > > > >
> > > > > > > > > > > > > """
> > > > > > > > > > > > >
> > > > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > > > """
> > > > > > > > > > > > >
> > > > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > > > >
> > > > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > > > >
> > > > > > > > > > > > > """
> > > > > > > > > > > > >
> > > > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > > > """
> > > > > > > > > > > > >
> > > > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > You are right. So then
> > > > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > > > >    READ_ONCE should do.
> > > > > > > > > > >
> > > > > > > > > > > See above.
> > > > > > > > > > >
> > > > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ok.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Do you mean:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > > > interrupt handlers
> > > > > > > > > > > >
> > > > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > > > >
> > > > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > > > spinlock or others.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > > > for old hypervisors
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > > > >
> > > > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > > > codes. They look all fine since day0.
> > > > > > > > > > >
> > > > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > > > in the debug build).
> > > > > > > > > > >
> > > > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > > > >
> > > > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > > > read back for validation in many cases.
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > > > + }
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > > > > >           bool failed;
> > > > > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > +/*
> > > > > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > > > > + */
> > > > > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > > > > +{
> > > > > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > > > > +         return true;
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > > > > paired
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Will fix.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > > > > +}
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >   /**
> > > > > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > 2.25.1
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-30  5:14                       ` Re: Michael S. Tsirkin
@ 2022-03-30  5:53                         ` Jason Wang
  0 siblings, 0 replies; 1651+ messages in thread
From: Jason Wang @ 2022-03-30  5:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 1:14 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Mar 30, 2022 at 10:40:59AM +0800, Jason Wang wrote:
> > On Tue, Mar 29, 2022 at 10:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 29, 2022 at 03:12:14PM +0800, Jason Wang wrote:
> > > > > > > > > And requesting irq commits all memory otherwise all drivers would be
> > > > > > > > > broken,
> > > > > > > >
> > > > > > > > So I think we might talk different issues:
> > > > > > > >
> > > > > > > > 1) Whether request_irq() commits the previous setups, I think the
> > > > > > > > answer is yes, since the spin_unlock of desc->lock (release) can
> > > > > > > > guarantee this though there seems no documentation around
> > > > > > > > request_irq() to say this.
> > > > > > > >
> > > > > > > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> > > > > > > > using smp_wmb() before the request_irq().
> > > > > > > >
> > > > > > > > And even if write is ordered we still need read to be ordered to be
> > > > > > > > paired with that.
> > > > >
> > > > > IMO it synchronizes with the CPU to which irq is
> > > > > delivered. Otherwise basically all drivers would be broken,
> > > > > wouldn't they be?
> > > >
> > > > I guess it's because most of the drivers don't care much about the
> > > > buggy/malicious device.  And most of the devices may require an extra
> > > > step to enable device IRQ after request_irq(). Or it's the charge of
> > > > the driver to do the synchronization.
> > >
> > > It is true that the use-case of malicious devices is somewhat boutique.
> > > But I think most drivers do want to have their hotplug routines to be
> > > robust, yes.
> > >
> > > > > I don't know whether it's correct on all platforms, but if not
> > > > > we need to fix request_irq.
> > > > >
> > > > > > > >
> > > > > > > > > if it doesn't it just needs to be fixed, not worked around in
> > > > > > > > > virtio.
> > > > > > > >
> > > > > > > > 2) virtio drivers might do a lot of setups between request_irq() and
> > > > > > > > virtio_device_ready():
> > > > > > > >
> > > > > > > > request_irq()
> > > > > > > > driver specific setups
> > > > > > > > virtio_device_ready()
> > > > > > > >
> > > > > > > > CPU 0 probe) request_irq()
> > > > > > > > CPU 1 IRQ handler) read the uninitialized variable
> > > > > > > > CPU 0 probe) driver specific setups
> > > > > > > > CPU 0 probe) smp_store_release(intr_soft_enabled, true), commit the setups
> > > > > > > > CPU 1 IRQ handler) read irq_soft_enable as true
> > > > > > > > CPU 1 IRQ handler) use the uninitialized variable
> > > > > > > >
> > > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > As I said, virtio_device_ready needs to do synchronize_irq.
> > > > > > > That will guarantee all setup is visible to the specific IRQ,
> > > > > >
> > > > > > Only the interrupt after synchronize_irq() returns.
> > > > >
> > > > > Anything else is a buggy device though.
> > > >
> > > > Yes, but the goal of this patch is to prevent the possible attack from
> > > > buggy(malicious) devices.
> > >
> > > Right. However if a driver of a *buggy* device somehow sees driver_ok =
> > > false even though it's actually initialized, that is not a deal breaker
> > > as that does not open us up to an attack.
> > >
> > > > >
> > > > > > >this
> > > > > > > is what it's point is.
> > > > > >
> > > > > > What happens if an interrupt is raised in the middle like:
> > > > > >
> > > > > > smp_store_release(dev->irq_soft_enabled, true)
> > > > > > IRQ handler
> > > > > > synchornize_irq()
> > > > > >
> > > > > > If we don't enforce a reading order, the IRQ handler may still see the
> > > > > > uninitialized variable.
> > > > > >
> > > > > > Thanks
> > > > >
> > > > > IMHO variables should be initialized before request_irq
> > > > > to a value meaning "not a valid interrupt".
> > > > > Specifically driver_ok = false.
> > > > > Handler in the scenario you describe will then see !driver_ok
> > > > > and exit immediately.
> > > >
> > > > So just to make sure we're on the same page.
> > > >
> > > > 1) virtio_reset_device() will set the driver_ok to false;
> > > > 2) virtio_device_ready() will set the driver_ok to true
> > > >
> > > > So for virtio drivers, it often did:
> > > >
> > > > 1) virtio_reset_device()
> > > > 2) find_vqs() which will call request_irq()
> > > > 3) other driver specific setups
> > > > 4) virtio_device_ready()
> > > >
> > > > In virtio_device_ready(), the patch perform the following currently:
> > > >
> > > > smp_store_release(driver_ok, true);
> > > > set_status(DRIVER_OK);
> > > >
> > > > Per your suggestion, to add synchronize_irq() after
> > > > smp_store_release() so we had
> > > >
> > > > smp_store_release(driver_ok, true);
> > > > synchornize_irq()
> > > > set_status(DRIVER_OK)
> > > >
> > > > Suppose there's a interrupt raised before the synchronize_irq(), if we do:
> > > >
> > > > if (READ_ONCE(driver_ok)) {
> > > >       vq->callback()
> > > > }
> > > >
> > > > It will see the driver_ok as true but how can we make sure
> > > > vq->callback sees the driver specific setups (3) above?
> > > >
> > > > And an example is virtio_scsi():
> > > >
> > > > virtio_reset_device()
> > > > virtscsi_probe()
> > > >     virtscsi_init()
> > > >         virtio_find_vqs()
> > > >         ...
> > > >         virtscsi_init_vq(&vscsi->event_vq, vqs[1])
> > > >     ....
> > > >     virtio_device_ready()
> > > >
> > > > In virtscsi_event_done():
> > > >
> > > > virtscsi_event_done():
> > > >     virtscsi_vq_done(vscsi, &vscsi->event_vq, ...);
> > > >
> > > > We need to make sure the even_done reads driver_ok before read vscsi->event_vq.
> > > >
> > > > Thanks
> > >
> > >
> > > See response by Thomas. A simple if (!dev->driver_ok) should be enough,
> > > it's all under a lock.
> >
> > Ordered through ACQUIRE+RELEASE actually since the irq handler is not
> > running under the lock.
> >
> > Another question, for synchronize_irq() do you prefer
> >
> > 1) transport specific callbacks
> > or
> > 2) a simple synchornize_rcu()
> >
> > Thanks
>
>
> 1) I think, and I'd add a wrapper so we can switch to 2 if we really
> want to. But for now synchronizing the specific irq is obviously designed to
> make any changes to memory visible to this irq. that
> seems cleaner and easier to understand than memory ordering tricks
> and relying on side effects of synchornize_rcu, even though
> internally this all boils down to memory ordering since
> memory is what's used to implement locks :).
> Not to mention, synchronize_irq just scales much better from performance
> POV.

Ok. Let me try to do that in V2.

Thanks

>
>
> > >
> > > > >
> > > > >
> > > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > We use smp_store_relase()
> > > > > > > > > > > > to make sure the driver commits the setup before enabling the irq. It
> > > > > > > > > > > > means the read needs to be ordered as well in vring_interrupt().
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Although I couldn't find anything about this in memory-barriers.txt
> > > > > > > > > > > > > which surprises me.
> > > > > > > > > > > > >
> > > > > > > > > > > > > CC Paul to help make sure I'm right.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > To avoid breaking legacy device which can send IRQ before DRIVER_OK, a
> > > > > > > > > > > > > > > > module parameter is introduced to enable the hardening so function
> > > > > > > > > > > > > > > > hardening is disabled by default.
> > > > > > > > > > > > > > > Which devices are these? How come they send an interrupt before there
> > > > > > > > > > > > > > > are any buffers in any queues?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I copied this from the commit log for 22b7050a024d7
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > "
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >     This change will also benefit old hypervisors (before 2009)
> > > > > > > > > > > > > >     that send interrupts without checking DRIVER_OK: previously,
> > > > > > > > > > > > > >     the callback could race with driver-specific initialization.
> > > > > > > > > > > > > > "
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If this is only for config interrupt, I can remove the above log.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is only for config interrupt.
> > > > > > > > > > > >
> > > > > > > > > > > > Ok.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Note that the hardening is only done for vring interrupt since the
> > > > > > > > > > > > > > > > config interrupt hardening is already done in commit 22b7050a024d7
> > > > > > > > > > > > > > > > ("virtio: defer config changed notifications"). But the method that is
> > > > > > > > > > > > > > > > used by config interrupt can't be reused by the vring interrupt
> > > > > > > > > > > > > > > > handler because it uses spinlock to do the synchronization which is
> > > > > > > > > > > > > > > > expensive.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ---
> > > > > > > > > > > > > > > >   drivers/virtio/virtio.c       | 19 +++++++++++++++++++
> > > > > > > > > > > > > > > >   drivers/virtio/virtio_ring.c  |  9 ++++++++-
> > > > > > > > > > > > > > > >   include/linux/virtio.h        |  4 ++++
> > > > > > > > > > > > > > > >   include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
> > > > > > > > > > > > > > > >   4 files changed, 56 insertions(+), 1 deletion(-)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > index 8dde44ea044a..85e331efa9cc 100644
> > > > > > > > > > > > > > > > --- a/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio.c
> > > > > > > > > > > > > > > > @@ -7,6 +7,12 @@
> > > > > > > > > > > > > > > >   #include <linux/of.h>
> > > > > > > > > > > > > > > >   #include <uapi/linux/virtio_ids.h>
> > > > > > > > > > > > > > > > +static bool irq_hardening = false;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > +module_param(irq_hardening, bool, 0444);
> > > > > > > > > > > > > > > > +MODULE_PARM_DESC(irq_hardening,
> > > > > > > > > > > > > > > > +          "Disalbe IRQ software processing when it is not expected");
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >   /* Unique numbering for virtio devices. */
> > > > > > > > > > > > > > > >   static DEFINE_IDA(virtio_index_ida);
> > > > > > > > > > > > > > > > @@ -220,6 +226,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> > > > > > > > > > > > > > > >    * */
> > > > > > > > > > > > > > > >   void virtio_reset_device(struct virtio_device *dev)
> > > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * The below synchronize_rcu() guarantees that any
> > > > > > > > > > > > > > > > +  * interrupt for this line arriving after
> > > > > > > > > > > > > > > > +  * synchronize_rcu() has completed is guaranteed to see
> > > > > > > > > > > > > > > > +  * irq_soft_enabled == false.
> > > > > > > > > > > > > > > News to me I did not know synchronize_rcu has anything to do
> > > > > > > > > > > > > > > with interrupts. Did not you intend to use synchronize_irq?
> > > > > > > > > > > > > > > I am not even 100% sure synchronize_rcu is by design a memory barrier
> > > > > > > > > > > > > > > though it's most likely is ...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > According to the comment above tree RCU version of synchronize_rcu():
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  * RCU read-side critical sections are delimited by rcu_read_lock()
> > > > > > > > > > > > > >  * and rcu_read_unlock(), and may be nested.  In addition, but only in
> > > > > > > > > > > > > >  * v5.0 and later, regions of code across which interrupts, preemption,
> > > > > > > > > > > > > >  * or softirqs have been disabled also serve as RCU read-side critical
> > > > > > > > > > > > > >  * sections.  This includes hardware interrupt handlers, softirq handlers,
> > > > > > > > > > > > > >  * and NMI handlers.
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So interrupt handlers are treated as read-side critical sections.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > And it has the comment for explain the barrier:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  * Note that this guarantee implies further memory-ordering guarantees.
> > > > > > > > > > > > > >  * On systems with more than one CPU, when synchronize_rcu() returns,
> > > > > > > > > > > > > >  * each CPU is guaranteed to have executed a full memory barrier since
> > > > > > > > > > > > > >  * the end of its last RCU read-side critical section whose beginning
> > > > > > > > > > > > > >  * preceded the call to synchronize_rcu().  In addition, each CPU having
> > > > > > > > > > > > > > """
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So on SMP it provides a full barrier. And for UP/tiny RCU we don't need the
> > > > > > > > > > > > > > barrier, if the interrupt come after WRITE_ONCE() it will see the
> > > > > > > > > > > > > > irq_soft_enabled as false.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > You are right. So then
> > > > > > > > > > > > > 1. I do not think we need load_acquire - why is it needed? Just
> > > > > > > > > > > > >    READ_ONCE should do.
> > > > > > > > > > > >
> > > > > > > > > > > > See above.
> > > > > > > > > > > >
> > > > > > > > > > > > > 2. isn't synchronize_irq also doing the same thing?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, but it requires a config ops since the IRQ knowledge is transport specific.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + WRITE_ONCE(dev->irq_soft_enabled, false);
> > > > > > > > > > > > > > > > + synchronize_rcu();
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           dev->config->reset(dev);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > >   EXPORT_SYMBOL_GPL(virtio_reset_device);
> > > > > > > > > > > > > > > Please add comment explaining where it will be enabled.
> > > > > > > > > > > > > > > Also, we *really* don't need to synch if it was already disabled,
> > > > > > > > > > > > > > > let's not add useless overhead to the boot sequence.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ok.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @@ -427,6 +442,10 @@ int register_virtio_device(struct virtio_device *dev)
> > > > > > > > > > > > > > > >           spin_lock_init(&dev->config_lock);
> > > > > > > > > > > > > > > >           dev->config_enabled = false;
> > > > > > > > > > > > > > > >           dev->config_change_pending = false;
> > > > > > > > > > > > > > > > + dev->irq_soft_check = irq_hardening;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > + if (dev->irq_soft_check)
> > > > > > > > > > > > > > > > +         dev_info(&dev->dev, "IRQ hardening is enabled\n");
> > > > > > > > > > > > > > > >           /* We always start by resetting the device, in case a previous
> > > > > > > > > > > > > > > >            * driver messed it up.  This also tests that code path a little. */
> > > > > > > > > > > > > > > one of the points of hardening is it's also helpful for buggy
> > > > > > > > > > > > > > > devices. this flag defeats the purpose.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Do you mean:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1) we need something like config_enable? This seems not easy to be
> > > > > > > > > > > > > > implemented without obvious overhead, mainly the synchronize with the
> > > > > > > > > > > > > > interrupt handlers
> > > > > > > > > > > > >
> > > > > > > > > > > > > But synchronize is only on tear-down path. That is not critical for any
> > > > > > > > > > > > > users at the moment, even less than probe.
> > > > > > > > > > > >
> > > > > > > > > > > > I meant if we have vq->irq_pending, we need to call vring_interrupt()
> > > > > > > > > > > > in the virtio_device_ready() and synchronize the IRQ handlers with
> > > > > > > > > > > > spinlock or others.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 2) enable this by default, so I don't object, but this may have some risk
> > > > > > > > > > > > > > for old hypervisors
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > The risk if there's a driver adding buffers without setting DRIVER_OK.
> > > > > > > > > > > >
> > > > > > > > > > > > Probably not, we have devices that accept random inputs from outside,
> > > > > > > > > > > > net, console, input etc. I've done a round of audits of the Qemu
> > > > > > > > > > > > codes. They look all fine since day0.
> > > > > > > > > > > >
> > > > > > > > > > > > > So with this approach, how about we rename the flag "driver_ok"?
> > > > > > > > > > > > > And then add_buf can actually test it and BUG_ON if not there  (at least
> > > > > > > > > > > > > in the debug build).
> > > > > > > > > > > >
> > > > > > > > > > > > This looks like a hardening of the driver in the core instead of the
> > > > > > > > > > > > device. I think it can be done but in a separate series.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > And going down from there, how about we cache status in the
> > > > > > > > > > > > > device? Then we don't need to keep re-reading it every time,
> > > > > > > > > > > > > speeding boot up a tiny bit.
> > > > > > > > > > > >
> > > > > > > > > > > > I don't fully understand here, actually spec requires status to be
> > > > > > > > > > > > read back for validation in many cases.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > index 962f1477b1fa..0170f8c784d8 100644
> > > > > > > > > > > > > > > > --- a/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > +++ b/drivers/virtio/virtio_ring.c
> > > > > > > > > > > > > > > > @@ -2144,10 +2144,17 @@ static inline bool more_used(const struct vring_virtqueue *vq)
> > > > > > > > > > > > > > > >           return vq->packed_ring ? more_used_packed(vq) : more_used_split(vq);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > -irqreturn_t vring_interrupt(int irq, void *_vq)
> > > > > > > > > > > > > > > > +irqreturn_t vring_interrupt(int irq, void *v)
> > > > > > > > > > > > > > > >   {
> > > > > > > > > > > > > > > > + struct virtqueue *_vq = v;
> > > > > > > > > > > > > > > > + struct virtio_device *vdev = _vq->vdev;
> > > > > > > > > > > > > > > >           struct vring_virtqueue *vq = to_vvq(_vq);
> > > > > > > > > > > > > > > > + if (!virtio_irq_soft_enabled(vdev)) {
> > > > > > > > > > > > > > > > +         dev_warn_once(&vdev->dev, "virtio vring IRQ raised before DRIVER_OK");
> > > > > > > > > > > > > > > > +         return IRQ_NONE;
> > > > > > > > > > > > > > > > + }
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           if (!more_used(vq)) {
> > > > > > > > > > > > > > > >                   pr_debug("virtqueue interrupt with no work for %p\n", vq);
> > > > > > > > > > > > > > > >                   return IRQ_NONE;
> > > > > > > > > > > > > > > > diff --git a/include/linux/virtio.h b/include/linux/virtio.h
> > > > > > > > > > > > > > > > index 5464f398912a..957d6ad604ac 100644
> > > > > > > > > > > > > > > > --- a/include/linux/virtio.h
> > > > > > > > > > > > > > > > +++ b/include/linux/virtio.h
> > > > > > > > > > > > > > > > @@ -95,6 +95,8 @@ dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
> > > > > > > > > > > > > > > >    * @failed: saved value for VIRTIO_CONFIG_S_FAILED bit (for restore)
> > > > > > > > > > > > > > > >    * @config_enabled: configuration change reporting enabled
> > > > > > > > > > > > > > > >    * @config_change_pending: configuration change reported while disabled
> > > > > > > > > > > > > > > > + * @irq_soft_check: whether or not to check @irq_soft_enabled
> > > > > > > > > > > > > > > > + * @irq_soft_enabled: callbacks enabled
> > > > > > > > > > > > > > > >    * @config_lock: protects configuration change reporting
> > > > > > > > > > > > > > > >    * @dev: underlying device.
> > > > > > > > > > > > > > > >    * @id: the device type identification (used to match it with a driver).
> > > > > > > > > > > > > > > > @@ -109,6 +111,8 @@ struct virtio_device {
> > > > > > > > > > > > > > > >           bool failed;
> > > > > > > > > > > > > > > >           bool config_enabled;
> > > > > > > > > > > > > > > >           bool config_change_pending;
> > > > > > > > > > > > > > > > + bool irq_soft_check;
> > > > > > > > > > > > > > > > + bool irq_soft_enabled;
> > > > > > > > > > > > > > > >           spinlock_t config_lock;
> > > > > > > > > > > > > > > >           spinlock_t vqs_list_lock; /* Protects VQs list access */
> > > > > > > > > > > > > > > >           struct device dev;
> > > > > > > > > > > > > > > > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > > index dafdc7f48c01..9c1b61f2e525 100644
> > > > > > > > > > > > > > > > --- a/include/linux/virtio_config.h
> > > > > > > > > > > > > > > > +++ b/include/linux/virtio_config.h
> > > > > > > > > > > > > > > > @@ -174,6 +174,24 @@ static inline bool virtio_has_feature(const struct virtio_device *vdev,
> > > > > > > > > > > > > > > >           return __virtio_test_bit(vdev, fbit);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > +/*
> > > > > > > > > > > > > > > > + * virtio_irq_soft_enabled: whether we can execute callbacks
> > > > > > > > > > > > > > > > + * @vdev: the device
> > > > > > > > > > > > > > > > + */
> > > > > > > > > > > > > > > > +static inline bool virtio_irq_soft_enabled(const struct virtio_device *vdev)
> > > > > > > > > > > > > > > > +{
> > > > > > > > > > > > > > > > + if (!vdev->irq_soft_check)
> > > > > > > > > > > > > > > > +         return true;
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * Read irq_soft_enabled before reading other device specific
> > > > > > > > > > > > > > > > +  * data. Paried with smp_store_relase() in
> > > > > > > > > > > > > > > paired
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Will fix.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +  * virtio_device_ready() and WRITE_ONCE()/synchronize_rcu() in
> > > > > > > > > > > > > > > > +  * virtio_reset_device().
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + return smp_load_acquire(&vdev->irq_soft_enabled);
> > > > > > > > > > > > > > > > +}
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >   /**
> > > > > > > > > > > > > > > >    * virtio_has_dma_quirk - determine whether this device has the DMA quirk
> > > > > > > > > > > > > > > >    * @vdev: the device
> > > > > > > > > > > > > > > > @@ -236,6 +254,13 @@ void virtio_device_ready(struct virtio_device *dev)
> > > > > > > > > > > > > > > >           if (dev->config->enable_cbs)
> > > > > > > > > > > > > > > >                     dev->config->enable_cbs(dev);
> > > > > > > > > > > > > > > > + /*
> > > > > > > > > > > > > > > > +  * Commit the driver setup before enabling the virtqueue
> > > > > > > > > > > > > > > > +  * callbacks. Paried with smp_load_acuqire() in
> > > > > > > > > > > > > > > > +  * virtio_irq_soft_enabled()
> > > > > > > > > > > > > > > > +  */
> > > > > > > > > > > > > > > > + smp_store_release(&dev->irq_soft_enabled, true);
> > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > >           BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > > >           dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > > > > > > > > > > > > > >   }
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > 2.25.1
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-28 10:40               ` Re: Michael S. Tsirkin
  2022-03-29  7:12                 ` Re: Jason Wang
@ 2022-03-29  8:35                 ` Thomas Gleixner
  2022-03-29 14:37                   ` Re: Michael S. Tsirkin
  2022-04-12  6:55                   ` Re: Michael S. Tsirkin
  1 sibling, 2 replies; 1651+ messages in thread
From: Thomas Gleixner @ 2022-03-29  8:35 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Mon, Mar 28 2022 at 06:40, Michael S. Tsirkin wrote:
> On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
>> > > So I think we might talk different issues:
>> > >
>> > > 1) Whether request_irq() commits the previous setups, I think the
>> > > answer is yes, since the spin_unlock of desc->lock (release) can
>> > > guarantee this though there seems no documentation around
>> > > request_irq() to say this.
>> > >
>> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
>> > > using smp_wmb() before the request_irq().

That's a complete bogus example especially as there is not a single
smp_rmb() which pairs with the smp_wmb().

>> > > And even if write is ordered we still need read to be ordered to be
>> > > paired with that.
>
> IMO it synchronizes with the CPU to which irq is
> delivered. Otherwise basically all drivers would be broken,
> wouldn't they be?
> I don't know whether it's correct on all platforms, but if not
> we need to fix request_irq.

There is nothing to fix:

request_irq()
   raw_spin_lock_irq(desc->lock);       // ACQUIRE
   ....
   raw_spin_unlock_irq(desc->lock);     // RELEASE

interrupt()
   raw_spin_lock(desc->lock);           // ACQUIRE
   set status to IN_PROGRESS
   raw_spin_unlock(desc->lock);         // RELEASE
   invoke handler()

So anything which the driver set up _before_ request_irq() is visible to
the interrupt handler. No?

>> What happens if an interrupt is raised in the middle like:
>> 
>> smp_store_release(dev->irq_soft_enabled, true)
>> IRQ handler
>> synchornize_irq()

This is bogus. The obvious order of things is:

    dev->ok = false;
    request_irq();

    moar_setup();
    synchronize_irq();  // ACQUIRE + RELEASE
    dev->ok = true;

The reverse operation on teardown:

    dev->ok = false;
    synchronize_irq();  // ACQUIRE + RELEASE

    teardown();

So in both cases a simple check in the handler is sufficient:

handler()
    if (!dev->ok)
    	return;

I'm not understanding what you folks are trying to "fix" here. If any
driver does this in the wrong order, then the driver is broken.

Sure, you can do the same with:

    dev->ok = false;
    request_irq();
    moar_setup();
    smp_wmb();
    dev->ok = true;

for the price of a smp_rmb() in the interrupt handler:

handler()
    if (!dev->ok)
    	return;
    smp_rmb();

but that's only working for the setup case correctly and not for
teardown.

Thanks,

        tglx
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-29  8:35                 ` Re: Thomas Gleixner
@ 2022-03-29 14:37                   ` Michael S. Tsirkin
  2022-03-29 18:13                     ` Re: Thomas Gleixner
  2022-04-12  6:55                   ` Re: Michael S. Tsirkin
  1 sibling, 1 reply; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-03-29 14:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> On Mon, Mar 28 2022 at 06:40, Michael S. Tsirkin wrote:
> > On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> >> > > So I think we might talk different issues:
> >> > >
> >> > > 1) Whether request_irq() commits the previous setups, I think the
> >> > > answer is yes, since the spin_unlock of desc->lock (release) can
> >> > > guarantee this though there seems no documentation around
> >> > > request_irq() to say this.
> >> > >
> >> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> >> > > using smp_wmb() before the request_irq().
> 
> That's a complete bogus example especially as there is not a single
> smp_rmb() which pairs with the smp_wmb().
> 
> >> > > And even if write is ordered we still need read to be ordered to be
> >> > > paired with that.
> >
> > IMO it synchronizes with the CPU to which irq is
> > delivered. Otherwise basically all drivers would be broken,
> > wouldn't they be?
> > I don't know whether it's correct on all platforms, but if not
> > we need to fix request_irq.
> 
> There is nothing to fix:
> 
> request_irq()
>    raw_spin_lock_irq(desc->lock);       // ACQUIRE
>    ....
>    raw_spin_unlock_irq(desc->lock);     // RELEASE
> 
> interrupt()
>    raw_spin_lock(desc->lock);           // ACQUIRE
>    set status to IN_PROGRESS
>    raw_spin_unlock(desc->lock);         // RELEASE
>    invoke handler()
> 
> So anything which the driver set up _before_ request_irq() is visible to
> the interrupt handler. No?
> >> What happens if an interrupt is raised in the middle like:
> >> 
> >> smp_store_release(dev->irq_soft_enabled, true)
> >> IRQ handler
> >> synchornize_irq()
> 
> This is bogus. The obvious order of things is:
> 
>     dev->ok = false;
>     request_irq();
> 
>     moar_setup();
>     synchronize_irq();  // ACQUIRE + RELEASE
>     dev->ok = true;
> 
> The reverse operation on teardown:
> 
>     dev->ok = false;
>     synchronize_irq();  // ACQUIRE + RELEASE
> 
>     teardown();
> 
> So in both cases a simple check in the handler is sufficient:
> 
> handler()
>     if (!dev->ok)
>     	return;


Thanks a lot for the analysis Thomas. This is more or less what I was
thinking.

> 
> I'm not understanding what you folks are trying to "fix" here.

We are trying to fix the driver since at the moment it does not
have the dev->ok flag at all.


And I suspect virtio is not alone in that.
So it would have been nice if there was a standard flag
replacing the driver-specific dev->ok above, and ideally
would also handle the case of an interrupt triggering
too early by deferring the interrupt until the flag is set.

And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
enable_irq instead of dev->ok = true, except
- it doesn't work with affinity managed IRQs
- it does not work with shared IRQs

So using dev->ok as you propose above seems better at this point.

> If any
> driver does this in the wrong order, then the driver is broken.

I agree, however:
$ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
113
$ git grep -l request_irq drivers/net/|wc -l
397

I suspect there are more drivers which in theory need the
synchronize_irq dance but in practice do not execute it.


> Sure, you can do the same with:
> 
>     dev->ok = false;
>     request_irq();
>     moar_setup();
>     smp_wmb();
>     dev->ok = true;
> 
> for the price of a smp_rmb() in the interrupt handler:
> 
> handler()
>     if (!dev->ok)
>     	return;
>     smp_rmb();
> 
> but that's only working for the setup case correctly and not for
> teardown.
> 
> Thanks,
> 
>         tglx

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-29 14:37                   ` Re: Michael S. Tsirkin
@ 2022-03-29 18:13                     ` Thomas Gleixner
  2022-03-29 22:04                       ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1651+ messages in thread
From: Thomas Gleixner @ 2022-03-29 18:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> We are trying to fix the driver since at the moment it does not
> have the dev->ok flag at all.
>
> And I suspect virtio is not alone in that.
> So it would have been nice if there was a standard flag
> replacing the driver-specific dev->ok above, and ideally
> would also handle the case of an interrupt triggering
> too early by deferring the interrupt until the flag is set.
>
> And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> enable_irq instead of dev->ok = true, except
> - it doesn't work with affinity managed IRQs
> - it does not work with shared IRQs
>
> So using dev->ok as you propose above seems better at this point.

Unless there is a big enough amount of drivers which could make use of a
generic mechanism for that.

>> If any driver does this in the wrong order, then the driver is
>> broken.
> 
> I agree, however:
> $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> 113
> $ git grep -l request_irq drivers/net/|wc -l
> 397
>
> I suspect there are more drivers which in theory need the
> synchronize_irq dance but in practice do not execute it.

That really depends on when the driver requests the interrupt, when
it actually enables the interrupt in the device itself and how the
interrupt service routine works.

So just doing that grep dance does not tell much. You really have to do
a case by case analysis.

Thanks,

        tglx

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-29 18:13                     ` Re: Thomas Gleixner
@ 2022-03-29 22:04                       ` Michael S. Tsirkin
  2022-03-30  2:38                         ` Re: Jason Wang
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-03-29 22:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Tue, Mar 29, 2022 at 08:13:57PM +0200, Thomas Gleixner wrote:
> On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> > On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> > We are trying to fix the driver since at the moment it does not
> > have the dev->ok flag at all.
> >
> > And I suspect virtio is not alone in that.
> > So it would have been nice if there was a standard flag
> > replacing the driver-specific dev->ok above, and ideally
> > would also handle the case of an interrupt triggering
> > too early by deferring the interrupt until the flag is set.
> >
> > And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> > enable_irq instead of dev->ok = true, except
> > - it doesn't work with affinity managed IRQs
> > - it does not work with shared IRQs
> >
> > So using dev->ok as you propose above seems better at this point.
> 
> Unless there is a big enough amount of drivers which could make use of a
> generic mechanism for that.
> 
> >> If any driver does this in the wrong order, then the driver is
> >> broken.
> > 
> > I agree, however:
> > $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> > 113
> > $ git grep -l request_irq drivers/net/|wc -l
> > 397
> >
> > I suspect there are more drivers which in theory need the
> > synchronize_irq dance but in practice do not execute it.
> 
> That really depends on when the driver requests the interrupt, when
> it actually enables the interrupt in the device itself

This last point does not matter since we are talking about protecting
against buggy/malicious devices. They can inject the interrupt anyway
even if driver did not configure it.

> and how the
> interrupt service routine works.
> 
> So just doing that grep dance does not tell much. You really have to do
> a case by case analysis.
> 
> Thanks,
> 
>         tglx


I agree. In fact, at least for network the standard approach is to
request interrupts in the open call, virtio net is unusual
in doing it in probe. We should consider changing that.
Jason?

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-29 22:04                       ` Re: Michael S. Tsirkin
@ 2022-03-30  2:38                         ` Jason Wang
  2022-03-30  5:09                           ` Re: Michael S. Tsirkin
  0 siblings, 1 reply; 1651+ messages in thread
From: Jason Wang @ 2022-03-30  2:38 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 6:04 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Mar 29, 2022 at 08:13:57PM +0200, Thomas Gleixner wrote:
> > On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> > > On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> > > We are trying to fix the driver since at the moment it does not
> > > have the dev->ok flag at all.
> > >
> > > And I suspect virtio is not alone in that.
> > > So it would have been nice if there was a standard flag
> > > replacing the driver-specific dev->ok above, and ideally
> > > would also handle the case of an interrupt triggering
> > > too early by deferring the interrupt until the flag is set.
> > >
> > > And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> > > enable_irq instead of dev->ok = true, except
> > > - it doesn't work with affinity managed IRQs
> > > - it does not work with shared IRQs
> > >
> > > So using dev->ok as you propose above seems better at this point.
> >
> > Unless there is a big enough amount of drivers which could make use of a
> > generic mechanism for that.
> >
> > >> If any driver does this in the wrong order, then the driver is
> > >> broken.
> > >
> > > I agree, however:
> > > $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> > > 113
> > > $ git grep -l request_irq drivers/net/|wc -l
> > > 397
> > >
> > > I suspect there are more drivers which in theory need the
> > > synchronize_irq dance but in practice do not execute it.
> >
> > That really depends on when the driver requests the interrupt, when
> > it actually enables the interrupt in the device itself
>
> This last point does not matter since we are talking about protecting
> against buggy/malicious devices. They can inject the interrupt anyway
> even if driver did not configure it.
>
> > and how the
> > interrupt service routine works.
> >
> > So just doing that grep dance does not tell much. You really have to do
> > a case by case analysis.
> >
> > Thanks,
> >
> >         tglx
>
>
> I agree. In fact, at least for network the standard approach is to
> request interrupts in the open call, virtio net is unusual
> in doing it in probe. We should consider changing that.
> Jason?

This probably works only for virtio-net and it looks like not trivial
since we don't have a specific core API to request interrupts.

Thanks

>
> --
> MST
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-30  2:38                         ` Re: Jason Wang
@ 2022-03-30  5:09                           ` Michael S. Tsirkin
  2022-03-30  5:53                             ` Re: Jason Wang
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-03-30  5:09 UTC (permalink / raw)
  To: Jason Wang
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 10:38:06AM +0800, Jason Wang wrote:
> On Wed, Mar 30, 2022 at 6:04 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Tue, Mar 29, 2022 at 08:13:57PM +0200, Thomas Gleixner wrote:
> > > On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> > > > On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> > > > We are trying to fix the driver since at the moment it does not
> > > > have the dev->ok flag at all.
> > > >
> > > > And I suspect virtio is not alone in that.
> > > > So it would have been nice if there was a standard flag
> > > > replacing the driver-specific dev->ok above, and ideally
> > > > would also handle the case of an interrupt triggering
> > > > too early by deferring the interrupt until the flag is set.
> > > >
> > > > And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> > > > enable_irq instead of dev->ok = true, except
> > > > - it doesn't work with affinity managed IRQs
> > > > - it does not work with shared IRQs
> > > >
> > > > So using dev->ok as you propose above seems better at this point.
> > >
> > > Unless there is a big enough amount of drivers which could make use of a
> > > generic mechanism for that.
> > >
> > > >> If any driver does this in the wrong order, then the driver is
> > > >> broken.
> > > >
> > > > I agree, however:
> > > > $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> > > > 113
> > > > $ git grep -l request_irq drivers/net/|wc -l
> > > > 397
> > > >
> > > > I suspect there are more drivers which in theory need the
> > > > synchronize_irq dance but in practice do not execute it.
> > >
> > > That really depends on when the driver requests the interrupt, when
> > > it actually enables the interrupt in the device itself
> >
> > This last point does not matter since we are talking about protecting
> > against buggy/malicious devices. They can inject the interrupt anyway
> > even if driver did not configure it.
> >
> > > and how the
> > > interrupt service routine works.
> > >
> > > So just doing that grep dance does not tell much. You really have to do
> > > a case by case analysis.
> > >
> > > Thanks,
> > >
> > >         tglx
> >
> >
> > I agree. In fact, at least for network the standard approach is to
> > request interrupts in the open call, virtio net is unusual
> > in doing it in probe. We should consider changing that.
> > Jason?
> 
> This probably works only for virtio-net and it looks like not trivial
> since we don't have a specific core API to request interrupts.
> 
> Thanks

We'll need a new API, for sure. E.g.  find vqs with no
callback on probe, and then virtio_request_vq_callbacks separately.

The existing API that specifies callbacks during find vqs
can be used by other drivers.

> >
> > --
> > MST
> >

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-30  5:09                           ` Re: Michael S. Tsirkin
@ 2022-03-30  5:53                             ` Jason Wang
  0 siblings, 0 replies; 1651+ messages in thread
From: Jason Wang @ 2022-03-30  5:53 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization, Thomas Gleixner

On Wed, Mar 30, 2022 at 1:09 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Mar 30, 2022 at 10:38:06AM +0800, Jason Wang wrote:
> > On Wed, Mar 30, 2022 at 6:04 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Tue, Mar 29, 2022 at 08:13:57PM +0200, Thomas Gleixner wrote:
> > > > On Tue, Mar 29 2022 at 10:37, Michael S. Tsirkin wrote:
> > > > > On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> > > > > We are trying to fix the driver since at the moment it does not
> > > > > have the dev->ok flag at all.
> > > > >
> > > > > And I suspect virtio is not alone in that.
> > > > > So it would have been nice if there was a standard flag
> > > > > replacing the driver-specific dev->ok above, and ideally
> > > > > would also handle the case of an interrupt triggering
> > > > > too early by deferring the interrupt until the flag is set.
> > > > >
> > > > > And in fact, it does kind of exist: IRQF_NO_AUTOEN, and you would call
> > > > > enable_irq instead of dev->ok = true, except
> > > > > - it doesn't work with affinity managed IRQs
> > > > > - it does not work with shared IRQs
> > > > >
> > > > > So using dev->ok as you propose above seems better at this point.
> > > >
> > > > Unless there is a big enough amount of drivers which could make use of a
> > > > generic mechanism for that.
> > > >
> > > > >> If any driver does this in the wrong order, then the driver is
> > > > >> broken.
> > > > >
> > > > > I agree, however:
> > > > > $ git grep synchronize_irq `git grep -l request_irq drivers/net/`|wc -l
> > > > > 113
> > > > > $ git grep -l request_irq drivers/net/|wc -l
> > > > > 397
> > > > >
> > > > > I suspect there are more drivers which in theory need the
> > > > > synchronize_irq dance but in practice do not execute it.
> > > >
> > > > That really depends on when the driver requests the interrupt, when
> > > > it actually enables the interrupt in the device itself
> > >
> > > This last point does not matter since we are talking about protecting
> > > against buggy/malicious devices. They can inject the interrupt anyway
> > > even if driver did not configure it.
> > >
> > > > and how the
> > > > interrupt service routine works.
> > > >
> > > > So just doing that grep dance does not tell much. You really have to do
> > > > a case by case analysis.
> > > >
> > > > Thanks,
> > > >
> > > >         tglx
> > >
> > >
> > > I agree. In fact, at least for network the standard approach is to
> > > request interrupts in the open call, virtio net is unusual
> > > in doing it in probe. We should consider changing that.
> > > Jason?
> >
> > This probably works only for virtio-net and it looks like not trivial
> > since we don't have a specific core API to request interrupts.
> >
> > Thanks
>
> We'll need a new API, for sure. E.g.  find vqs with no
> callback on probe, and then virtio_request_vq_callbacks separately.
>
> The existing API that specifies callbacks during find vqs
> can be used by other drivers.

Ok, I will do it.

Thanks

>
> > >
> > > --
> > > MST
> > >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-03-29  8:35                 ` Re: Thomas Gleixner
  2022-03-29 14:37                   ` Re: Michael S. Tsirkin
@ 2022-04-12  6:55                   ` Michael S. Tsirkin
  1 sibling, 0 replies; 1651+ messages in thread
From: Michael S. Tsirkin @ 2022-04-12  6:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paul E. McKenney, Peter Zijlstra, Marc Zyngier, Keir Fraser,
	linux-kernel, virtualization

On Tue, Mar 29, 2022 at 10:35:21AM +0200, Thomas Gleixner wrote:
> On Mon, Mar 28 2022 at 06:40, Michael S. Tsirkin wrote:
> > On Mon, Mar 28, 2022 at 02:18:22PM +0800, Jason Wang wrote:
> >> > > So I think we might talk different issues:
> >> > >
> >> > > 1) Whether request_irq() commits the previous setups, I think the
> >> > > answer is yes, since the spin_unlock of desc->lock (release) can
> >> > > guarantee this though there seems no documentation around
> >> > > request_irq() to say this.
> >> > >
> >> > > And I can see at least drivers/video/fbdev/omap2/omapfb/dss/dispc.c is
> >> > > using smp_wmb() before the request_irq().
> 
> That's a complete bogus example especially as there is not a single
> smp_rmb() which pairs with the smp_wmb().
> 
> >> > > And even if write is ordered we still need read to be ordered to be
> >> > > paired with that.
> >
> > IMO it synchronizes with the CPU to which irq is
> > delivered. Otherwise basically all drivers would be broken,
> > wouldn't they be?
> > I don't know whether it's correct on all platforms, but if not
> > we need to fix request_irq.
> 
> There is nothing to fix:
> 
> request_irq()
>    raw_spin_lock_irq(desc->lock);       // ACQUIRE
>    ....
>    raw_spin_unlock_irq(desc->lock);     // RELEASE
> 
> interrupt()
>    raw_spin_lock(desc->lock);           // ACQUIRE
>    set status to IN_PROGRESS
>    raw_spin_unlock(desc->lock);         // RELEASE
>    invoke handler()
> 
> So anything which the driver set up _before_ request_irq() is visible to
> the interrupt handler. No?
> 
> >> What happens if an interrupt is raised in the middle like:
> >> 
> >> smp_store_release(dev->irq_soft_enabled, true)
> >> IRQ handler
> >> synchornize_irq()
> 
> This is bogus. The obvious order of things is:
> 
>     dev->ok = false;
>     request_irq();
> 
>     moar_setup();
>     synchronize_irq();  // ACQUIRE + RELEASE
>     dev->ok = true;
> 
> The reverse operation on teardown:
> 
>     dev->ok = false;
>     synchronize_irq();  // ACQUIRE + RELEASE
> 
>     teardown();
> 
> So in both cases a simple check in the handler is sufficient:
> 
> handler()
>     if (!dev->ok)
>     	return;

Does this need to be if (!READ_ONCE(dev->ok)) ?



> I'm not understanding what you folks are trying to "fix" here. If any
> driver does this in the wrong order, then the driver is broken.
> 
> Sure, you can do the same with:
> 
>     dev->ok = false;
>     request_irq();
>     moar_setup();
>     smp_wmb();
>     dev->ok = true;
> 
> for the price of a smp_rmb() in the interrupt handler:
> 
> handler()
>     if (!dev->ok)
>     	return;
>     smp_rmb();
> 
> but that's only working for the setup case correctly and not for
> teardown.
> 
> Thanks,
> 
>         tglx

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* rcu_sched self-detected stall on CPU
@ 2022-04-05 21:41 Miguel Ojeda
  2022-04-06  9:31 ` Zhouyi Zhou
  0 siblings, 1 reply; 1651+ messages in thread
From: Miguel Ojeda @ 2022-04-05 21:41 UTC (permalink / raw)
  To: linuxppc-dev, rcu

[-- Attachment #1: Type: text/plain, Size: 400 bytes --]

Hi PPC/RCU,

While merging v5.18-rc1 changes I noticed our CI PPC runs broke. I
reproduced the problem in v5.18-rc1 as well as next-20220405, under
both QEMU 4.2.1 and 6.1.0, with `-smp 2`; but I cannot reproduce it in
v5.17 from a few tries.

Sadly, the problem is not deterministic although it is not too hard to
reproduce (1 out of 5?). Please see attached config and QEMU output.

Cheers,
Miguel

[-- Attachment #2: qemu --]
[-- Type: application/octet-stream, Size: 20395 bytes --]

# qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot -smp 2
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-cfpc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-sbbc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-ibs=workaround


SLOF **********************************************************************
QEMU Starting
 Build Date = Jan 31 2020 20:27:09
 FW Version = buildd@ release 20191209
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /vdevice/l-lan@71000002
Populating /vdevice/v-scsi@71000003
       SCSI: Looking for devices
          8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
Populating /pci@800000020000000
No NVRAM common partition, re-initializing...
Scanning USB 
Using default console: /vdevice/vty@71000000
Detected RAM kernel at 400000 (121f8f0 bytes) 
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 5.18.0-rc1 (root@test) (powerpc64le-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #2 SMP Tue Apr 5 21:09:40 UTC 2022
Detected machine type: 0000000000000101
command line:  
Max number of cores passed to firmware: 2 (NR_CPUS = 2)
Calling ibm,client-architecture-support...qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-cfpc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-sbbc=workaround
qemu-system-ppc64: warning: TCG doesn't support requested feature, cap-ibs=workaround


SLOF **********************************************************************
QEMU Starting
 Build Date = Jan 31 2020 20:27:09
 FW Version = buildd@ release 20191209
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@71000000
Populating /vdevice/nvram@71000001
Populating /vdevice/l-lan@71000002
Populating /vdevice/v-scsi@71000003
       SCSI: Looking for devices
          8200000000000000 CD-ROM   : "QEMU     QEMU CD-ROM      2.5+"
Populating /pci@800000020000000
Scanning USB 
Using default console: /vdevice/vty@71000000
Detected RAM kernel at 400000 (121f8f0 bytes) 
     
  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php

Booting from memory...
OF stdout device is: /vdevice/vty@71000000
Preparing to boot Linux version 5.18.0-rc1 (root@test) (powerpc64le-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #2 SMP Tue Apr 5 21:09:40 UTC 2022
Detected machine type: 0000000000000101
command line:  
Max number of cores passed to firmware: 2 (NR_CPUS = 2)
Calling ibm,client-architecture-support... done
memory layout at init:
  memory_limit : 0000000000000000 (16 MB aligned)
  alloc_bottom : 0000000001630000
  alloc_top    : 0000000020000000
  alloc_top_hi : 0000000020000000
  rmo_top      : 0000000020000000
  ram_top      : 0000000020000000
instantiating rtas at 0x000000001fff0000... done
prom_hold_cpus: skipped
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x0000000001640000 -> 0x0000000001640a77
Device tree struct  0x0000000001650000 -> 0x0000000001660000
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x0000000000400000 ...
[    0.000000] radix-mmu: Page sizes from device-tree:
[    0.000000] radix-mmu: Page size shift = 12 AP=0x0
[    0.000000] radix-mmu: Page size shift = 16 AP=0x5
[    0.000000] radix-mmu: Page size shift = 21 AP=0x1
[    0.000000] radix-mmu: Page size shift = 30 AP=0x2
[    0.000000] Activating Kernel Userspace Access Prevention
[    0.000000] Activating Kernel Userspace Execution Prevention
[    0.000000] radix-mmu: Mapped 0x0000000000000000-0x0000000001200000 with 2.00 MiB pages (exec)
[    0.000000] radix-mmu: Mapped 0x0000000001200000-0x0000000020000000 with 2.00 MiB pages
[    0.000000] lpar: Using radix MMU under hypervisor
[    0.000000] Linux version 5.18.0-rc1 (root@test) (powerpc64le-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #2 SMP Tue Apr 5 21:09:40 UTC 2022
[    0.000000] Using pSeries machine description
[    0.000000] printk: bootconsole [udbg0] enabled
[    0.000000] Partition configured for 2 cpus.
[    0.000000] CPU maps initialized for 1 thread per core
[    0.000000] -----------------------------------------------------
[    0.000000] phys_mem_size     = 0x20000000
[    0.000000] dcache_bsize      = 0x80
[    0.000000] icache_bsize      = 0x80
[    0.000000] cpu_features      = 0x0001c06b8f4f9187
[    0.000000]   possible        = 0x000ffbebcf5fb187
[    0.000000]   always          = 0x0000006b8b5c9181
[    0.000000] cpu_user_features = 0xdc0065c2 0xaef00000
[    0.000000] mmu_features      = 0x3c007641
[    0.000000] firmware_features = 0x00000085455a445f
[    0.000000] vmalloc start     = 0xc008000000000000
[    0.000000] IO start          = 0xc00a000000000000
[    0.000000] vmemmap start     = 0xc00c000000000000
[    0.000000] -----------------------------------------------------
[    0.000000] rfi-flush: fallback displacement flush available
[    0.000000] rfi-flush: ori type flush available
[    0.000000] rfi-flush: mttrig type flush available
[    0.000000] count-cache-flush: software flush enabled.
[    0.000000] link-stack-flush: software flush enabled.
[    0.000000] stf-barrier: eieio barrier available
[    0.000000] PPC64 nvram contains 65536 bytes
[    0.000000] PV qspinlock hash table entries: 4096 (order: 0, 65536 bytes, linear)
[    0.000000] barrier-nospec: using ORI speculation barrier
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x000000001fffffff]
[    0.000000] percpu: Embedded 2 pages/cpu s32544 r0 d98528 u131072
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 8185
[    0.000000] Kernel command line: 
[    0.000000] Dentry cache hash table entries: 65536 (order: 3, 524288 bytes, linear)
[    0.000000] Inode-cache hash table entries: 32768 (order: 2, 262144 bytes, linear)
[    0.000000] mem auto-init: stack:off, heap alloc:on, heap free:on
[    0.000000] mem auto-init: clearing system memory may take some time...
[    0.000000] Memory: 418560K/524288K available (4096K kernel code, 704K rwdata, 768K rodata, 1024K init, 446K bss, 105728K reserved, 0K cma-reserved)
[    0.000000] random: get_random_u64 called from __kmem_cache_create+0x34/0x520 with crng_init=0
[    0.000000] SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies.
[    0.000000] NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16
[    0.000000] xive: Using IRQ range [0-1]
[    0.000000] xive: Interrupt handling initialized with spapr backend
[    0.000000] xive: Using priority 6 for all interrupts
[    0.000000] xive: Using 64kB queues
[    0.000061] time_init: 56 bit decrementer (max: 7fffffffffffff)
[    0.000505] clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns
[    0.000999] clocksource: timebase mult[1f40000] shift[24] registered
[    0.007895] Console: colour dummy device 80x25
[    0.008493] printk: console [hvc0] enabled
[    0.008493] printk: console [hvc0] enabled
[    0.010438] printk: bootconsole [udbg0] disabled
[    0.010438] printk: bootconsole [udbg0] disabled
[    0.011696] pid_max: default: 32768 minimum: 301
[    0.012503] Mount-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[    0.012575] Mountpoint-cache hash table entries: 8192 (order: 0, 65536 bytes, linear)
[    0.041799] POWER9 performance monitor hardware support registered
[    0.042735] rcu: Hierarchical SRCU implementation.
[    0.045220] smp: Bringing up secondary CPUs ...
[    0.066059] smp: Brought up 1 node, 2 CPUs
[    0.093530] devtmpfs: initialized
[    0.100032] PCI host bridge /pci@800000020000000  ranges:
[    0.100607]   IO 0x0000200000000000..0x000020000000ffff -> 0x0000000000000000
[    0.100734]  MEM 0x0000200080000000..0x00002000ffffffff -> 0x0000000080000000 
[    0.100805]  MEM 0x0000210000000000..0x000021ffffffffff -> 0x0000210000000000 
[    0.101744] PCI: OF: PROBE_ONLY disabled
[    0.102198] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[    0.102371] futex hash table entries: 512 (order: 0, 65536 bytes, linear)
Linux ppc64le
#2 SMP Tue Apr 5[    0.110196] EEH: pSeries platform initialized
[    0.115987] software IO TLB: tearing down default memory pool
[    0.134293] PCI: Probing PCI hardware
[    0.136684] PCI host bridge to bus 0000:00
[    0.136999] pci_bus 0000:00: root bus resource [io  0x10000-0x1ffff] (bus address [0x0000-0xffff])
[    0.137449] pci_bus 0000:00: root bus resource [mem 0x200080000000-0x2000ffffffff] (bus address [0x80000000-0xffffffff])
[    0.137535] pci_bus 0000:00: root bus resource [mem 0x210000000000-0x21ffffffffff 64bit]
[    0.137720] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.142416] IOMMU table initialized, virtual merging enabled
[    0.143475] pci_bus 0000:00: resource 4 [io  0x10000-0x1ffff]
[    0.143541] pci_bus 0000:00: resource 5 [mem 0x200080000000-0x2000ffffffff]
[    0.143575] pci_bus 0000:00: resource 6 [mem 0x210000000000-0x21ffffffffff 64bit]
[    0.143754] EEH: No capable adapters found: recovery disabled.
[    0.166254] vgaarb: loaded
[    0.168227] clocksource: Switched to clocksource timebase
[    0.182817] PCI: CLS 0 bytes, default 128
[    1.790625] workingset: timestamp_bits=62 max_order=13 bucket_order=0
[   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
[   21.187331] rcu: 	1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0 
[   21.187529] 	(t=21000 jiffies g=-1183 q=3)
[   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402
[   21.187770] rcu: 	Possible timer handling issue on cpu=1 timer-softirq=1
[   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[   21.188019] rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[   21.188087] rcu: RCU grace-period kthread stack dump:
[   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
[   21.188453] Call Trace:
[   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
[   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
[   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
[   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
[   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
[   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
[   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
[   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
[   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64
[   21.189938] rcu: Stack dump where RCU GP kthread last ran:
[   21.189992] Task dump for CPU 1:
[   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.190169] Call Trace:
[   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
[   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
[   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
[   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
[   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.191274] CFAR: 0000000000000000 IRQMASK: 0 
[   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.192118] --- interrupt: 900
[   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.192755] CFAR: 0000000000000000 IRQMASK: 0 
[   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.193428] --- interrupt: 900
[   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
[   21.194245] Task dump for CPU 1:
[   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.194374] Call Trace:
[   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
[   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
[   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
[   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
[   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.195296] CFAR: 0000000000000000 IRQMASK: 0 
[   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.196027] --- interrupt: 900
[   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.196627] CFAR: 0000000000000000 IRQMASK: 0 
[   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.197305] --- interrupt: 900
[   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14

[-- Attachment #3: config --]
[-- Type: application/octet-stream, Size: 35266 bytes --]

#
# Automatically generated file; DO NOT EDIT.
# Linux/powerpc 5.18.0-rc1 Kernel Configuration
#
CONFIG_CC_VERSION_TEXT="powerpc64le-linux-gnu-gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0"
CONFIG_CC_IS_GCC=y
CONFIG_GCC_VERSION=90400
CONFIG_CLANG_VERSION=0
CONFIG_AS_IS_GNU=y
CONFIG_AS_VERSION=23400
CONFIG_LD_IS_BFD=y
CONFIG_LD_VERSION=23400
CONFIG_LLD_VERSION=0
CONFIG_CC_CAN_LINK=y
CONFIG_CC_CAN_LINK_STATIC=y
CONFIG_CC_HAS_ASM_GOTO=y
CONFIG_CC_HAS_ASM_INLINE=y
CONFIG_CC_HAS_NO_PROFILE_FN_ATTR=y
CONFIG_PAHOLE_VERSION=0
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_TABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
# CONFIG_COMPILE_TEST is not set
CONFIG_WERROR=y
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_BUILD_SALT=""
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_XZ is not set
CONFIG_DEFAULT_INIT=""
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
# CONFIG_SYSVIPC is not set
# CONFIG_WATCH_QUEUE is not set
# CONFIG_CROSS_MEMORY_ATTACH is not set
# CONFIG_USELIB is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_SHOW_LEVEL=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_HARDIRQS_SW_RESEND=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# end of IRQ subsystem

CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_ARCH_HAS_TICK_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set
# end of Timers subsystem

CONFIG_HAVE_EBPF_JIT=y

#
# BPF subsystem
#
# CONFIG_BPF_SYSCALL is not set
# end of BPF subsystem

CONFIG_PREEMPT_VOLUNTARY_BUILD=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
# CONFIG_SCHED_CORE is not set

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_PSI is not set
# end of CPU/Task time and stats accounting

CONFIG_CPU_ISOLATION=y

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
# end of RCU Subsystem

# CONFIG_IKCONFIG is not set
# CONFIG_IKHEADERS is not set
CONFIG_LOG_BUF_SHIFT=16
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13

#
# Scheduler features
#
# end of Scheduler features

CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
CONFIG_CC_HAS_INT128=y
CONFIG_CC_IMPLICIT_FALLTHROUGH="-Wimplicit-fallthrough=5"
# CONFIG_CGROUPS is not set
CONFIG_NAMESPACES=y
# CONFIG_UTS_NS is not set
CONFIG_TIME_NS=y
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_CHECKPOINT_RESTORE is not set
# CONFIG_SCHED_AUTOGROUP is not set
# CONFIG_SYSFS_DEPRECATED is not set
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_RD_GZIP is not set
# CONFIG_RD_BZIP2 is not set
# CONFIG_RD_LZMA is not set
# CONFIG_RD_XZ is not set
# CONFIG_RD_LZO is not set
# CONFIG_RD_LZ4 is not set
# CONFIG_RD_ZSTD is not set
# CONFIG_BOOT_CONFIG is not set
CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION=y
CONFIG_LD_ORPHAN_WARN=y
CONFIG_SYSCTL=y
CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_HAVE_PCSPKR_PLATFORM=y
# CONFIG_EXPERT is not set
CONFIG_MULTIUSER=y
CONFIG_SGETMASK_SYSCALL=y
CONFIG_SYSFS_SYSCALL=y
CONFIG_FHANDLE=y
CONFIG_POSIX_TIMERS=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_PCSPKR_PLATFORM=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_FUTEX_PI=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_IO_URING=y
CONFIG_ADVISE_SYSCALLS=y
CONFIG_MEMBARRIER=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_BASE_RELATIVE=y
# CONFIG_USERFAULTFD is not set
CONFIG_ARCH_HAS_MEMBARRIER_CALLBACKS=y
CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE=y
CONFIG_RSEQ=y
# CONFIG_EMBEDDED is not set
CONFIG_HAVE_PERF_EVENTS=y

#
# Kernel Performance Events And Counters
#
CONFIG_PERF_EVENTS=y
# end of Kernel Performance Events And Counters

CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_COMPAT_BRK is not set
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLAB_MERGE_DEFAULT is not set
# CONFIG_SLAB_FREELIST_RANDOM is not set
CONFIG_SLAB_FREELIST_HARDENED=y
# CONFIG_SHUFFLE_PAGE_ALLOCATOR is not set
CONFIG_SLUB_CPU_PARTIAL=y
# CONFIG_PROFILING is not set
# end of General setup

CONFIG_PPC64=y

#
# Processor support
#
CONFIG_PPC_BOOK3S_64=y
# CONFIG_PPC_BOOK3E_64 is not set
CONFIG_GENERIC_CPU=y
# CONFIG_POWER7_CPU is not set
# CONFIG_POWER8_CPU is not set
# CONFIG_POWER9_CPU is not set
CONFIG_PPC_BOOK3S=y
CONFIG_PPC_FPU_REGS=y
CONFIG_PPC_FPU=y
CONFIG_ALTIVEC=y
CONFIG_VSX=y
CONFIG_PPC_64S_HASH_MMU=y
CONFIG_PPC_RADIX_MMU=y
CONFIG_PPC_RADIX_MMU_DEFAULT=y
CONFIG_PPC_KUEP=y
CONFIG_PPC_KUAP=y
# CONFIG_PPC_KUAP_DEBUG is not set
CONFIG_PPC_PKEY=y
CONFIG_PPC_MM_SLICES=y
CONFIG_PPC_HAVE_PMU_SUPPORT=y
# CONFIG_PMU_SYSFS is not set
CONFIG_PPC_PERF_CTRS=y
CONFIG_FORCE_SMP=y
CONFIG_SMP=y
CONFIG_NR_CPUS=2
CONFIG_PPC_DOORBELL=y
# end of Processor support

# CONFIG_CPU_BIG_ENDIAN is not set
CONFIG_CPU_LITTLE_ENDIAN=y
CONFIG_PPC64_BOOT_WRAPPER=y
CONFIG_64BIT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MAX=29
CONFIG_ARCH_MMAP_RND_BITS_MIN=14
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=13
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=7
CONFIG_NR_IRQS=512
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_PPC=y
CONFIG_PPC_BARRIER_NOSPEC=y
CONFIG_EARLY_PRINTK=y
CONFIG_PANIC_TIMEOUT=-1
# CONFIG_COMPAT is not set
CONFIG_SCHED_OMIT_FRAME_POINTER=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_PPC_UDBG_16550=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_SUSPEND_NONZERO_CPU=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_PPC_DAWR=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_PPC_MSI_BITMAP=y
CONFIG_PPC_XICS=y
CONFIG_PPC_ICP_NATIVE=y
CONFIG_PPC_ICP_HV=y
CONFIG_PPC_ICS_RTAS=y
CONFIG_PPC_XIVE=y
CONFIG_PPC_XIVE_SPAPR=y

#
# Platform support
#
# CONFIG_PPC_POWERNV is not set
CONFIG_PPC_PSERIES=y
CONFIG_PARAVIRT_SPINLOCKS=y
CONFIG_PPC_SPLPAR=y
# CONFIG_PSERIES_ENERGY is not set
CONFIG_IO_EVENT_IRQ=y
# CONFIG_LPARCFG is not set
# CONFIG_PPC_SMLPAR is not set
# CONFIG_HV_PERF_CTRS is not set
CONFIG_IBMVIO=y
# CONFIG_PPC_SVM is not set
CONFIG_PPC_VAS=y
# CONFIG_KVM_GUEST is not set
# CONFIG_EPAPR_PARAVIRT is not set
CONFIG_PPC_OF_BOOT_TRAMPOLINE=y
# CONFIG_PPC_DT_CPU_FTRS is not set
# CONFIG_UDBG_RTAS_CONSOLE is not set
CONFIG_PPC_SMP_MUXED_IPI=y
CONFIG_MPIC=y
# CONFIG_MPIC_MSGR is not set
CONFIG_PPC_I8259=y
CONFIG_PPC_RTAS=y
CONFIG_RTAS_ERROR_LOGGING=y
CONFIG_PPC_RTAS_DAEMON=y
# CONFIG_RTAS_PROC is not set
CONFIG_EEH=y

#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
# end of CPU Frequency scaling

#
# CPUIdle driver
#

#
# CPU Idle
#
# CONFIG_CPU_IDLE is not set
# end of CPU Idle
# end of CPUIdle driver

# CONFIG_GEN_RTC is not set
# end of Platform support

#
# Kernel options
#
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
# CONFIG_PPC_TRANSACTIONAL_MEM is not set
CONFIG_HOTPLUG_CPU=y
CONFIG_PPC_QUEUED_SPINLOCKS=y
CONFIG_ARCH_CPU_PROBE_RELEASE=y
# CONFIG_PPC64_SUPPORTS_MEMORY_FAILURE is not set
# CONFIG_KEXEC is not set
CONFIG_RELOCATABLE=y
# CONFIG_RELOCATABLE_TEST is not set
# CONFIG_CRASH_DUMP is not set
# CONFIG_FA_DUMP is not set
# CONFIG_IRQ_ALL_CPUS is not set
# CONFIG_NUMA is not set
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ILLEGAL_POINTER_VALUE=0x5deadbeef0000000
# CONFIG_PPC_4K_PAGES is not set
CONFIG_PPC_64K_PAGES=y
CONFIG_PPC_PAGE_SHIFT=16
CONFIG_THREAD_SHIFT=14
CONFIG_DATA_SHIFT=24
CONFIG_FORCE_MAX_ZONEORDER=9
# CONFIG_PPC_SUBPAGE_PROT is not set
# CONFIG_PPC_PROT_SAO_LPAR is not set
CONFIG_SCHED_SMT=y
CONFIG_PPC_DENORMALISATION=y
CONFIG_CMDLINE=""
CONFIG_EXTRA_TARGETS=""
# CONFIG_SUSPEND is not set
# CONFIG_HIBERNATION is not set
# CONFIG_PM is not set
# CONFIG_PPC_MEM_KEYS is not set
CONFIG_PPC_RTAS_FILTER=y
# end of Kernel options

CONFIG_ISA_DMA_API=y

#
# Bus options
#
CONFIG_GENERIC_ISA_DMA=y
# CONFIG_FSL_LBC is not set
# end of Bus options

CONFIG_NONSTATIC_KERNEL=y
CONFIG_PAGE_OFFSET=0xc000000000000000
CONFIG_KERNEL_START=0xc000000000000000
CONFIG_PHYSICAL_START=0x00000000
CONFIG_ARCH_RANDOM=y
# CONFIG_VIRTUALIZATION is not set

#
# General architecture-dependent options
#
# CONFIG_KPROBES is not set
CONFIG_JUMP_LABEL=y
# CONFIG_STATIC_KEYS_SELFTEST is not set
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y
CONFIG_ARCH_USE_BUILTIN_BSWAP=y
CONFIG_HAVE_IOREMAP_PROT=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_OPTPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_FUNCTION_ERROR_INJECTION=y
CONFIG_HAVE_NMI=y
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_GENERIC_SMP_IDLE_THREAD=y
CONFIG_ARCH_HAS_FORTIFY_SOURCE=y
CONFIG_ARCH_HAS_SET_MEMORY=y
CONFIG_HAVE_ASM_MODVERSIONS=y
CONFIG_HAVE_REGS_AND_STACK_ACCESS_API=y
CONFIG_HAVE_RSEQ=y
CONFIG_HAVE_HW_BREAKPOINT=y
CONFIG_HAVE_PERF_EVENTS_NMI=y
CONFIG_HAVE_NMI_WATCHDOG=y
CONFIG_HAVE_HARDLOCKUP_DETECTOR_ARCH=y
CONFIG_HAVE_PERF_REGS=y
CONFIG_HAVE_PERF_USER_STACK_DUMP=y
CONFIG_HAVE_ARCH_JUMP_LABEL=y
CONFIG_HAVE_ARCH_JUMP_LABEL_RELATIVE=y
CONFIG_MMU_GATHER_TABLE_FREE=y
CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
CONFIG_MMU_GATHER_PAGE_SIZE=y
CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM=y
CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y
CONFIG_ARCH_WEAK_RELEASE_ACQUIRE=y
CONFIG_ARCH_WANT_IPC_PARSE_VERSION=y
CONFIG_HAVE_ARCH_SECCOMP=y
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
# CONFIG_SECCOMP is not set
CONFIG_HAVE_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_LTO_NONE=y
CONFIG_HAVE_CONTEXT_TRACKING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING=y
CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y
CONFIG_HAVE_IRQ_TIME_ACCOUNTING=y
CONFIG_HAVE_MOVE_PUD=y
CONFIG_HAVE_MOVE_PMD=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_HAVE_ARCH_HUGE_VMALLOC=y
CONFIG_HAVE_ARCH_SOFT_DIRTY=y
CONFIG_HAVE_MOD_ARCH_SPECIFIC=y
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK=y
CONFIG_HAVE_SOFTIRQ_ON_OWN_STACK=y
CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_HAVE_ARCH_MMAP_RND_BITS=y
CONFIG_ARCH_MMAP_RND_BITS=14
CONFIG_PAGE_SIZE_LESS_THAN_256KB=y
CONFIG_HAVE_RELIABLE_STACKTRACE=y
CONFIG_HAVE_ARCH_NVRAM_OPS=y
CONFIG_CLONE_BACKWARDS=y
CONFIG_OLD_SIGSUSPEND=y
# CONFIG_COMPAT_32BIT_TIME is not set
CONFIG_ARCH_OPTIONAL_KERNEL_RWX=y
CONFIG_ARCH_OPTIONAL_KERNEL_RWX_DEFAULT=y
CONFIG_ARCH_HAS_STRICT_KERNEL_RWX=y
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
CONFIG_STRICT_MODULE_RWX=y
CONFIG_ARCH_HAS_PHYS_TO_DMA=y
CONFIG_ARCH_WANT_LD_ORPHAN_WARN=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y

#
# GCOV-based kernel profiling
#
CONFIG_ARCH_HAS_GCOV_PROFILE_ALL=y
# end of GCOV-based kernel profiling

CONFIG_HAVE_GCC_PLUGINS=y
# end of General architecture-dependent options

CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
# CONFIG_MODULE_SIG is not set
CONFIG_MODULE_COMPRESS_NONE=y
# CONFIG_MODULE_COMPRESS_GZIP is not set
# CONFIG_MODULE_COMPRESS_XZ is not set
# CONFIG_MODULE_COMPRESS_ZSTD is not set
# CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is not set
CONFIG_MODPROBE_PATH="/sbin/modprobe"
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_BLOCK=y
CONFIG_BLOCK_LEGACY_AUTOLOAD=y
# CONFIG_BLK_DEV_BSGLIB is not set
# CONFIG_BLK_DEV_INTEGRITY is not set
# CONFIG_BLK_DEV_ZONED is not set
# CONFIG_BLK_WBT is not set
# CONFIG_BLK_SED_OPAL is not set
# CONFIG_BLK_INLINE_ENCRYPTION is not set

#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_AIX_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
# CONFIG_MSDOS_PARTITION is not set
# CONFIG_LDM_PARTITION is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
# CONFIG_KARMA_PARTITION is not set
# CONFIG_EFI_PARTITION is not set
# CONFIG_SYSV68_PARTITION is not set
# CONFIG_CMDLINE_PARTITION is not set
# end of Partition Types

CONFIG_BLK_MQ_PCI=y

#
# IO Schedulers
#
# CONFIG_MQ_IOSCHED_DEADLINE is not set
# CONFIG_MQ_IOSCHED_KYBER is not set
# CONFIG_IOSCHED_BFQ is not set
# end of IO Schedulers

CONFIG_INLINE_SPIN_UNLOCK_IRQ=y
CONFIG_INLINE_READ_UNLOCK=y
CONFIG_INLINE_READ_UNLOCK_IRQ=y
CONFIG_INLINE_WRITE_UNLOCK=y
CONFIG_INLINE_WRITE_UNLOCK_IRQ=y
CONFIG_ARCH_SUPPORTS_ATOMIC_RMW=y
CONFIG_MUTEX_SPIN_ON_OWNER=y
CONFIG_RWSEM_SPIN_ON_OWNER=y
CONFIG_LOCK_SPIN_ON_OWNER=y
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
CONFIG_ARCH_HAS_MMIOWB=y
CONFIG_MMIOWB=y
CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE=y

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_ELFCORE=y
CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
CONFIG_BINFMT_SCRIPT=y
# CONFIG_BINFMT_MISC is not set
CONFIG_COREDUMP=y
# end of Executable file formats

#
# Memory Management options
#
CONFIG_SELECT_MEMORY_MODEL=y
# CONFIG_FLATMEM_MANUAL is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_HAVE_FAST_GUP=y
CONFIG_ARCH_KEEP_MEMBLOCK=y
CONFIG_EXCLUSIVE_SYSTEM_RAM=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
# CONFIG_MEMORY_HOTPLUG is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK=y
CONFIG_COMPACTION=y
# CONFIG_PAGE_REPORTING is not set
CONFIG_MIGRATION=y
CONFIG_ARCH_ENABLE_THP_MIGRATION=y
CONFIG_PHYS_ADDR_T_64BIT=y
# CONFIG_KSM is not set
CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
CONFIG_TRANSPARENT_HUGEPAGE=y
# CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS is not set
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
# CONFIG_CMA is not set
# CONFIG_ZPOOL is not set
# CONFIG_ZSMALLOC is not set
CONFIG_GENERIC_EARLY_IOREMAP=y
# CONFIG_DEFERRED_STRUCT_PAGE_INIT is not set
# CONFIG_IDLE_PAGE_TRACKING is not set
CONFIG_ARCH_HAS_CURRENT_STACK_POINTER=y
CONFIG_ARCH_HAS_PTE_DEVMAP=y
# CONFIG_PERCPU_STATS is not set

#
# GUP_TEST needs to have DEBUG_FS enabled
#
# CONFIG_READ_ONLY_THP_FOR_FS is not set
CONFIG_ARCH_HAS_PTE_SPECIAL=y
# CONFIG_ANON_VMA_NAME is not set

#
# Data Access Monitoring
#
# CONFIG_DAMON is not set
# end of Data Access Monitoring
# end of Memory Management options

# CONFIG_NET is not set

#
# Device Drivers
#
CONFIG_HAVE_PCI=y
CONFIG_FORCE_PCI=y
CONFIG_PCI=y
CONFIG_PCI_DOMAINS=y
CONFIG_PCI_SYSCALL=y
# CONFIG_PCIEPORTBUS is not set
CONFIG_PCIEASPM=y
CONFIG_PCIEASPM_DEFAULT=y
# CONFIG_PCIEASPM_POWERSAVE is not set
# CONFIG_PCIEASPM_POWER_SUPERSAVE is not set
# CONFIG_PCIEASPM_PERFORMANCE is not set
# CONFIG_PCIE_PTM is not set
CONFIG_PCI_MSI=y
CONFIG_PCI_MSI_IRQ_DOMAIN=y
CONFIG_PCI_MSI_ARCH_FALLBACKS=y
CONFIG_PCI_QUIRKS=y
# CONFIG_PCI_STUB is not set
# CONFIG_PCI_IOV is not set
# CONFIG_PCI_PRI is not set
# CONFIG_PCI_PASID is not set
CONFIG_VGA_ARB=y
CONFIG_VGA_ARB_MAX_GPUS=16
# CONFIG_HOTPLUG_PCI is not set

#
# PCI controller drivers
#
# CONFIG_PCI_FTPCI100 is not set
# CONFIG_PCI_HOST_GENERIC is not set
# CONFIG_PCIE_XILINX is not set
# CONFIG_PCIE_MICROCHIP_HOST is not set

#
# DesignWare PCI Core Support
#
# CONFIG_PCIE_DW_PLAT_HOST is not set
# CONFIG_PCI_MESON is not set
# end of DesignWare PCI Core Support

#
# Mobiveil PCIe Core Support
#
# end of Mobiveil PCIe Core Support

#
# Cadence PCIe controllers support
#
# CONFIG_PCIE_CADENCE_PLAT_HOST is not set
# CONFIG_PCI_J721E_HOST is not set
# end of Cadence PCIe controllers support
# end of PCI controller drivers

#
# PCI Endpoint
#
# CONFIG_PCI_ENDPOINT is not set
# end of PCI Endpoint

#
# PCI switch controller drivers
#
# CONFIG_PCI_SW_SWITCHTEC is not set
# end of PCI switch controller drivers

# CONFIG_CXL_BUS is not set
# CONFIG_PCCARD is not set
# CONFIG_RAPIDIO is not set

#
# Generic Driver Options
#
# CONFIG_UEVENT_HELPER is not set
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# CONFIG_DEVTMPFS_SAFE is not set
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y

#
# Firmware loader
#
CONFIG_FW_LOADER=y
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_FW_LOADER_USER_HELPER is not set
# CONFIG_FW_LOADER_COMPRESS is not set
# end of Firmware loader

CONFIG_ALLOW_DEV_COREDUMP=y
# CONFIG_TEST_ASYNC_DRIVER_PROBE is not set
CONFIG_GENERIC_CPU_AUTOPROBE=y
CONFIG_GENERIC_CPU_VULNERABILITIES=y
# end of Generic Driver Options

#
# Bus devices
#
# CONFIG_MHI_BUS is not set
# end of Bus devices

#
# Firmware Drivers
#

#
# ARM System Control and Management Interface Protocol
#
# end of ARM System Control and Management Interface Protocol

# CONFIG_GOOGLE_FIRMWARE is not set

#
# Tegra firmware driver
#
# end of Tegra firmware driver
# end of Firmware Drivers

# CONFIG_GNSS is not set
# CONFIG_MTD is not set
CONFIG_DTC=y
CONFIG_OF=y
# CONFIG_OF_UNITTEST is not set
CONFIG_OF_FLATTREE=y
CONFIG_OF_EARLY_FLATTREE=y
CONFIG_OF_KOBJ=y
CONFIG_OF_DYNAMIC=y
CONFIG_OF_ADDRESS=y
CONFIG_OF_IRQ=y
CONFIG_OF_RESERVED_MEM=y
# CONFIG_OF_OVERLAY is not set
CONFIG_OF_DMA_DEFAULT_COHERENT=y
CONFIG_ARCH_MIGHT_HAVE_PC_PARPORT=y
# CONFIG_PARPORT is not set
# CONFIG_BLK_DEV is not set

#
# NVME Support
#
# CONFIG_BLK_DEV_NVME is not set
# CONFIG_NVME_FC is not set
# end of NVME Support

#
# Misc devices
#
# CONFIG_DUMMY_IRQ is not set
# CONFIG_IBMVMC is not set
# CONFIG_PHANTOM is not set
# CONFIG_TIFM_CORE is not set
# CONFIG_ENCLOSURE_SERVICES is not set
# CONFIG_HP_ILO is not set
# CONFIG_SRAM is not set
# CONFIG_DW_XDATA_PCIE is not set
# CONFIG_PCI_ENDPOINT_TEST is not set
# CONFIG_XILINX_SDFEC is not set
# CONFIG_OPEN_DICE is not set
# CONFIG_C2PORT is not set

#
# EEPROM support
#
# CONFIG_EEPROM_93CX6 is not set
# end of EEPROM support

# CONFIG_CB710_CORE is not set

#
# Texas Instruments shared transport line discipline
#
# end of Texas Instruments shared transport line discipline

#
# Altera FPGA firmware download module (requires I2C)
#
# CONFIG_GENWQE is not set
# CONFIG_ECHO is not set
# CONFIG_BCM_VK is not set
# CONFIG_MISC_ALCOR_PCI is not set
# CONFIG_MISC_RTSX_PCI is not set
# CONFIG_HABANA_AI is not set
# CONFIG_PVPANIC is not set
# end of Misc devices

#
# SCSI device support
#
CONFIG_SCSI_MOD=y
# CONFIG_RAID_ATTRS is not set
# CONFIG_SCSI is not set
# end of SCSI device support

# CONFIG_ATA is not set
# CONFIG_MD is not set
# CONFIG_TARGET_CORE is not set
# CONFIG_FUSION is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_FIREWIRE is not set
# CONFIG_FIREWIRE_NOSY is not set
# end of IEEE 1394 (FireWire) support

# CONFIG_MACINTOSH_DRIVERS is not set

#
# Input device support
#
CONFIG_INPUT=y
# CONFIG_INPUT_FF_MEMLESS is not set
# CONFIG_INPUT_SPARSEKMAP is not set
# CONFIG_INPUT_MATRIXKMAP is not set

#
# Userland interfaces
#
# CONFIG_INPUT_MOUSEDEV is not set
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
# CONFIG_INPUT_KEYBOARD is not set
# CONFIG_INPUT_MOUSE is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TABLET is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
# CONFIG_RMI4_CORE is not set

#
# Hardware I/O ports
#
# CONFIG_SERIO is not set
CONFIG_ARCH_MIGHT_HAVE_PC_SERIO=y
# CONFIG_GAMEPORT is not set
# end of Hardware I/O ports
# end of Input device support

#
# Character devices
#
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_CONSOLE_TRANSLATIONS=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_VT_HW_CONSOLE_BINDING is not set
CONFIG_UNIX98_PTYS=y
# CONFIG_LEGACY_PTYS is not set
# CONFIG_LDISC_AUTOLOAD is not set

#
# Serial drivers
#
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
# CONFIG_SERIAL_8250_16550A_VARIANTS is not set
# CONFIG_SERIAL_8250_FINTEK is not set
CONFIG_SERIAL_8250_CONSOLE=y
# CONFIG_SERIAL_8250_PCI is not set
CONFIG_SERIAL_8250_NR_UARTS=1
CONFIG_SERIAL_8250_RUNTIME_UARTS=1
# CONFIG_SERIAL_8250_EXTENDED is not set
CONFIG_SERIAL_8250_FSL=y
# CONFIG_SERIAL_8250_DW is not set
# CONFIG_SERIAL_8250_RT288X is not set
CONFIG_SERIAL_8250_PERICOM=y
# CONFIG_SERIAL_OF_PLATFORM is not set

#
# Non-8250 serial port support
#
# CONFIG_SERIAL_UARTLITE is not set
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_ICOM is not set
# CONFIG_SERIAL_JSM is not set
# CONFIG_SERIAL_SIFIVE is not set
# CONFIG_SERIAL_SCCNXP is not set
# CONFIG_SERIAL_ALTERA_JTAGUART is not set
# CONFIG_SERIAL_ALTERA_UART is not set
# CONFIG_SERIAL_XILINX_PS_UART is not set
# CONFIG_SERIAL_ARC is not set
# CONFIG_SERIAL_RP2 is not set
# CONFIG_SERIAL_FSL_LPUART is not set
# CONFIG_SERIAL_FSL_LINFLEXUART is not set
# CONFIG_SERIAL_CONEXANT_DIGICOLOR is not set
# end of Serial drivers

# CONFIG_SERIAL_NONSTANDARD is not set
# CONFIG_PPC_EPAPR_HV_BYTECHAN is not set
# CONFIG_NOZOMI is not set
# CONFIG_NULL_TTY is not set
CONFIG_HVC_DRIVER=y
CONFIG_HVC_IRQ=y
CONFIG_HVC_CONSOLE=y
# CONFIG_HVC_OLD_HVSI is not set
# CONFIG_HVC_RTAS is not set
# CONFIG_HVC_UDBG is not set
# CONFIG_HVCS is not set
# CONFIG_SERIAL_DEV_BUS is not set
# CONFIG_VIRTIO_CONSOLE is not set
# CONFIG_IBM_BSR is not set
# CONFIG_IPMI_HANDLER is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_APPLICOM is not set
# CONFIG_DEVMEM is not set
# CONFIG_NVRAM is not set
CONFIG_DEVPORT=y
# CONFIG_HANGCHECK_TIMER is not set
# CONFIG_TCG_TPM is not set
# CONFIG_XILLYBUS is not set
# CONFIG_RANDOM_TRUST_CPU is not set
# CONFIG_RANDOM_TRUST_BOOTLOADER is not set
# end of Character devices

#
# I2C support
#
# CONFIG_I2C is not set
# end of I2C support

# CONFIG_I3C is not set
# CONFIG_SPI is not set
# CONFIG_SPMI is not set
# CONFIG_HSI is not set
# CONFIG_PPS is not set

#
# PTP clock support
#
CONFIG_PTP_1588_CLOCK_OPTIONAL=y

#
# Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks.
#
# end of PTP clock support

# CONFIG_PINCTRL is not set
# CONFIG_GPIOLIB is not set
# CONFIG_W1 is not set
# CONFIG_POWER_RESET is not set
# CONFIG_POWER_SUPPLY is not set
# CONFIG_HWMON is not set
# CONFIG_THERMAL is not set
# CONFIG_WATCHDOG is not set
CONFIG_SSB_POSSIBLE=y
# CONFIG_SSB is not set
CONFIG_BCMA_POSSIBLE=y
# CONFIG_BCMA is not set

#
# Multifunction device drivers
#
# CONFIG_MFD_ATMEL_FLEXCOM is not set
# CONFIG_MFD_ATMEL_HLCDC is not set
# CONFIG_MFD_MADERA is not set
# CONFIG_MFD_HI6421_PMIC is not set
# CONFIG_HTC_PASIC3 is not set
# CONFIG_LPC_ICH is not set
# CONFIG_LPC_SCH is not set
# CONFIG_MFD_JANZ_CMODIO is not set
# CONFIG_MFD_KEMPLD is not set
# CONFIG_MFD_MT6397 is not set
# CONFIG_MFD_RDC321X is not set
# CONFIG_MFD_SM501 is not set
# CONFIG_MFD_SYSCON is not set
# CONFIG_MFD_TI_AM335X_TSCADC is not set
# CONFIG_MFD_TQMX86 is not set
# CONFIG_MFD_VX855 is not set
# end of Multifunction device drivers

# CONFIG_REGULATOR is not set
# CONFIG_RC_CORE is not set

#
# CEC support
#
# CONFIG_MEDIA_CEC_SUPPORT is not set
# end of CEC support

# CONFIG_MEDIA_SUPPORT is not set

#
# Graphics support
#
# CONFIG_AGP is not set
# CONFIG_DRM is not set

#
# ARM devices
#
# end of ARM devices

#
# Frame buffer Devices
#
# CONFIG_FB is not set
# end of Frame buffer Devices

#
# Backlight & LCD device support
#
# CONFIG_LCD_CLASS_DEVICE is not set
# CONFIG_BACKLIGHT_CLASS_DEVICE is not set
# end of Backlight & LCD device support

#
# Console display driver support
#
# CONFIG_VGA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE=y
CONFIG_DUMMY_CONSOLE_COLUMNS=80
CONFIG_DUMMY_CONSOLE_ROWS=25
# end of Console display driver support
# end of Graphics support

# CONFIG_SOUND is not set

#
# HID support
#
# CONFIG_HID is not set
# end of HID support

CONFIG_USB_OHCI_LITTLE_ENDIAN=y
# CONFIG_USB_SUPPORT is not set
# CONFIG_MMC is not set
# CONFIG_MEMSTICK is not set
# CONFIG_NEW_LEDS is not set
# CONFIG_ACCESSIBILITY is not set
CONFIG_EDAC_ATOMIC_SCRUB=y
CONFIG_EDAC_SUPPORT=y
CONFIG_RTC_LIB=y
# CONFIG_RTC_CLASS is not set
# CONFIG_DMADEVICES is not set

#
# DMABUF options
#
# CONFIG_SYNC_FILE is not set
# CONFIG_DMABUF_HEAPS is not set
# end of DMABUF options

# CONFIG_AUXDISPLAY is not set
# CONFIG_UIO is not set
# CONFIG_VFIO is not set
# CONFIG_VIRT_DRIVERS is not set
# CONFIG_VIRTIO_MENU is not set
# CONFIG_VHOST_MENU is not set

#
# Microsoft Hyper-V guest support
#
# end of Microsoft Hyper-V guest support

# CONFIG_GREYBUS is not set
# CONFIG_COMEDI is not set
# CONFIG_STAGING is not set
# CONFIG_GOLDFISH is not set
# CONFIG_COMMON_CLK is not set
# CONFIG_HWSPINLOCK is not set

#
# Clock Source drivers
#
CONFIG_I8253_LOCK=y
CONFIG_CLKBLD_I8253=y
# CONFIG_MICROCHIP_PIT64B is not set
# end of Clock Source drivers

# CONFIG_MAILBOX is not set
# CONFIG_IOMMU_SUPPORT is not set

#
# Remoteproc drivers
#
# CONFIG_REMOTEPROC is not set
# end of Remoteproc drivers

#
# Rpmsg drivers
#
# CONFIG_RPMSG_VIRTIO is not set
# end of Rpmsg drivers

# CONFIG_SOUNDWIRE is not set

#
# SOC (System On Chip) specific Drivers
#

#
# Amlogic SoC drivers
#
# end of Amlogic SoC drivers

#
# Broadcom SoC drivers
#
# end of Broadcom SoC drivers

#
# NXP/Freescale QorIQ SoC drivers
#
# CONFIG_QUICC_ENGINE is not set
# end of NXP/Freescale QorIQ SoC drivers

#
# i.MX SoC drivers
#
# end of i.MX SoC drivers

#
# Enable LiteX SoC Builder specific drivers
#
# CONFIG_LITEX_SOC_CONTROLLER is not set
# end of Enable LiteX SoC Builder specific drivers

#
# Qualcomm SoC drivers
#
# end of Qualcomm SoC drivers

# CONFIG_SOC_TI is not set

#
# Xilinx SoC drivers
#
# end of Xilinx SoC drivers
# end of SOC (System On Chip) specific Drivers

# CONFIG_PM_DEVFREQ is not set
# CONFIG_EXTCON is not set
# CONFIG_MEMORY is not set
# CONFIG_IIO is not set
# CONFIG_NTB is not set
# CONFIG_VME_BUS is not set
# CONFIG_PWM is not set

#
# IRQ chip support
#
CONFIG_IRQCHIP=y
# CONFIG_AL_FIC is not set
# end of IRQ chip support

# CONFIG_IPACK_BUS is not set
# CONFIG_RESET_CONTROLLER is not set

#
# PHY Subsystem
#
# CONFIG_GENERIC_PHY is not set
# CONFIG_PHY_CAN_TRANSCEIVER is not set

#
# PHY drivers for Broadcom platforms
#
# CONFIG_BCM_KONA_USB2_PHY is not set
# end of PHY drivers for Broadcom platforms

# CONFIG_PHY_CADENCE_DPHY is not set
# CONFIG_PHY_CADENCE_DPHY_RX is not set
# CONFIG_PHY_CADENCE_SALVO is not set
# CONFIG_PHY_PXA_28NM_HSIC is not set
# CONFIG_PHY_PXA_28NM_USB2 is not set
# end of PHY Subsystem

# CONFIG_POWERCAP is not set
# CONFIG_MCB is not set

#
# Performance monitor support
#
# end of Performance monitor support

# CONFIG_RAS is not set
# CONFIG_USB4 is not set

#
# Android
#
CONFIG_ANDROID=y
# CONFIG_ANDROID_BINDER_IPC is not set
# end of Android

# CONFIG_DAX is not set
# CONFIG_NVMEM is not set

#
# HW tracing support
#
# CONFIG_STM is not set
# CONFIG_INTEL_TH is not set
# end of HW tracing support

# CONFIG_FPGA is not set
# CONFIG_FSI is not set
# CONFIG_SIOX is not set
# CONFIG_SLIMBUS is not set
# CONFIG_INTERCONNECT is not set
# CONFIG_COUNTER is not set
# CONFIG_PECI is not set
# end of Device Drivers

#
# File systems
#
CONFIG_DCACHE_WORD_ACCESS=y
# CONFIG_VALIDATE_FS_PARSER is not set
# CONFIG_EXT2_FS is not set
# CONFIG_EXT3_FS is not set
# CONFIG_EXT4_FS is not set
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_GFS2_FS is not set
# CONFIG_BTRFS_FS is not set
# CONFIG_NILFS2_FS is not set
# CONFIG_F2FS_FS is not set
CONFIG_EXPORTFS=y
# CONFIG_EXPORTFS_BLOCK_OPS is not set
CONFIG_FILE_LOCKING=y
# CONFIG_FS_ENCRYPTION is not set
# CONFIG_FS_VERITY is not set
# CONFIG_DNOTIFY is not set
# CONFIG_INOTIFY_USER is not set
# CONFIG_FANOTIFY is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS4_FS is not set
# CONFIG_AUTOFS_FS is not set
# CONFIG_FUSE_FS is not set
# CONFIG_OVERLAY_FS is not set

#
# Caches
#
# CONFIG_FSCACHE is not set
# end of Caches

#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set
# end of CD-ROM/DVD Filesystems

#
# DOS/FAT/EXFAT/NT Filesystems
#
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_EXFAT_FS is not set
# CONFIG_NTFS_FS is not set
# CONFIG_NTFS3_FS is not set
# end of DOS/FAT/EXFAT/NT Filesystems

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
# CONFIG_PROC_KCORE is not set
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
# CONFIG_PROC_CHILDREN is not set
CONFIG_KERNFS=y
CONFIG_SYSFS=y
# CONFIG_TMPFS is not set
CONFIG_ARCH_SUPPORTS_HUGETLBFS=y
# CONFIG_HUGETLBFS is not set
CONFIG_ARCH_HAS_GIGANTIC_PAGE=y
# CONFIG_CONFIGFS_FS is not set
# end of Pseudo filesystems

# CONFIG_MISC_FILESYSTEMS is not set
# CONFIG_NLS is not set
# CONFIG_UNICODE is not set
CONFIG_IO_WQ=y
# end of File systems

#
# Security options
#
# CONFIG_KEYS is not set
# CONFIG_SECURITY_DMESG_RESTRICT is not set
# CONFIG_SECURITY is not set
# CONFIG_SECURITYFS is not set
CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR=y
CONFIG_HARDENED_USERCOPY=y
CONFIG_FORTIFY_SOURCE=y
# CONFIG_STATIC_USERMODEHELPER is not set
CONFIG_DEFAULT_SECURITY_DAC=y
CONFIG_LSM="lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor,bpf"

#
# Kernel hardening options
#

#
# Memory initialization
#
CONFIG_INIT_STACK_NONE=y
CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
CONFIG_INIT_ON_FREE_DEFAULT_ON=y
# end of Memory initialization
# end of Kernel hardening options
# end of Security options

# CONFIG_CRYPTO is not set

#
# Library routines
#
# CONFIG_PACKING is not set
CONFIG_BITREVERSE=y
CONFIG_GENERIC_STRNCPY_FROM_USER=y
CONFIG_GENERIC_STRNLEN_USER=y
# CONFIG_CORDIC is not set
# CONFIG_PRIME_NUMBERS is not set
CONFIG_GENERIC_PCI_IOMAP=y
CONFIG_ARCH_USE_CMPXCHG_LOCKREF=y
CONFIG_ARCH_HAS_FAST_MULTIPLIER=y

#
# Crypto library routines
#
CONFIG_CRYPTO_LIB_BLAKE2S_GENERIC=y
# CONFIG_CRYPTO_LIB_CURVE25519 is not set
CONFIG_CRYPTO_LIB_POLY1305_RSIZE=1
# CONFIG_CRYPTO_LIB_POLY1305 is not set
# end of Crypto library routines

# CONFIG_CRC_CCITT is not set
# CONFIG_CRC16 is not set
# CONFIG_CRC_T10DIF is not set
# CONFIG_CRC64_ROCKSOFT is not set
# CONFIG_CRC_ITU_T is not set
CONFIG_CRC32=y
# CONFIG_CRC32_SELFTEST is not set
CONFIG_CRC32_SLICEBY8=y
# CONFIG_CRC32_SLICEBY4 is not set
# CONFIG_CRC32_SARWATE is not set
# CONFIG_CRC32_BIT is not set
# CONFIG_CRC64 is not set
# CONFIG_CRC4 is not set
# CONFIG_CRC7 is not set
# CONFIG_LIBCRC32C is not set
# CONFIG_CRC8 is not set
# CONFIG_RANDOM32_SELFTEST is not set
CONFIG_ZLIB_DEFLATE=y
# CONFIG_XZ_DEC is not set
CONFIG_XARRAY_MULTI=y
CONFIG_HAS_IOMEM=y
CONFIG_HAS_IOPORT_MAP=y
CONFIG_HAS_DMA=y
CONFIG_DMA_OPS=y
CONFIG_DMA_OPS_BYPASS=y
CONFIG_ARCH_HAS_DMA_MAP_DIRECT=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_ARCH_DMA_ADDR_T_64BIT=y
CONFIG_DMA_DECLARE_COHERENT=y
CONFIG_SWIOTLB=y
# CONFIG_DMA_RESTRICTED_POOL is not set
# CONFIG_DMA_API_DEBUG is not set
CONFIG_IOMMU_HELPER=y
# CONFIG_IRQ_POLL is not set
CONFIG_LIBFDT=y
CONFIG_HAVE_GENERIC_VDSO=y
CONFIG_GENERIC_GETTIMEOFDAY=y
CONFIG_GENERIC_VDSO_TIME_NS=y
CONFIG_ARCH_HAS_PMEM_API=y
CONFIG_ARCH_HAS_MEMREMAP_COMPAT_ALIGN=y
CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE=y
CONFIG_ARCH_HAS_COPY_MC=y
CONFIG_ARCH_STACKWALK=y
CONFIG_SBITMAP=y
# end of Library routines

#
# Kernel hacking
#

#
# printk and dmesg options
#
CONFIG_PRINTK_TIME=y
# CONFIG_PRINTK_CALLER is not set
# CONFIG_STACKTRACE_BUILD_ID is not set
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=7
CONFIG_CONSOLE_LOGLEVEL_QUIET=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
# CONFIG_DYNAMIC_DEBUG is not set
# CONFIG_DYNAMIC_DEBUG_CORE is not set
CONFIG_SYMBOLIC_ERRNAME=y
CONFIG_DEBUG_BUGVERBOSE=y
# end of printk and dmesg options

# CONFIG_DEBUG_KERNEL is not set

#
# Compile-time checks and compiler options
#
CONFIG_FRAME_WARN=2048
# CONFIG_STRIP_ASM_SYMS is not set
# CONFIG_HEADERS_INSTALL is not set
# CONFIG_DEBUG_SECTION_MISMATCH is not set
# CONFIG_SECTION_MISMATCH_WARN_ONLY is not set
# end of Compile-time checks and compiler options

#
# Generic Kernel Debugging Instruments
#
# CONFIG_MAGIC_SYSRQ is not set
# CONFIG_DEBUG_FS is not set
CONFIG_HAVE_ARCH_KGDB=y
CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
# CONFIG_UBSAN is not set
# end of Generic Kernel Debugging Instruments

#
# Networking Debugging
#
# end of Networking Debugging

#
# Memory Debugging
#
# CONFIG_PAGE_EXTENSION is not set
# CONFIG_PAGE_POISONING is not set
# CONFIG_DEBUG_RODATA_TEST is not set
CONFIG_ARCH_HAS_DEBUG_WX=y
# CONFIG_DEBUG_WX is not set
CONFIG_GENERIC_PTDUMP=y
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE=y
# CONFIG_DEBUG_VM_PGTABLE is not set
CONFIG_ARCH_HAS_DEBUG_VIRTUAL=y
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_HAVE_DEBUG_STACKOVERFLOW=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
# end of Memory Debugging

#
# Debug Oops, Lockups and Hangs
#
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
# CONFIG_TEST_LOCKUP is not set
# end of Debug Oops, Lockups and Hangs

#
# Scheduler Debugging
#
# end of Scheduler Debugging

# CONFIG_DEBUG_TIMEKEEPING is not set

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
CONFIG_LOCK_DEBUGGING_SUPPORT=y
# CONFIG_WW_MUTEX_SELFTEST is not set
# end of Lock Debugging (spinlocks, mutexes, etc...)

# CONFIG_DEBUG_IRQFLAGS is not set
# CONFIG_STACKTRACE is not set
# CONFIG_WARN_ALL_UNSEEDED_RANDOM is not set

#
# Debug kernel data structures
#
# CONFIG_BUG_ON_DATA_CORRUPTION is not set
# end of Debug kernel data structures

#
# RCU Debugging
#
CONFIG_RCU_CPU_STALL_TIMEOUT=21
# end of RCU Debugging

CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACING_SUPPORT=y
# CONFIG_FTRACE is not set
CONFIG_SAMPLES=y
# CONFIG_SAMPLE_AUXDISPLAY is not set
# CONFIG_SAMPLE_KOBJECT is not set
# CONFIG_SAMPLE_HW_BREAKPOINT is not set
# CONFIG_SAMPLE_KFIFO is not set
# CONFIG_SAMPLE_WATCHDOG is not set
CONFIG_ARCH_HAS_DEVMEM_IS_ALLOWED=y
# CONFIG_STRICT_DEVMEM is not set

#
# powerpc Debugging
#
# CONFIG_PPC_DISABLE_WERROR is not set
CONFIG_PPC_WERROR=y
CONFIG_PRINT_STACK_DEPTH=64
# CONFIG_JUMP_LABEL_FEATURE_CHECKS is not set
# CONFIG_PPC_IRQ_SOFT_MASK_DEBUG is not set
# CONFIG_PPC_RFI_SRR_DEBUG is not set
# CONFIG_BOOTX_TEXT is not set
# CONFIG_PPC_EARLY_DEBUG is not set
# end of powerpc Debugging

#
# Kernel Testing and Coverage
#
# CONFIG_KUNIT is not set
CONFIG_ARCH_HAS_KCOV=y
CONFIG_CC_HAS_SANCOV_TRACE_PC=y
# CONFIG_KCOV is not set
# CONFIG_RUNTIME_TESTING_MENU is not set
CONFIG_ARCH_USE_MEMTEST=y
# CONFIG_MEMTEST is not set
# end of Kernel Testing and Coverage
# end of Kernel hacking

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-05 21:41 rcu_sched self-detected stall on CPU Miguel Ojeda
@ 2022-04-06  9:31 ` Zhouyi Zhou
  2022-04-06 17:00   ` Paul E. McKenney
  0 siblings, 1 reply; 1651+ messages in thread
From: Zhouyi Zhou @ 2022-04-06  9:31 UTC (permalink / raw)
  To: Miguel Ojeda; +Cc: rcu, linuxppc-dev

Hi

I can reproduce it in a ppc virtual cloud server provided by Oregon
State University.  Following is what I do:
1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
-o linux-5.18-rc1.tar.gz
2) tar zxf linux-5.18-rc1.tar.gz
3) cp config linux-5.18-rc1/.config
4) cd linux-5.18-rc1
5) make vmlinux -j 8
6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
-smp 2 (QEMU 4.2.1)
7) after 12 rounds, the bug got reproduced:
(http://154.223.142.244/logs/20220406/qemu.log.txt)

Cheers ;-)
Zhouyi

On Wed, Apr 6, 2022 at 3:47 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> Hi PPC/RCU,
>
> While merging v5.18-rc1 changes I noticed our CI PPC runs broke. I
> reproduced the problem in v5.18-rc1 as well as next-20220405, under
> both QEMU 4.2.1 and 6.1.0, with `-smp 2`; but I cannot reproduce it in
> v5.17 from a few tries.
>
> Sadly, the problem is not deterministic although it is not too hard to
> reproduce (1 out of 5?). Please see attached config and QEMU output.
>
> Cheers,
> Miguel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-06  9:31 ` Zhouyi Zhou
@ 2022-04-06 17:00   ` Paul E. McKenney
  2022-04-08  7:23     ` Michael Ellerman
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2022-04-06 17:00 UTC (permalink / raw)
  To: Zhouyi Zhou; +Cc: rcu, Miguel Ojeda, linuxppc-dev

On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
> Hi
> 
> I can reproduce it in a ppc virtual cloud server provided by Oregon
> State University.  Following is what I do:
> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
> -o linux-5.18-rc1.tar.gz
> 2) tar zxf linux-5.18-rc1.tar.gz
> 3) cp config linux-5.18-rc1/.config
> 4) cd linux-5.18-rc1
> 5) make vmlinux -j 8
> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
> -smp 2 (QEMU 4.2.1)
> 7) after 12 rounds, the bug got reproduced:
> (http://154.223.142.244/logs/20220406/qemu.log.txt)

Just to make sure, are you both seeing the same thing?  Last I knew,
Zhouyi was chasing an RCU-tasks issue that appears only in kernels
built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
I miss something?

Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
kthread slept for three milliseconds, but did not wake up for more than
20 seconds.  This kthread would normally have awakened on CPU 1, but
CPU 1 looks to me to be very unhealthy, as can be seen in your console
output below (but maybe my idea of what is healthy for powerpc systems
is outdated).  Please see also the inline annotations.

Thoughts from the PPC guys?

							Thanx, Paul

------------------------------------------------------------------------

[   21.186912] rcu: INFO: rcu_sched self-detected stall on CPU
[   21.187331] rcu: 	1-...!: (4712629 ticks this GP) idle=2c1/0/0x3 softirq=8/8 fqs=0 
[   21.187529] 	(t=21000 jiffies g=-1183 q=3)
[   21.187681] rcu: rcu_sched kthread timer wakeup didn't happen for 20997 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402

	The grace-period kthread is still asleep (->state=0x402).
	This indicates that the three-jiffy timer has somehow been
	prevented from expiring for almost a full 21 seconds.  Of course,
	if timers don't work, RCU cannot work.

[   21.187770] rcu: 	Possible timer handling issue on cpu=1 timer-softirq=1
[   21.187927] rcu: rcu_sched kthread starved for 21001 jiffies! g-1183 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1
[   21.188019] rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[   21.188087] rcu: RCU grace-period kthread stack dump:
[   21.188196] task:rcu_sched       state:I stack:    0 pid:   10 ppid:     2 flags:0x00000800
[   21.188453] Call Trace:
[   21.188525] [c0000000061e78a0] [c0000000061e78e0] 0xc0000000061e78e0 (unreliable)
[   21.188900] [c0000000061e7a90] [c000000000017210] __switch_to+0x250/0x310
[   21.189210] [c0000000061e7b00] [c0000000003ed660] __schedule+0x210/0x660
[   21.189315] [c0000000061e7b80] [c0000000003edb14] schedule+0x64/0x110
[   21.189387] [c0000000061e7bb0] [c0000000003f6648] schedule_timeout+0x1d8/0x390
[   21.189473] [c0000000061e7c80] [c00000000011111c] rcu_gp_fqs_loop+0x2dc/0x3d0
[   21.189555] [c0000000061e7d30] [c0000000001144ec] rcu_gp_kthread+0x13c/0x160
[   21.189633] [c0000000061e7dc0] [c0000000000c1770] kthread+0x110/0x120
[   21.189714] [c0000000061e7e10] [c00000000000c9e4] ret_from_kernel_thread+0x5c/0x64

	The above stack trace is expected behavior when the RCU
	grace-period kthread is waiting to do its next FQS scan.

[   21.189938] rcu: Stack dump where RCU GP kthread last ran:

	And here is the stalled CPU, which also happens to be the CPU
	that RCU last ran on:

[   21.189992] Task dump for CPU 1:
[   21.190059] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.190169] Call Trace:
[   21.190194] [c0000000061ef2d0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.190278] [c0000000061ef340] [c000000000116ca0] rcu_check_gp_kthread_starvation+0x16c/0x19c
[   21.190370] [c0000000061ef3c0] [c000000000114f7c] rcu_sched_clock_irq+0x7ec/0xaf0
[   21.190448] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.190524] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.190608] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.190699] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.190837] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170

	Up through this point is just the stack trace of the the
	code doing the stack dump that the RCU CPU stall warning code
	asked for.

[   21.190941] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630

	This NIP does not look at all good to me.  But I freely confess
	that I am out of date on what Power machines do.

[   21.191031] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.191109] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.191274] CFAR: 0000000000000000 IRQMASK: 0 
[   21.191274] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.191274] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.191274] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.191274] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.191274] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.191274] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.191274] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.191274] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.191932] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.192024] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.192118] --- interrupt: 900
[   21.192158] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.192227] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.192307] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.192397] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.192495] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.192566] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.192615] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.192659] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.192755] CFAR: 0000000000000000 IRQMASK: 0 
[   21.192755] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.192755] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.192755] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.192755] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.192755] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.192755] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.192755] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.192755] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.193290] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.193363] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.193428] --- interrupt: 900
[   21.193457] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.193512] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.193590] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.193679] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.193747] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.193901] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.194002] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x14
[   21.194245] Task dump for CPU 1:
[   21.194284] task:swapper/1       state:R  running task     stack:    0 pid:    0 ppid:     1 flags:0x00000804
[   21.194374] Call Trace:
[   21.194400] [c0000000061ef2b0] [c0000000000c9a40] sched_show_task+0x180/0x1c0 (unreliable)
[   21.194479] [c0000000061ef320] [c000000000116df8] rcu_dump_cpu_stacks+0x128/0x188
[   21.194567] [c0000000061ef3c0] [c000000000114f9c] rcu_sched_clock_irq+0x80c/0xaf0
[   21.194642] [c0000000061ef4b0] [c000000000120fdc] update_process_times+0xbc/0x140
[   21.194712] [c0000000061ef4f0] [c000000000136a24] tick_nohz_handler+0xf4/0x1b0
[   21.194828] [c0000000061ef540] [c00000000001c828] timer_interrupt+0x148/0x2d0
[   21.194942] [c0000000061ef590] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.195035] --- interrupt: 900 at arch_local_irq_restore+0x168/0x170
[   21.195104] NIP:  c000000000013608 LR: c0000000003f8114 CTR: c0000000000dc630
[   21.195152] REGS: c0000000061ef600 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.195199] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22000202  XER: 00000000
[   21.195296] CFAR: 0000000000000000 IRQMASK: 0 
[   21.195296] GPR00: c00000000009c368 c0000000061ef8a0 c00000000116a700 0000000000000000 
[   21.195296] GPR04: 0000000000000000 0000000000000000 000000001ee30000 ffffffffffffffff 
[   21.195296] GPR08: 000000001ee30000 0000000000000000 0000000000008002 7265677368657265 
[   21.195296] GPR12: c0000000000dc630 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.195296] GPR16: 0000000000000282 0000000000000000 0000000000000000 c0000000061eff00 
[   21.195296] GPR20: 0000000000000000 0000000000000001 c0000000061b9f80 c000000001195a10 
[   21.195296] GPR24: c000000001193a00 00000000fffb6cc4 000000000000000a c0000000010721e8 
[   21.195296] GPR28: c000000001076800 c000000001070380 c0000000010716d8 c0000000061b9f80 
[   21.195850] NIP [c000000000013608] arch_local_irq_restore+0x168/0x170
[   21.195944] LR [c0000000003f8114] __do_softirq+0xd4/0x2ec
[   21.196027] --- interrupt: 900
[   21.196056] [c0000000061ef8a0] [c0000000061b9f80] 0xc0000000061b9f80 (unreliable)
[   21.196119] [c0000000061ef9b0] [c00000000009c368] irq_exit+0xc8/0x110
[   21.196192] [c0000000061ef9d0] [c00000000001c858] timer_interrupt+0x178/0x2d0
[   21.196282] [c0000000061efa20] [c0000000000098e8] decrementer_common_virt+0x208/0x210
[   21.196373] --- interrupt: 900 at plpar_hcall_norets_notrace+0x18/0x2c
[   21.196439] NIP:  c000000000072988 LR: c000000000074fa8 CTR: c000000000074f10
[   21.196489] REGS: c0000000061efa90 TRAP: 0900   Not tainted  (5.18.0-rc1)
[   21.196534] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28000202  XER: 00000000
[   21.196627] CFAR: 0000000000000000 IRQMASK: 0 
[   21.196627] GPR00: 0000000028000202 c0000000061efd30 c00000000116a700 0000000000000000 
[   21.196627] GPR04: c00000001fea0280 ffffffffffffffff 0000000001f40000 000000019d088fcf 
[   21.196627] GPR08: 000000001ee30000 c00000001ffe5400 0000000000000001 0000000100000000 
[   21.196627] GPR12: c000000000074f10 c00000001ffe5800 0000000000000000 0000000000000000 
[   21.196627] GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000061eff00 
[   21.196627] GPR20: c00000000003d440 0000000000000001 c000000001195b30 c000000001195a10 
[   21.196627] GPR24: 0000000000080000 c0000000061ba000 c000000001195a98 0000000000000001 
[   21.196627] GPR28: 0000000000000001 c0000000010716d0 c0000000010716d8 c0000000010716d0 
[   21.197168] NIP [c000000000072988] plpar_hcall_norets_notrace+0x18/0x2c
[   21.197239] LR [c000000000074fa8] pseries_lpar_idle+0x98/0x1b0
[   21.197305] --- interrupt: 900
[   21.197333] [c0000000061efd30] [0000000000000001] 0x1 (unreliable)
[   21.197390] [c0000000061efdb0] [c000000000018b54] arch_cpu_idle+0x44/0x180
[   21.197470] [c0000000061efde0] [c0000000003f75bc] default_idle_call+0x4c/0x7c
[   21.197556] [c0000000061efe00] [c0000000000e1384] do_idle+0x114/0x1e0
[   21.197620] [c0000000061efe60] [c0000000000e1664] cpu_startup_entry+0x34/0x40
[   21.197696] [c0000000061efe90] [c00000000003f044] start_secondary+0x624/0xa00
[   21.197820] [c0000000061eff90] [c00000000000cd54] start_secondary_prolog+0x10/0x1I4

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-06 17:00   ` Paul E. McKenney
@ 2022-04-08  7:23     ` Michael Ellerman
  2022-04-08 14:42       ` Michael Ellerman
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael Ellerman @ 2022-04-08  7:23 UTC (permalink / raw)
  To: paulmck, Zhouyi Zhou; +Cc: rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

"Paul E. McKenney" <paulmck@kernel.org> writes:
> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
>> Hi
>> 
>> I can reproduce it in a ppc virtual cloud server provided by Oregon
>> State University.  Following is what I do:
>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
>> -o linux-5.18-rc1.tar.gz
>> 2) tar zxf linux-5.18-rc1.tar.gz
>> 3) cp config linux-5.18-rc1/.config
>> 4) cd linux-5.18-rc1
>> 5) make vmlinux -j 8
>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
>> -smp 2 (QEMU 4.2.1)
>> 7) after 12 rounds, the bug got reproduced:
>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
>
> Just to make sure, are you both seeing the same thing?  Last I knew,
> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
> I miss something?
>
> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
> kthread slept for three milliseconds, but did not wake up for more than
> 20 seconds.  This kthread would normally have awakened on CPU 1, but
> CPU 1 looks to me to be very unhealthy, as can be seen in your console
> output below (but maybe my idea of what is healthy for powerpc systems
> is outdated).  Please see also the inline annotations.
>
> Thoughts from the PPC guys?

I haven't seen it in my testing. But using Miguel's config I can
reproduce it seemingly on every boot.

For me it bisects to:

  35de589cb879 ("powerpc/time: improve decrementer clockevent processing")

Which seems plausible.

Reverting that on mainline makes the bug go away.

I don't see an obvious bug in the diff, but I could be wrong, or the old
code was papering over an existing bug?

I'll try and work out what it is about Miguel's config that exposes
this vs our defconfig, that might give us a clue.

cheers

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: rcu_sched self-detected stall on CPU
  2022-04-08  7:23     ` Michael Ellerman
@ 2022-04-08 14:42       ` Michael Ellerman
  2022-04-13  5:11         ` Nicholas Piggin
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael Ellerman @ 2022-04-08 14:42 UTC (permalink / raw)
  To: paulmck, Zhouyi Zhou; +Cc: rcu, Miguel Ojeda, linuxppc-dev, Nicholas Piggin

Michael Ellerman <mpe@ellerman.id.au> writes:
> "Paul E. McKenney" <paulmck@kernel.org> writes:
>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
>>> Hi
>>> 
>>> I can reproduce it in a ppc virtual cloud server provided by Oregon
>>> State University.  Following is what I do:
>>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
>>> -o linux-5.18-rc1.tar.gz
>>> 2) tar zxf linux-5.18-rc1.tar.gz
>>> 3) cp config linux-5.18-rc1/.config
>>> 4) cd linux-5.18-rc1
>>> 5) make vmlinux -j 8
>>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
>>> -smp 2 (QEMU 4.2.1)
>>> 7) after 12 rounds, the bug got reproduced:
>>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
>>
>> Just to make sure, are you both seeing the same thing?  Last I knew,
>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
>> I miss something?
>>
>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
>> kthread slept for three milliseconds, but did not wake up for more than
>> 20 seconds.  This kthread would normally have awakened on CPU 1, but
>> CPU 1 looks to me to be very unhealthy, as can be seen in your console
>> output below (but maybe my idea of what is healthy for powerpc systems
>> is outdated).  Please see also the inline annotations.
>>
>> Thoughts from the PPC guys?
>
> I haven't seen it in my testing. But using Miguel's config I can
> reproduce it seemingly on every boot.
>
> For me it bisects to:
>
>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>
> Which seems plausible.
>
> Reverting that on mainline makes the bug go away.
>
> I don't see an obvious bug in the diff, but I could be wrong, or the old
> code was papering over an existing bug?
>
> I'll try and work out what it is about Miguel's config that exposes
> this vs our defconfig, that might give us a clue.

It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.

I can reproduce just with:

  $ make ppc64le_guest_defconfig
  $ ./scripts/config -d HIGH_RES_TIMERS

We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
realise you could disable it TBH :)

The Rust CI has it disabled because I copied that from the x86 defconfig
they were using back when I added the Rust support. I think that was
meant to be a stripped down fast config for CI, but the result is it's
just using a badly tested combination which is not helpful.

So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
can debug this further without blocking them.

cheers

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2022-04-08 14:42       ` Michael Ellerman
@ 2022-04-13  5:11         ` Nicholas Piggin
  2022-04-22 15:53             ` Re: Thomas Gleixner
  0 siblings, 1 reply; 1651+ messages in thread
From: Nicholas Piggin @ 2022-04-13  5:11 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: Viresh Kumar, Daniel Lezcano, linux-kernel, rcu, Miguel Ojeda,
	Thomas Gleixner, linuxppc-dev

+Daniel, Thomas, Viresh

Subject: Re: rcu_sched self-detected stall on CPU

Excerpts from Michael Ellerman's message of April 9, 2022 12:42 am:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>> "Paul E. McKenney" <paulmck@kernel.org> writes:
>>> On Wed, Apr 06, 2022 at 05:31:10PM +0800, Zhouyi Zhou wrote:
>>>> Hi
>>>> 
>>>> I can reproduce it in a ppc virtual cloud server provided by Oregon
>>>> State University.  Following is what I do:
>>>> 1) curl -l https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/snapshot/linux-5.18-rc1.tar.gz
>>>> -o linux-5.18-rc1.tar.gz
>>>> 2) tar zxf linux-5.18-rc1.tar.gz
>>>> 3) cp config linux-5.18-rc1/.config
>>>> 4) cd linux-5.18-rc1
>>>> 5) make vmlinux -j 8
>>>> 6) qemu-system-ppc64 -kernel vmlinux -nographic -vga none -no-reboot
>>>> -smp 2 (QEMU 4.2.1)
>>>> 7) after 12 rounds, the bug got reproduced:
>>>> (http://154.223.142.244/logs/20220406/qemu.log.txt)
>>>
>>> Just to make sure, are you both seeing the same thing?  Last I knew,
>>> Zhouyi was chasing an RCU-tasks issue that appears only in kernels
>>> built with CONFIG_PROVE_RCU=y, which Miguel does not have set.  Or did
>>> I miss something?
>>>
>>> Miguel is instead seeing an RCU CPU stall warning where RCU's grace-period
>>> kthread slept for three milliseconds, but did not wake up for more than
>>> 20 seconds.  This kthread would normally have awakened on CPU 1, but
>>> CPU 1 looks to me to be very unhealthy, as can be seen in your console
>>> output below (but maybe my idea of what is healthy for powerpc systems
>>> is outdated).  Please see also the inline annotations.
>>>
>>> Thoughts from the PPC guys?
>>
>> I haven't seen it in my testing. But using Miguel's config I can
>> reproduce it seemingly on every boot.
>>
>> For me it bisects to:
>>
>>   35de589cb879 ("powerpc/time: improve decrementer clockevent processing")
>>
>> Which seems plausible.
>>
>> Reverting that on mainline makes the bug go away.
>>
>> I don't see an obvious bug in the diff, but I could be wrong, or the old
>> code was papering over an existing bug?
>>
>> I'll try and work out what it is about Miguel's config that exposes
>> this vs our defconfig, that might give us a clue.
> 
> It's CONFIG_HIGH_RES_TIMERS=n which triggers the stall.
> 
> I can reproduce just with:
> 
>   $ make ppc64le_guest_defconfig
>   $ ./scripts/config -d HIGH_RES_TIMERS
> 
> We have no defconfigs that disable HIGH_RES_TIMERS, I didn't even
> realise you could disable it TBH :)
> 
> The Rust CI has it disabled because I copied that from the x86 defconfig
> they were using back when I added the Rust support. I think that was
> meant to be a stripped down fast config for CI, but the result is it's
> just using a badly tested combination which is not helpful.
> 
> So I'll send a patch to turn HIGH_RES_TIMERS on for the Rust CI, and we
> can debug this further without blocking them.

So we traced the problem down to possibly a misunderstanding between 
decrementer clock event device and core code.

The decrementer is only oneshot*ish*. It actually needs to either be 
reprogrammed or shut down otherwise it just continues to cause 
interrupts.

Before commit 35de589cb879, it was sort of two-shot. The initial 
interrupt at the programmed time would set its internal next_tb variable 
to ~0 and call the ->event_handler(). If that did not set_next_event or 
stop the timer, the interrupt will fire again immediately, notice 
next_tb is ~0, and only then stop the decrementer interrupt.

So that was already kind of ugly, this patch just turned it into a hang.

The problem happens when the tick is stopped with an event still 
pending, then tick_nohz_handler() is called, but it bails out because 
tick_stopped == 1 so the device never gets programmed again, and so it 
keeps firing.

How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
really oneshot, but we would like to avoid doing that because it requires 
additional programming of the hardware on each timer interrupt. We have 
the ONESHOT_STOPPED state which seems to be just about what we want.

Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
we don't stop it here? This patch seems to fix the hang (not heavily
tested though).
 
Thanks,
Nick

---
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 2d76c91b85de..7e13a55b6b71 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1364,9 +1364,11 @@ static void tick_nohz_handler(struct clock_event_device *dev)
 	tick_sched_do_timer(ts, now);
 	tick_sched_handle(ts, regs);
 
-	/* No need to reprogram if we are running tickless  */
-	if (unlikely(ts->tick_stopped))
+	if (unlikely(ts->tick_stopped)) {
+		/* If we are tickless, change the clock event to stopped */
+		tick_program_event(KTIME_MAX, 1);
 		return;
+	}
 
 	hrtimer_forward(&ts->sched_timer, now, TICK_NSEC);
 	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2022-04-13  5:11         ` Nicholas Piggin
@ 2022-04-22 15:53             ` Thomas Gleixner
  0 siblings, 0 replies; 1651+ messages in thread
From: Thomas Gleixner @ 2022-04-22 15:53 UTC (permalink / raw)
  To: Nicholas Piggin, Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: Viresh Kumar, Daniel Lezcano, linux-kernel, rcu, Miguel Ojeda,
	linuxppc-dev

On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
>
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.

I always thought that PPC had sane timers. That's really disillusioning.

> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
>
> So that was already kind of ugly, this patch just turned it into a hang.
>
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
>
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
>
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).

This was definitely overlooked, but it's arguable it is is not required
for real oneshot clockevent devices. This should only handle the case
where the interrupt was already pending.

The ONESHOT_STOPPED state was introduced to handle the case where the
last timer gets canceled, so the already programmed event does not fire.

It was not necessarily meant to "fix" clockevent devices which are
pretending to be ONESHOT, but keep firing over and over.

That, said. I'm fine with the change along with a big fat comment why
this is required.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2022-04-22 15:53             ` Thomas Gleixner
  0 siblings, 0 replies; 1651+ messages in thread
From: Thomas Gleixner @ 2022-04-22 15:53 UTC (permalink / raw)
  To: Nicholas Piggin, Michael Ellerman, paulmck, Zhouyi Zhou
  Cc: linuxppc-dev, Miguel Ojeda, rcu, Daniel Lezcano, linux-kernel,
	Viresh Kumar

On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
> So we traced the problem down to possibly a misunderstanding between 
> decrementer clock event device and core code.
>
> The decrementer is only oneshot*ish*. It actually needs to either be 
> reprogrammed or shut down otherwise it just continues to cause 
> interrupts.

I always thought that PPC had sane timers. That's really disillusioning.

> Before commit 35de589cb879, it was sort of two-shot. The initial 
> interrupt at the programmed time would set its internal next_tb variable 
> to ~0 and call the ->event_handler(). If that did not set_next_event or 
> stop the timer, the interrupt will fire again immediately, notice 
> next_tb is ~0, and only then stop the decrementer interrupt.
>
> So that was already kind of ugly, this patch just turned it into a hang.
>
> The problem happens when the tick is stopped with an event still 
> pending, then tick_nohz_handler() is called, but it bails out because 
> tick_stopped == 1 so the device never gets programmed again, and so it 
> keeps firing.
>
> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
> really oneshot, but we would like to avoid doing that because it requires 
> additional programming of the hardware on each timer interrupt. We have 
> the ONESHOT_STOPPED state which seems to be just about what we want.
>
> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
> we don't stop it here? This patch seems to fix the hang (not heavily
> tested though).

This was definitely overlooked, but it's arguable it is is not required
for real oneshot clockevent devices. This should only handle the case
where the interrupt was already pending.

The ONESHOT_STOPPED state was introduced to handle the case where the
last timer gets canceled, so the already programmed event does not fire.

It was not necessarily meant to "fix" clockevent devices which are
pretending to be ONESHOT, but keep firing over and over.

That, said. I'm fine with the change along with a big fat comment why
this is required.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-04-22 15:53             ` Re: Thomas Gleixner
@ 2022-04-23  2:29               ` Nicholas Piggin
  -1 siblings, 0 replies; 1651+ messages in thread
From: Nicholas Piggin @ 2022-04-23  2:29 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Thomas Gleixner, Zhouyi Zhou
  Cc: Viresh,  Kumar, Daniel,  Lezcano, linux-kernel, rcu, Miguel Ojeda,
	linuxppc-dev

Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am:
> On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
>> So we traced the problem down to possibly a misunderstanding between 
>> decrementer clock event device and core code.
>>
>> The decrementer is only oneshot*ish*. It actually needs to either be 
>> reprogrammed or shut down otherwise it just continues to cause 
>> interrupts.
> 
> I always thought that PPC had sane timers. That's really disillusioning.

My comment was probably a bit misleading explanation of the whole
situation. This weirdness is actually in software in the powerpc
clock event driver due to a recent change I made assuming the clock 
event goes to oneshot-stopped.

The hardware is relatively sane I think, global synchronized constant
rate high frequency clock distributed to the CPUs so reads don't
go off-core. And per-CPU "decrementer" event interrupt at the same
frequency as the clock -- program it to a +ve value and it decrements
until zero then creates basically a level triggered interrupt.

Before my change, the decrementer interrupt would always clear the
interrupt at entry. The event_handler usually programs another
timer in so I tried to avoid that first clear counting on the
oneshot_stopped callback to clear the interrupt if there was no
other timer.

>> Before commit 35de589cb879, it was sort of two-shot. The initial 
>> interrupt at the programmed time would set its internal next_tb variable 
>> to ~0 and call the ->event_handler(). If that did not set_next_event or 
>> stop the timer, the interrupt will fire again immediately, notice 
>> next_tb is ~0, and only then stop the decrementer interrupt.
>>
>> So that was already kind of ugly, this patch just turned it into a hang.
>>
>> The problem happens when the tick is stopped with an event still 
>> pending, then tick_nohz_handler() is called, but it bails out because 
>> tick_stopped == 1 so the device never gets programmed again, and so it 
>> keeps firing.
>>
>> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
>> really oneshot, but we would like to avoid doing that because it requires 
>> additional programming of the hardware on each timer interrupt. We have 
>> the ONESHOT_STOPPED state which seems to be just about what we want.
>>
>> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
>> we don't stop it here? This patch seems to fix the hang (not heavily
>> tested though).
> 
> This was definitely overlooked, but it's arguable it is is not required
> for real oneshot clockevent devices. This should only handle the case
> where the interrupt was already pending.
> 
> The ONESHOT_STOPPED state was introduced to handle the case where the
> last timer gets canceled, so the already programmed event does not fire.
> 
> It was not necessarily meant to "fix" clockevent devices which are
> pretending to be ONESHOT, but keep firing over and over.
> 
> That, said. I'm fine with the change along with a big fat comment why
> this is required.

Thanks for taking a look and confirming. I just sent a patch with a
comment and what looks like another missed case. Hopefully it's okay.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2022-04-23  2:29               ` Nicholas Piggin
  0 siblings, 0 replies; 1651+ messages in thread
From: Nicholas Piggin @ 2022-04-23  2:29 UTC (permalink / raw)
  To: Michael Ellerman, paulmck, Thomas Gleixner, Zhouyi Zhou
  Cc: Daniel,  Lezcano, linux-kernel, linuxppc-dev, Miguel Ojeda, rcu,
	Viresh,  Kumar

Excerpts from Thomas Gleixner's message of April 23, 2022 1:53 am:
> On Wed, Apr 13 2022 at 15:11, Nicholas Piggin wrote:
>> So we traced the problem down to possibly a misunderstanding between 
>> decrementer clock event device and core code.
>>
>> The decrementer is only oneshot*ish*. It actually needs to either be 
>> reprogrammed or shut down otherwise it just continues to cause 
>> interrupts.
> 
> I always thought that PPC had sane timers. That's really disillusioning.

My comment was probably a bit misleading explanation of the whole
situation. This weirdness is actually in software in the powerpc
clock event driver due to a recent change I made assuming the clock 
event goes to oneshot-stopped.

The hardware is relatively sane I think, global synchronized constant
rate high frequency clock distributed to the CPUs so reads don't
go off-core. And per-CPU "decrementer" event interrupt at the same
frequency as the clock -- program it to a +ve value and it decrements
until zero then creates basically a level triggered interrupt.

Before my change, the decrementer interrupt would always clear the
interrupt at entry. The event_handler usually programs another
timer in so I tried to avoid that first clear counting on the
oneshot_stopped callback to clear the interrupt if there was no
other timer.

>> Before commit 35de589cb879, it was sort of two-shot. The initial 
>> interrupt at the programmed time would set its internal next_tb variable 
>> to ~0 and call the ->event_handler(). If that did not set_next_event or 
>> stop the timer, the interrupt will fire again immediately, notice 
>> next_tb is ~0, and only then stop the decrementer interrupt.
>>
>> So that was already kind of ugly, this patch just turned it into a hang.
>>
>> The problem happens when the tick is stopped with an event still 
>> pending, then tick_nohz_handler() is called, but it bails out because 
>> tick_stopped == 1 so the device never gets programmed again, and so it 
>> keeps firing.
>>
>> How to fix it? Before commit a7cba02deced, powerpc's decrementer was 
>> really oneshot, but we would like to avoid doing that because it requires 
>> additional programming of the hardware on each timer interrupt. We have 
>> the ONESHOT_STOPPED state which seems to be just about what we want.
>>
>> Did the ONESHOT_STOPPED patch just miss this case, or is there a reason 
>> we don't stop it here? This patch seems to fix the hang (not heavily
>> tested though).
> 
> This was definitely overlooked, but it's arguable it is is not required
> for real oneshot clockevent devices. This should only handle the case
> where the interrupt was already pending.
> 
> The ONESHOT_STOPPED state was introduced to handle the case where the
> last timer gets canceled, so the already programmed event does not fire.
> 
> It was not necessarily meant to "fix" clockevent devices which are
> pretending to be ONESHOT, but keep firing over and over.
> 
> That, said. I'm fine with the change along with a big fat comment why
> this is required.

Thanks for taking a look and confirming. I just sent a patch with a
comment and what looks like another missed case. Hopefully it's okay.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-04-06  7:51 Christian König
  2022-04-06 12:59 ` Daniel Vetter
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2022-04-06  7:51 UTC (permalink / raw)
  To: daniel.vetter, dri-devel

Hi Daniel,

rebased on top of all the changes in drm-misc-next now and hopefully
ready for 5.19.

I think I addressed all concern, but there was a bunch of rebase fallout
from i915, so better to double check that once more.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-04-06  7:51 Christian König
@ 2022-04-06 12:59 ` Daniel Vetter
  0 siblings, 0 replies; 1651+ messages in thread
From: Daniel Vetter @ 2022-04-06 12:59 UTC (permalink / raw)
  To: DMA-resvusage; +Cc: daniel.vetter, dri-devel

On Wed, Apr 06, 2022 at 09:51:16AM +0200, Christian König wrote:
> Hi Daniel,
> 
> rebased on top of all the changes in drm-misc-next now and hopefully
> ready for 5.19.
> 
> I think I addressed all concern, but there was a bunch of rebase fallout
> from i915, so better to double check that once more.

No idea what you managed to do with this series, but
- cover letter isn't showing up in archives
- you have Reply-To: DMA-resvusage sprinkled all over, which means my
  replies are bouncing in funny ways

Please fix for next time around.

Also the split up patches lack a bit cc.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus
@ 2022-04-17 17:43 Richard Henderson
  2022-04-17 17:43 ` [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 to bool Richard Henderson
  0 siblings, 1 reply; 1651+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Supercedes: 20220412003326.588530-1-richard.henderson@linaro.org
("target/arm: 8 new features, A76 and N1")

Changes for v3:
  * More field updates for H.a.  This is not nearly complete, but what
    I've encountered so far as I've begun implementing SME.
  * Use bool instead of uint32_t for env->{aarch64,thumb}.
    I had anticipated other changes for implementing PSTATE.{SM,FA},
    but dropped those; these seemed like worth keeping.
  * Use tcg_constant_* more -- got stuck on this while working on...
  * Lots of cleanups to ARMCPRegInfo.
  * Discard unreachable cpregs when ELx not available.
  * Transform EL2 regs to RES0 when EL3 present but EL2 isn't.
    This greatly simplifies registration of cpregs for new features.
    Changes contextidr_el2, minimal_ras_reginfo, scxtnum_reginfo
    within this patch set; other uses coming for SME.


r~


Richard Henderson (60):
  tcg: Add tcg_constant_ptr
  target/arm: Update ISAR fields for ARMv8.8
  target/arm: Update SCR_EL3 bits to ARMv8.8
  target/arm: Update SCTLR bits to ARMv9.2
  target/arm: Change DisasContext.aarch64 to bool
  target/arm: Change CPUArchState.aarch64 to bool
  target/arm: Extend store_cpu_offset to take field size
  target/arm: Change DisasContext.thumb to bool
  target/arm: Change CPUArchState.thumb to bool
  target/arm: Remove fpexc32_access
  target/arm: Split out set_btype_raw
  target/arm: Split out gen_rebuild_hflags
  target/arm: Use tcg_constant in translate-a64.c
  target/arm: Simplify GEN_SHIFT in translate.c
  target/arm: Simplify gen_sar
  target/arm: Simplify aa32 DISAS_WFI
  target/arm: Use tcg_constant in translate.c
  target/arm: Use tcg_constant in translate-m-nocp.c
  target/arm: Use tcg_constant in translate-neon.c
  target/arm: Use smin/smax for do_sat_addsub_32
  target/arm: Use tcg_constant in translate-sve.c
  target/arm: Use tcg_constant in translate-vfp.c
  target/arm: Use tcg_constant_i32 in translate.h
  target/arm: Split out cpregs.h
  target/arm: Reorg CPAccessResult and access_check_cp_reg
  target/arm: Replace sentinels with ARRAY_SIZE in cpregs.h
  target/arm: Make some more cpreg data static const
  target/arm: Reorg ARMCPRegInfo type field bits
  target/arm: Change cpreg access permissions to enum
  target/arm: Name CPState type
  target/arm: Name CPSecureState type
  target/arm: Update sysreg fields when redirecting for E2H
  target/arm: Store cpregs key in the hash table directly
  target/arm: Cleanup add_cpreg_to_hashtable
  target/arm: Handle cpreg registration for missing EL
  target/arm: Drop EL3 no EL2 fallbacks
  target/arm: Merge zcr reginfo
  target/arm: Add isar predicates for FEAT_Debugv8p2
  target/arm: Adjust definition of CONTEXTIDR_EL2
  target/arm: Move cortex impdef sysregs to cpu_tcg.c
  target/arm: Update qemu-system-arm -cpu max to cortex-a57
  target/arm: Set ID_DFR0.PerfMon for qemu-system-arm -cpu max
  target/arm: Split out aa32_max_features
  target/arm: Annotate arm_max_initfn with FEAT identifiers
  target/arm: Use field names for manipulating EL2 and EL3 modes
  target/arm: Enable FEAT_Debugv8p2 for -cpu max
  target/arm: Enable FEAT_Debugv8p4 for -cpu max
  target/arm: Add isar_feature_{aa64,any}_ras
  target/arm: Add minimal RAS registers
  target/arm: Enable SCR and HCR bits for RAS
  target/arm: Implement virtual SError exceptions
  target/arm: Implement ESB instruction
  target/arm: Enable FEAT_RAS for -cpu max
  target/arm: Enable FEAT_IESB for -cpu max
  target/arm: Enable FEAT_CSV2 for -cpu max
  target/arm: Enable FEAT_CSV2_2 for -cpu max
  target/arm: Enable FEAT_CSV3 for -cpu max
  target/arm: Enable FEAT_DGH for -cpu max
  target/arm: Define cortex-a76
  target/arm: Define neoverse-n1

 docs/system/arm/emulation.rst |  10 +
 docs/system/arm/virt.rst      |   2 +
 include/tcg/tcg.h             |   2 +
 target/arm/cpregs.h           | 459 +++++++++++++++++
 target/arm/cpu.h              | 475 ++++--------------
 target/arm/helper.h           |   1 +
 target/arm/internals.h        |  16 +
 target/arm/syndrome.h         |   5 +
 target/arm/translate-a32.h    |  13 +-
 target/arm/translate.h        |  17 +-
 target/arm/a32.decode         |  16 +-
 target/arm/t32.decode         |  18 +-
 hw/arm/pxa2xx.c               |   2 +-
 hw/arm/pxa2xx_pic.c           |   2 +-
 hw/arm/sbsa-ref.c             |   2 +
 hw/arm/virt.c                 |   2 +
 hw/intc/arm_gicv3_cpuif.c     |   6 +-
 hw/intc/arm_gicv3_kvm.c       |   3 +-
 linux-user/arm/cpu_loop.c     |   2 +-
 target/arm/cpu.c              |  88 ++--
 target/arm/cpu64.c            | 349 +++++++------
 target/arm/cpu_tcg.c          | 232 ++++++---
 target/arm/gdbstub.c          |   5 +-
 target/arm/helper-a64.c       |   4 +-
 target/arm/helper.c           | 897 ++++++++++++++++++----------------
 target/arm/hvf/hvf.c          |   2 +-
 target/arm/m_helper.c         |   6 +-
 target/arm/op_helper.c        | 113 +++--
 target/arm/translate-a64.c    | 395 ++++++---------
 target/arm/translate-m-nocp.c |  12 +-
 target/arm/translate-neon.c   |  21 +-
 target/arm/translate-sve.c    | 207 +++-----
 target/arm/translate-vfp.c    |  76 +--
 target/arm/translate.c        | 400 +++++++--------
 34 files changed, 2026 insertions(+), 1834 deletions(-)
 create mode 100644 target/arm/cpregs.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 to bool
  2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
@ 2022-04-17 17:43 ` Richard Henderson
  2022-04-19 11:17   ` Alex Bennée
  0 siblings, 1 reply; 1651+ messages in thread
From: Richard Henderson @ 2022-04-17 17:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-arm

Bool is a more appropriate type for this value.
Adjust the assignments to use true/false.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.h        | 2 +-
 target/arm/cpu.c        | 2 +-
 target/arm/helper-a64.c | 4 ++--
 target/arm/helper.c     | 2 +-
 target/arm/hvf/hvf.c    | 2 +-
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9ae9c935a2..a61a52e2f6 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -258,7 +258,7 @@ typedef struct CPUArchState {
      *  all other bits are stored in their correct places in env->pstate
      */
     uint32_t pstate;
-    uint32_t aarch64; /* 1 if CPU is in aarch64 state; inverse of PSTATE.nRW */
+    bool aarch64; /* True if CPU is in aarch64 state; inverse of PSTATE.nRW */
 
     /* Cached TBFLAGS state.  See below for which bits are included.  */
     CPUARMTBFlags hflags;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5d4ca7a227..30e0d16ad4 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -189,7 +189,7 @@ static void arm_cpu_reset(DeviceState *dev)
 
     if (arm_feature(env, ARM_FEATURE_AARCH64)) {
         /* 64 bit CPUs always start in 64 bit mode */
-        env->aarch64 = 1;
+        env->aarch64 = true;
 #if defined(CONFIG_USER_ONLY)
         env->pstate = PSTATE_MODE_EL0t;
         /* Userspace expects access to DC ZVA, CTL_EL0 and the cache ops */
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index 7cf953b1e6..77a8502b6b 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -952,7 +952,7 @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
     qemu_mutex_unlock_iothread();
 
     if (!return_to_aa64) {
-        env->aarch64 = 0;
+        env->aarch64 = false;
         /* We do a raw CPSR write because aarch64_sync_64_to_32()
          * will sort the register banks out for us, and we've already
          * caught all the bad-mode cases in el_from_spsr().
@@ -975,7 +975,7 @@ void HELPER(exception_return)(CPUARMState *env, uint64_t new_pc)
     } else {
         int tbii;
 
-        env->aarch64 = 1;
+        env->aarch64 = true;
         spsr &= aarch64_pstate_valid_mask(&env_archcpu(env)->isar);
         pstate_write(env, spsr);
         if (!arm_singlestep_active(env)) {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 7d14650615..47fe790854 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10182,7 +10182,7 @@ static void arm_cpu_do_interrupt_aarch64(CPUState *cs)
     }
 
     pstate_write(env, PSTATE_DAIF | new_mode);
-    env->aarch64 = 1;
+    env->aarch64 = true;
     aarch64_restore_sp(env, new_el);
     helper_rebuild_hflags_a64(env, new_el);
 
diff --git a/target/arm/hvf/hvf.c b/target/arm/hvf/hvf.c
index 8c34f86792..11176ef252 100644
--- a/target/arm/hvf/hvf.c
+++ b/target/arm/hvf/hvf.c
@@ -565,7 +565,7 @@ int hvf_arch_init_vcpu(CPUState *cpu)
     hv_return_t ret;
     int i;
 
-    env->aarch64 = 1;
+    env->aarch64 = true;
     asm volatile("mrs %0, cntfrq_el0" : "=r"(arm_cpu->gt_cntfrq_hz));
 
     /* Allocate enough space for our sysreg sync */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2022-04-17 17:43 ` [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 to bool Richard Henderson
@ 2022-04-19 11:17   ` Alex Bennée
  0 siblings, 0 replies; 1651+ messages in thread
From: Alex Bennée @ 2022-04-19 11:17 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, qemu-arm


Richard Henderson <richard.henderson@linaro.org> writes:

> Bool is a more appropriate type for this value.
> Adjust the assignments to use true/false.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr
@ 2022-05-15 20:36 Jiri Olsa
  2023-05-20  9:47 ` Ze Gao
  0 siblings, 1 reply; 1651+ messages in thread
From: Jiri Olsa @ 2022-05-15 20:36 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Paul E. McKenney
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Steven Rostedt

Making arch_cpu_idle and rcu_idle_exit noinstr. Both functions run
in rcu 'not watching' context and if there's tracer attached to
them, which uses rcu (e.g. kprobe multi interface) it will hit RCU
warning like:

  [    3.017540] WARNING: suspicious RCU usage
  ...
  [    3.018363]  kprobe_multi_link_handler+0x68/0x1c0
  [    3.018364]  ? kprobe_multi_link_handler+0x3e/0x1c0
  [    3.018366]  ? arch_cpu_idle_dead+0x10/0x10
  [    3.018367]  ? arch_cpu_idle_dead+0x10/0x10
  [    3.018371]  fprobe_handler.part.0+0xab/0x150
  [    3.018374]  0xffffffffa00080c8
  [    3.018393]  ? arch_cpu_idle+0x5/0x10
  [    3.018398]  arch_cpu_idle+0x5/0x10
  [    3.018399]  default_idle_call+0x59/0x90
  [    3.018401]  do_idle+0x1c3/0x1d0

The call path is following:

default_idle_call
  rcu_idle_enter
  arch_cpu_idle
  rcu_idle_exit

The arch_cpu_idle and rcu_idle_exit are the only ones from above
path that are traceble and cause this problem on my setup.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/kernel/process.c | 2 +-
 kernel/rcu/tree.c         | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index b370767f5b19..1345cb0124a6 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -720,7 +720,7 @@ void arch_cpu_idle_dead(void)
 /*
  * Called from the generic idle code.
  */
-void arch_cpu_idle(void)
+void noinstr arch_cpu_idle(void)
 {
 	x86_idle();
 }
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index a4b8189455d5..20d529722f51 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -896,7 +896,7 @@ static void noinstr rcu_eqs_exit(bool user)
  * If you add or remove a call to rcu_idle_exit(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_idle_exit(void)
+void noinstr rcu_idle_exit(void)
 {
 	unsigned long flags;
 
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* (no subject)
  2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
@ 2023-05-20  9:47 ` Ze Gao
  2023-05-21  3:58   ` Yonghong Song
  2023-05-21  8:08   ` Re: Jiri Olsa
  0 siblings, 2 replies; 1651+ messages in thread
From: Ze Gao @ 2023-05-20  9:47 UTC (permalink / raw)
  To: jolsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao


Hi Jiri,

Would you like to consider to add rcu_is_watching check in
to solve this from the viewpoint of kprobe_multi_link_prog_run
itself? And accounting of missed runs can be added as well
to imporve observability.

Regards,
Ze


-----------------
From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
From: Ze Gao <zegao@tencent.com>
Date: Sat, 20 May 2023 17:32:05 +0800
Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching

From the perspective of kprobe_multi_link_prog_run, any traceable
functions can be attached while bpf progs need specical care and
ought to be under rcu protection. To solve the likely rcu lockdep
warns once for good, when (future) functions in idle path were
attached accidentally, we better paying some cost to check at least
in kernel-side, and return when rcu is not watching, which helps
to avoid any unpredictable results.

Signed-off-by: Ze Gao <zegao@tencent.com>
---
 kernel/trace/bpf_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 9a050e36dc6c..3e6ea7274765 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
 	struct bpf_run_ctx *old_run_ctx;
 	int err;
 
-	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
+	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
 		err = 0;
 		goto out;
 	}
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-20  9:47 ` Ze Gao
@ 2023-05-21  3:58   ` Yonghong Song
  2023-05-21 15:10     ` Re: Ze Gao
  2023-05-21  8:08   ` Re: Jiri Olsa
  1 sibling, 1 reply; 1651+ messages in thread
From: Yonghong Song @ 2023-05-21  3:58 UTC (permalink / raw)
  To: Ze Gao, jolsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao



On 5/20/23 2:47 AM, Ze Gao wrote:
> 
> Hi Jiri,
> 
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run
> itself? And accounting of missed runs can be added as well
> to imporve observability.
> 
> Regards,
> Ze
> 
> 
> -----------------
>  From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> 
>  From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.

kprobe_multi/fprobe share the same set of attachments with fentry.
Currently, fentry does not filter with !rcu_is_watching, maybe
because this is an extreme corner case. Not sure whether it is
worthwhile or not.

Maybe if you can give a concrete example (e.g., attachment point)
with current code base to show what the issue you encountered and
it will make it easier to judge whether adding !rcu_is_watching()
is necessary or not.

> 
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
>   kernel/trace/bpf_trace.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
>   	struct bpf_run_ctx *old_run_ctx;
>   	int err;
>   
> -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
>   		err = 0;
>   		goto out;
>   	}

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-21  3:58   ` Yonghong Song
@ 2023-05-21 15:10     ` Ze Gao
  2023-05-21 20:26       ` Re: Jiri Olsa
  0 siblings, 1 reply; 1651+ messages in thread
From: Ze Gao @ 2023-05-21 15:10 UTC (permalink / raw)
  To: Yonghong Song
  Cc: jolsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
	Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

> kprobe_multi/fprobe share the same set of attachments with fentry.
> Currently, fentry does not filter with !rcu_is_watching, maybe
> because this is an extreme corner case. Not sure whether it is
> worthwhile or not.

Agreed, it's rare, especially after Peter's patches which push narrow
down rcu eqs regions
in the idle path and reduce the chance of any traceable functions
happening in between.

However, from RCU's perspective, we ought to check if rcu_is_watching
theoretically
when there's a chance our code will run in the idle path and also we
need rcu to be alive,
And also we cannot simply make assumptions for any future changes in
the idle path.
You know, just like what was hit in the thread.

> Maybe if you can give a concrete example (e.g., attachment point)
> with current code base to show what the issue you encountered and
> it will make it easier to judge whether adding !rcu_is_watching()
> is necessary or not.

I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
traceable but not on the latest version
so far. But as I state above, in theory we need it. So here is a
gentle ping :) .

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-21 15:10     ` Re: Ze Gao
@ 2023-05-21 20:26       ` Jiri Olsa
  2023-05-22  1:36         ` Re: Masami Hiramatsu
  2023-05-22  2:07         ` Re: Ze Gao
  0 siblings, 2 replies; 1651+ messages in thread
From: Jiri Olsa @ 2023-05-21 20:26 UTC (permalink / raw)
  To: Ze Gao
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > kprobe_multi/fprobe share the same set of attachments with fentry.
> > Currently, fentry does not filter with !rcu_is_watching, maybe
> > because this is an extreme corner case. Not sure whether it is
> > worthwhile or not.
> 
> Agreed, it's rare, especially after Peter's patches which push narrow
> down rcu eqs regions
> in the idle path and reduce the chance of any traceable functions
> happening in between.
> 
> However, from RCU's perspective, we ought to check if rcu_is_watching
> theoretically
> when there's a chance our code will run in the idle path and also we
> need rcu to be alive,
> And also we cannot simply make assumptions for any future changes in
> the idle path.
> You know, just like what was hit in the thread.
> 
> > Maybe if you can give a concrete example (e.g., attachment point)
> > with current code base to show what the issue you encountered and
> > it will make it easier to judge whether adding !rcu_is_watching()
> > is necessary or not.
> 
> I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> traceable but not on the latest version
> so far. But as I state above, in theory we need it. So here is a
> gentle ping :) .

hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]

I might be missing something, but it seems like we don't need another
rcu_is_watching call on kprobe_multi level

jirka


[1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
[2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-21 20:26       ` Re: Jiri Olsa
@ 2023-05-22  1:36         ` Masami Hiramatsu
  2023-05-22  2:07         ` Re: Ze Gao
  1 sibling, 0 replies; 1651+ messages in thread
From: Masami Hiramatsu @ 2023-05-22  1:36 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ze Gao, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, 21 May 2023 22:26:37 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Sun, May 21, 2023 at 11:10:16PM +0800, Ze Gao wrote:
> > > kprobe_multi/fprobe share the same set of attachments with fentry.
> > > Currently, fentry does not filter with !rcu_is_watching, maybe
> > > because this is an extreme corner case. Not sure whether it is
> > > worthwhile or not.
> > 
> > Agreed, it's rare, especially after Peter's patches which push narrow
> > down rcu eqs regions
> > in the idle path and reduce the chance of any traceable functions
> > happening in between.
> > 
> > However, from RCU's perspective, we ought to check if rcu_is_watching
> > theoretically
> > when there's a chance our code will run in the idle path and also we
> > need rcu to be alive,
> > And also we cannot simply make assumptions for any future changes in
> > the idle path.
> > You know, just like what was hit in the thread.
> > 
> > > Maybe if you can give a concrete example (e.g., attachment point)
> > > with current code base to show what the issue you encountered and
> > > it will make it easier to judge whether adding !rcu_is_watching()
> > > is necessary or not.
> > 
> > I can reproduce likely warnings on v6.1.18 where arch_cpu_idle is
> > traceable but not on the latest version
> > so far. But as I state above, in theory we need it. So here is a
> > gentle ping :) .
> 
> hum, this change [1] added rcu_is_watching check to ftrace_test_recursion_trylock,
> which we use in fprobe_handler and is coming to fprobe_exit_handler in [2]
> 
> I might be missing something, but it seems like we don't need another
> rcu_is_watching call on kprobe_multi level

Good point! OK, then it seems we don't need it. The rethook continues to
use the rcu_is_watching() because it is also used from kprobes, but the
kprobe_multi doesn't need it.

Thank you,

> 
> jirka
> 
> 
> [1] d099dbfd3306 cpuidle: tracing: Warn about !rcu_is_watching()
> [2] https://lore.kernel.org/bpf/20230517034510.15639-4-zegao@tencent.com/


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-21 20:26       ` Re: Jiri Olsa
  2023-05-22  1:36         ` Re: Masami Hiramatsu
@ 2023-05-22  2:07         ` Ze Gao
  2023-05-23  4:38           ` Re: Yonghong Song
  2023-05-23  5:30           ` Re: Masami Hiramatsu
  1 sibling, 2 replies; 1651+ messages in thread
From: Ze Gao @ 2023-05-22  2:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

Oops, I missed that. Thanks for pointing that out, which I thought is
conditional use of rcu_is_watching before.

One last point, I think we should double check on this
     "fentry does not filter with !rcu_is_watching"
as quoted from Yonghong and argue whether it needs
the same check for fentry as well.

Regards,
Ze

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-22  2:07         ` Re: Ze Gao
@ 2023-05-23  4:38           ` Yonghong Song
  2023-05-23  5:30           ` Re: Masami Hiramatsu
  1 sibling, 0 replies; 1651+ messages in thread
From: Yonghong Song @ 2023-05-23  4:38 UTC (permalink / raw)
  To: Ze Gao, Jiri Olsa
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao



On 5/21/23 7:07 PM, Ze Gao wrote:
> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
> 
> One last point, I think we should double check on this
>       "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.

I would suggest that we address rcu_is_watching issue for fentry
only if we do have a reproducible case to show something goes wrong...

> 
> Regards,
> Ze

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-22  2:07         ` Re: Ze Gao
  2023-05-23  4:38           ` Re: Yonghong Song
@ 2023-05-23  5:30           ` Masami Hiramatsu
  2023-05-23  6:59             ` Re: Paul E. McKenney
  1 sibling, 1 reply; 1651+ messages in thread
From: Masami Hiramatsu @ 2023-05-23  5:30 UTC (permalink / raw)
  To: Ze Gao
  Cc: Jiri Olsa, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Hao Luo, John Fastabend, KP Singh,
	Martin KaFai Lau, Masami Hiramatsu, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Mon, 22 May 2023 10:07:42 +0800
Ze Gao <zegao2021@gmail.com> wrote:

> Oops, I missed that. Thanks for pointing that out, which I thought is
> conditional use of rcu_is_watching before.
> 
> One last point, I think we should double check on this
>      "fentry does not filter with !rcu_is_watching"
> as quoted from Yonghong and argue whether it needs
> the same check for fentry as well.

rcu_is_watching() comment says;

 * if the current CPU is not in its idle loop or is in an interrupt or
 * NMI handler, return true.

Thus it returns *fault* if the current CPU is in the idle loop and not
any interrupt(including NMI) context. This means if any tracable function
is called from idle loop, it can be !rcu_is_watching(). I meant, this is
'context' based check, thus fentry can not filter out that some commonly
used functions is called from that context but it can be detected.

Thank you,

> 
> Regards,
> Ze


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-23  5:30           ` Re: Masami Hiramatsu
@ 2023-05-23  6:59             ` Paul E. McKenney
  2023-05-25  0:13               ` Re: Masami Hiramatsu
  0 siblings, 1 reply; 1651+ messages in thread
From: Paul E. McKenney @ 2023-05-23  6:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
	KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
	Ze Gao

On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> On Mon, 22 May 2023 10:07:42 +0800
> Ze Gao <zegao2021@gmail.com> wrote:
> 
> > Oops, I missed that. Thanks for pointing that out, which I thought is
> > conditional use of rcu_is_watching before.
> > 
> > One last point, I think we should double check on this
> >      "fentry does not filter with !rcu_is_watching"
> > as quoted from Yonghong and argue whether it needs
> > the same check for fentry as well.
> 
> rcu_is_watching() comment says;
> 
>  * if the current CPU is not in its idle loop or is in an interrupt or
>  * NMI handler, return true.
> 
> Thus it returns *fault* if the current CPU is in the idle loop and not
> any interrupt(including NMI) context. This means if any tracable function
> is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> 'context' based check, thus fentry can not filter out that some commonly
> used functions is called from that context but it can be detected.

It really does return false (rather than faulting?) if the current CPU
is deep within the idle loop.

In addition, the recent x86/entry rework (thank you Peter and
Thomas!) mean that the "idle loop" is quite restricted, as can be
seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
For example, in default_idle_call(), these are immediately before and
after the call to arch_cpu_idle().

Would the following help?  Or am I missing your point?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1449cb69a0e0..fae9b4e29c93 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
 /**
  * rcu_is_watching - see if RCU thinks that the current CPU is not idle
  *
- * Return true if RCU is watching the running CPU, which means that this
- * CPU can safely enter RCU read-side critical sections.  In other words,
- * if the current CPU is not in its idle loop or is in an interrupt or
- * NMI handler, return true.
+ * Return @true if RCU is watching the running CPU and @false otherwise.
+ * An @true return means that this CPU can safely enter RCU read-side
+ * critical sections.
+ *
+ * More specifically, if the current CPU is not deep within its idle
+ * loop, return @true.  Note that rcu_is_watching() will return @true if
+ * invoked from an interrupt or NMI handler, even if that interrupt or
+ * NMI interrupted the CPU while it was deep within its idle loop.
  *
  * Make notrace because it can be called by the internal functions of
  * ftrace, and making this notrace removes unnecessary recursion calls.

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-23  6:59             ` Re: Paul E. McKenney
@ 2023-05-25  0:13               ` Masami Hiramatsu
  0 siblings, 0 replies; 1651+ messages in thread
From: Masami Hiramatsu @ 2023-05-25  0:13 UTC (permalink / raw)
  To: paulmck
  Cc: Ze Gao, Jiri Olsa, Yonghong Song, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Hao Luo, John Fastabend,
	KP Singh, Martin KaFai Lau, Song Liu, Stanislav Fomichev,
	Steven Rostedt, Yonghong Song, bpf, linux-kernel,
	linux-trace-kernel, kafai, kpsingh, netdev, songliubraving,
	Ze Gao

On Mon, 22 May 2023 23:59:28 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> On Tue, May 23, 2023 at 01:30:19PM +0800, Masami Hiramatsu wrote:
> > On Mon, 22 May 2023 10:07:42 +0800
> > Ze Gao <zegao2021@gmail.com> wrote:
> > 
> > > Oops, I missed that. Thanks for pointing that out, which I thought is
> > > conditional use of rcu_is_watching before.
> > > 
> > > One last point, I think we should double check on this
> > >      "fentry does not filter with !rcu_is_watching"
> > > as quoted from Yonghong and argue whether it needs
> > > the same check for fentry as well.
> > 
> > rcu_is_watching() comment says;
> > 
> >  * if the current CPU is not in its idle loop or is in an interrupt or
> >  * NMI handler, return true.
> > 
> > Thus it returns *fault* if the current CPU is in the idle loop and not
> > any interrupt(including NMI) context. This means if any tracable function
> > is called from idle loop, it can be !rcu_is_watching(). I meant, this is
> > 'context' based check, thus fentry can not filter out that some commonly
> > used functions is called from that context but it can be detected.
> 
> It really does return false (rather than faulting?) if the current CPU
> is deep within the idle loop.
> 
> In addition, the recent x86/entry rework (thank you Peter and
> Thomas!) mean that the "idle loop" is quite restricted, as can be
> seen by the invocations of ct_cpuidle_enter() and ct_cpuidle_exit().
> For example, in default_idle_call(), these are immediately before and
> after the call to arch_cpu_idle().

Thanks! I also found that the default_idle_call() is enough small and
it seems not happening on fentry because there are no commonly used
functions on that path.

> 
> Would the following help?  Or am I missing your point?

Yes, thank you for the update!

> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 1449cb69a0e0..fae9b4e29c93 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -679,10 +679,14 @@ static void rcu_disable_urgency_upon_qs(struct rcu_data *rdp)
>  /**
>   * rcu_is_watching - see if RCU thinks that the current CPU is not idle
>   *
> - * Return true if RCU is watching the running CPU, which means that this
> - * CPU can safely enter RCU read-side critical sections.  In other words,
> - * if the current CPU is not in its idle loop or is in an interrupt or
> - * NMI handler, return true.
> + * Return @true if RCU is watching the running CPU and @false otherwise.
> + * An @true return means that this CPU can safely enter RCU read-side
> + * critical sections.
> + *
> + * More specifically, if the current CPU is not deep within its idle
> + * loop, return @true.  Note that rcu_is_watching() will return @true if
> + * invoked from an interrupt or NMI handler, even if that interrupt or
> + * NMI interrupted the CPU while it was deep within its idle loop.
>   *
>   * Make notrace because it can be called by the internal functions of
>   * ftrace, and making this notrace removes unnecessary recursion calls.


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-20  9:47 ` Ze Gao
  2023-05-21  3:58   ` Yonghong Song
@ 2023-05-21  8:08   ` Jiri Olsa
  2023-05-21 10:09     ` Re: Masami Hiramatsu
  1 sibling, 1 reply; 1651+ messages in thread
From: Jiri Olsa @ 2023-05-21  8:08 UTC (permalink / raw)
  To: Ze Gao
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Hao Luo,
	John Fastabend, KP Singh, Martin KaFai Lau, Masami Hiramatsu,
	Song Liu, Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> 
> Hi Jiri,
> 
> Would you like to consider to add rcu_is_watching check in
> to solve this from the viewpoint of kprobe_multi_link_prog_run

I think this was discussed in here:
  https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/

and was considered a bug, there's fix mentioned later in the thread

there's also this recent patchset:
  https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/

that solves related problems

> itself? And accounting of missed runs can be added as well
> to imporve observability.

right, we count fprobe->nmissed but it's not exposed, we should allow
to get 'missed' stats from both fprobe and kprobe_multi later, which
is missing now, will check

thanks,
jirka

> 
> Regards,
> Ze
> 
> 
> -----------------
> From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> From: Ze Gao <zegao@tencent.com>
> Date: Sat, 20 May 2023 17:32:05 +0800
> Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> 
> From the perspective of kprobe_multi_link_prog_run, any traceable
> functions can be attached while bpf progs need specical care and
> ought to be under rcu protection. To solve the likely rcu lockdep
> warns once for good, when (future) functions in idle path were
> attached accidentally, we better paying some cost to check at least
> in kernel-side, and return when rcu is not watching, which helps
> to avoid any unpredictable results.
> 
> Signed-off-by: Ze Gao <zegao@tencent.com>
> ---
>  kernel/trace/bpf_trace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 9a050e36dc6c..3e6ea7274765 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
>  	struct bpf_run_ctx *old_run_ctx;
>  	int err;
>  
> -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
>  		err = 0;
>  		goto out;
>  	}
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-21  8:08   ` Re: Jiri Olsa
@ 2023-05-21 10:09     ` Masami Hiramatsu
  2023-05-21 14:19       ` Re: Ze Gao
  0 siblings, 1 reply; 1651+ messages in thread
From: Masami Hiramatsu @ 2023-05-21 10:09 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ze Gao, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau,
	Masami Hiramatsu, Song Liu, Stanislav Fomichev, Steven Rostedt,
	Yonghong Song, bpf, linux-kernel, linux-trace-kernel, kafai,
	kpsingh, netdev, paulmck, songliubraving, Ze Gao

On Sun, 21 May 2023 10:08:46 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> > 
> > Hi Jiri,
> > 
> > Would you like to consider to add rcu_is_watching check in
> > to solve this from the viewpoint of kprobe_multi_link_prog_run
> 
> I think this was discussed in here:
>   https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
> 
> and was considered a bug, there's fix mentioned later in the thread
> 
> there's also this recent patchset:
>   https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
> 
> that solves related problems

I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
check is required if the kprobe_multi_link_prog_run() uses any RCU API.
E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
call_rcu().

Thank you,

> 
> > itself? And accounting of missed runs can be added as well
> > to imporve observability.
> 
> right, we count fprobe->nmissed but it's not exposed, we should allow
> to get 'missed' stats from both fprobe and kprobe_multi later, which
> is missing now, will check
> 
> thanks,
> jirka
> 
> > 
> > Regards,
> > Ze
> > 
> > 
> > -----------------
> > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > From: Ze Gao <zegao@tencent.com>
> > Date: Sat, 20 May 2023 17:32:05 +0800
> > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> > 
> > From the perspective of kprobe_multi_link_prog_run, any traceable
> > functions can be attached while bpf progs need specical care and
> > ought to be under rcu protection. To solve the likely rcu lockdep
> > warns once for good, when (future) functions in idle path were
> > attached accidentally, we better paying some cost to check at least
> > in kernel-side, and return when rcu is not watching, which helps
> > to avoid any unpredictable results.
> > 
> > Signed-off-by: Ze Gao <zegao@tencent.com>
> > ---
> >  kernel/trace/bpf_trace.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 9a050e36dc6c..3e6ea7274765 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> >  	struct bpf_run_ctx *old_run_ctx;
> >  	int err;
> >  
> > -	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > +	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> >  		err = 0;
> >  		goto out;
> >  	}
> > -- 
> > 2.40.1
> > 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-21 10:09     ` Re: Masami Hiramatsu
@ 2023-05-21 14:19       ` Ze Gao
  0 siblings, 0 replies; 1651+ messages in thread
From: Ze Gao @ 2023-05-21 14:19 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Hao Luo, John Fastabend, KP Singh, Martin KaFai Lau, Song Liu,
	Stanislav Fomichev, Steven Rostedt, Yonghong Song, bpf,
	linux-kernel, linux-trace-kernel, kafai, kpsingh, netdev, paulmck,
	songliubraving, Ze Gao

On Sun, May 21, 2023 at 6:09 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Sun, 21 May 2023 10:08:46 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
>
> > On Sat, May 20, 2023 at 05:47:24PM +0800, Ze Gao wrote:
> > >
> > > Hi Jiri,
> > >
> > > Would you like to consider to add rcu_is_watching check in
> > > to solve this from the viewpoint of kprobe_multi_link_prog_run
> >
> > I think this was discussed in here:
> >   https://lore.kernel.org/bpf/20230321020103.13494-1-laoar.shao@gmail.com/
> >
> > and was considered a bug, there's fix mentioned later in the thread
> >
> > there's also this recent patchset:
> >   https://lore.kernel.org/bpf/20230517034510.15639-3-zegao@tencent.com/
> >
> > that solves related problems
>
> I think this rcu_is_watching() is a bit different issue. This rcu_is_watching()
> check is required if the kprobe_multi_link_prog_run() uses any RCU API.
> E.g. rethook_try_get() is also checks rcu_is_watching() because it uses
> call_rcu().

Yes, that's my point!

Regards,
Ze

>
> >
> > > itself? And accounting of missed runs can be added as well
> > > to imporve observability.
> >
> > right, we count fprobe->nmissed but it's not exposed, we should allow
> > to get 'missed' stats from both fprobe and kprobe_multi later, which
> > is missing now, will check
> >
> > thanks,
> > jirka
> >
> > >
> > > Regards,
> > > Ze
> > >
> > >
> > > -----------------
> > > From 29fd3cd713e65461325c2703cf5246a6fae5d4fe Mon Sep 17 00:00:00 2001
> > > From: Ze Gao <zegao@tencent.com>
> > > Date: Sat, 20 May 2023 17:32:05 +0800
> > > Subject: [PATCH] bpf: kprobe_multi runs bpf progs only when rcu_is_watching
> > >
> > > From the perspective of kprobe_multi_link_prog_run, any traceable
> > > functions can be attached while bpf progs need specical care and
> > > ought to be under rcu protection. To solve the likely rcu lockdep
> > > warns once for good, when (future) functions in idle path were
> > > attached accidentally, we better paying some cost to check at least
> > > in kernel-side, and return when rcu is not watching, which helps
> > > to avoid any unpredictable results.
> > >
> > > Signed-off-by: Ze Gao <zegao@tencent.com>
> > > ---
> > >  kernel/trace/bpf_trace.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > > index 9a050e36dc6c..3e6ea7274765 100644
> > > --- a/kernel/trace/bpf_trace.c
> > > +++ b/kernel/trace/bpf_trace.c
> > > @@ -2622,7 +2622,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
> > >     struct bpf_run_ctx *old_run_ctx;
> > >     int err;
> > >
> > > -   if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
> > > +   if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1 || !rcu_is_watching())) {
> > >             err = 0;
> > >             goto out;
> > >     }
> > > --
> > > 2.40.1
> > >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-05-19  9:54 Christian König
  2022-05-19 10:50 ` Matthew Auld
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2022-05-19  9:54 UTC (permalink / raw)
  To: intel-gfx; +Cc: matthew.william.auld, dri-devel

Just sending that out once more to intel-gfx to let the CI systems take
a look.

No functional change compared to the last version.

Christian.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-05-19  9:54 Christian König
@ 2022-05-19 10:50 ` Matthew Auld
  2022-05-20  7:11   ` Re: Christian König
  0 siblings, 1 reply; 1651+ messages in thread
From: Matthew Auld @ 2022-05-19 10:50 UTC (permalink / raw)
  To: Christian König; +Cc: Intel Graphics Development, ML dri-devel

On Thu, 19 May 2022 at 10:55, Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Just sending that out once more to intel-gfx to let the CI systems take
> a look.

If all went well it should normally appear at [1][2], if CI was able
to pick up the series.

Since it's not currently there, I assume it's temporarily stuck in the
moderation queue, assuming you are not subscribed to intel-gfx ml? If
so, perhaps consider subscribing at [3] and then disable receiving any
mail from the ml, so you get full use of CI without getting spammed.

[1] https://intel-gfx-ci.01.org/queue/index.html
[2] https://patchwork.freedesktop.org/project/intel-gfx/series/
[3] https://lists.freedesktop.org/mailman/listinfo/intel-gfx

>
> No functional change compared to the last version.
>
> Christian.
>
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-05-19 10:50 ` Matthew Auld
@ 2022-05-20  7:11   ` Christian König
  0 siblings, 0 replies; 1651+ messages in thread
From: Christian König @ 2022-05-20  7:11 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development, ML dri-devel

Am 19.05.22 um 12:50 schrieb Matthew Auld:
> On Thu, 19 May 2022 at 10:55, Christian König
> <ckoenig.leichtzumerken@gmail.com> wrote:
>> Just sending that out once more to intel-gfx to let the CI systems take
>> a look.
> If all went well it should normally appear at [1][2], if CI was able
> to pick up the series.
>
> Since it's not currently there, I assume it's temporarily stuck in the
> moderation queue, assuming you are not subscribed to intel-gfx ml?

Ah! Well I am subscribed, just not with the e-Mail address I use to send 
out those patches.

Going to fix that ASAP!

Thanks,
Christian.

>   If
> so, perhaps consider subscribing at [3] and then disable receiving any
> mail from the ml, so you get full use of CI without getting spammed.
>
> [1] https://intel-gfx-ci.01.org/queue/index.html
> [2] https://patchwork.freedesktop.org/project/intel-gfx/series/
> [3] https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>> No functional change compared to the last version.
>>
>> Christian.
>>
>>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-06-06  5:33 Fenil Jain
  2022-06-06  5:51 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 1651+ messages in thread
From: Fenil Jain @ 2022-06-06  5:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Shuah Khan, stable

On Fri, Jun 03, 2022 at 07:43:01PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.18.2 release.
> There are 67 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 05 Jun 2022 17:38:05 +0000.
> Anything received after that time might be too late.

Hey Greg,

Ran tests and boot tested on my system, no regression found

Tested-by: Fenil Jain<fkjainco@gmail.com>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-06-06  5:33 Fenil Jain
@ 2022-06-06  5:51 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 1651+ messages in thread
From: Greg Kroah-Hartman @ 2022-06-06  5:51 UTC (permalink / raw)
  To: Fenil Jain; +Cc: Shuah Khan, stable

On Mon, Jun 06, 2022 at 11:03:24AM +0530, Fenil Jain wrote:
> On Fri, Jun 03, 2022 at 07:43:01PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.18.2 release.
> > There are 67 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sun, 05 Jun 2022 17:38:05 +0000.
> > Anything received after that time might be too late.
> 
> Hey Greg,
> 
> Ran tests and boot tested on my system, no regression found
> 
> Tested-by: Fenil Jain<fkjainco@gmail.com>

Thanks for the testing, but something went wrong with your email client
and it lost the Subject: line, making this impossible to be picked up by
our tools.

Also, please include an extra ' ' before the '<' character in your
tested-by line.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-08-26 22:03 Zach O'Keefe
  2022-08-31 21:47 ` Yang Shi
  0 siblings, 1 reply; 1651+ messages in thread
From: Zach O'Keefe @ 2022-08-26 22:03 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, linux-api, Axel Rasmussen, James Houghton,
	Hugh Dickins, Yang Shi, Miaohe Lin, David Hildenbrand,
	David Rientjes, Matthew Wilcox, Pasha Tatashin, Peter Xu,
	Rongwei Wang, SeongJae Park, Song Liu, Vlastimil Babka,
	Chris Kennelly, Kirill A. Shutemov, Minchan Kim, Patrick Xia,
	Zach O'Keefe

Subject: [PATCH mm-unstable v2 0/9] mm: add file/shmem support to MADV_COLLAPSE

v2 Forward

Mostly a RESEND: rebase on latest mm-unstable + minor bug fixes from
kernel test robot.
--------------------------------

This series builds on top of the previous "mm: userspace hugepage collapse"
series which introduced the MADV_COLLAPSE madvise mode and added support
for private, anonymous mappings[1], by adding support for file and shmem
backed memory to CONFIG_READ_ONLY_THP_FOR_FS=y kernels.

File and shmem support have been added with effort to align with existing
MADV_COLLAPSE semantics and policy decisions[2].  Collapse of shmem-backed
memory ignores kernel-guiding directives and heuristics including all
sysfs settings (transparent_hugepage/shmem_enabled), and tmpfs huge= mount
options (shmem always supports large folios).  Like anonymous mappings, on
successful return of MADV_COLLAPSE on file/shmem memory, the contents of
memory mapped by the addresses provided will be synchronously pmd-mapped
THPs.

This functionality unlocks two important uses:

(1)	Immediately back executable text by THPs.  Current support provided
	by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large
	system which might impair services from serving at their full rated
	load after (re)starting.  Tricks like mremap(2)'ing text onto
	anonymous memory to immediately realize iTLB performance prevents
	page sharing and demand paging, both of which increase steady state
	memory footprint.  Now, we can have the best of both worlds: Peak
	upfront performance and lower RAM footprints.

(2)	userfaultfd-based live migration of virtual machines satisfy UFFD
	faults by fetching native-sized pages over the network (to avoid
	latency of transferring an entire hugepage).  However, after guest
	memory has been fully copied to the new host, MADV_COLLAPSE can
	be used to immediately increase guest performance.

khugepaged has received a small improvement by association and can now
detect and collapse pte-mapped THPs.  However, there is still work to be
done along the file collapse path.  Compound pages of arbitrary order still
needs to be supported and THP collapse needs to be converted to using
folios in general.  Eventually, we'd like to move away from the read-only
and executable-mapped constraints currently imposed on eligible files and
support any inode claiming huge folio support.  That said, I think the
series as-is covers enough to claim that MADV_COLLAPSE supports file/shmem
memory.

Patches 1-3	Implement the guts of the series.
Patch 4 	Is a tracepoint for debugging.
Patches 5-8 	Refactor existing khugepaged selftests to work with new
		memory types.
Patch 9 	Adds a userfaultfd selftest mode to mimic a functional test
		of UFFDIO_REGISTER_MODE_MINOR+MADV_COLLAPSE live migration.

Applies against mm-unstable.

[1] https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/
[2] https://lore.kernel.org/linux-mm/YtBmhaiPHUTkJml8@google.com/

v1 -> v2:
- Add missing definition for khugepaged_add_pte_mapped_thp() in
  !CONFIG_SHEM builds, in "mm/khugepaged: attempt to map
  file/shmem-backed pte-mapped THPs by pmds"
- Minor bugfixes in "mm/madvise: add file and shmem support to
  MADV_COLLAPSE" for !CONFIG_SHMEM, !CONFIG_TRANSPARENT_HUGEPAGE and some
  compiler settings.
- Rebased on latest mm-unstable

Zach O'Keefe (9):
  mm/shmem: add flag to enforce shmem THP in hugepage_vma_check()
  mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by
    pmds
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  selftests/vm: dedup THP helpers
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory

 include/linux/khugepaged.h                    |  13 +-
 include/linux/shmem_fs.h                      |  10 +-
 include/trace/events/huge_memory.h            |  36 +
 kernel/events/uprobes.c                       |   2 +-
 mm/huge_memory.c                              |   2 +-
 mm/khugepaged.c                               | 289 ++++--
 mm/shmem.c                                    |  18 +-
 tools/testing/selftests/vm/Makefile           |   2 +
 tools/testing/selftests/vm/khugepaged.c       | 828 ++++++++++++------
 tools/testing/selftests/vm/soft-dirty.c       |   2 +-
 .../selftests/vm/split_huge_page_test.c       |  12 +-
 tools/testing/selftests/vm/userfaultfd.c      | 171 +++-
 tools/testing/selftests/vm/vm_util.c          |  36 +-
 tools/testing/selftests/vm/vm_util.h          |   5 +-
 14 files changed, 1040 insertions(+), 386 deletions(-)

-- 
2.37.2.672.g94769d06f0-goog


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-08-26 22:03 Zach O'Keefe
@ 2022-08-31 21:47 ` Yang Shi
  2022-09-01  0:24   ` Re: Zach O'Keefe
  0 siblings, 1 reply; 1651+ messages in thread
From: Yang Shi @ 2022-08-31 21:47 UTC (permalink / raw)
  To: Zach O'Keefe
  Cc: linux-mm, Andrew Morton, linux-api, Axel Rasmussen,
	James Houghton, Hugh Dickins, Miaohe Lin, David Hildenbrand,
	David Rientjes, Matthew Wilcox, Pasha Tatashin, Peter Xu,
	Rongwei Wang, SeongJae Park, Song Liu, Vlastimil Babka,
	Chris Kennelly, Kirill A. Shutemov, Minchan Kim, Patrick Xia

Hi Zach,

I did a quick look at the series, basically no show stopper to me. But
I didn't find time to review them thoroughly yet, quite busy on
something else. Just a heads up, I didn't mean to ignore you. I will
review them when I find some time.

Thanks,
Yang

On Fri, Aug 26, 2022 at 3:03 PM Zach O'Keefe <zokeefe@google.com> wrote:
>
> Subject: [PATCH mm-unstable v2 0/9] mm: add file/shmem support to MADV_COLLAPSE
>
> v2 Forward
>
> Mostly a RESEND: rebase on latest mm-unstable + minor bug fixes from
> kernel test robot.
> --------------------------------
>
> This series builds on top of the previous "mm: userspace hugepage collapse"
> series which introduced the MADV_COLLAPSE madvise mode and added support
> for private, anonymous mappings[1], by adding support for file and shmem
> backed memory to CONFIG_READ_ONLY_THP_FOR_FS=y kernels.
>
> File and shmem support have been added with effort to align with existing
> MADV_COLLAPSE semantics and policy decisions[2].  Collapse of shmem-backed
> memory ignores kernel-guiding directives and heuristics including all
> sysfs settings (transparent_hugepage/shmem_enabled), and tmpfs huge= mount
> options (shmem always supports large folios).  Like anonymous mappings, on
> successful return of MADV_COLLAPSE on file/shmem memory, the contents of
> memory mapped by the addresses provided will be synchronously pmd-mapped
> THPs.
>
> This functionality unlocks two important uses:
>
> (1)     Immediately back executable text by THPs.  Current support provided
>         by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large
>         system which might impair services from serving at their full rated
>         load after (re)starting.  Tricks like mremap(2)'ing text onto
>         anonymous memory to immediately realize iTLB performance prevents
>         page sharing and demand paging, both of which increase steady state
>         memory footprint.  Now, we can have the best of both worlds: Peak
>         upfront performance and lower RAM footprints.
>
> (2)     userfaultfd-based live migration of virtual machines satisfy UFFD
>         faults by fetching native-sized pages over the network (to avoid
>         latency of transferring an entire hugepage).  However, after guest
>         memory has been fully copied to the new host, MADV_COLLAPSE can
>         be used to immediately increase guest performance.
>
> khugepaged has received a small improvement by association and can now
> detect and collapse pte-mapped THPs.  However, there is still work to be
> done along the file collapse path.  Compound pages of arbitrary order still
> needs to be supported and THP collapse needs to be converted to using
> folios in general.  Eventually, we'd like to move away from the read-only
> and executable-mapped constraints currently imposed on eligible files and
> support any inode claiming huge folio support.  That said, I think the
> series as-is covers enough to claim that MADV_COLLAPSE supports file/shmem
> memory.
>
> Patches 1-3     Implement the guts of the series.
> Patch 4         Is a tracepoint for debugging.
> Patches 5-8     Refactor existing khugepaged selftests to work with new
>                 memory types.
> Patch 9         Adds a userfaultfd selftest mode to mimic a functional test
>                 of UFFDIO_REGISTER_MODE_MINOR+MADV_COLLAPSE live migration.
>
> Applies against mm-unstable.
>
> [1] https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/
> [2] https://lore.kernel.org/linux-mm/YtBmhaiPHUTkJml8@google.com/
>
> v1 -> v2:
> - Add missing definition for khugepaged_add_pte_mapped_thp() in
>   !CONFIG_SHEM builds, in "mm/khugepaged: attempt to map
>   file/shmem-backed pte-mapped THPs by pmds"
> - Minor bugfixes in "mm/madvise: add file and shmem support to
>   MADV_COLLAPSE" for !CONFIG_SHMEM, !CONFIG_TRANSPARENT_HUGEPAGE and some
>   compiler settings.
> - Rebased on latest mm-unstable
>
> Zach O'Keefe (9):
>   mm/shmem: add flag to enforce shmem THP in hugepage_vma_check()
>   mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by
>     pmds
>   mm/madvise: add file and shmem support to MADV_COLLAPSE
>   mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
>   selftests/vm: dedup THP helpers
>   selftests/vm: modularize thp collapse memory operations
>   selftests/vm: add thp collapse file and tmpfs testing
>   selftests/vm: add thp collapse shmem testing
>   selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
>
>  include/linux/khugepaged.h                    |  13 +-
>  include/linux/shmem_fs.h                      |  10 +-
>  include/trace/events/huge_memory.h            |  36 +
>  kernel/events/uprobes.c                       |   2 +-
>  mm/huge_memory.c                              |   2 +-
>  mm/khugepaged.c                               | 289 ++++--
>  mm/shmem.c                                    |  18 +-
>  tools/testing/selftests/vm/Makefile           |   2 +
>  tools/testing/selftests/vm/khugepaged.c       | 828 ++++++++++++------
>  tools/testing/selftests/vm/soft-dirty.c       |   2 +-
>  .../selftests/vm/split_huge_page_test.c       |  12 +-
>  tools/testing/selftests/vm/userfaultfd.c      | 171 +++-
>  tools/testing/selftests/vm/vm_util.c          |  36 +-
>  tools/testing/selftests/vm/vm_util.h          |   5 +-
>  14 files changed, 1040 insertions(+), 386 deletions(-)
>
> --
> 2.37.2.672.g94769d06f0-goog
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-08-31 21:47 ` Yang Shi
@ 2022-09-01  0:24   ` Zach O'Keefe
  0 siblings, 0 replies; 1651+ messages in thread
From: Zach O'Keefe @ 2022-09-01  0:24 UTC (permalink / raw)
  To: Yang Shi
  Cc: linux-mm, Andrew Morton, linux-api, Axel Rasmussen,
	James Houghton, Hugh Dickins, Miaohe Lin, David Hildenbrand,
	David Rientjes, Matthew Wilcox, Pasha Tatashin, Peter Xu,
	Rongwei Wang, SeongJae Park, Song Liu, Vlastimil Babka,
	Chris Kennelly, Kirill A. Shutemov, Minchan Kim, Patrick Xia

On Wed, Aug 31, 2022 at 2:47 PM Yang Shi <shy828301@gmail.com> wrote:
>
> Hi Zach,
>
> I did a quick look at the series, basically no show stopper to me. But
> I didn't find time to review them thoroughly yet, quite busy on
> something else. Just a heads up, I didn't mean to ignore you. I will
> review them when I find some time.
>
> Thanks,
> Yang

Hey Yang,

Thanks for taking the time to look through, and thanks for giving me a
heads up, and no rush!

In the last day or so, while porting this series around, I encountered
some subtle edge cases I wanted to clean up / address - so it's good
you didn't do a thorough review yet. I was *hoping* to have a v3 out
last night (which evidently did not happen) and it does not seem like
it will happen today, so I'll leave this message as a request for
reviewers to hold off on a thorough review until v3.

Thanks for your time as always,
Zach

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-08-28 21:01 Nick Neumann
  2022-09-01 17:44 ` Nick Neumann
  0 siblings, 1 reply; 1651+ messages in thread
From: Nick Neumann @ 2022-08-28 21:01 UTC (permalink / raw)
  To: fio

I've filed the issue on github, but just thought I'd mention here too.
In real-world use it appears to be intermittent. I"m not yet sure how
intermittent, but I could see it being used in production and not
caught right away. I got lucky and stumbled on it when looking at
graphs of runs and noticed 15 seconds of no activity.

https://github.com/axboe/fio/issues/1457

With the null ioengine, I can make it reproduce very reliably, which
is encouraging as I move to debug.

I had just moved to using log compression as it is really powerful,
and the only way to store per I/O logs for a long run without pushing
up against the amount of physical memory in a system.

(Without compression, a GB of sequential writes at 128K block size is
on the order of 245KB of memory per log, so a TB is 245MB per log. Now
run a job to fill a 20TB drive and you're at 4.9GB for one log file.
If you record all 3 latency numbers too, you're talking close to
20GB.)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-08-28 21:01 Nick Neumann
@ 2022-09-01 17:44 ` Nick Neumann
  0 siblings, 0 replies; 1651+ messages in thread
From: Nick Neumann @ 2022-09-01 17:44 UTC (permalink / raw)
  To: fio

PR for this is up now.

On Sun, Aug 28, 2022 at 4:01 PM Nick Neumann <nick@pcpartpicker.com> wrote:
>
> I've filed the issue on github, but just thought I'd mention here too.
> In real-world use it appears to be intermittent. I"m not yet sure how
> intermittent, but I could see it being used in production and not
> caught right away. I got lucky and stumbled on it when looking at
> graphs of runs and noticed 15 seconds of no activity.
>
> https://github.com/axboe/fio/issues/1457
>
> With the null ioengine, I can make it reproduce very reliably, which
> is encouraging as I move to debug.
>
> I had just moved to using log compression as it is really powerful,
> and the only way to store per I/O logs for a long run without pushing
> up against the amount of physical memory in a system.
>
> (Without compression, a GB of sequential writes at 128K block size is
> on the order of 245KB of memory per log, so a TB is 245MB per log. Now
> run a job to fill a 20TB drive and you're at 4.9GB for one log file.
> If you record all 3 latency numbers too, you're talking close to
> 20GB.)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-09-12 12:36 Christian König
  2022-09-13  2:04 ` Alex Deucher
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2022-09-12 12:36 UTC (permalink / raw)
  To: alexander.deucher, amd-gfx

Hey Alex,

I've decided to split this patch set into two because we still can't
figure out where the VCN regressions come from.

Ruijing tested them and confirmed that they don't regress VCN.

Can you and maybe Felix take a look and review them?

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-09-12 12:36 Christian König
@ 2022-09-13  2:04 ` Alex Deucher
  0 siblings, 0 replies; 1651+ messages in thread
From: Alex Deucher @ 2022-09-13  2:04 UTC (permalink / raw)
  To: Christian König; +Cc: alexander.deucher, amd-gfx

On Mon, Sep 12, 2022 at 8:36 AM Christian König
<ckoenig.leichtzumerken@gmail.com> wrote:
>
> Hey Alex,
>
> I've decided to split this patch set into two because we still can't
> figure out where the VCN regressions come from.
>
> Ruijing tested them and confirmed that they don't regress VCN.
>
> Can you and maybe Felix take a look and review them?

Looks good to me.  Series is:
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

>
> Thanks,
> Christian.
>
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-09-14 13:12 Amjad Ouled-Ameur
  2022-09-14 13:18   ` Re: Amjad Ouled-Ameur
  0 siblings, 1 reply; 1651+ messages in thread
From: Amjad Ouled-Ameur @ 2022-09-14 13:12 UTC (permalink / raw)
  To: Rob Herring
  Cc: Amjad Ouled-Ameur, Krzysztof Kozlowski, Matthias Brugger,
	devicetree, linux-arm-kernel, linux-mediatek, linux-kernel

Subject: [PATCH] arm64: dts: mediatek: mt8183: remove thermal zones without
 trips.

Thermal zones without trip point are not registered by thermal core.

tzts1 ~ tzts6 zones of mt8183 were intially introduced for test-purpose
only but are not supposed to remain on DT.

Remove the zones above and keep only cpu_thermal.

Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi | 57 ------------------------
 1 file changed, 57 deletions(-)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 9d32871973a2..f65fae8939de 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -1182,63 +1182,6 @@ THERMAL_NO_LIMIT
 					};
 				};
 			};
-
-			/* The tzts1 ~ tzts6 don't need to polling */
-			/* The tzts1 ~ tzts6 don't need to thermal throttle */
-
-			tzts1: tzts1 {
-				polling-delay-passive = <0>;
-				polling-delay = <0>;
-				thermal-sensors = <&thermal 1>;
-				sustainable-power = <5000>;
-				trips {};
-				cooling-maps {};
-			};
-
-			tzts2: tzts2 {
-				polling-delay-passive = <0>;
-				polling-delay = <0>;
-				thermal-sensors = <&thermal 2>;
-				sustainable-power = <5000>;
-				trips {};
-				cooling-maps {};
-			};
-
-			tzts3: tzts3 {
-				polling-delay-passive = <0>;
-				polling-delay = <0>;
-				thermal-sensors = <&thermal 3>;
-				sustainable-power = <5000>;
-				trips {};
-				cooling-maps {};
-			};
-
-			tzts4: tzts4 {
-				polling-delay-passive = <0>;
-				polling-delay = <0>;
-				thermal-sensors = <&thermal 4>;
-				sustainable-power = <5000>;
-				trips {};
-				cooling-maps {};
-			};
-
-			tzts5: tzts5 {
-				polling-delay-passive = <0>;
-				polling-delay = <0>;
-				thermal-sensors = <&thermal 5>;
-				sustainable-power = <5000>;
-				trips {};
-				cooling-maps {};
-			};
-
-			tztsABB: tztsABB {
-				polling-delay-passive = <0>;
-				polling-delay = <0>;
-				thermal-sensors = <&thermal 6>;
-				sustainable-power = <5000>;
-				trips {};
-				cooling-maps {};
-			};
 		};
 
 		pwm0: pwm@1100e000 {
-- 
2.37.3



^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2022-09-14 13:12 Amjad Ouled-Ameur
@ 2022-09-14 13:18   ` Amjad Ouled-Ameur
  0 siblings, 0 replies; 1651+ messages in thread
From: Amjad Ouled-Ameur @ 2022-09-14 13:18 UTC (permalink / raw)
  To: Rob Herring
  Cc: Krzysztof Kozlowski, Matthias Brugger, devicetree,
	linux-arm-kernel, linux-mediatek, linux-kernel

Hi,

The subject has not been parsed correctly, I resent a proper patch here:

https://patchwork.kernel.org/project/linux-mediatek/patch/20220914131339.18348-1-aouledameur@baylibre.com/


Sorry for the noise.

Regards,

Amjad

On 9/14/22 15:12, Amjad Ouled-Ameur wrote:
> Subject: [PATCH] arm64: dts: mediatek: mt8183: remove thermal zones without
>   trips.
>
> Thermal zones without trip point are not registered by thermal core.
>
> tzts1 ~ tzts6 zones of mt8183 were intially introduced for test-purpose
> only but are not supposed to remain on DT.
>
> Remove the zones above and keep only cpu_thermal.
>
> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
> ---
>   arch/arm64/boot/dts/mediatek/mt8183.dtsi | 57 ------------------------
>   1 file changed, 57 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> index 9d32871973a2..f65fae8939de 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> @@ -1182,63 +1182,6 @@ THERMAL_NO_LIMIT
>   					};
>   				};
>   			};
> -
> -			/* The tzts1 ~ tzts6 don't need to polling */
> -			/* The tzts1 ~ tzts6 don't need to thermal throttle */
> -
> -			tzts1: tzts1 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 1>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts2: tzts2 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 2>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts3: tzts3 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 3>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts4: tzts4 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 4>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts5: tzts5 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 5>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tztsABB: tztsABB {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 6>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
>   		};
>   
>   		pwm0: pwm@1100e000 {


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2022-09-14 13:18   ` Amjad Ouled-Ameur
  0 siblings, 0 replies; 1651+ messages in thread
From: Amjad Ouled-Ameur @ 2022-09-14 13:18 UTC (permalink / raw)
  To: Rob Herring
  Cc: Krzysztof Kozlowski, Matthias Brugger, devicetree,
	linux-arm-kernel, linux-mediatek, linux-kernel

Hi,

The subject has not been parsed correctly, I resent a proper patch here:

https://patchwork.kernel.org/project/linux-mediatek/patch/20220914131339.18348-1-aouledameur@baylibre.com/


Sorry for the noise.

Regards,

Amjad

On 9/14/22 15:12, Amjad Ouled-Ameur wrote:
> Subject: [PATCH] arm64: dts: mediatek: mt8183: remove thermal zones without
>   trips.
>
> Thermal zones without trip point are not registered by thermal core.
>
> tzts1 ~ tzts6 zones of mt8183 were intially introduced for test-purpose
> only but are not supposed to remain on DT.
>
> Remove the zones above and keep only cpu_thermal.
>
> Signed-off-by: Amjad Ouled-Ameur <aouledameur@baylibre.com>
> ---
>   arch/arm64/boot/dts/mediatek/mt8183.dtsi | 57 ------------------------
>   1 file changed, 57 deletions(-)
>
> diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> index 9d32871973a2..f65fae8939de 100644
> --- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> +++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
> @@ -1182,63 +1182,6 @@ THERMAL_NO_LIMIT
>   					};
>   				};
>   			};
> -
> -			/* The tzts1 ~ tzts6 don't need to polling */
> -			/* The tzts1 ~ tzts6 don't need to thermal throttle */
> -
> -			tzts1: tzts1 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 1>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts2: tzts2 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 2>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts3: tzts3 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 3>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts4: tzts4 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 4>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tzts5: tzts5 {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 5>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
> -
> -			tztsABB: tztsABB {
> -				polling-delay-passive = <0>;
> -				polling-delay = <0>;
> -				thermal-sensors = <&thermal 6>;
> -				sustainable-power = <5000>;
> -				trips {};
> -				cooling-maps {};
> -			};
>   		};
>   
>   		pwm0: pwm@1100e000 {

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-11-09 14:34 Denis Arefev
  2022-11-09 14:44 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 1651+ messages in thread
From: Denis Arefev @ 2022-11-09 14:34 UTC (permalink / raw)
  To: David Airlie, Daniel Vetter, Greg Kroah-Hartman, stable
  Cc: Alexey Khoroshilov, ldv-project, trufanov, vfh

Date: Wed, 9 Nov 2022 16:52:17 +0300
Subject: [PATCH 5.10] nbio_v7_4: Add pointer check

Return value of a function 'amdgpu_ras_find_obj' is dereferenced at nbio_v7_4.c:325 without checking for null

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Denis Arefev <arefev@swemel.ru>
---
 drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
index eadc9526d33f..d2627a610e48 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
@@ -303,6 +303,9 @@ static void nbio_v7_4_handle_ras_controller_intr_no_bifring(struct amdgpu_device
 	struct ras_manager *obj = amdgpu_ras_find_obj(adev, adev->nbio.ras_if);
 	struct ras_err_data err_data = {0, 0, 0, NULL};
 	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);

+	if (!obj)
+		return;
 
 	bif_doorbell_intr_cntl = RREG32_SOC15(NBIO, 0, mmBIF_DOORBELL_INT_CNTL);
 	if (REG_GET_FIELD(bif_doorbell_intr_cntl,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2022-11-09 14:34 Denis Arefev
@ 2022-11-09 14:44 ` Greg Kroah-Hartman
  0 siblings, 0 replies; 1651+ messages in thread
From: Greg Kroah-Hartman @ 2022-11-09 14:44 UTC (permalink / raw)
  To: Denis Arefev
  Cc: David Airlie, Daniel Vetter, stable, Alexey Khoroshilov,
	ldv-project, trufanov, vfh

On Wed, Nov 09, 2022 at 05:34:13PM +0300, Denis Arefev wrote:
> Date: Wed, 9 Nov 2022 16:52:17 +0300
> Subject: [PATCH 5.10] nbio_v7_4: Add pointer check
> 
> Return value of a function 'amdgpu_ras_find_obj' is dereferenced at nbio_v7_4.c:325 without checking for null
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Signed-off-by: Denis Arefev <arefev@swemel.ru>
> ---
>  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> index eadc9526d33f..d2627a610e48 100644
> --- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> +++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c
> @@ -303,6 +303,9 @@ static void nbio_v7_4_handle_ras_controller_intr_no_bifring(struct amdgpu_device
>  	struct ras_manager *obj = amdgpu_ras_find_obj(adev, adev->nbio.ras_if);
>  	struct ras_err_data err_data = {0, 0, 0, NULL};
>  	struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
> 
> +	if (!obj)
> +		return;
>  
>  	bif_doorbell_intr_cntl = RREG32_SOC15(NBIO, 0, mmBIF_DOORBELL_INT_CNTL);
>  	if (REG_GET_FIELD(bif_doorbell_intr_cntl,
> -- 
> 2.25.1
> 


<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-11-18  2:00 Jiamei Xie
  2022-11-18  7:47 ` Michal Orzel
  0 siblings, 1 reply; 1651+ messages in thread
From: Jiamei Xie @ 2022-11-18  2:00 UTC (permalink / raw)
  To: xen-devel
  Cc: Jiamei Xie, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Wei Chen

Date: Thu, 17 Nov 2022 11:07:12 +0800
Subject: [PATCH] xen/arm: vpl011: Make access to DMACR write-ignore

When the guest kernel enables DMA engine with "CONFIG_DMA_ENGINE=y",
Linux SBSA PL011 driver will access PL011 DMACR register in some
functions. As chapter "B Generic UART" in "ARM Server Base System
Architecture"[1] documentation describes, SBSA UART doesn't support
DMA. In current code, when the kernel tries to access DMACR register,
Xen will inject a data abort:
Unhandled fault at 0xffffffc00944d048
Mem abort info:
  ESR = 0x96000000
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x00: ttbr address size fault
Data abort info:
  ISV = 0, ISS = 0x00000000
  CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000020e2e000
[ffffffc00944d048] pgd=100000003ffff803, p4d=100000003ffff803, pud=100000003ffff803, pmd=100000003fffa803, pte=006800009c090f13
Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP
...
Call trace:
 pl011_stop_rx+0x70/0x80
 tty_port_shutdown+0x7c/0xb4
 tty_port_close+0x60/0xcc
 uart_close+0x34/0x8c
 tty_release+0x144/0x4c0
 __fput+0x78/0x220
 ____fput+0x1c/0x30
 task_work_run+0x88/0xc0
 do_notify_resume+0x8d0/0x123c
 el0_svc+0xa8/0xc0
 el0t_64_sync_handler+0xa4/0x130
 el0t_64_sync+0x1a0/0x1a4
Code: b9000083 b901f001 794038a0 8b000042 (b9000041)
---[ end trace 83dd93df15c3216f ]---
note: bootlogd[132] exited with preempt_count 1
/etc/rcS.d/S07bootlogd: line 47: 132 Segmentation fault start-stop-daemon

As discussed in [2], this commit makes the access to DMACR register
write-ignore as an improvement.

[1] https://developer.arm.com/documentation/den0094/c/?lang=en
[2] https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2211161552420.4020@ubuntu-linux-20-04-desktop/

Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>
---
 xen/arch/arm/vpl011.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
index 43522d48fd..80d00b3052 100644
--- a/xen/arch/arm/vpl011.c
+++ b/xen/arch/arm/vpl011.c
@@ -463,6 +463,7 @@ static int vpl011_mmio_write(struct vcpu *v,
     case FR:
     case RIS:
     case MIS:
+    case DMACR:
         goto write_ignore;
 
     case IMSC:
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2022-11-18  2:00 Jiamei Xie
@ 2022-11-18  7:47 ` Michal Orzel
  2022-11-18  9:02   ` Re: Julien Grall
  0 siblings, 1 reply; 1651+ messages in thread
From: Michal Orzel @ 2022-11-18  7:47 UTC (permalink / raw)
  To: Jiamei Xie, xen-devel
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Volodymyr Babchuk, Wei Chen

Hi Jimaei,

On 18/11/2022 03:00, Jiamei Xie wrote:
> 
> 
> Date: Thu, 17 Nov 2022 11:07:12 +0800
> Subject: [PATCH] xen/arm: vpl011: Make access to DMACR write-ignore
> 
> When the guest kernel enables DMA engine with "CONFIG_DMA_ENGINE=y",
> Linux SBSA PL011 driver will access PL011 DMACR register in some
> functions. As chapter "B Generic UART" in "ARM Server Base System
> Architecture"[1] documentation describes, SBSA UART doesn't support
> DMA. In current code, when the kernel tries to access DMACR register,
> Xen will inject a data abort:
> Unhandled fault at 0xffffffc00944d048
> Mem abort info:
>   ESR = 0x96000000
>   EC = 0x25: DABT (current EL), IL = 32 bits
>   SET = 0, FnV = 0
>   EA = 0, S1PTW = 0
>   FSC = 0x00: ttbr address size fault
> Data abort info:
>   ISV = 0, ISS = 0x00000000
>   CM = 0, WnR = 0
> swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000020e2e000
> [ffffffc00944d048] pgd=100000003ffff803, p4d=100000003ffff803, pud=100000003ffff803, pmd=100000003fffa803, pte=006800009c090f13
> Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP
> ...
> Call trace:
>  pl011_stop_rx+0x70/0x80
>  tty_port_shutdown+0x7c/0xb4
>  tty_port_close+0x60/0xcc
>  uart_close+0x34/0x8c
>  tty_release+0x144/0x4c0
>  __fput+0x78/0x220
>  ____fput+0x1c/0x30
>  task_work_run+0x88/0xc0
>  do_notify_resume+0x8d0/0x123c
>  el0_svc+0xa8/0xc0
>  el0t_64_sync_handler+0xa4/0x130
>  el0t_64_sync+0x1a0/0x1a4
> Code: b9000083 b901f001 794038a0 8b000042 (b9000041)
> ---[ end trace 83dd93df15c3216f ]---
> note: bootlogd[132] exited with preempt_count 1
> /etc/rcS.d/S07bootlogd: line 47: 132 Segmentation fault start-stop-daemon
> 
> As discussed in [2], this commit makes the access to DMACR register
> write-ignore as an improvement.
As discussed earlier, if we decide to improve vpl011 (for now only Stefano shared his opinion),
then we need to mark *all* the PL011 registers that are not part of SBSA ar RAZ/WI. So handling
DMACR and only for writes is not beneficial (it is only fixing current Linux issue, but what we
really want is to improve the code in general).

> 
> [1] https://developer.arm.com/documentation/den0094/c/?lang=en
> [2] https://lore.kernel.org/xen-devel/alpine.DEB.2.22.394.2211161552420.4020@ubuntu-linux-20-04-desktop/
> 
> Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>
> ---
>  xen/arch/arm/vpl011.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/xen/arch/arm/vpl011.c b/xen/arch/arm/vpl011.c
> index 43522d48fd..80d00b3052 100644
> --- a/xen/arch/arm/vpl011.c
> +++ b/xen/arch/arm/vpl011.c
> @@ -463,6 +463,7 @@ static int vpl011_mmio_write(struct vcpu *v,
>      case FR:
>      case RIS:
>      case MIS:
> +    case DMACR:
>          goto write_ignore;
> 
>      case IMSC:
> --
> 2.25.1
> 
> 

~Michal


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2022-11-18  7:47 ` Michal Orzel
@ 2022-11-18  9:02   ` Julien Grall
  0 siblings, 0 replies; 1651+ messages in thread
From: Julien Grall @ 2022-11-18  9:02 UTC (permalink / raw)
  To: Michal Orzel, Jiamei Xie, xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk, Wei Chen



On 18/11/2022 07:47, Michal Orzel wrote:
> On 18/11/2022 03:00, Jiamei Xie wrote:
>>
>>
>> Date: Thu, 17 Nov 2022 11:07:12 +0800
>> Subject: [PATCH] xen/arm: vpl011: Make access to DMACR write-ignore
>>
>> When the guest kernel enables DMA engine with "CONFIG_DMA_ENGINE=y",
>> Linux SBSA PL011 driver will access PL011 DMACR register in some
>> functions. As chapter "B Generic UART" in "ARM Server Base System
>> Architecture"[1] documentation describes, SBSA UART doesn't support
>> DMA. In current code, when the kernel tries to access DMACR register,
>> Xen will inject a data abort:
>> Unhandled fault at 0xffffffc00944d048
>> Mem abort info:
>>    ESR = 0x96000000
>>    EC = 0x25: DABT (current EL), IL = 32 bits
>>    SET = 0, FnV = 0
>>    EA = 0, S1PTW = 0
>>    FSC = 0x00: ttbr address size fault
>> Data abort info:
>>    ISV = 0, ISS = 0x00000000
>>    CM = 0, WnR = 0
>> swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000020e2e000
>> [ffffffc00944d048] pgd=100000003ffff803, p4d=100000003ffff803, pud=100000003ffff803, pmd=100000003fffa803, pte=006800009c090f13
>> Internal error: ttbr address size fault: 96000000 [#1] PREEMPT SMP
>> ...
>> Call trace:
>>   pl011_stop_rx+0x70/0x80
>>   tty_port_shutdown+0x7c/0xb4
>>   tty_port_close+0x60/0xcc
>>   uart_close+0x34/0x8c
>>   tty_release+0x144/0x4c0
>>   __fput+0x78/0x220
>>   ____fput+0x1c/0x30
>>   task_work_run+0x88/0xc0
>>   do_notify_resume+0x8d0/0x123c
>>   el0_svc+0xa8/0xc0
>>   el0t_64_sync_handler+0xa4/0x130
>>   el0t_64_sync+0x1a0/0x1a4
>> Code: b9000083 b901f001 794038a0 8b000042 (b9000041)
>> ---[ end trace 83dd93df15c3216f ]---
>> note: bootlogd[132] exited with preempt_count 1
>> /etc/rcS.d/S07bootlogd: line 47: 132 Segmentation fault start-stop-daemon
>>
>> As discussed in [2], this commit makes the access to DMACR register
>> write-ignore as an improvement.
> As discussed earlier, if we decide to improve vpl011 (for now only Stefano shared his opinion),
> then we need to mark *all* the PL011 registers that are not part of SBSA ar RAZ/WI.

I would be fine to that. But I would like us to print a message using 
XENLOG_G_DEBUG to catch any OS that would touch those registers.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2022-11-18 18:11 Mr. JAMES
  0 siblings, 0 replies; 1651+ messages in thread
From: Mr. JAMES @ 2022-11-18 18:11 UTC (permalink / raw)
  To: devicetree

Hello, 

I'm James, an Entrepreneur, Venture Capitalist & Private Lender. I represent a group of Ultra High Net Worth Donors worldwide. Kindly let me know if you can be trusted to distribute charitable items which include Cash, Food Items and Clothing in your region.

Thank you
James.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2022-11-18 19:33 Mr. JAMES
  0 siblings, 0 replies; 1651+ messages in thread
From: Mr. JAMES @ 2022-11-18 19:33 UTC (permalink / raw)
  To: git

Hello, 

I'm James, an Entrepreneur, Venture Capitalist & Private Lender. I represent a group of Ultra High Net Worth Donors worldwide. Kindly let me know if you can be trusted to distribute charitable items which include Cash, Food Items and Clothing in your region.

Thank you
James.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2022-11-21 11:11 Denis Arefev
  2022-11-21 14:28 ` Jason Yan
  0 siblings, 1 reply; 1651+ messages in thread
From: Denis Arefev @ 2022-11-21 11:11 UTC (permalink / raw)
  To: Anil Gurumurthy
  Cc: Sudarsana Kalluru, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel, trufanov, vfh

Date: Mon, 21 Nov 2022 13:29:03 +0300
Subject: [PATCH] scsi:bfa: Eliminated buffer overflow

Buffer 'cmd->adapter_hwpath' of size 32 accessed at
bfad_bsg.c:101:103 can overflow, since its index 'i'
can have value 32 that is out of range.

Signed-off-by: Denis Arefev <arefev@swemel.ru>
---
 drivers/scsi/bfa/bfad_bsg.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c
index be8dfbe13e90..78615ffc62ef 100644
--- a/drivers/scsi/bfa/bfad_bsg.c
+++ b/drivers/scsi/bfa/bfad_bsg.c
@@ -98,9 +98,9 @@ bfad_iocmd_ioc_get_info(struct bfad_s *bfad, void *cmd)
 
 	/* set adapter hw path */
 	strcpy(iocmd->adapter_hwpath, bfad->pci_name);
-	for (i = 0; iocmd->adapter_hwpath[i] != ':' && i < BFA_STRING_32; i++)
+	for (i = 0; iocmd->adapter_hwpath[i] != ':' && i < BFA_STRING_32-2; i++)
 		;
-	for (; iocmd->adapter_hwpath[++i] != ':' && i < BFA_STRING_32; )
+	for (; iocmd->adapter_hwpath[++i] != ':' && i < BFA_STRING_32-1; )
 		;
 	iocmd->adapter_hwpath[i] = '\0';
 	iocmd->status = BFA_STATUS_OK;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2022-11-21 11:11 Denis Arefev
@ 2022-11-21 14:28 ` Jason Yan
  0 siblings, 0 replies; 1651+ messages in thread
From: Jason Yan @ 2022-11-21 14:28 UTC (permalink / raw)
  To: Denis Arefev, Anil Gurumurthy
  Cc: Sudarsana Kalluru, James E.J. Bottomley, Martin K. Petersen,
	linux-scsi, linux-kernel, trufanov, vfh

You may need a real subject, not a subject text in the email.

type "git help send-email" if you don't know how to use it.

On 2022/11/21 19:11, Denis Arefev wrote:
> Date: Mon, 21 Nov 2022 13:29:03 +0300
> Subject: [PATCH] scsi:bfa: Eliminated buffer overflow
> 
> Buffer 'cmd->adapter_hwpath' of size 32 accessed at
> bfad_bsg.c:101:103 can overflow, since its index 'i'
> can have value 32 that is out of range.
> 
> Signed-off-by: Denis Arefev <arefev@swemel.ru>
> ---
>   drivers/scsi/bfa/bfad_bsg.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/bfa/bfad_bsg.c b/drivers/scsi/bfa/bfad_bsg.c
> index be8dfbe13e90..78615ffc62ef 100644
> --- a/drivers/scsi/bfa/bfad_bsg.c
> +++ b/drivers/scsi/bfa/bfad_bsg.c
> @@ -98,9 +98,9 @@ bfad_iocmd_ioc_get_info(struct bfad_s *bfad, void *cmd)
>   
>   	/* set adapter hw path */
>   	strcpy(iocmd->adapter_hwpath, bfad->pci_name);
> -	for (i = 0; iocmd->adapter_hwpath[i] != ':' && i < BFA_STRING_32; i++)
> +	for (i = 0; iocmd->adapter_hwpath[i] != ':' && i < BFA_STRING_32-2; i++)
>   		;
> -	for (; iocmd->adapter_hwpath[++i] != ':' && i < BFA_STRING_32; )
> +	for (; iocmd->adapter_hwpath[++i] != ':' && i < BFA_STRING_32-1; )
>   		;
>   	iocmd->adapter_hwpath[i] = '\0';
>   	iocmd->status = BFA_STATUS_OK;
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
@ 2023-01-18 20:59 alison.schofield
  2023-01-27  1:59 ` Dan Williams
  0 siblings, 1 reply; 1651+ messages in thread
From: alison.schofield @ 2023-01-18 20:59 UTC (permalink / raw)
  To: Dan Williams, Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky,
	Steven Rostedt
  Cc: Alison Schofield, linux-cxl, linux-kernel

From: Alison Schofield <alison.schofield@intel.com>

**RESENDING this cover letter previously mis-threaded.

Changes in v5:
- Rebase on cxl/next 
- Use struct_size() to calc mbox cmd payload .min_out
- s/INTERNAL/INJECTED mocked poison record source
- Added Jonathan Reviewed-by tag on Patch 3

Link to v4:
https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/

Add support for retrieving device poison lists and store the returned
error records as kernel trace events.

The handling of the poison list is guided by the CXL 3.0 Specification
Section 8.2.9.8.4.1. [1] 

Example, triggered by memdev:
$ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0

Example, triggered by region:
$ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0

[1]: https://www.computeexpresslink.org/download-the-specification

Alison Schofield (5):
  cxl/mbox: Add GET_POISON_LIST mailbox command
  cxl/trace: Add TRACE support for CXL media-error records
  cxl/memdev: Add trigger_poison_list sysfs attribute
  cxl/region: Add trigger_poison_list sysfs attribute
  tools/testing/cxl: Mock support for Get Poison List

 Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
 drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
 drivers/cxl/core/memdev.c               | 45 ++++++++++++++
 drivers/cxl/core/region.c               | 33 ++++++++++
 drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
 drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
 drivers/cxl/pci.c                       |  4 ++
 tools/testing/cxl/test/mem.c            | 42 +++++++++++++
 8 files changed, 381 insertions(+), 1 deletion(-)


base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
-- 
2.37.3


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2023-01-18 20:59 [PATCH v5 0/5] CXL Poison List Retrieval & Tracing alison.schofield
@ 2023-01-27  1:59 ` Dan Williams
  2023-01-27 16:10   ` Alison Schofield
  0 siblings, 1 reply; 1651+ messages in thread
From: Dan Williams @ 2023-01-27  1:59 UTC (permalink / raw)
  To: alison.schofield, Dan Williams, Ira Weiny, Vishal Verma,
	Dave Jiang, Ben Widawsky, Steven Rostedt
  Cc: Alison Schofield, linux-cxl, linux-kernel

alison.schofield@ wrote:
> From: Alison Schofield <alison.schofield@intel.com>
> 
> Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> 
> Changes in v5:
> - Rebase on cxl/next 
> - Use struct_size() to calc mbox cmd payload .min_out
> - s/INTERNAL/INJECTED mocked poison record source
> - Added Jonathan Reviewed-by tag on Patch 3
> 
> Link to v4:
> https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> 
> Add support for retrieving device poison lists and store the returned
> error records as kernel trace events.
> 
> The handling of the poison list is guided by the CXL 3.0 Specification
> Section 8.2.9.8.4.1. [1] 
> 
> Example, triggered by memdev:
> $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0

I think the pcidev= field wants to be called something like "host" or
"parent", because there is no strict requirement that a 'struct
cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
"cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
think all CXL device events should be emitting the PCIe serial number
for the memdev.

I will look in the implementation, but do region= and region_uuid= get
populated when mem3 is a member of the region?

> 
> Example, triggered by region:
> $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> 
> [1]: https://www.computeexpresslink.org/download-the-specification
> 
> Alison Schofield (5):
>   cxl/mbox: Add GET_POISON_LIST mailbox command
>   cxl/trace: Add TRACE support for CXL media-error records
>   cxl/memdev: Add trigger_poison_list sysfs attribute
>   cxl/region: Add trigger_poison_list sysfs attribute
>   tools/testing/cxl: Mock support for Get Poison List
> 
>  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
>  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
>  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
>  drivers/cxl/core/region.c               | 33 ++++++++++
>  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
>  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
>  drivers/cxl/pci.c                       |  4 ++
>  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
>  8 files changed, 381 insertions(+), 1 deletion(-)
> 
> 
> base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> -- 
> 2.37.3
> 



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-01-27  1:59 ` Dan Williams
@ 2023-01-27 16:10   ` Alison Schofield
  2023-01-27 19:16     ` Re: Dan Williams
  0 siblings, 1 reply; 1651+ messages in thread
From: Alison Schofield @ 2023-01-27 16:10 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky, Steven Rostedt,
	linux-cxl, linux-kernel

On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> alison.schofield@ wrote:
> > From: Alison Schofield <alison.schofield@intel.com>
> > 
> > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > 
> > Changes in v5:
> > - Rebase on cxl/next 
> > - Use struct_size() to calc mbox cmd payload .min_out
> > - s/INTERNAL/INJECTED mocked poison record source
> > - Added Jonathan Reviewed-by tag on Patch 3
> > 
> > Link to v4:
> > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > 
> > Add support for retrieving device poison lists and store the returned
> > error records as kernel trace events.
> > 
> > The handling of the poison list is guided by the CXL 3.0 Specification
> > Section 8.2.9.8.4.1. [1] 
> > 
> > Example, triggered by memdev:
> > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> 
> I think the pcidev= field wants to be called something like "host" or
> "parent", because there is no strict requirement that a 'struct
> cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> think all CXL device events should be emitting the PCIe serial number
> for the memdev.
]

Will do, 'host' and add PCIe serial no.

> 
> I will look in the implementation, but do region= and region_uuid= get
> populated when mem3 is a member of the region?

Not always.
In the case above, where the trigger was by memdev, no.
Region= and region_uuid= (and in the follow-on patch, hpa=) only get
populated if the poison was triggered by region, like the case below.

It could be looked up for the by memdev cases. Is that wanted?

Thanks for the reviews Dan!
> 
> > 
> > Example, triggered by region:
> > $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> > cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > 
> > [1]: https://www.computeexpresslink.org/download-the-specification
> > 
> > Alison Schofield (5):
> >   cxl/mbox: Add GET_POISON_LIST mailbox command
> >   cxl/trace: Add TRACE support for CXL media-error records
> >   cxl/memdev: Add trigger_poison_list sysfs attribute
> >   cxl/region: Add trigger_poison_list sysfs attribute
> >   tools/testing/cxl: Mock support for Get Poison List
> > 
> >  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
> >  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
> >  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
> >  drivers/cxl/core/region.c               | 33 ++++++++++
> >  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
> >  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
> >  drivers/cxl/pci.c                       |  4 ++
> >  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
> >  8 files changed, 381 insertions(+), 1 deletion(-)
> > 
> > 
> > base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> > -- 
> > 2.37.3
> > 
> 
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-01-27 16:10   ` Alison Schofield
@ 2023-01-27 19:16     ` Dan Williams
  2023-01-27 21:36       ` Re: Alison Schofield
  0 siblings, 1 reply; 1651+ messages in thread
From: Dan Williams @ 2023-01-27 19:16 UTC (permalink / raw)
  To: Alison Schofield, Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky, Steven Rostedt,
	linux-cxl, linux-kernel

Alison Schofield wrote:
> On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> > alison.schofield@ wrote:
> > > From: Alison Schofield <alison.schofield@intel.com>
> > > 
> > > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > > 
> > > Changes in v5:
> > > - Rebase on cxl/next 
> > > - Use struct_size() to calc mbox cmd payload .min_out
> > > - s/INTERNAL/INJECTED mocked poison record source
> > > - Added Jonathan Reviewed-by tag on Patch 3
> > > 
> > > Link to v4:
> > > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > > 
> > > Add support for retrieving device poison lists and store the returned
> > > error records as kernel trace events.
> > > 
> > > The handling of the poison list is guided by the CXL 3.0 Specification
> > > Section 8.2.9.8.4.1. [1] 
> > > 
> > > Example, triggered by memdev:
> > > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > 
> > I think the pcidev= field wants to be called something like "host" or
> > "parent", because there is no strict requirement that a 'struct
> > cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> > "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> > think all CXL device events should be emitting the PCIe serial number
> > for the memdev.
> ]
> 
> Will do, 'host' and add PCIe serial no.
> 
> > 
> > I will look in the implementation, but do region= and region_uuid= get
> > populated when mem3 is a member of the region?
> 
> Not always.
> In the case above, where the trigger was by memdev, no.
> Region= and region_uuid= (and in the follow-on patch, hpa=) only get
> populated if the poison was triggered by region, like the case below.
> 
> It could be looked up for the by memdev cases. Is that wanted?

Just trying to understand the semantics. However, I do think it makes sense
for a memdev trigger to lookup information on all impacted regions
across all of the device's DPA and the region trigger makes sense to
lookup all memdevs, but bounded by the DPA that contributes to that
region. I just want to avoid someone having to trigger the region to get
extra information that was readily available from a memdev listing.

> 
> Thanks for the reviews Dan!
> > 
> > > 
> > > Example, triggered by region:
> > > $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> > > cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > 
> > > [1]: https://www.computeexpresslink.org/download-the-specification
> > > 
> > > Alison Schofield (5):
> > >   cxl/mbox: Add GET_POISON_LIST mailbox command
> > >   cxl/trace: Add TRACE support for CXL media-error records
> > >   cxl/memdev: Add trigger_poison_list sysfs attribute
> > >   cxl/region: Add trigger_poison_list sysfs attribute
> > >   tools/testing/cxl: Mock support for Get Poison List
> > > 
> > >  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
> > >  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
> > >  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
> > >  drivers/cxl/core/region.c               | 33 ++++++++++
> > >  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
> > >  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
> > >  drivers/cxl/pci.c                       |  4 ++
> > >  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
> > >  8 files changed, 381 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> > > -- 
> > > 2.37.3
> > > 
> > 
> > 



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-01-27 19:16     ` Re: Dan Williams
@ 2023-01-27 21:36       ` Alison Schofield
  2023-01-27 22:04         ` Re: Dan Williams
  0 siblings, 1 reply; 1651+ messages in thread
From: Alison Schofield @ 2023-01-27 21:36 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky, Steven Rostedt,
	linux-cxl, linux-kernel

On Fri, Jan 27, 2023 at 11:16:49AM -0800, Dan Williams wrote:
> Alison Schofield wrote:
> > On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> > > alison.schofield@ wrote:
> > > > From: Alison Schofield <alison.schofield@intel.com>
> > > > 
> > > > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > > > 
> > > > Changes in v5:
> > > > - Rebase on cxl/next 
> > > > - Use struct_size() to calc mbox cmd payload .min_out
> > > > - s/INTERNAL/INJECTED mocked poison record source
> > > > - Added Jonathan Reviewed-by tag on Patch 3
> > > > 
> > > > Link to v4:
> > > > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > > > 
> > > > Add support for retrieving device poison lists and store the returned
> > > > error records as kernel trace events.
> > > > 
> > > > The handling of the poison list is guided by the CXL 3.0 Specification
> > > > Section 8.2.9.8.4.1. [1] 
> > > > 
> > > > Example, triggered by memdev:
> > > > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > > > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > 
> > > I think the pcidev= field wants to be called something like "host" or
> > > "parent", because there is no strict requirement that a 'struct
> > > cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> > > "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> > > think all CXL device events should be emitting the PCIe serial number
> > > for the memdev.
> > ]
> > 
> > Will do, 'host' and add PCIe serial no.
> > 
> > > 
> > > I will look in the implementation, but do region= and region_uuid= get
> > > populated when mem3 is a member of the region?
> > 
> > Not always.
> > In the case above, where the trigger was by memdev, no.
> > Region= and region_uuid= (and in the follow-on patch, hpa=) only get
> > populated if the poison was triggered by region, like the case below.
> > 
> > It could be looked up for the by memdev cases. Is that wanted?
> 
> Just trying to understand the semantics. However, I do think it makes sense
> for a memdev trigger to lookup information on all impacted regions
> across all of the device's DPA and the region trigger makes sense to
> lookup all memdevs, but bounded by the DPA that contributes to that
> region. I just want to avoid someone having to trigger the region to get
> extra information that was readily available from a memdev listing.
> 

Dan - 

Confirming my take-away from this email, and our chat:

Remove the by-region trigger_poison_list option entirely. User space
needs to trigger by-memdev the memdevs participating in the region and
filter those events by region.

Add the region info (region name, uuid) to the TRACE_EVENTs when the
poisoned DPA is part of any region.

Alison

> > 
> > Thanks for the reviews Dan!
> > > 
> > > > 
> > > > Example, triggered by region:
> > > > $ echo 1 > /sys/bus/cxl/devices/region5/trigger_poison_list
> > > > cxl_poison: memdev=mem0 pcidev=cxl_mem.0 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > > cxl_poison: memdev=mem1 pcidev=cxl_mem.1 region=region5 region_uuid=bfcb7a29-890e-4a41-8236-fe22221fc75c dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > > 
> > > > [1]: https://www.computeexpresslink.org/download-the-specification
> > > > 
> > > > Alison Schofield (5):
> > > >   cxl/mbox: Add GET_POISON_LIST mailbox command
> > > >   cxl/trace: Add TRACE support for CXL media-error records
> > > >   cxl/memdev: Add trigger_poison_list sysfs attribute
> > > >   cxl/region: Add trigger_poison_list sysfs attribute
> > > >   tools/testing/cxl: Mock support for Get Poison List
> > > > 
> > > >  Documentation/ABI/testing/sysfs-bus-cxl | 28 +++++++++
> > > >  drivers/cxl/core/mbox.c                 | 78 +++++++++++++++++++++++
> > > >  drivers/cxl/core/memdev.c               | 45 ++++++++++++++
> > > >  drivers/cxl/core/region.c               | 33 ++++++++++
> > > >  drivers/cxl/core/trace.h                | 83 +++++++++++++++++++++++++
> > > >  drivers/cxl/cxlmem.h                    | 69 +++++++++++++++++++-
> > > >  drivers/cxl/pci.c                       |  4 ++
> > > >  tools/testing/cxl/test/mem.c            | 42 +++++++++++++
> > > >  8 files changed, 381 insertions(+), 1 deletion(-)
> > > > 
> > > > 
> > > > base-commit: 589c3357370a596ef7c99c00baca8ac799fce531
> > > > -- 
> > > > 2.37.3
> > > > 
> > > 
> > > 
> 
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-01-27 21:36       ` Re: Alison Schofield
@ 2023-01-27 22:04         ` Dan Williams
  0 siblings, 0 replies; 1651+ messages in thread
From: Dan Williams @ 2023-01-27 22:04 UTC (permalink / raw)
  To: Alison Schofield, Dan Williams
  Cc: Ira Weiny, Vishal Verma, Dave Jiang, Ben Widawsky, Steven Rostedt,
	linux-cxl, linux-kernel

Alison Schofield wrote:
> On Fri, Jan 27, 2023 at 11:16:49AM -0800, Dan Williams wrote:
> > Alison Schofield wrote:
> > > On Thu, Jan 26, 2023 at 05:59:03PM -0800, Dan Williams wrote:
> > > > alison.schofield@ wrote:
> > > > > From: Alison Schofield <alison.schofield@intel.com>
> > > > > 
> > > > > Subject: [PATCH v5 0/5] CXL Poison List Retrieval & Tracing
> > > > > 
> > > > > Changes in v5:
> > > > > - Rebase on cxl/next 
> > > > > - Use struct_size() to calc mbox cmd payload .min_out
> > > > > - s/INTERNAL/INJECTED mocked poison record source
> > > > > - Added Jonathan Reviewed-by tag on Patch 3
> > > > > 
> > > > > Link to v4:
> > > > > https://lore.kernel.org/linux-cxl/cover.1671135967.git.alison.schofield@intel.com/
> > > > > 
> > > > > Add support for retrieving device poison lists and store the returned
> > > > > error records as kernel trace events.
> > > > > 
> > > > > The handling of the poison list is guided by the CXL 3.0 Specification
> > > > > Section 8.2.9.8.4.1. [1] 
> > > > > 
> > > > > Example, triggered by memdev:
> > > > > $ echo 1 > /sys/bus/cxl/devices/mem3/trigger_poison_list
> > > > > cxl_poison: memdev=mem3 pcidev=cxl_mem.3 region= region_uuid=00000000-0000-0000-0000-000000000000 dpa=0x0 length=0x40 source=Internal flags= overflow_time=0
> > > > 
> > > > I think the pcidev= field wants to be called something like "host" or
> > > > "parent", because there is no strict requirement that a 'struct
> > > > cxl_memdev' is related to a 'struct pci_dev'. In fact in that example
> > > > "cxl_mem.3" is a 'struct platform_device'. Now that I think about it, I
> > > > think all CXL device events should be emitting the PCIe serial number
> > > > for the memdev.
> > > ]
> > > 
> > > Will do, 'host' and add PCIe serial no.
> > > 
> > > > 
> > > > I will look in the implementation, but do region= and region_uuid= get
> > > > populated when mem3 is a member of the region?
> > > 
> > > Not always.
> > > In the case above, where the trigger was by memdev, no.
> > > Region= and region_uuid= (and in the follow-on patch, hpa=) only get
> > > populated if the poison was triggered by region, like the case below.
> > > 
> > > It could be looked up for the by memdev cases. Is that wanted?
> > 
> > Just trying to understand the semantics. However, I do think it makes sense
> > for a memdev trigger to lookup information on all impacted regions
> > across all of the device's DPA and the region trigger makes sense to
> > lookup all memdevs, but bounded by the DPA that contributes to that
> > region. I just want to avoid someone having to trigger the region to get
> > extra information that was readily available from a memdev listing.
> > 
> 
> Dan - 
> 
> Confirming my take-away from this email, and our chat:
> 
> Remove the by-region trigger_poison_list option entirely. User space
> needs to trigger by-memdev the memdevs participating in the region and
> filter those events by region.
> 
> Add the region info (region name, uuid) to the TRACE_EVENTs when the
> poisoned DPA is part of any region.

That's what I was thinking, yes. So the internals of
cxl_mem_get_poison() will take the cxl_region_rwsem for read and compare
the device's endpoint decoder settings against the media error records
to do the region (and later HPA) lookup.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20230122193117.GA28689@Debian-50-lenny-64-minimal>]

* Re:
       [not found] <20230122193117.GA28689@Debian-50-lenny-64-minimal>
@ 2023-01-22 21:42 ` Alejandro Colomar
  2023-01-24 20:01   ` Re: Helge Kreutzmann
  0 siblings, 1 reply; 1651+ messages in thread
From: Alejandro Colomar @ 2023-01-22 21:42 UTC (permalink / raw)
  To: Helge Kreutzmann; +Cc: mario.blaettermann, linux-man


[-- Attachment #1.1: Type: text/plain, Size: 205 bytes --]

Hi Helge,

On 1/22/23 20:31, Helge Kreutzmann wrote:
> Without further ado, the following was found:

Empty report.  An accident? :)

Cheers,

Alex
> 

-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-01-22 21:42 ` Re: Alejandro Colomar
@ 2023-01-24 20:01   ` Helge Kreutzmann
  0 siblings, 0 replies; 1651+ messages in thread
From: Helge Kreutzmann @ 2023-01-24 20:01 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: mario.blaettermann, linux-man

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

Helo Alex,
On Sun, Jan 22, 2023 at 10:42:54PM +0100, Alejandro Colomar wrote:
> Hi Helge,
> 
> On 1/22/23 20:31, Helge Kreutzmann wrote:
> > Without further ado, the following was found:
> 
> Empty report.  An accident? :)

I tried to figure out what happend - but I don't know.

Sorry for the empty report, please disregard.

Greetings

         Helge

-- 
      Dr. Helge Kreutzmann                     debian@helgefjell.de
           Dipl.-Phys.                   http://www.helgefjell.de/debian.php
        64bit GNU powered                     gpg signed mail preferred
           Help keep free software "libre": http://www.ffii.de/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2023-02-28  6:32 Mahmut Akten
  0 siblings, 0 replies; 1651+ messages in thread
From: Mahmut Akten @ 2023-02-28  6:32 UTC (permalink / raw)
  To: stable

Hello

I need your urgent response to a transaction request attached to your name/email stable@vger.kernel.org I would like to discuss with you now. 

Thank You
Mahmut Akten
Vice Chairman
Garanti BBVA Bank (Turkey)
www.garantibbva.com.tr

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH v2] uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2
@ 2023-03-12  6:52 Greg Kroah-Hartman
  2023-03-27 13:54 ` Yaroslav Furman
  0 siblings, 1 reply; 1651+ messages in thread
From: Greg Kroah-Hartman @ 2023-03-12  6:52 UTC (permalink / raw)
  To: Yaroslav Furman; +Cc: Alan Stern, linux-usb, usb-storage, linux-kernel

On Sat, Mar 11, 2023 at 07:12:26PM +0200, Yaroslav Furman wrote:
> Just like other JMicron JMS5xx enclosures, it chokes on report-opcodes,
> let's avoid them.
> 
> Signed-off-by: Yaroslav Furman <yaro330@gmail.com>
> ---
>  drivers/usb/storage/unusual_uas.h | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/usb/storage/unusual_uas.h b/drivers/usb/storage/unusual_uas.h
> index c7b763d6d102..1f8c9b16a0fb 100644
> --- a/drivers/usb/storage/unusual_uas.h
> +++ b/drivers/usb/storage/unusual_uas.h
> @@ -111,6 +111,13 @@ UNUSUAL_DEV(0x152d, 0x0578, 0x0000, 0x9999,
>  		USB_SC_DEVICE, USB_PR_DEVICE, NULL,
>  		US_FL_BROKEN_FUA),
>  
> +/* Reported by: Yaroslav Furman <yaro330@gmail.com> */
> +UNUSUAL_DEV(0x152d, 0x0583, 0x0000, 0x9999,
> +		"JMicron",
> +		"JMS583Gen 2",
> +		USB_SC_DEVICE, USB_PR_DEVICE, NULL,
> +		US_FL_NO_REPORT_OPCODES),
> +
>  /* Reported-by: Thinh Nguyen <thinhn@synopsys.com> */
>  UNUSUAL_DEV(0x154b, 0xf00b, 0x0000, 0x9999,
>  		"PNY",
> -- 
> 2.39.2
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- This looks like a new version of a previously submitted patch, but you
  did not list below the --- line any changes from the previous version.
  Please read the section entitled "The canonical patch format" in the
  kernel file, Documentation/process/submitting-patches.rst for what
  needs to be done here to properly describe this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2023-03-12  6:52 [PATCH v2] uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2 Greg Kroah-Hartman
@ 2023-03-27 13:54 ` Yaroslav Furman
  2023-03-27 14:19   ` Greg Kroah-Hartman
  0 siblings, 1 reply; 1651+ messages in thread
From: Yaroslav Furman @ 2023-03-27 13:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: yaro330, Alan Stern, linux-usb, usb-storage, linux-kernel


Will this patch get ported to LTS trees? It applies cleanly.
Would love to see it in 6.1 and 5.15 trees.

6.1 is what my steam deck is going to start using soon-ish.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-03-27 13:54 ` Yaroslav Furman
@ 2023-03-27 14:19   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 1651+ messages in thread
From: Greg Kroah-Hartman @ 2023-03-27 14:19 UTC (permalink / raw)
  To: Yaroslav Furman; +Cc: Alan Stern, linux-usb, usb-storage, linux-kernel

On Mon, Mar 27, 2023 at 04:54:22PM +0300, Yaroslav Furman wrote:
> 
> Will this patch get ported to LTS trees? It applies cleanly.
> Would love to see it in 6.1 and 5.15 trees.

What patch?

confused,

greg k-h

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2023-05-11 12:58 Ryan Roberts
  2023-05-11 13:13 ` Ryan Roberts
  0 siblings, 1 reply; 1651+ messages in thread
From: Ryan Roberts @ 2023-05-11 12:58 UTC (permalink / raw)
  To: Andrew Morton, Matthew Wilcox (Oracle), Kirill A. Shutemov,
	SeongJae Park
  Cc: Ryan Roberts, linux-kernel, linux-mm, damon

Date: Thu, 11 May 2023 11:38:28 +0100
Subject: [PATCH v1 0/5] Encapsulate PTE contents from non-arch code

Hi All,

This series improves the encapsulation of pte entries by disallowing non-arch
code from directly dereferencing pte_t pointers. Instead code must use a new
helper, `pte_t ptep_deref(pte_t *ptep)`. By default, this helper does a direct
dereference of the pointer, so generated code should be exactly the same. But
it's presence sets us up for arch code being able to override the default to
"virtualize" the ptes without needing to maintain a shadow table.

I intend to take advantage of this for arm64 to enable use of its "contiguous
bit" to coalesce multiple ptes into a single tlb entry, reducing pressure and
improving performance. I have an RFC for the first part of this work at [1]. The
cover letter there also explains the second part, which this series is enabling.

I intend to post an RFC for the contpte changes in due course, but it would be
good to get the ball rolling on this enabler.

There are 2 reasons that I need the encapsulation:

  - Prevent leaking the arch-private PTE_CONT bit to the core code. If the core
    code reads a pte that contains this bit, it could end up calling
    set_pte_at() with the bit set which would confuse the implementation. So we
    can always clear PTE_CONT in ptep_deref() (and ptep_get()) to avoid a leaky
    abstraction.
  - Contiguous ptes have a single access and dirty bit for the contiguous range.
    So we need to "mix-in" those bits when the core is dereferencing a pte that
    lies in the contig range. There is code that dereferences the pte then takes
    different actions based on access/dirty (see e.g. write_protect_page()).

While ptep_get() and ptep_get_lockless() already exist, both of them are
implemented using READ_ONCE() by default. While we could use ptep_get() instead
of the new ptep_deref(), I didn't want to risk performance regression.
Alternatively, all call sites that currently use ptep_get() that need the
lockless behaviour could be upgraded to ptep_get_lockless() and ptep_get() could
be downgraded to a simple dereference. That would be cleanest, but is a much
bigger (and likely error prone) change because all the arch code would need to
be updated for the new definitions of ptep_get().

The series is split up as follows:

patchs 1-2: Fix bugs where code was _setting_ ptes directly, rather than using
            set_pte_at() and friends.
patch 3:    Fix highmem unmapping issue I spotted while doing the work.
patch 4:    Introduce the new ptep_deref() helper with default implementation.
patch 5:    Convert all direct dereferences to use ptep_deref().

[1] https://lore.kernel.org/linux-mm/20230414130303.2345383-1-ryan.roberts@arm.com/

Thanks,
Ryan


Ryan Roberts (5):
  mm: vmalloc must set pte via arch code
  mm: damon must atomically clear young on ptes and pmds
  mm: Fix failure to unmap pte on highmem systems
  mm: Add new ptep_deref() helper to fully encapsulate pte_t
  mm: ptep_deref() conversion

 .../drm/i915/gem/selftests/i915_gem_mman.c    |   8 +-
 drivers/misc/sgi-gru/grufault.c               |   2 +-
 drivers/vfio/vfio_iommu_type1.c               |   7 +-
 drivers/xen/privcmd.c                         |   2 +-
 fs/proc/task_mmu.c                            |  33 +++---
 fs/userfaultfd.c                              |   6 +-
 include/linux/hugetlb.h                       |   2 +-
 include/linux/mm_inline.h                     |   2 +-
 include/linux/pgtable.h                       |  13 ++-
 kernel/events/uprobes.c                       |   2 +-
 mm/damon/ops-common.c                         |  18 ++-
 mm/damon/ops-common.h                         |   4 +-
 mm/damon/paddr.c                              |   6 +-
 mm/damon/vaddr.c                              |  14 ++-
 mm/filemap.c                                  |   2 +-
 mm/gup.c                                      |  21 ++--
 mm/highmem.c                                  |  12 +-
 mm/hmm.c                                      |   2 +-
 mm/huge_memory.c                              |   4 +-
 mm/hugetlb.c                                  |   2 +-
 mm/hugetlb_vmemmap.c                          |   6 +-
 mm/kasan/init.c                               |   9 +-
 mm/kasan/shadow.c                             |  10 +-
 mm/khugepaged.c                               |  24 ++--
 mm/ksm.c                                      |  22 ++--
 mm/madvise.c                                  |   6 +-
 mm/mapping_dirty_helpers.c                    |   4 +-
 mm/memcontrol.c                               |   4 +-
 mm/memory-failure.c                           |   6 +-
 mm/memory.c                                   | 103 +++++++++---------
 mm/mempolicy.c                                |   6 +-
 mm/migrate.c                                  |  14 ++-
 mm/migrate_device.c                           |  14 ++-
 mm/mincore.c                                  |   2 +-
 mm/mlock.c                                    |   6 +-
 mm/mprotect.c                                 |   8 +-
 mm/mremap.c                                   |   2 +-
 mm/page_table_check.c                         |   4 +-
 mm/page_vma_mapped.c                          |  26 +++--
 mm/pgtable-generic.c                          |   2 +-
 mm/rmap.c                                     |  32 +++---
 mm/sparse-vmemmap.c                           |   8 +-
 mm/swap_state.c                               |   4 +-
 mm/swapfile.c                                 |  16 +--
 mm/userfaultfd.c                              |   4 +-
 mm/vmalloc.c                                  |  11 +-
 mm/vmscan.c                                   |  14 ++-
 virt/kvm/kvm_main.c                           |   9 +-
 48 files changed, 302 insertions(+), 236 deletions(-)

--
2.25.1


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-05-11 12:58 Ryan Roberts
@ 2023-05-11 13:13 ` Ryan Roberts
  0 siblings, 0 replies; 1651+ messages in thread
From: Ryan Roberts @ 2023-05-11 13:13 UTC (permalink / raw)
  To: Andrew Morton, Matthew Wilcox (Oracle), Kirill A. Shutemov,
	SeongJae Park
  Cc: linux-kernel, linux-mm, damon

My appologies for the noise: A blank line between Cc and Subject has broken the
subject and grouping in lore.

Please Ignore this, I will resend.


On 11/05/2023 13:58, Ryan Roberts wrote:
> Date: Thu, 11 May 2023 11:38:28 +0100
> Subject: [PATCH v1 0/5] Encapsulate PTE contents from non-arch code
> 
> Hi All,
> 
> This series improves the encapsulation of pte entries by disallowing non-arch
> code from directly dereferencing pte_t pointers. Instead code must use a new
> helper, `pte_t ptep_deref(pte_t *ptep)`. By default, this helper does a direct
> dereference of the pointer, so generated code should be exactly the same. But
> it's presence sets us up for arch code being able to override the default to
> "virtualize" the ptes without needing to maintain a shadow table.
> 
> I intend to take advantage of this for arm64 to enable use of its "contiguous
> bit" to coalesce multiple ptes into a single tlb entry, reducing pressure and
> improving performance. I have an RFC for the first part of this work at [1]. The
> cover letter there also explains the second part, which this series is enabling.
> 
> I intend to post an RFC for the contpte changes in due course, but it would be
> good to get the ball rolling on this enabler.
> 
> There are 2 reasons that I need the encapsulation:
> 
>   - Prevent leaking the arch-private PTE_CONT bit to the core code. If the core
>     code reads a pte that contains this bit, it could end up calling
>     set_pte_at() with the bit set which would confuse the implementation. So we
>     can always clear PTE_CONT in ptep_deref() (and ptep_get()) to avoid a leaky
>     abstraction.
>   - Contiguous ptes have a single access and dirty bit for the contiguous range.
>     So we need to "mix-in" those bits when the core is dereferencing a pte that
>     lies in the contig range. There is code that dereferences the pte then takes
>     different actions based on access/dirty (see e.g. write_protect_page()).
> 
> While ptep_get() and ptep_get_lockless() already exist, both of them are
> implemented using READ_ONCE() by default. While we could use ptep_get() instead
> of the new ptep_deref(), I didn't want to risk performance regression.
> Alternatively, all call sites that currently use ptep_get() that need the
> lockless behaviour could be upgraded to ptep_get_lockless() and ptep_get() could
> be downgraded to a simple dereference. That would be cleanest, but is a much
> bigger (and likely error prone) change because all the arch code would need to
> be updated for the new definitions of ptep_get().
> 
> The series is split up as follows:
> 
> patchs 1-2: Fix bugs where code was _setting_ ptes directly, rather than using
>             set_pte_at() and friends.
> patch 3:    Fix highmem unmapping issue I spotted while doing the work.
> patch 4:    Introduce the new ptep_deref() helper with default implementation.
> patch 5:    Convert all direct dereferences to use ptep_deref().
> 
> [1] https://lore.kernel.org/linux-mm/20230414130303.2345383-1-ryan.roberts@arm.com/
> 
> Thanks,
> Ryan
> 
> 
> Ryan Roberts (5):
>   mm: vmalloc must set pte via arch code
>   mm: damon must atomically clear young on ptes and pmds
>   mm: Fix failure to unmap pte on highmem systems
>   mm: Add new ptep_deref() helper to fully encapsulate pte_t
>   mm: ptep_deref() conversion
> 
>  .../drm/i915/gem/selftests/i915_gem_mman.c    |   8 +-
>  drivers/misc/sgi-gru/grufault.c               |   2 +-
>  drivers/vfio/vfio_iommu_type1.c               |   7 +-
>  drivers/xen/privcmd.c                         |   2 +-
>  fs/proc/task_mmu.c                            |  33 +++---
>  fs/userfaultfd.c                              |   6 +-
>  include/linux/hugetlb.h                       |   2 +-
>  include/linux/mm_inline.h                     |   2 +-
>  include/linux/pgtable.h                       |  13 ++-
>  kernel/events/uprobes.c                       |   2 +-
>  mm/damon/ops-common.c                         |  18 ++-
>  mm/damon/ops-common.h                         |   4 +-
>  mm/damon/paddr.c                              |   6 +-
>  mm/damon/vaddr.c                              |  14 ++-
>  mm/filemap.c                                  |   2 +-
>  mm/gup.c                                      |  21 ++--
>  mm/highmem.c                                  |  12 +-
>  mm/hmm.c                                      |   2 +-
>  mm/huge_memory.c                              |   4 +-
>  mm/hugetlb.c                                  |   2 +-
>  mm/hugetlb_vmemmap.c                          |   6 +-
>  mm/kasan/init.c                               |   9 +-
>  mm/kasan/shadow.c                             |  10 +-
>  mm/khugepaged.c                               |  24 ++--
>  mm/ksm.c                                      |  22 ++--
>  mm/madvise.c                                  |   6 +-
>  mm/mapping_dirty_helpers.c                    |   4 +-
>  mm/memcontrol.c                               |   4 +-
>  mm/memory-failure.c                           |   6 +-
>  mm/memory.c                                   | 103 +++++++++---------
>  mm/mempolicy.c                                |   6 +-
>  mm/migrate.c                                  |  14 ++-
>  mm/migrate_device.c                           |  14 ++-
>  mm/mincore.c                                  |   2 +-
>  mm/mlock.c                                    |   6 +-
>  mm/mprotect.c                                 |   8 +-
>  mm/mremap.c                                   |   2 +-
>  mm/page_table_check.c                         |   4 +-
>  mm/page_vma_mapped.c                          |  26 +++--
>  mm/pgtable-generic.c                          |   2 +-
>  mm/rmap.c                                     |  32 +++---
>  mm/sparse-vmemmap.c                           |   8 +-
>  mm/swap_state.c                               |   4 +-
>  mm/swapfile.c                                 |  16 +--
>  mm/userfaultfd.c                              |   4 +-
>  mm/vmalloc.c                                  |  11 +-
>  mm/vmscan.c                                   |  14 ++-
>  virt/kvm/kvm_main.c                           |   9 +-
>  48 files changed, 302 insertions(+), 236 deletions(-)
> 
> --
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE;
@ 2023-05-30  1:31 Olena Shevchenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Olena Shevchenko @ 2023-05-30  1:31 UTC (permalink / raw)
  To: soc

Hello,

I have funds for investment. Can we partner if you have a good business idea? 


Thank you
Mrs. Olena 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE;
@ 2023-05-30  2:46 Olena Shevchenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Olena Shevchenko @ 2023-05-30  2:46 UTC (permalink / raw)
  To: sparclinux

Hello,

I have funds for investment. Can we partner if you have a good business idea? 


Thank you
Mrs. Olena 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CAKEZqKKdQ9EhRobSmq0sV76arfpk6m5XqA-=XQP_M3VRG=M-eg@mail.gmail.com>]

* Re:
       [not found] <CAKEZqKKdQ9EhRobSmq0sV76arfpk6m5XqA-=XQP_M3VRG=M-eg@mail.gmail.com>
@ 2023-06-08  8:13 ` chenlei0x
  0 siblings, 0 replies; 1651+ messages in thread
From: chenlei0x @ 2023-06-08  8:13 UTC (permalink / raw)
  To: linux-xfs

unsubscribe linux-xfs

On Thu, Jun 8, 2023 at 4:11 PM chenlei0x <losemyheaven@gmail.com> wrote:
>
> unsubscribe

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <010d01d999f4$257ae020$7070a060$@mirroredgenetworks.com>]

[parent not found: <CAEhhANphwWt5iOMc5Yqp1tT1HGoG_GsCuUWBWeVX4zxL6JwUiw@mail.gmail.com>]

[parent not found: <CAEhhANom-MGPCqEk5LXufMkxvnoY0YRUrr0r07s0_7F=eCQH5Q@mail.gmail.com>]

* Re:
       [not found]   ` <CAEhhANom-MGPCqEk5LXufMkxvnoY0YRUrr0r07s0_7F=eCQH5Q@mail.gmail.com>
@ 2023-06-08 10:51     ` Daniel Little
  0 siblings, 0 replies; 1651+ messages in thread
From: Daniel Little @ 2023-06-08 10:51 UTC (permalink / raw)
  To: linux-btrfs, support

[-- Attachment #1: Type: text/plain, Size: 2581 bytes --]

>>
>> Good Day,
>>
>> I’m sorry to message the developers this way. Im sure this is not the purpose of being able to contact developers, but I am pretty desperate here.
>>
>> I’m desperately seeking some "hands-on" assistance with my broken Rocstor setup. I have a lot of photos and videos on my drives that I cannot reproduce and would really like to retrieve.
>>
>> With my limited knowledge and skill I have tried as best I can to follow the suggestions made by Philip on my forum post (Disk Pool mounted, shared missing. many errors - #2 by phillxnet), but I’m no closer to success than when I started. Im sure its because Im not doing things right. If someone smarter than me is willing to offer their precious time to assist I am happy to set up remote access to the system for them to work/diagnose/troubleshoot directly. I will fit into your schedule whenever and whatever that may be. I’m willing to put on the dunce hat and be tarred and feathered and publicly mocked, so long as some kind souls help me to recover the data.
>>
>> I eagerly await and appreciate any assistance offered. I respectfully understand too if this is not something anyone wants to take on.
>>
>> SITREP:
>>
>> Rockstor 4.1.0-0 installed on a ESXI vm. Tried to get vmware-tools installed. followed a guide blindly. vm rebooted, all hell broke loose.
>>
>> “Parent transid verify failed… wanted 32616 found 32441”
>> Pool remounts automatically as read-only.
>>
>> OUTPUTS:
>>
>>
>>
>> uname -a
>>
>> Linux RocStor 5.3.18-150300.59.106-default #1 SMP Mon Dec 12 13:16:24 UTC 2022 (774239c) x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>>
>> btrfs --version
>>
>> btrfs-progs v4.19.1
>>
>>
>>
>> btrfs fi show
>>
>> Label: ‘ROOT’     uuid: 4ac1b0f-afeb-4946-aad1-975a2a26c941
>>
>>                              Total devices 1 FS bytes used 4.65GiB
>>
>>                              Devid 1 size 47.93GiB used 5.80GiB path /dev/sda4
>>
>>
>>
>> Label: ‘DATA’      uuid: 8d3ee597-bddc-4de8-8fc0-23fde00e27f1
>>
>>                              Total devices 1FS bytes used 768.00KiB
>>
>>                              Devid 1 size 16.37TiB used 11.72TiB path /dev/sdb
>>
>>
>>
>> Inside DATA there are only two folders. DATASTORE and SyncThing. All the required data is in DATASTORE.
>>
>>
>>
>> Btrfs fi df /home
>>
>> Data, single: total=5.54GiB, used=4.55GiB
>>
>> System, single: total=32.00MiB, used=16.ooKiB
>>
>> Metadata, single:=232.00MiB, used=110.05MiB
>>
>> GlobalReserve, single: total=11.55MiB, used=0.00B

[-- Attachment #2: requested_logs.tgz --]
[-- Type: application/x-compressed, Size: 27311 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2023-06-27 11:10 Alvaro a-m
  2023-06-27 11:15 ` Michael Kjörling
  0 siblings, 1 reply; 1651+ messages in thread
From: Alvaro a-m @ 2023-06-27 11:10 UTC (permalink / raw)
  To: cryptsetup

HI everyone,

My name is Álvaro. I'm texting you this email about LUKS. I have a USB
with 3 partitions (/efi, /boot and /data with LVMs encrypted by LUKS).
I have a problem entering the password as I don't have a keyboard and
I need to enter the password with a touch screen. I tried to search
for information about that but nothing was successful.

Do you know any solution for this? Can I enable the touch screen
before LUKS gets up?
Thank you, have a nice day.

Kind regards,
Álvaro

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-06-27 11:10 Alvaro a-m
@ 2023-06-27 11:15 ` Michael Kjörling
  0 siblings, 0 replies; 1651+ messages in thread
From: Michael Kjörling @ 2023-06-27 11:15 UTC (permalink / raw)
  To: cryptsetup

On 27 Jun 2023 13:10 +0200, from alvaroam007@gmail.com (Alvaro a-m):
> Do you know any solution for this? Can I enable the touch screen
> before LUKS gets up?

That is a distribution issue; not a LUKS, dm-crypt or cryptsetup
issue. You should ask your question in a forum more geared toward
whatever distribution you are using.

I do hope that you will be able to find an answer, but the cryptsetup
mailing list is the wrong forum for your question.

-- 
Michael Kjörling                     🔗 https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <64b09dbb.630a0220.e80b9.e2ed@mx.google.com>]

* Re:
       [not found] <64b09dbb.630a0220.e80b9.e2ed@mx.google.com>
@ 2023-07-14  8:05 ` Andy Shevchenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Andy Shevchenko @ 2023-07-14  8:05 UTC (permalink / raw)
  To: luoruihong
  Cc: ilpo.jarvinen, gregkh, jirislaby, linux-kernel, linux-serial,
	luoruihong, weipengliang, wengjinfei

On Fri, Jul 14, 2023 at 08:58:29AM +0800, luoruihong wrote:
> On Thu, Jul 13, 2023 at 07:51:14PM +0300, Andy Shevchenko wrote:
> > On Thu, Jul 13, 2023 at 08:42:36AM +0800, Ruihong Luo wrote:
> > > Preserve the original value of the Divisor Latch Fraction (DLF) register.
> > > When the DLF register is modified without preservation, it can disrupt
> > > the baudrate settings established by firmware or bootloader, leading to
> > > data corruption and the generation of unreadable or distorted characters.
> >
> > You forgot to add my tag. Why? Do you think the name of variable warrants this?
> > Whatever,
> > Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> >
> > Next time if you don't pick up somebody's tag, care to explain in the changelog
> > why.
> >
> > > Fixes: 701c5e73b296 ("serial: 8250_dw: add fractional divisor support")
> > > Signed-off-by: Ruihong Luo <colorsu1922@gmail.com>
> 
> I'm sorry, I didn't know about this rule. Thank you for helping me add
> the missing tags back and for all your previous kind assistance.

For now no need to do anything, just wait for Ilpo's and/or Greg's answer(s),

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <TXJgqLzlM6oCfTXKSqrSBk@txt.att.net>]

* Re:
       [not found] <TXJgqLzlM6oCfTXKSqrSBk@txt.att.net>
@ 2023-08-09  5:12 ` Luna Jernberg
  0 siblings, 0 replies; 1651+ messages in thread
From: Luna Jernberg @ 2023-08-09  5:12 UTC (permalink / raw)
  To: 5598162950, Luna Jernberg; +Cc: git

What is the question?

Den ons 9 aug. 2023 kl 03:31 skrev <5598162950@mms.cricketwireless.net>:

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <c8d43894-7e66-4a01-88fc-10708dc53b6b@amd.com>]

[parent not found: <878r7z4kb4.ffs@tglx>]

[parent not found: <e79dea49-0c07-4ca2-b359-97dd1bc579c8@amd.com>]

[parent not found: <87ttqhcotn.ffs@tglx>]

[parent not found: <87v8avawe0.ffs@tglx>]

[parent not found: <32bcaa8a-0413-4aa4-97a0-189830da8654@amd.com>]

[parent not found: <ZTkzYA3w2p3L4SVA@localhost>]

[parent not found: <87jzra6235.ffs@tglx>]

[parent not found: <875y2u5s8g.ffs@tglx>]

* Re:
       [not found]               ` <875y2u5s8g.ffs@tglx>
@ 2023-10-25 22:11                 ` Mario Limonciello
  2023-10-26  9:27                   ` Re: Thomas Gleixner
  0 siblings, 1 reply; 1651+ messages in thread
From: Mario Limonciello @ 2023-10-25 22:11 UTC (permalink / raw)
  To: Thomas Gleixner, David Lazar
  Cc: Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov,
	Rafael J. Wysocki, Linux kernel regressions list

On 10/25/2023 16:04, Thomas Gleixner wrote:
> David and a few others reported that on certain newer systems some legacy
> interrupts fail to work correctly.
> 
> Debugging revealed that the BIOS of these systems leaves the legacy PIC in
> uninitialized state which makes the PIC detection fail and the kernel
> switches to a dummy implementation.
> 
> Unfortunately this fallback causes quite some code to fail as it depends on
> checks for the number of legacy PIC interrupts or the availability of the
> real PIC.
> 
> In theory there is no reason to use the PIC on any modern system when
> IO/APIC is available, but the dependencies on the related checks cannot be
> resolved trivially and on short notice. This needs lots of analysis and
> rework.
> 
> The PIC detection has been added to avoid quirky checks and force selection
> of the dummy implementation all over the place, especially in VM guest
> scenarios. So it's not an option to revert the relevant commit as that
> would break a lot of other scenarios.
> 
> One solution would be to try to initialize the PIC on detection fail and
> retry the detection, but that puts the burden on everything which does not
> have a PIC.
> 
> Fortunately the ACPI/MADT table header has a flag field, which advertises
> in bit 0 that the system is PCAT compatible, which means it has a legacy
> 8259 PIC.
> 
> Evaluate that bit and if set avoid the detection routine and keep the real
> PIC installed, which then gets initialized (for nothing) and makes the rest
> of the code with all the dependencies work again.
> 
> Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
> Reported-by: David Lazar <dlazar@gmail.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: David Lazar <dlazar@gmail.com>
> Cc: stable@vger.kernel.org
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218003

s/Link/Closes/

Presumably you will add a proper subject when this is committed?

With adding title and fixing that tag:

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

> ---
> ---
>   arch/x86/include/asm/i8259.h |    2 ++
>   arch/x86/kernel/acpi/boot.c  |    3 +++
>   arch/x86/kernel/i8259.c      |   38 ++++++++++++++++++++++++++++++--------
>   3 files changed, 35 insertions(+), 8 deletions(-)
> 
> --- a/arch/x86/include/asm/i8259.h
> +++ b/arch/x86/include/asm/i8259.h
> @@ -69,6 +69,8 @@ struct legacy_pic {
>   	void (*make_irq)(unsigned int irq);
>   };
>   
> +void legacy_pic_pcat_compat(void);
> +
>   extern struct legacy_pic *legacy_pic;
>   extern struct legacy_pic null_legacy_pic;
>   
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct
>   		pr_debug("Local APIC address 0x%08x\n", madt->address);
>   	}
>   
> +	if (madt->flags & ACPI_MADT_PCAT_COMPAT)
> +		legacy_pic_pcat_compat();
> +
>   	/* ACPI 6.3 and newer support the online capable bit. */
>   	if (acpi_gbl_FADT.header.revision > 6 ||
>   	    (acpi_gbl_FADT.header.revision == 6 &&
> --- a/arch/x86/kernel/i8259.c
> +++ b/arch/x86/kernel/i8259.c
> @@ -32,6 +32,7 @@
>    */
>   static void init_8259A(int auto_eoi);
>   
> +static bool pcat_compat __ro_after_init;
>   static int i8259A_auto_eoi;
>   DEFINE_RAW_SPINLOCK(i8259A_lock);
>   
> @@ -299,15 +300,32 @@ static void unmask_8259A(void)
>   
>   static int probe_8259A(void)
>   {
> +	unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
>   	unsigned long flags;
> -	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
> -	unsigned char new_val;
> +
> +	/*
> +	 * If MADT has the PCAT_COMPAT flag set, then do not bother probing
> +	 * for the PIC. Some BIOSes leave the PIC uninitialized and probing
> +	 * fails.
> +	 *
> +	 * Right now this causes problems as quite some code depends on
> +	 * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
> +	 * when the system has an IO/APIC because then PIC is not required
> +	 * at all, except for really old machines where the timer interrupt
> +	 * must be routed through the PIC. So just pretend that the PIC is
> +	 * there and let legacy_pic->init() initialize it for nothing.
> +	 *
> +	 * Alternatively this could just try to initialize the PIC and
> +	 * repeat the probe, but for cases where there is no PIC that's
> +	 * just pointless.
> +	 */
> +	if (pcat_compat)
> +		return nr_legacy_irqs();
> +
>   	/*
> -	 * Check to see if we have a PIC.
> -	 * Mask all except the cascade and read
> -	 * back the value we just wrote. If we don't
> -	 * have a PIC, we will read 0xff as opposed to the
> -	 * value we wrote.
> +	 * Check to see if we have a PIC.  Mask all except the cascade and
> +	 * read back the value we just wrote. If we don't have a PIC, we
> +	 * will read 0xff as opposed to the value we wrote.
>   	 */
>   	raw_spin_lock_irqsave(&i8259A_lock, flags);
>   
> @@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
>   
>   	return 0;
>   }
> -
>   device_initcall(i8259A_init_ops);
> +
> +void __init legacy_pic_pcat_compat(void)
> +{
> +	pcat_compat = true;
> +}


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-10-25 22:11                 ` Re: Mario Limonciello
@ 2023-10-26  9:27                   ` Thomas Gleixner
  0 siblings, 0 replies; 1651+ messages in thread
From: Thomas Gleixner @ 2023-10-26  9:27 UTC (permalink / raw)
  To: Mario Limonciello, David Lazar
  Cc: Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov,
	Rafael J. Wysocki, Linux kernel regressions list

On Wed, Oct 25 2023 at 17:11, Mario Limonciello wrote:
> On 10/25/2023 16:04, Thomas Gleixner wrote:
>> Cc: stable@vger.kernel.org
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218003
>
> s/Link/Closes/

Sure.

> Presumably you will add a proper subject when this is committed?

Bah, yes. I stopped replacing the subject line right after clearing it :(

> With adding title and fixing that tag:
>
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2023-11-11  4:21 Andrew Worsley
  2023-11-11  8:22 ` Javier Martinez Canillas
  0 siblings, 1 reply; 1651+ messages in thread
From: Andrew Worsley @ 2023-11-11  4:21 UTC (permalink / raw)
  To: Thomas Zimmermann, Javier Martinez Canillas, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Daniel Vetter,
	open list:DRM DRIVER FOR FIRMWARE FRAMEBUFFERS, open list

   This patch fix's the failure of the frame buffer driver on my Asahi kernel
which prevented X11 from starting on my Apple M1 laptop. It seems like a straight
forward failure to follow the procedure described in drivers/video/aperture.c
to remove the ealier driver. This patch is very simple and minimal. Very likely
there may be better ways to fix this and very like there may be other drivers
which have the same problem but I submit this so at least there is
an interim fix for my problem.

    Thanks

    Andrew Worsley

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-11-11  4:21 Andrew Worsley
@ 2023-11-11  8:22 ` Javier Martinez Canillas
  0 siblings, 0 replies; 1651+ messages in thread
From: Javier Martinez Canillas @ 2023-11-11  8:22 UTC (permalink / raw)
  To: Andrew Worsley, Thomas Zimmermann, Maarten Lankhorst,
	Maxime Ripard, David Airlie, Daniel Vetter,
	open list:DRM DRIVER FOR FIRMWARE FRAMEBUFFERS, open list

Andrew Worsley <amworsley@gmail.com> writes:

Hello Andrew,

>    This patch fix's the failure of the frame buffer driver on my Asahi kernel
> which prevented X11 from starting on my Apple M1 laptop. It seems like a straight
> forward failure to follow the procedure described in drivers/video/aperture.c
> to remove the ealier driver. This patch is very simple and minimal. Very likely
> there may be better ways to fix this and very like there may be other drivers
> which have the same problem but I submit this so at least there is
> an interim fix for my problem.
>

Which partch? I think you forgot to include in your email?

>     Thanks
>
>     Andrew Worsley
>

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [NDCTL PATCH v3 2/2] cxl: Add check for regions before disabling memdev
@ 2023-11-30 21:51 Dave Jiang
  2024-04-17  6:46 ` Yao Xingtao
  0 siblings, 1 reply; 1651+ messages in thread
From: Dave Jiang @ 2023-11-30 21:51 UTC (permalink / raw)
  To: linux-cxl, nvdimm; +Cc: vishal.l.verma, caoqq

Add a check for memdev disable to see if there are active regions present
before disabling the device. This is necessary now regions are present to
fulfill the TODO that was left there. The best way to determine if a
region is active is to see if there are decoders enabled for the mem
device. This is also best effort as the state is only a snapshot the
kernel provides and is not atomic WRT the memdev disable operation. The
expectation is the admin issuing the command has full control of the mem
device and there are no other agents also attempt to control the device.

Reviewed-by: Quanquan Cao <caoqq@fujitsu.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
v3:
- Add emission of warning for forcing operation. (Quanquan)
v2:
- Warn if active region regardless of -f. (Alison)
- Expound on -f behavior in man page. (Vishal)
---
 Documentation/cxl/cxl-disable-memdev.txt |    4 +++-
 cxl/memdev.c                             |   20 +++++++++++++++++---
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/Documentation/cxl/cxl-disable-memdev.txt b/Documentation/cxl/cxl-disable-memdev.txt
index c4edb93ee94a..34b720288705 100644
--- a/Documentation/cxl/cxl-disable-memdev.txt
+++ b/Documentation/cxl/cxl-disable-memdev.txt
@@ -27,7 +27,9 @@ include::bus-option.txt[]
 	a device if the tool determines the memdev is in active usage. Recall
 	that CXL memory ranges might have been established by platform
 	firmware and disabling an active device is akin to force removing
-	memory from a running system.
+	memory from a running system. Going down this path does not offline
+	active memory if they are currently online. User is recommended to
+	offline and disable the appropriate regions before disabling the memdevs.
 
 -v::
 	Turn on verbose debug messages in the library (if libcxl was built with
diff --git a/cxl/memdev.c b/cxl/memdev.c
index 2dd2e7fcc4dd..ed962d478048 100644
--- a/cxl/memdev.c
+++ b/cxl/memdev.c
@@ -437,14 +437,28 @@ static int action_free_dpa(struct cxl_memdev *memdev,
 
 static int action_disable(struct cxl_memdev *memdev, struct action_context *actx)
 {
+	struct cxl_endpoint *ep;
+	struct cxl_port *port;
+
 	if (!cxl_memdev_is_enabled(memdev))
 		return 0;
 
-	if (!param.force) {
-		/* TODO: actually detect rather than assume active */
+	ep = cxl_memdev_get_endpoint(memdev);
+	if (!ep)
+		return -ENODEV;
+
+	port = cxl_endpoint_get_port(ep);
+	if (!port)
+		return -ENODEV;
+
+	if (cxl_port_decoders_committed(port)) {
 		log_err(&ml, "%s is part of an active region\n",
 			cxl_memdev_get_devname(memdev));
-		return -EBUSY;
+		if (!param.force)
+			return -EBUSY;
+
+		log_err(&ml, "Forcing %s disable with an active region!\n",
+			cxl_memdev_get_devname(memdev));
 	}
 
 	return cxl_memdev_disable_invalidate(memdev);



^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* (no subject)
  2023-11-30 21:51 [NDCTL PATCH v3 2/2] cxl: Add check for regions before disabling memdev Dave Jiang
@ 2024-04-17  6:46 ` Yao Xingtao
  2024-04-17 18:14   ` Verma, Vishal L
  0 siblings, 1 reply; 1651+ messages in thread
From: Yao Xingtao @ 2024-04-17  6:46 UTC (permalink / raw)
  To: dave.jiang; +Cc: caoqq, linux-cxl, nvdimm, vishal.l.verma


Hi Dave,
  I have applied this patch in my env, and done a lot of testing, this
feature is currently working fine. 
  But it is not merged into master branch yet, are there any updates
on this feature?

Associated patches:
https://lore.kernel.org/linux-cxl/170112921107.2687457.2741231995154639197.stgit@djiang5-mobl3/
https://lore.kernel.org/linux-cxl/170120423159.2725915.14670830315829916850.stgit@djiang5-mobl3/

Thanks
Xingtao

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-04-17  6:46 ` Yao Xingtao
@ 2024-04-17 18:14   ` Verma, Vishal L
  2024-04-22  7:26     ` Re: Xingtao Yao (Fujitsu)
  0 siblings, 1 reply; 1651+ messages in thread
From: Verma, Vishal L @ 2024-04-17 18:14 UTC (permalink / raw)
  To: Jiang, Dave, yaoxt.fnst@fujitsu.com
  Cc: caoqq@fujitsu.com, linux-cxl@vger.kernel.org,
	nvdimm@lists.linux.dev

On Wed, 2024-04-17 at 02:46 -0400, Yao Xingtao wrote:
> 
> Hi Dave,
>   I have applied this patch in my env, and done a lot of testing,
> this
> feature is currently working fine. 
>   But it is not merged into master branch yet, are there any updates
> on this feature?

Hi Xingtao,

Turns out that I had applied this to a branch but forgot to merge and
push it. Thanks for the ping - done now, and pushed to pending.

> 
> Associated patches:
> https://lore.kernel.org/linux-cxl/170112921107.2687457.2741231995154639197.stgit@djiang5-mobl3/
> https://lore.kernel.org/linux-cxl/170120423159.2725915.14670830315829916850.stgit@djiang5-mobl3/
> 
> Thanks
> Xingtao


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: Re:
  2024-04-17 18:14   ` Verma, Vishal L
@ 2024-04-22  7:26     ` Xingtao Yao (Fujitsu)
  0 siblings, 0 replies; 1651+ messages in thread
From: Xingtao Yao (Fujitsu) @ 2024-04-22  7:26 UTC (permalink / raw)
  To: Verma, Vishal L, Jiang, Dave
  Cc: Quanquan Cao (Fujitsu), linux-cxl@vger.kernel.org,
	nvdimm@lists.linux.dev


> -----Original Message-----
> From: Verma, Vishal L <vishal.l.verma@intel.com>
> Sent: Thursday, April 18, 2024 2:14 AM
> To: Jiang, Dave <dave.jiang@intel.com>; Yao, Xingtao/姚 幸涛
> <yaoxt.fnst@fujitsu.com>
> Cc: Cao, Quanquan/曹 全全 <caoqq@fujitsu.com>; linux-cxl@vger.kernel.org;
> nvdimm@lists.linux.dev
> Subject: Re:
> 
> On Wed, 2024-04-17 at 02:46 -0400, Yao Xingtao wrote:
> >
> > Hi Dave,
> >   I have applied this patch in my env, and done a lot of testing,
> > this
> > feature is currently working fine.
> >   But it is not merged into master branch yet, are there any updates
> > on this feature?
> 
> Hi Xingtao,
> 
> Turns out that I had applied this to a branch but forgot to merge and
> push it. Thanks for the ping - done now, and pushed to pending.
Awesome, many thanks!!!

> 
> >
> > Associated patches:
> >
> https://lore.kernel.org/linux-cxl/170112921107.2687457.2741231995154639197.st
> git@djiang5-mobl3/
> >
> https://lore.kernel.org/linux-cxl/170120423159.2725915.14670830315829916850.s
> tgit@djiang5-mobl3/
> >
> > Thanks
> > Xingtao


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2023-12-07  4:40 Emma Tebibyte
  2023-12-07  5:00 ` Christoph Anton Mitterer
  0 siblings, 1 reply; 1651+ messages in thread
From: Emma Tebibyte @ 2023-12-07  4:40 UTC (permalink / raw)
  To: dash

Hi,

I found a bug in dash version 0.5.12 where when shifting more than ?#,
the shell exits before evaluating a logical OR operator.

To reproduce this issue:

	$ set -- 1
	$ shift 2 || printf 'meow\n'
	dash: 2: shift: can't shift that many
	$

This also occurs using if:

	$ set -- 1
	$ if shift 2; then
	> true
	> else
	> printf 'meow\n'
	> fi
	dash: 2: shift: can't shift that many
	$

Thanks,
Emma Tebibyte

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-12-07  4:40 Emma Tebibyte
@ 2023-12-07  5:00 ` Christoph Anton Mitterer
  2023-12-07  5:29   ` Re: Lawrence Velázquez
  0 siblings, 1 reply; 1651+ messages in thread
From: Christoph Anton Mitterer @ 2023-12-07  5:00 UTC (permalink / raw)
  To: Emma Tebibyte, dash

On Wed, 2023-12-06 at 21:40 -0700, Emma Tebibyte wrote:
> I found a bug in dash version 0.5.12 where when shifting more than
> ?#,
> the shell exits before evaluating a logical OR operator.

AFAIU from POSIX this is perfectly valid behaviour:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#shift

> EXIT STATUS
> If the n operand is invalid or is greater than "$#", this may be
> considered a syntax error and a non-interactive shell may exit; if
> the shell does not exit in this case, a non-zero exit status shall
> be returned. Otherwise, zero shall be returned.


Cheers,
Chris.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2023-12-07  5:00 ` Christoph Anton Mitterer
@ 2023-12-07  5:29   ` Lawrence Velázquez
  0 siblings, 0 replies; 1651+ messages in thread
From: Lawrence Velázquez @ 2023-12-07  5:29 UTC (permalink / raw)
  To: Christoph Anton Mitterer, Emma Tebibyte; +Cc: dash

On Thu, Dec 7, 2023, at 12:00 AM, Christoph Anton Mitterer wrote:
> On Wed, 2023-12-06 at 21:40 -0700, Emma Tebibyte wrote:
>> I found a bug in dash version 0.5.12 where when shifting more than
>> ?#,
>> the shell exits before evaluating a logical OR operator.
>
> AFAIU from POSIX this is perfectly valid behaviour:
>
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#shift
>
>> EXIT STATUS
>> If the n operand is invalid or is greater than "$#", this may be
>> considered a syntax error and a non-interactive shell may exit; if
>> the shell does not exit in this case, a non-zero exit status shall
>> be returned. Otherwise, zero shall be returned.

See also Section 2.8.1 [*], which states that interactive shells
shall not exit on special built-in utility errors and that:

	In all of the cases shown in the table where an interactive
	shell is required not to exit, the shell shall not perform
	any further processing of the command in which the error
	occurred.

[*] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_08_01

-- 
vq

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-01-16  6:46 meir elisha
  2024-01-16  7:05 ` Dan Carpenter
  0 siblings, 1 reply; 1651+ messages in thread
From: meir elisha @ 2024-01-16  6:46 UTC (permalink / raw)
  To: linux-staging

subscribe linux-staging

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-01-16  6:46 meir elisha
@ 2024-01-16  7:05 ` Dan Carpenter
  0 siblings, 0 replies; 1651+ messages in thread
From: Dan Carpenter @ 2024-01-16  7:05 UTC (permalink / raw)
  To: meir elisha; +Cc: linux-staging

You have to send an email to linux-staging+subscribe@lists.linux.dev

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [RFC] [PATCH 0/3] xfs: use large folios for buffers
@ 2024-01-18 22:19 Dave Chinner
  2024-01-22 10:13 ` Andi Kleen
  0 siblings, 1 reply; 1651+ messages in thread
From: Dave Chinner @ 2024-01-18 22:19 UTC (permalink / raw)
  To: linux-xfs; +Cc: willy, linux-mm

The XFS buffer cache supports metadata buffers up to 64kB, and it does so by
aggregating multiple pages into a single contiguous memory region using
vmapping. This is expensive (both the setup and the runtime TLB mapping cost),
and would be unnecessary if we could allocate large contiguous memory regions
for the buffers in the first place.

Enter multi-page folios.

This patchset converts the buffer cache to use the folio API, then enhances it
to optimisitically use large folios where possible. It retains the old "vmap an
array of single page folios" functionality as a fallback when large folio
allocation fails. This means that, like page cache support for large folios, we
aren't dependent on large folio allocation succeeding all the time.

This relegates the single page array allocation mechanism to the "slow path"
that we don't have to care so much about performance of this path anymore. This
might allow us to simplify it a bit in future.

One of the issues with the folio conversion is that we use a couple of APIs that
take struct page ** (i.e. pointers to page pointer arrays) and there aren't
folio counterparts. These are the bulk page allocator and vm_map_ram(). In the
cases where they are used, we cast &bp->b_folios[] to (struct page **) knowing
that this array will only contain single page folios and that single page folios
and struct page are the same structure and so have the same address. This is a
bit of a hack (hence the RFC) but I'm not sure that it's worth adding folio
versions of these interfaces right now. We don't need to use the bulk page
allocator so much any more, because that's now a slow path and we could probably
just call folio_alloc() in a loop like we used to. What to do about vm_map_ram()
is a little less clear....

The other issue I tripped over in doing this conversion is that the
discontiguous buffer straddling code in the buf log item dirty region tracking
is broken. We don't actually exercise that code on existing configurations, and
I tripped over it when tracking down a bug in the folio conversion. I fixed it
and short-circuted the check for contiguous buffers, but that didn't fix the
failure I was seeing (which was not handling bp->b_offset and large folios
properly when building bios).

Apart from those issues, the conversion and enhancement is relatively straight
forward.  It passes fstests on both 512 and 4096 byte sector size storage (512
byte sectors exercise the XBF_KMEM path which has non-zero bp->b_offset values)
and doesn't appear to cause any problems with large directory buffers, though I
haven't done any real testing on those yet. Large folio allocations are
definitely being exercised, though, as all the inode cluster buffers are 16kB on
a 512 byte inode V5 filesystem.

Thoughts, comments, etc?

Note: this patchset is on top of the NOFS removal patchset I sent a
few days ago. That can be pulled from this git branch:

https://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git xfs-kmem-cleanup

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2024-01-18 22:19 [RFC] [PATCH 0/3] xfs: use large folios for buffers Dave Chinner
@ 2024-01-22 10:13 ` Andi Kleen
  2024-01-22 11:53   ` Dave Chinner
  0 siblings, 1 reply; 1651+ messages in thread
From: Andi Kleen @ 2024-01-22 10:13 UTC (permalink / raw)
  To: linux-xfs, david, linux-mm

Dave Chinner <david@fromorbit.com> writes:

> Thoughts, comments, etc?

The interesting part is if it will cause additional tail latencies
allocating under fragmentation with direct reclaim, compaction
etc. being triggered before it falls back to the base page path.

In fact it is highly likely it will, the question is just how bad it is.

Unfortunately benchmarking for that isn't that easy, it needs artificial
memory fragmentation and then some high stress workload, and then
instrumenting the transactions for individual latencies. 

I would in any case add a tunable for it in case people run into this.
Tail latencies are a common concern on many IO workloads.

-Andi

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-01-22 10:13 ` Andi Kleen
@ 2024-01-22 11:53   ` Dave Chinner
  0 siblings, 0 replies; 1651+ messages in thread
From: Dave Chinner @ 2024-01-22 11:53 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-xfs, linux-mm

On Mon, Jan 22, 2024 at 02:13:23AM -0800, Andi Kleen wrote:
> Dave Chinner <david@fromorbit.com> writes:
> 
> > Thoughts, comments, etc?
> 
> The interesting part is if it will cause additional tail latencies
> allocating under fragmentation with direct reclaim, compaction
> etc. being triggered before it falls back to the base page path.

It's not like I don't know these problems exist with memory
allocation. Go have a look at xlog_kvmalloc() which is an open coded
kvmalloc() that allows the high order kmalloc allocations to
fail-fast without triggering all the expensive and unnecessary
direct reclaim overhead (e.g. compaction!) because we can fall back
to vmalloc without huge concerns. When high order allocations start
to fail, then we fall back to vmalloc and then we hit the long
standing vmalloc scalability problems before anything else in XFS or
the IO path becomes a bottleneck.

IOWs, we already know that fail-fast high-order allocation is a more
efficient and effective fast path than using vmalloc/vmap_ram() all
the time. As this is an RFC, I haven't implemented stuff like this
yet - I haven't seen anything in the profiles indicating that high
order folio allocation is failing and causing lots of reclaim
overhead, so I simply haven't added fail-fast behaviour yet...

> In fact it is highly likely it will, the question is just how bad it is.
> 
> Unfortunately benchmarking for that isn't that easy, it needs artificial
> memory fragmentation and then some high stress workload, and then
> instrumenting the transactions for individual latencies. 

I stress test and measure XFS metadata performance under sustained
memory pressure all the time. This change has not caused any
obvious regressions in the short time I've been testing it.

I still need to do perf testing on large directory block sizes. That
is where high-order allocations will get stressed - that's where
xlog_kvmalloc() starts dominating the profiles as it trips over
vmalloc scalability issues...

> I would in any case add a tunable for it in case people run into this.

No tunables. It either works or it doesn't. If we can't make
it work reliably by default, we throw it in the dumpster, light it
on fire and walk away.

> Tail latencies are a common concern on many IO workloads.

Yes, for user data operations it's a common concern. For metadata,
not so much - there's so many far worse long tail latencies in
metadata operations (like waiting for journal space) that memory
allocation latencies in the metadata IO path are largely noise....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v3 0/7] net/gve: RSS Support for GVE Driver
@ 2024-01-24  0:14 Joshua Washington
       [not found] ` <20240126173317.2779230-1-joshwash@google.com>
  0 siblings, 1 reply; 1651+ messages in thread
From: Joshua Washington @ 2024-01-24  0:14 UTC (permalink / raw)
  Cc: dev, Ferruh Yigit, Rushil Gupta, Joshua Washington

This patch series introduces RSS support for the GVE poll-mode driver.
This series includes implementations of the following eth_dev_ops:

1) rss_hash_update
2) rss_hash_conf_get
3) reta_query
4) reta_update

In rss_hash_update, the GVE driver supports the following RSS hash
types:

* RTE_ETH_RSS_IPV4
* RTE_ETH_RSS_NONFRAG_IPV4_TCP
* RTE_ETH_RSS_NONFRAG_IPV4_UDP
* RTE_ETH_RSS_IPV6
* RTE_ETH_RSS_IPV6_EX
* RTE_ETH_RSS_NONFRAG_IPV6_TCP
* RTE_ETH_RSS_NONFRAG_IPV6_UDP
* RTE_ETH_RSS_IPV6_TCP_EX
* RTE_ETH_RSS_IPV6_UDP_EX

The hash key is 40B, and the lookup table has 128 entries. These values
are not configurable in this implementation.

In general, the DPDK driver expects the RSS hash configuration to be set
with a key before the redriection table is set up. When the RSS hash is
configured, a default redirection table is generated based on the number
of queues. When the device is re-configured, the redirection table is
reset to the default value based on the queue count.

An important note is that the gVNIC device expects 32 bit integers for
RSS redirection table entries, while the RTE API uses 16 bit integers.
However, this is unlikely to be an issue, as these values represent
receive queues, and the gVNIC device does not support anywhere near 64K
queues.

This series also updates the corresponding feature matrix ertries and
documentation as it pertains to RSS support in the GVE driver.

v2:
Add commmit messages for patches with it missing, and other checkpatches
fixes.

Note: There is a warning about complex macros being parenthesized that
does not seem to be well-founded.

v3:
Fix build warnings that come up on certain distros.

Joshua Washington (7):
  net/gve: fully expose RSS offload support in dev_info
  net/gve: RSS adminq command changes
  net/gve: add gve_rss library for handling RSS-related behaviors
  net/gve: RSS configuration update support
  net/gve: RSS redirection table update support
  net/gve: update gve.ini with RSS capabilities
  net/gve: update GVE documentation with RSS support

 doc/guides/nics/features/gve.ini  |   3 +
 doc/guides/nics/gve.rst           |  16 ++-
 drivers/net/gve/base/gve.h        |  15 ++
 drivers/net/gve/base/gve_adminq.c |  59 ++++++++
 drivers/net/gve/base/gve_adminq.h |  21 +++
 drivers/net/gve/gve_ethdev.c      | 231 +++++++++++++++++++++++++++++-
 drivers/net/gve/gve_ethdev.h      |  17 +++
 drivers/net/gve/gve_rss.c         | 206 ++++++++++++++++++++++++++
 drivers/net/gve/gve_rss.h         | 107 ++++++++++++++
 drivers/net/gve/meson.build       |   1 +
 10 files changed, 667 insertions(+), 9 deletions(-)
 create mode 100644 drivers/net/gve/gve_rss.c
 create mode 100644 drivers/net/gve/gve_rss.h

-- 
2.43.0.429.g432eaa2c6b-goog

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <20240126173317.2779230-1-joshwash@google.com>]

* Re:
       [not found] ` <20240126173317.2779230-1-joshwash@google.com>
@ 2024-01-31 14:58   ` Ferruh Yigit
  0 siblings, 0 replies; 1651+ messages in thread
From: Ferruh Yigit @ 2024-01-31 14:58 UTC (permalink / raw)
  To: Joshua Washington; +Cc: dev, Rushil Gupta

On 1/26/2024 5:33 PM, Joshua Washington wrote:
> Subject: [PATCH v4 0/7] net/gve: RSS Support for GVE Driver
> 
> This patch series introduces RSS support for the GVE poll-mode driver.
> This series includes implementations of the following eth_dev_ops:
> 
> 1) rss_hash_update
> 2) rss_hash_conf_get
> 3) reta_query
> 4) reta_update
> 
> In rss_hash_update, the GVE driver supports the following RSS hash
> types:
> 
> * RTE_ETH_RSS_IPV4
> * RTE_ETH_RSS_NONFRAG_IPV4_TCP
> * RTE_ETH_RSS_NONFRAG_IPV4_UDP
> * RTE_ETH_RSS_IPV6
> * RTE_ETH_RSS_IPV6_EX
> * RTE_ETH_RSS_NONFRAG_IPV6_TCP
> * RTE_ETH_RSS_NONFRAG_IPV6_UDP
> * RTE_ETH_RSS_IPV6_TCP_EX
> * RTE_ETH_RSS_IPV6_UDP_EX
> 
> The hash key is 40B, and the lookup table has 128 entries. These values
> are not configurable in this implementation.
> 
> In general, the DPDK driver expects the RSS hash configuration to be set
> with a key before the redriection table is set up. When the RSS hash is
> configured, a default redirection table is generated based on the number
> of queues. When the device is re-configured, the redirection table is
> reset to the default value based on the queue count.
> 
> An important note is that the gVNIC device expects 32 bit integers for
> RSS redirection table entries, while the RTE API uses 16 bit integers.
> However, this is unlikely to be an issue, as these values represent
> receive queues, and the gVNIC device does not support anywhere near 64K
> queues.
> 
> This series also updates the corresponding feature matrix ertries and
> documentation as it pertains to RSS support in the GVE driver.
> 
> v2:
> Add commmit messages for patches with it missing, and other checkpatches
> fixes.
> 
> Note: There is a warning about complex macros being parenthesized that
> does not seem to be well-founded.
> 
> v3:
> Fix build warnings that come up on certain distros.
> 
> v4:
> Fix formatting in gve_adminq.c
> 
> Joshua Washington (7):
>   net/gve: fully expose RSS offload support in dev_info
>   net/gve: RSS adminq command changes
>   net/gve: add gve_rss library for handling RSS-related behaviors
>   net/gve: RSS configuration update support
>   net/gve: RSS redirection table update support
>   net/gve: update gve.ini with RSS capabilities
>   net/gve: update GVE documentation with RSS support
> 
> 

'./devtools/check-git-log.sh' script is giving warnings [1]:

Expected patch title format is:
net/gve: <verb> <object>

<verb> should start with lowercase.


[1]
- check-git-log:




Wrong headline format:




        net/gve: fully expose RSS offload support in dev_info




        net/gve: add gve_rss library for handling RSS-related behaviors




Wrong headline uppercase:




        net/gve: RSS adminq command changes




        net/gve: RSS configuration update support




        net/gve: RSS redirection table update support




Headline too long:




        net/gve: add gve_rss library for handling RSS-related behaviors


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-03-07  6:07 KR Kim
  2024-03-07  8:01 ` Miquel Raynal
  0 siblings, 1 reply; 1651+ messages in thread
From: KR Kim @ 2024-03-07  6:07 UTC (permalink / raw)
  To: miquel.raynal, richard, vigneshr, mmkurbanov, ddrokosov,
	gch981213
  Cc: kr.kim, michael, broonie, mika.westerberg, acelan.kao,
	linux-kernel, linux-mtd, moh.sardi, changsub.shim

Feat: Add SkyHigh Memory Patch code

Add SPI Nand Patch code of SkyHigh Memory
- Add company dependent code with 'skyhigh.c'
- Insert into 'core.c' so that 'always ECC on'

commit 6061b97a830af8cb5fd0917e833e779451f9046a (HEAD -> master)
Author: KR Kim <kr.kim@skyhighmemory.com>
Date:   Thu Mar 7 13:24:11 2024 +0900

    SPI Nand Patch code of SkyHigh Momory

    Signed-off-by: KR Kim <kr.kim@skyhighmemory.com>

From 6061b97a830af8cb5fd0917e833e779451f9046a Mon Sep 17 00:00:00 2001
From: KR Kim <kr.kim@skyhighmemory.com>
Date: Thu, 7 Mar 2024 13:24:11 +0900
Subject: [PATCH] SPI Nand Patch code of SkyHigh Memory

---
 drivers/mtd/nand/spi/Makefile  |   2 +-
 drivers/mtd/nand/spi/core.c    |   7 +-
 drivers/mtd/nand/spi/skyhigh.c | 155 +++++++++++++++++++++++++++++++++
 include/linux/mtd/spinand.h    |   3 +
 4 files changed, 165 insertions(+), 2 deletions(-)
 mode change 100644 => 100755 drivers/mtd/nand/spi/Makefile
 mode change 100644 => 100755 drivers/mtd/nand/spi/core.c
 create mode 100644 drivers/mtd/nand/spi/skyhigh.c
 mode change 100644 => 100755 include/linux/mtd/spinand.h

diff --git a/drivers/mtd/nand/spi/Makefile b/drivers/mtd/nand/spi/Makefile
old mode 100644
new mode 100755
index 19cc77288ebb..1e61ab21893a
--- a/drivers/mtd/nand/spi/Makefile
+++ b/drivers/mtd/nand/spi/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
 spinand-objs := core.o alliancememory.o ato.o esmt.o foresee.o gigadevice.o macronix.o
-spinand-objs += micron.o paragon.o toshiba.o winbond.o xtx.o
+spinand-objs += micron.o paragon.o skyhigh.o toshiba.o winbond.o xtx.o
 obj-$(CONFIG_MTD_SPI_NAND) += spinand.o
diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
old mode 100644
new mode 100755
index e0b6715e5dfe..e3f0a7544ba4
--- a/drivers/mtd/nand/spi/core.c
+++ b/drivers/mtd/nand/spi/core.c
@@ -34,7 +34,7 @@ static int spinand_read_reg_op(struct spinand_device *spinand, u8 reg, u8 *val)
 	return 0;
 }
 
-static int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 val)
+int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 val)
 {
 	struct spi_mem_op op = SPINAND_SET_FEATURE_OP(reg,
 						      spinand->scratchbuf);
@@ -196,6 +196,10 @@ static int spinand_init_quad_enable(struct spinand_device *spinand)
 static int spinand_ecc_enable(struct spinand_device *spinand,
 			      bool enable)
 {
+	/* SHM : always ECC enable */
+	if (spinand->flags & SPINAND_ON_DIE_ECC_MANDATORY)
+		return 0;
+
 	return spinand_upd_cfg(spinand, CFG_ECC_ENABLE,
 			       enable ? CFG_ECC_ENABLE : 0);
 }
@@ -945,6 +949,7 @@ static const struct spinand_manufacturer *spinand_manufacturers[] = {
 	&macronix_spinand_manufacturer,
 	&micron_spinand_manufacturer,
 	&paragon_spinand_manufacturer,
+	&skyhigh_spinand_manufacturer,
 	&toshiba_spinand_manufacturer,
 	&winbond_spinand_manufacturer,
 	&xtx_spinand_manufacturer,
diff --git a/drivers/mtd/nand/spi/skyhigh.c b/drivers/mtd/nand/spi/skyhigh.c
new file mode 100644
index 000000000000..92e7572094ff
--- /dev/null
+++ b/drivers/mtd/nand/spi/skyhigh.c
@@ -0,0 +1,155 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2022 SkyHigh Memory Limited
+ *
+ * Author: Takahiro Kuwano <takahiro.kuwano@infineon.com>
+ */
+
+#include <linux/device.h>
+#include <linux/kernel.h>
+#include <linux/mtd/spinand.h>
+
+#define SPINAND_MFR_SKYHIGH		0x01
+
+#define SKYHIGH_STATUS_ECC_1TO2_BITFLIPS	(1 << 4)
+#define SKYHIGH_STATUS_ECC_3TO6_BITFLIPS	(2 << 4)
+#define SKYHIGH_STATUS_ECC_UNCOR_ERROR  	(3 << 4)
+
+#define SKYHIGH_CONFIG_PROTECT_EN	BIT(1)
+
+static SPINAND_OP_VARIANTS(read_cache_variants,
+		SPINAND_PAGE_READ_FROM_CACHE_QUADIO_OP(0, 4, NULL, 0),
+		SPINAND_PAGE_READ_FROM_CACHE_X4_OP(0, 1, NULL, 0),
+		SPINAND_PAGE_READ_FROM_CACHE_DUALIO_OP(0, 2, NULL, 0),
+		SPINAND_PAGE_READ_FROM_CACHE_X2_OP(0, 1, NULL, 0),
+		SPINAND_PAGE_READ_FROM_CACHE_OP(true, 0, 1, NULL, 0),
+		SPINAND_PAGE_READ_FROM_CACHE_OP(false, 0, 1, NULL, 0));
+
+static SPINAND_OP_VARIANTS(write_cache_variants,
+		SPINAND_PROG_LOAD_X4(true, 0, NULL, 0),
+		SPINAND_PROG_LOAD(true, 0, NULL, 0));
+
+static SPINAND_OP_VARIANTS(update_cache_variants,
+		SPINAND_PROG_LOAD_X4(false, 0, NULL, 0),
+		SPINAND_PROG_LOAD(false, 0, NULL, 0));
+
+static int skyhigh_spinand_ooblayout_ecc(struct mtd_info *mtd, int section,
+					 struct mtd_oob_region *region)
+{
+	if (section)
+		return -ERANGE;
+
+	/* SkyHigh's ecc parity is stored in the internal hidden area and is not needed for them. */
+	region->length = 0;
+	region->offset = mtd->oobsize;
+
+	return 0;
+}
+
+static int skyhigh_spinand_ooblayout_free(struct mtd_info *mtd, int section,
+					  struct mtd_oob_region *region)
+{
+	if (section)
+		return -ERANGE;
+
+	region->length = mtd->oobsize - 2;
+	region->offset = 2;
+
+	return 0;
+}
+
+static const struct mtd_ooblayout_ops skyhigh_spinand_ooblayout = {
+	.ecc = skyhigh_spinand_ooblayout_ecc,
+	.free = skyhigh_spinand_ooblayout_free,
+};
+
+static int skyhigh_spinand_ecc_get_status(struct spinand_device *spinand,
+				  u8 status)
+{
+	/* SHM
+	 * 00 : No bit-flip
+	 * 01 : 1-2 errors corrected
+	 * 10 : 3-6 errors corrected         
+	 * 11 : uncorrectable
+	 */
+
+	switch (status & STATUS_ECC_MASK) {
+	case STATUS_ECC_NO_BITFLIPS:
+		return 0;
+
+	case SKYHIGH_STATUS_ECC_1TO2_BITFLIPS:
+		return 2;
+
+ 	case SKYHIGH_STATUS_ECC_3TO6_BITFLIPS:
+		return 6; 
+
+ 	case SKYHIGH_STATUS_ECC_UNCOR_ERROR:
+		return -EBADMSG;;
+
+	default:
+		break;
+	}
+
+	return -EINVAL;
+}
+
+static const struct spinand_info skyhigh_spinand_table[] = {
+	SPINAND_INFO("S35ML01G301",
+		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x15),
+		     NAND_MEMORG(1, 2048, 64, 64, 1024, 20, 1, 1, 1),
+		     NAND_ECCREQ(6, 32),
+		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
+					      &write_cache_variants,
+					      &update_cache_variants),
+		     SPINAND_ON_DIE_ECC_MANDATORY,
+		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
+		     		     skyhigh_spinand_ecc_get_status)),
+	SPINAND_INFO("S35ML01G300",
+		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x14),
+		     NAND_MEMORG(1, 2048, 128, 64, 1024, 20, 1, 1, 1),
+		     NAND_ECCREQ(6, 32),
+		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
+					      &write_cache_variants,
+					      &update_cache_variants),
+		     SPINAND_ON_DIE_ECC_MANDATORY,
+		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
+		     		     skyhigh_spinand_ecc_get_status)),
+	SPINAND_INFO("S35ML02G300",
+		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x25),
+		     NAND_MEMORG(1, 2048, 128, 64, 2048, 40, 2, 1, 1),
+		     NAND_ECCREQ(6, 32),
+		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
+					      &write_cache_variants,
+					      &update_cache_variants),
+		     SPINAND_ON_DIE_ECC_MANDATORY,
+		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
+		     		     skyhigh_spinand_ecc_get_status)),
+	SPINAND_INFO("S35ML04G300",
+		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x35),
+		     NAND_MEMORG(1, 2048, 128, 64, 4096, 80, 2, 1, 1),
+		     NAND_ECCREQ(6, 32),
+		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
+					      &write_cache_variants,
+					      &update_cache_variants),
+		     SPINAND_ON_DIE_ECC_MANDATORY,
+		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
+		     		     skyhigh_spinand_ecc_get_status)),
+};
+
+static int skyhigh_spinand_init(struct spinand_device *spinand)
+{
+	return spinand_write_reg_op(spinand, REG_BLOCK_LOCK,
+				    SKYHIGH_CONFIG_PROTECT_EN);
+}
+
+static const struct spinand_manufacturer_ops skyhigh_spinand_manuf_ops = {
+	.init = skyhigh_spinand_init,
+ };
+
+const struct spinand_manufacturer skyhigh_spinand_manufacturer = {
+	.id = SPINAND_MFR_SKYHIGH,
+	.name = "SkyHigh",
+	.chips = skyhigh_spinand_table,
+	.nchips = ARRAY_SIZE(skyhigh_spinand_table),
+	.ops = &skyhigh_spinand_manuf_ops,
+};
diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h
old mode 100644
new mode 100755
index badb4c1ac079..0e135076df24
--- a/include/linux/mtd/spinand.h
+++ b/include/linux/mtd/spinand.h
@@ -268,6 +268,7 @@ extern const struct spinand_manufacturer gigadevice_spinand_manufacturer;
 extern const struct spinand_manufacturer macronix_spinand_manufacturer;
 extern const struct spinand_manufacturer micron_spinand_manufacturer;
 extern const struct spinand_manufacturer paragon_spinand_manufacturer;
+extern const struct spinand_manufacturer skyhigh_spinand_manufacturer;
 extern const struct spinand_manufacturer toshiba_spinand_manufacturer;
 extern const struct spinand_manufacturer winbond_spinand_manufacturer;
 extern const struct spinand_manufacturer xtx_spinand_manufacturer;
@@ -312,6 +313,7 @@ struct spinand_ecc_info {
 
 #define SPINAND_HAS_QE_BIT		BIT(0)
 #define SPINAND_HAS_CR_FEAT_BIT		BIT(1)
+#define SPINAND_ON_DIE_ECC_MANDATORY	BIT(2)	/* SHM */
 
 /**
  * struct spinand_ondie_ecc_conf - private SPI-NAND on-die ECC engine structure
@@ -518,5 +520,6 @@ int spinand_match_and_init(struct spinand_device *spinand,
 
 int spinand_upd_cfg(struct spinand_device *spinand, u8 mask, u8 val);
 int spinand_select_target(struct spinand_device *spinand, unsigned int target);
+int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 val);
 
 #endif /* __LINUX_MTD_SPINAND_H */
-- 
2.34.1


______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2024-03-07  6:07 KR Kim
@ 2024-03-07  8:01 ` Miquel Raynal
  2024-03-08  1:27   ` Re: Kyeongrho.Kim
       [not found]   ` <SE2P216MB210205B301549661575720CC833A2@SE2P216MB2102.KORP216.PROD.OUTLOOK.COM>
  0 siblings, 2 replies; 1651+ messages in thread
From: Miquel Raynal @ 2024-03-07  8:01 UTC (permalink / raw)
  To: KR Kim
  Cc: richard, vigneshr, mmkurbanov, ddrokosov, gch981213, michael,
	broonie, mika.westerberg, acelan.kao, linux-kernel, linux-mtd,
	moh.sardi, changsub.shim

Hi,

kr.kim@skyhighmemory.com wrote on Thu,  7 Mar 2024 15:07:29 +0900:

> Feat: Add SkyHigh Memory Patch code
> 
> Add SPI Nand Patch code of SkyHigh Memory
> - Add company dependent code with 'skyhigh.c'
> - Insert into 'core.c' so that 'always ECC on'

Patch formatting is still messed up.

> commit 6061b97a830af8cb5fd0917e833e779451f9046a (HEAD -> master)
> Author: KR Kim <kr.kim@skyhighmemory.com>
> Date:   Thu Mar 7 13:24:11 2024 +0900
> 
>     SPI Nand Patch code of SkyHigh Momory
> 
>     Signed-off-by: KR Kim <kr.kim@skyhighmemory.com>
> 
> From 6061b97a830af8cb5fd0917e833e779451f9046a Mon Sep 17 00:00:00 2001
> From: KR Kim <kr.kim@skyhighmemory.com>
> Date: Thu, 7 Mar 2024 13:24:11 +0900
> Subject: [PATCH] SPI Nand Patch code of SkyHigh Memory
> 
> ---
>  drivers/mtd/nand/spi/Makefile  |   2 +-
>  drivers/mtd/nand/spi/core.c    |   7 +-
>  drivers/mtd/nand/spi/skyhigh.c | 155 +++++++++++++++++++++++++++++++++
>  include/linux/mtd/spinand.h    |   3 +
>  4 files changed, 165 insertions(+), 2 deletions(-)
>  mode change 100644 => 100755 drivers/mtd/nand/spi/Makefile
>  mode change 100644 => 100755 drivers/mtd/nand/spi/core.c
>  create mode 100644 drivers/mtd/nand/spi/skyhigh.c
>  mode change 100644 => 100755 include/linux/mtd/spinand.h
> 
> diff --git a/drivers/mtd/nand/spi/Makefile b/drivers/mtd/nand/spi/Makefile
> old mode 100644
> new mode 100755
> index 19cc77288ebb..1e61ab21893a
> --- a/drivers/mtd/nand/spi/Makefile
> +++ b/drivers/mtd/nand/spi/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0
>  spinand-objs := core.o alliancememory.o ato.o esmt.o foresee.o gigadevice.o macronix.o
> -spinand-objs += micron.o paragon.o toshiba.o winbond.o xtx.o
> +spinand-objs += micron.o paragon.o skyhigh.o toshiba.o winbond.o xtx.o
>  obj-$(CONFIG_MTD_SPI_NAND) += spinand.o
> diff --git a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c
> old mode 100644
> new mode 100755
> index e0b6715e5dfe..e3f0a7544ba4
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -34,7 +34,7 @@ static int spinand_read_reg_op(struct spinand_device *spinand, u8 reg, u8 *val)
>  	return 0;
>  }
>  
> -static int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 val)
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 val)

Please do this in a separate commit.

>  {
>  	struct spi_mem_op op = SPINAND_SET_FEATURE_OP(reg,
>  						      spinand->scratchbuf);
> @@ -196,6 +196,10 @@ static int spinand_init_quad_enable(struct spinand_device *spinand)
>  static int spinand_ecc_enable(struct spinand_device *spinand,
>  			      bool enable)
>  {
> +	/* SHM : always ECC enable */
> +	if (spinand->flags & SPINAND_ON_DIE_ECC_MANDATORY)
> +		return 0;

Silently always enabling ECC is not possible. If you cannot disable the
on-die engine, then:
- you should prevent any other engine type to be used
- you should error out if a raw access is requested
- these chips are broken, IMO

> +
>  	return spinand_upd_cfg(spinand, CFG_ECC_ENABLE,
>  			       enable ? CFG_ECC_ENABLE : 0);
>  }
> @@ -945,6 +949,7 @@ static const struct spinand_manufacturer *spinand_manufacturers[] = {
>  	&macronix_spinand_manufacturer,
>  	&micron_spinand_manufacturer,
>  	&paragon_spinand_manufacturer,
> +	&skyhigh_spinand_manufacturer,
>  	&toshiba_spinand_manufacturer,
>  	&winbond_spinand_manufacturer,
>  	&xtx_spinand_manufacturer,
> diff --git a/drivers/mtd/nand/spi/skyhigh.c b/drivers/mtd/nand/spi/skyhigh.c
> new file mode 100644
> index 000000000000..92e7572094ff
> --- /dev/null
> +++ b/drivers/mtd/nand/spi/skyhigh.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2022 SkyHigh Memory Limited
> + *
> + * Author: Takahiro Kuwano <takahiro.kuwano@infineon.com>
> + */
> +
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/mtd/spinand.h>
> +
> +#define SPINAND_MFR_SKYHIGH		0x01
> +
> +#define SKYHIGH_STATUS_ECC_1TO2_BITFLIPS	(1 << 4)
> +#define SKYHIGH_STATUS_ECC_3TO6_BITFLIPS	(2 << 4)
> +#define SKYHIGH_STATUS_ECC_UNCOR_ERROR  	(3 << 4)
> +
> +#define SKYHIGH_CONFIG_PROTECT_EN	BIT(1)
> +
> +static SPINAND_OP_VARIANTS(read_cache_variants,
> +		SPINAND_PAGE_READ_FROM_CACHE_QUADIO_OP(0, 4, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_X4_OP(0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_DUALIO_OP(0, 2, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_X2_OP(0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_OP(true, 0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_OP(false, 0, 1, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(write_cache_variants,
> +		SPINAND_PROG_LOAD_X4(true, 0, NULL, 0),
> +		SPINAND_PROG_LOAD(true, 0, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(update_cache_variants,
> +		SPINAND_PROG_LOAD_X4(false, 0, NULL, 0),
> +		SPINAND_PROG_LOAD(false, 0, NULL, 0));
> +
> +static int skyhigh_spinand_ooblayout_ecc(struct mtd_info *mtd, int section,
> +					 struct mtd_oob_region *region)
> +{
> +	if (section)
> +		return -ERANGE;
> +
> +	/* SkyHigh's ecc parity is stored in the internal hidden area and is not needed for them. */

		     ECC		     an

"needed" is wrong here. Just stop after "area"


> +	region->length = 0;
> +	region->offset = mtd->oobsize;
> +
> +	return 0;
> +}
> +
> +static int skyhigh_spinand_ooblayout_free(struct mtd_info *mtd, int section,
> +					  struct mtd_oob_region *region)
> +{
> +	if (section)
> +		return -ERANGE;
> +
> +	region->length = mtd->oobsize - 2;
> +	region->offset = 2;
> +
> +	return 0;
> +}
> +
> +static const struct mtd_ooblayout_ops skyhigh_spinand_ooblayout = {
> +	.ecc = skyhigh_spinand_ooblayout_ecc,
> +	.free = skyhigh_spinand_ooblayout_free,
> +};
> +
> +static int skyhigh_spinand_ecc_get_status(struct spinand_device *spinand,
> +				  u8 status)
> +{
> +	/* SHM
> +	 * 00 : No bit-flip
> +	 * 01 : 1-2 errors corrected
> +	 * 10 : 3-6 errors corrected         
> +	 * 11 : uncorrectable
> +	 */

Thanks for the comment but the switch case looks rather
straightforward, it is self-sufficient in this case.

> +
> +	switch (status & STATUS_ECC_MASK) {
> +	case STATUS_ECC_NO_BITFLIPS:
> +		return 0;
> +
> +	case SKYHIGH_STATUS_ECC_1TO2_BITFLIPS:
> +		return 2;
> +
> + 	case SKYHIGH_STATUS_ECC_3TO6_BITFLIPS:
> +		return 6; 
> +
> + 	case SKYHIGH_STATUS_ECC_UNCOR_ERROR:
> +		return -EBADMSG;;
> +
> +	default:
> +		break;

I guess you can directly call return -EINVAL here?

> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static const struct spinand_info skyhigh_spinand_table[] = {
> +	SPINAND_INFO("S35ML01G301",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x15),
> +		     NAND_MEMORG(1, 2048, 64, 64, 1024, 20, 1, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML01G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x14),
> +		     NAND_MEMORG(1, 2048, 128, 64, 1024, 20, 1, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML02G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x25),
> +		     NAND_MEMORG(1, 2048, 128, 64, 2048, 40, 2, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML04G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x35),
> +		     NAND_MEMORG(1, 2048, 128, 64, 4096, 80, 2, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +};
> +
> +static int skyhigh_spinand_init(struct spinand_device *spinand)
> +{
> +	return spinand_write_reg_op(spinand, REG_BLOCK_LOCK,
> +				    SKYHIGH_CONFIG_PROTECT_EN);

Is this really relevant? Isn't there an API for the block lock
mechanism?

> +}
> +
> +static const struct spinand_manufacturer_ops skyhigh_spinand_manuf_ops = {
> +	.init = skyhigh_spinand_init,
> + };
> +
> +const struct spinand_manufacturer skyhigh_spinand_manufacturer = {
> +	.id = SPINAND_MFR_SKYHIGH,
> +	.name = "SkyHigh",
> +	.chips = skyhigh_spinand_table,
> +	.nchips = ARRAY_SIZE(skyhigh_spinand_table),
> +	.ops = &skyhigh_spinand_manuf_ops,
> +};
> diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h
> old mode 100644
> new mode 100755
> index badb4c1ac079..0e135076df24
> --- a/include/linux/mtd/spinand.h
> +++ b/include/linux/mtd/spinand.h
> @@ -268,6 +268,7 @@ extern const struct spinand_manufacturer gigadevice_spinand_manufacturer;
>  extern const struct spinand_manufacturer macronix_spinand_manufacturer;
>  extern const struct spinand_manufacturer micron_spinand_manufacturer;
>  extern const struct spinand_manufacturer paragon_spinand_manufacturer;
> +extern const struct spinand_manufacturer skyhigh_spinand_manufacturer;
>  extern const struct spinand_manufacturer toshiba_spinand_manufacturer;
>  extern const struct spinand_manufacturer winbond_spinand_manufacturer;
>  extern const struct spinand_manufacturer xtx_spinand_manufacturer;
> @@ -312,6 +313,7 @@ struct spinand_ecc_info {
>  
>  #define SPINAND_HAS_QE_BIT		BIT(0)
>  #define SPINAND_HAS_CR_FEAT_BIT		BIT(1)
> +#define SPINAND_ON_DIE_ECC_MANDATORY	BIT(2)	/* SHM */

If we go this route, then "mandatory" is not relevant here, we shall
convey the fact that the on-die ECC engine cannot be disabled and as
mentioned above, there are other impacts.

>  
>  /**
>   * struct spinand_ondie_ecc_conf - private SPI-NAND on-die ECC engine structure
> @@ -518,5 +520,6 @@ int spinand_match_and_init(struct spinand_device *spinand,
>  
>  int spinand_upd_cfg(struct spinand_device *spinand, u8 mask, u8 val);
>  int spinand_select_target(struct spinand_device *spinand, unsigned int target);
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 val);
>  
>  #endif /* __LINUX_MTD_SPINAND_H */


Thanks,
Miquèl

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: Re:
  2024-03-07  8:01 ` Miquel Raynal
@ 2024-03-08  1:27   ` Kyeongrho.Kim
       [not found]   ` <SE2P216MB210205B301549661575720CC833A2@SE2P216MB2102.KORP216.PROD.OUTLOOK.COM>
  1 sibling, 0 replies; 1651+ messages in thread
From: Kyeongrho.Kim @ 2024-03-08  1:27 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard@nod.at, vigneshr@ti.com, mmkurbanov@salutedevices.com,
	ddrokosov@sberdevices.ru, gch981213@gmail.com, michael@walle.cc,
	broonie@kernel.org, mika.westerberg@linux.intel.com,
	acelan.kao@canonical.com, linux-kernel@vger.kernel.org,
	linux-mtd@lists.infradead.org, Mohamed Sardi, Changsub.Shim

Hi Miquel,
Thank you for your comment.
I tried to match the patch format, but it seems to be not enough yet. 
Can you send me a good sample for the patch format?
Thanks,
KR
-----Original Message-----
From: Miquel Raynal <miquel.raynal@bootlin.com> 
Sent: Thursday, March 7, 2024 5:01 PM
To: Kyeongrho.Kim <kr.kim@skyhighmemory.com>
Cc: richard@nod.at; vigneshr@ti.com; mmkurbanov@salutedevices.com; ddrokosov@sberdevices.ru; gch981213@gmail.com; michael@walle.cc; broonie@kernel.org; mika.westerberg@linux.intel.com; acelan.kao@canonical.com; linux-kernel@vger.kernel.org; linux-mtd@lists.infradead.org; Mohamed Sardi <moh.sardi@skyhighmemory.com>; Changsub.Shim <changsub.shim@skyhighmemory.com>
Subject: Re:

Hi,

kr.kim@skyhighmemory.com wrote on Thu,  7 Mar 2024 15:07:29 +0900:

> Feat: Add SkyHigh Memory Patch code
> 
> Add SPI Nand Patch code of SkyHigh Memory
> - Add company dependent code with 'skyhigh.c'
> - Insert into 'core.c' so that 'always ECC on'

Patch formatting is still messed up.

> commit 6061b97a830af8cb5fd0917e833e779451f9046a (HEAD -> master)
> Author: KR Kim <kr.kim@skyhighmemory.com>
> Date:   Thu Mar 7 13:24:11 2024 +0900
> 
>     SPI Nand Patch code of SkyHigh Momory
> 
>     Signed-off-by: KR Kim <kr.kim@skyhighmemory.com>
> 
> From 6061b97a830af8cb5fd0917e833e779451f9046a Mon Sep 17 00:00:00 2001
> From: KR Kim <kr.kim@skyhighmemory.com>
> Date: Thu, 7 Mar 2024 13:24:11 +0900
> Subject: [PATCH] SPI Nand Patch code of SkyHigh Memory
> 
> ---
>  drivers/mtd/nand/spi/Makefile  |   2 +-
>  drivers/mtd/nand/spi/core.c    |   7 +-
>  drivers/mtd/nand/spi/skyhigh.c | 155 +++++++++++++++++++++++++++++++++
>  include/linux/mtd/spinand.h    |   3 +
>  4 files changed, 165 insertions(+), 2 deletions(-)  mode change 
> 100644 => 100755 drivers/mtd/nand/spi/Makefile  mode change 100644 => 
> 100755 drivers/mtd/nand/spi/core.c  create mode 100644 
> drivers/mtd/nand/spi/skyhigh.c  mode change 100644 => 100755 
> include/linux/mtd/spinand.h
> 
> diff --git a/drivers/mtd/nand/spi/Makefile 
> b/drivers/mtd/nand/spi/Makefile old mode 100644 new mode 100755 index 
> 19cc77288ebb..1e61ab21893a
> --- a/drivers/mtd/nand/spi/Makefile
> +++ b/drivers/mtd/nand/spi/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0
>  spinand-objs := core.o alliancememory.o ato.o esmt.o foresee.o 
> gigadevice.o macronix.o -spinand-objs += micron.o paragon.o toshiba.o 
> winbond.o xtx.o
> +spinand-objs += micron.o paragon.o skyhigh.o toshiba.o winbond.o 
> +xtx.o
>  obj-$(CONFIG_MTD_SPI_NAND) += spinand.o diff --git 
> a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c old mode 
> 100644 new mode 100755 index e0b6715e5dfe..e3f0a7544ba4
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -34,7 +34,7 @@ static int spinand_read_reg_op(struct spinand_device *spinand, u8 reg, u8 *val)
>  	return 0;
>  }
>  
> -static int spinand_write_reg_op(struct spinand_device *spinand, u8 
> reg, u8 val)
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 
> +val)

Please do this in a separate commit.

>  {
>  	struct spi_mem_op op = SPINAND_SET_FEATURE_OP(reg,
>  						      spinand->scratchbuf);
> @@ -196,6 +196,10 @@ static int spinand_init_quad_enable(struct 
> spinand_device *spinand)  static int spinand_ecc_enable(struct spinand_device *spinand,
>  			      bool enable)
>  {
> +	/* SHM : always ECC enable */
> +	if (spinand->flags & SPINAND_ON_DIE_ECC_MANDATORY)
> +		return 0;

Silently always enabling ECC is not possible. If you cannot disable the on-die engine, then:
- you should prevent any other engine type to be used
- you should error out if a raw access is requested
- these chips are broken, IMO

> +
>  	return spinand_upd_cfg(spinand, CFG_ECC_ENABLE,
>  			       enable ? CFG_ECC_ENABLE : 0);  } @@ -945,6 +949,7 @@ static 
> const struct spinand_manufacturer *spinand_manufacturers[] = {
>  	&macronix_spinand_manufacturer,
>  	&micron_spinand_manufacturer,
>  	&paragon_spinand_manufacturer,
> +	&skyhigh_spinand_manufacturer,
>  	&toshiba_spinand_manufacturer,
>  	&winbond_spinand_manufacturer,
>  	&xtx_spinand_manufacturer,
> diff --git a/drivers/mtd/nand/spi/skyhigh.c 
> b/drivers/mtd/nand/spi/skyhigh.c new file mode 100644 index 
> 000000000000..92e7572094ff
> --- /dev/null
> +++ b/drivers/mtd/nand/spi/skyhigh.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2022 SkyHigh Memory Limited
> + *
> + * Author: Takahiro Kuwano <takahiro.kuwano@infineon.com>  */
> +
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/mtd/spinand.h>
> +
> +#define SPINAND_MFR_SKYHIGH		0x01
> +
> +#define SKYHIGH_STATUS_ECC_1TO2_BITFLIPS	(1 << 4)
> +#define SKYHIGH_STATUS_ECC_3TO6_BITFLIPS	(2 << 4)
> +#define SKYHIGH_STATUS_ECC_UNCOR_ERROR  	(3 << 4)
> +
> +#define SKYHIGH_CONFIG_PROTECT_EN	BIT(1)
> +
> +static SPINAND_OP_VARIANTS(read_cache_variants,
> +		SPINAND_PAGE_READ_FROM_CACHE_QUADIO_OP(0, 4, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_X4_OP(0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_DUALIO_OP(0, 2, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_X2_OP(0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_OP(true, 0, 1, NULL, 0),
> +		SPINAND_PAGE_READ_FROM_CACHE_OP(false, 0, 1, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(write_cache_variants,
> +		SPINAND_PROG_LOAD_X4(true, 0, NULL, 0),
> +		SPINAND_PROG_LOAD(true, 0, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(update_cache_variants,
> +		SPINAND_PROG_LOAD_X4(false, 0, NULL, 0),
> +		SPINAND_PROG_LOAD(false, 0, NULL, 0));
> +
> +static int skyhigh_spinand_ooblayout_ecc(struct mtd_info *mtd, int section,
> +					 struct mtd_oob_region *region)
> +{
> +	if (section)
> +		return -ERANGE;
> +
> +	/* SkyHigh's ecc parity is stored in the internal hidden area and is 
> +not needed for them. */

		     ECC		     an

"needed" is wrong here. Just stop after "area"


> +	region->length = 0;
> +	region->offset = mtd->oobsize;
> +
> +	return 0;
> +}
> +
> +static int skyhigh_spinand_ooblayout_free(struct mtd_info *mtd, int section,
> +					  struct mtd_oob_region *region) {
> +	if (section)
> +		return -ERANGE;
> +
> +	region->length = mtd->oobsize - 2;
> +	region->offset = 2;
> +
> +	return 0;
> +}
> +
> +static const struct mtd_ooblayout_ops skyhigh_spinand_ooblayout = {
> +	.ecc = skyhigh_spinand_ooblayout_ecc,
> +	.free = skyhigh_spinand_ooblayout_free, };
> +
> +static int skyhigh_spinand_ecc_get_status(struct spinand_device *spinand,
> +				  u8 status)
> +{
> +	/* SHM
> +	 * 00 : No bit-flip
> +	 * 01 : 1-2 errors corrected
> +	 * 10 : 3-6 errors corrected         
> +	 * 11 : uncorrectable
> +	 */

Thanks for the comment but the switch case looks rather straightforward, it is self-sufficient in this case.

> +
> +	switch (status & STATUS_ECC_MASK) {
> +	case STATUS_ECC_NO_BITFLIPS:
> +		return 0;
> +
> +	case SKYHIGH_STATUS_ECC_1TO2_BITFLIPS:
> +		return 2;
> +
> + 	case SKYHIGH_STATUS_ECC_3TO6_BITFLIPS:
> +		return 6;
> +
> + 	case SKYHIGH_STATUS_ECC_UNCOR_ERROR:
> +		return -EBADMSG;;
> +
> +	default:
> +		break;

I guess you can directly call return -EINVAL here?

> +	}
> +
> +	return -EINVAL;
> +}
> +
> +static const struct spinand_info skyhigh_spinand_table[] = {
> +	SPINAND_INFO("S35ML01G301",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x15),
> +		     NAND_MEMORG(1, 2048, 64, 64, 1024, 20, 1, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML01G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x14),
> +		     NAND_MEMORG(1, 2048, 128, 64, 1024, 20, 1, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML02G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x25),
> +		     NAND_MEMORG(1, 2048, 128, 64, 2048, 40, 2, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)),
> +	SPINAND_INFO("S35ML04G300",
> +		     SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x35),
> +		     NAND_MEMORG(1, 2048, 128, 64, 4096, 80, 2, 1, 1),
> +		     NAND_ECCREQ(6, 32),
> +		     SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +					      &write_cache_variants,
> +					      &update_cache_variants),
> +		     SPINAND_ON_DIE_ECC_MANDATORY,
> +		     SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +		     		     skyhigh_spinand_ecc_get_status)), };
> +
> +static int skyhigh_spinand_init(struct spinand_device *spinand) {
> +	return spinand_write_reg_op(spinand, REG_BLOCK_LOCK,
> +				    SKYHIGH_CONFIG_PROTECT_EN);

Is this really relevant? Isn't there an API for the block lock mechanism?

> +}
> +
> +static const struct spinand_manufacturer_ops skyhigh_spinand_manuf_ops = {
> +	.init = skyhigh_spinand_init,
> + };
> +
> +const struct spinand_manufacturer skyhigh_spinand_manufacturer = {
> +	.id = SPINAND_MFR_SKYHIGH,
> +	.name = "SkyHigh",
> +	.chips = skyhigh_spinand_table,
> +	.nchips = ARRAY_SIZE(skyhigh_spinand_table),
> +	.ops = &skyhigh_spinand_manuf_ops,
> +};
> diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h 
> old mode 100644 new mode 100755 index badb4c1ac079..0e135076df24
> --- a/include/linux/mtd/spinand.h
> +++ b/include/linux/mtd/spinand.h
> @@ -268,6 +268,7 @@ extern const struct spinand_manufacturer 
> gigadevice_spinand_manufacturer;  extern const struct 
> spinand_manufacturer macronix_spinand_manufacturer;  extern const 
> struct spinand_manufacturer micron_spinand_manufacturer;  extern const 
> struct spinand_manufacturer paragon_spinand_manufacturer;
> +extern const struct spinand_manufacturer 
> +skyhigh_spinand_manufacturer;
>  extern const struct spinand_manufacturer 
> toshiba_spinand_manufacturer;  extern const struct 
> spinand_manufacturer winbond_spinand_manufacturer;  extern const 
> struct spinand_manufacturer xtx_spinand_manufacturer; @@ -312,6 +313,7 
> @@ struct spinand_ecc_info {
>  
>  #define SPINAND_HAS_QE_BIT		BIT(0)
>  #define SPINAND_HAS_CR_FEAT_BIT		BIT(1)
> +#define SPINAND_ON_DIE_ECC_MANDATORY	BIT(2)	/* SHM */

If we go this route, then "mandatory" is not relevant here, we shall convey the fact that the on-die ECC engine cannot be disabled and as mentioned above, there are other impacts.

>  
>  /**
>   * struct spinand_ondie_ecc_conf - private SPI-NAND on-die ECC engine 
> structure @@ -518,5 +520,6 @@ int spinand_match_and_init(struct 
> spinand_device *spinand,
>  
>  int spinand_upd_cfg(struct spinand_device *spinand, u8 mask, u8 val);  
> int spinand_select_target(struct spinand_device *spinand, unsigned int 
> target);
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 
> +val);
>  
>  #endif /* __LINUX_MTD_SPINAND_H */


Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <SE2P216MB210205B301549661575720CC833A2@SE2P216MB2102.KORP216.PROD.OUTLOOK.COM>]

* RE: Re:
       [not found]   ` <SE2P216MB210205B301549661575720CC833A2@SE2P216MB2102.KORP216.PROD.OUTLOOK.COM>
@ 2024-03-29  4:41     ` Kyeongrho.Kim
  0 siblings, 0 replies; 1651+ messages in thread
From: Kyeongrho.Kim @ 2024-03-29  4:41 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: richard@nod.at, vigneshr@ti.com, mmkurbanov@salutedevices.com,
	ddrokosov@sberdevices.ru, gch981213@gmail.com, michael@walle.cc,
	broonie@kernel.org, mika.westerberg@linux.intel.com,
	acelan.kao@canonical.com, linux-kernel@vger.kernel.org,
	linux-mtd@lists.infradead.org, Mohamed Sardi, Changsub.Shim

(I send again this mail with plain text not HTML.)

Dear Miquel,
Please see my reply in below email.
And please comment if you have any others.
Thanks,
KR

-----Original Message-----
From: Miquel Raynal <mailto:miquel.raynal@bootlin.com> 
Sent: Thursday, March 7, 2024 5:01 PM
To: Kyeongrho.Kim <mailto:kr.kim@skyhighmemory.com>
Cc: mailto:richard@nod.at; mailto:vigneshr@ti.com; mailto:mmkurbanov@salutedevices.com; mailto:ddrokosov@sberdevices.ru; mailto:gch981213@gmail.com; mailto:michael@walle.cc; mailto:broonie@kernel.org; mailto:mika.westerberg@linux.intel.com; mailto:acelan.kao@canonical.com; mailto:linux-kernel@vger.kernel.org; mailto:linux-mtd@lists.infradead.org; Mohamed Sardi <mailto:moh.sardi@skyhighmemory.com>; Changsub.Shim <mailto:changsub.shim@skyhighmemory.com>
Subject: Re:

Hi,

mailto:kr.kim@skyhighmemory.com wrote on Thu,  7 Mar 2024 15:07:29 +0900:

> Feat: Add SkyHigh Memory Patch code
> 
> Add SPI Nand Patch code of SkyHigh Memory
> - Add company dependent code with 'skyhigh.c'
> - Insert into 'core.c' so that 'always ECC on'

Patch formatting is still messed up.

> commit 6061b97a830af8cb5fd0917e833e779451f9046a (HEAD -> master)
> Author: KR Kim <mailto:kr.kim@skyhighmemory.com>
> Date:   Thu Mar 7 13:24:11 2024 +0900
> 
>     SPI Nand Patch code of SkyHigh Momory
> 
>     Signed-off-by: KR Kim <mailto:kr.kim@skyhighmemory.com>
> 
> From 6061b97a830af8cb5fd0917e833e779451f9046a Mon Sep 17 00:00:00 2001
> From: KR Kim <mailto:kr.kim@skyhighmemory.com>
> Date: Thu, 7 Mar 2024 13:24:11 +0900
> Subject: [PATCH] SPI Nand Patch code of SkyHigh Memory
> 
> ---
>  drivers/mtd/nand/spi/Makefile  |   2 +-
>  drivers/mtd/nand/spi/core.c    |   7 +-
>  drivers/mtd/nand/spi/skyhigh.c | 155 +++++++++++++++++++++++++++++++++
>  include/linux/mtd/spinand.h    |   3 +
>  4 files changed, 165 insertions(+), 2 deletions(-)  mode change 
> 100644 => 100755 drivers/mtd/nand/spi/Makefile  mode change 100644 => 
> 100755 drivers/mtd/nand/spi/core.c  create mode 100644 
> drivers/mtd/nand/spi/skyhigh.c  mode change 100644 => 100755 
> include/linux/mtd/spinand.h
> 
> diff --git a/drivers/mtd/nand/spi/Makefile 
> b/drivers/mtd/nand/spi/Makefile old mode 100644 new mode 100755 index 
> 19cc77288ebb..1e61ab21893a
> --- a/drivers/mtd/nand/spi/Makefile
> +++ b/drivers/mtd/nand/spi/Makefile
> @@ -1,4 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0
>  spinand-objs := core.o alliancememory.o ato.o esmt.o foresee.o 
> gigadevice.o macronix.o -spinand-objs += micron.o paragon.o toshiba.o 
> winbond.o xtx.o
> +spinand-objs += micron.o paragon.o skyhigh.o toshiba.o winbond.o 
> +xtx.o
>  obj-$(CONFIG_MTD_SPI_NAND) += spinand.o diff --git 
> a/drivers/mtd/nand/spi/core.c b/drivers/mtd/nand/spi/core.c old mode 
> 100644 new mode 100755 index e0b6715e5dfe..e3f0a7544ba4
> --- a/drivers/mtd/nand/spi/core.c
> +++ b/drivers/mtd/nand/spi/core.c
> @@ -34,7 +34,7 @@ static int spinand_read_reg_op(struct spinand_device *spinand, u8 reg, u8 *val)
>     return 0;
>  }
>  
> -static int spinand_write_reg_op(struct spinand_device *spinand, u8 
> reg, u8 val)
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 
> +val)

Please do this in a separate commit.
[SHM] May I know why we need to do a separate commit?
Please elaborate for the reason.
>  {
>     struct spi_mem_op op = SPINAND_SET_FEATURE_OP(reg,
>                                         spinand->scratchbuf);
> @@ -196,6 +196,10 @@ static int spinand_init_quad_enable(struct 
> spinand_device *spinand)  static int spinand_ecc_enable(struct spinand_device *spinand,
>                       bool enable)
>  {
> +   /* SHM : always ECC enable */
> +   if (spinand->flags & SPINAND_ON_DIE_ECC_MANDATORY)
> +         return 0;

Silently always enabling ECC is not possible. If you cannot disable the on-die engine, then:
- you should prevent any other engine type to be used
- you should error out if a raw access is requested
- these chips are broken, IMO
[SHM] I understand that you are concern.
We have already reviewed 'Always ECC on' to see if there was any problem in many aspects and confirmed that there was no problem.

> +
>     return spinand_upd_cfg(spinand, CFG_ECC_ENABLE,
>                        enable ? CFG_ECC_ENABLE : 0);  } @@ -945,6 +949,7 @@ static 
> const struct spinand_manufacturer *spinand_manufacturers[] = {
>     &macronix_spinand_manufacturer,
>     &micron_spinand_manufacturer,
>     &paragon_spinand_manufacturer,
> +   &skyhigh_spinand_manufacturer,
>     &toshiba_spinand_manufacturer,
>     &winbond_spinand_manufacturer,
>     &xtx_spinand_manufacturer,
> diff --git a/drivers/mtd/nand/spi/skyhigh.c 
> b/drivers/mtd/nand/spi/skyhigh.c new file mode 100644 index 
> 000000000000..92e7572094ff
> --- /dev/null
> +++ b/drivers/mtd/nand/spi/skyhigh.c
> @@ -0,0 +1,155 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2022 SkyHigh Memory Limited
> + *
> + * Author: Takahiro Kuwano <mailto:takahiro.kuwano@infineon.com>  */
> +
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/mtd/spinand.h>
> +
> +#define SPINAND_MFR_SKYHIGH      0x01
> +
> +#define SKYHIGH_STATUS_ECC_1TO2_BITFLIPS     (1 << 4)
> +#define SKYHIGH_STATUS_ECC_3TO6_BITFLIPS     (2 << 4)
> +#define SKYHIGH_STATUS_ECC_UNCOR_ERROR        (3 << 4)
> +
> +#define SKYHIGH_CONFIG_PROTECT_EN BIT(1)
> +
> +static SPINAND_OP_VARIANTS(read_cache_variants,
> +         SPINAND_PAGE_READ_FROM_CACHE_QUADIO_OP(0, 4, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_X4_OP(0, 1, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_DUALIO_OP(0, 2, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_X2_OP(0, 1, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_OP(true, 0, 1, NULL, 0),
> +         SPINAND_PAGE_READ_FROM_CACHE_OP(false, 0, 1, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(write_cache_variants,
> +         SPINAND_PROG_LOAD_X4(true, 0, NULL, 0),
> +         SPINAND_PROG_LOAD(true, 0, NULL, 0));
> +
> +static SPINAND_OP_VARIANTS(update_cache_variants,
> +         SPINAND_PROG_LOAD_X4(false, 0, NULL, 0),
> +         SPINAND_PROG_LOAD(false, 0, NULL, 0));
> +
> +static int skyhigh_spinand_ooblayout_ecc(struct mtd_info *mtd, int section,
> +                           struct mtd_oob_region *region)
> +{
> +   if (section)
> +         return -ERANGE;
> +
> +   /* SkyHigh's ecc parity is stored in the internal hidden area and is 
> +not needed for them. */

                 ECC                an

"needed" is wrong here. Just stop after "area"


> +   region->length = 0;
> +   region->offset = mtd->oobsize;
> +
> +   return 0;
> +}
> +
> +static int skyhigh_spinand_ooblayout_free(struct mtd_info *mtd, int section,
> +                             struct mtd_oob_region *region) {
> +   if (section)
> +         return -ERANGE;
> +
> +   region->length = mtd->oobsize - 2;
> +   region->offset = 2;
> +
> +   return 0;
> +}
> +
> +static const struct mtd_ooblayout_ops skyhigh_spinand_ooblayout = {
> +   .ecc = skyhigh_spinand_ooblayout_ecc,
> +   .free = skyhigh_spinand_ooblayout_free, };
> +
> +static int skyhigh_spinand_ecc_get_status(struct spinand_device *spinand,
> +                       u8 status)
> +{
> +   /* SHM
> +   * 00 : No bit-flip
> +   * 01 : 1-2 errors corrected
> +   * 10 : 3-6 errors corrected         
> +   * 11 : uncorrectable
> +   */

Thanks for the comment but the switch case looks rather straightforward, it is self-sufficient in this case.

> +
> +   switch (status & STATUS_ECC_MASK) {
> +   case STATUS_ECC_NO_BITFLIPS:
> +         return 0;
> +
> +   case SKYHIGH_STATUS_ECC_1TO2_BITFLIPS:
> +         return 2;
> +
> +   case SKYHIGH_STATUS_ECC_3TO6_BITFLIPS:
> +         return 6;
> +
> +   case SKYHIGH_STATUS_ECC_UNCOR_ERROR:
> +         return -EBADMSG;;
> +
> +   default:
> +         break;

I guess you can directly call return -EINVAL here?

> +   }
> +
> +   return -EINVAL;
> +}
> +
> +static const struct spinand_info skyhigh_spinand_table[] = {
> +   SPINAND_INFO("S35ML01G301",
> +              SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x15),
> +              NAND_MEMORG(1, 2048, 64, 64, 1024, 20, 1, 1, 1),
> +              NAND_ECCREQ(6, 32),
> +              SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +                                 &write_cache_variants,
> +                                 &update_cache_variants),
> +              SPINAND_ON_DIE_ECC_MANDATORY,
> +              SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +                           skyhigh_spinand_ecc_get_status)),
> +   SPINAND_INFO("S35ML01G300",
> +              SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x14),
> +              NAND_MEMORG(1, 2048, 128, 64, 1024, 20, 1, 1, 1),
> +              NAND_ECCREQ(6, 32),
> +              SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +                                 &write_cache_variants,
> +                                 &update_cache_variants),
> +              SPINAND_ON_DIE_ECC_MANDATORY,
> +              SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +                           skyhigh_spinand_ecc_get_status)),
> +   SPINAND_INFO("S35ML02G300",
> +              SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x25),
> +              NAND_MEMORG(1, 2048, 128, 64, 2048, 40, 2, 1, 1),
> +              NAND_ECCREQ(6, 32),
> +              SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +                                 &write_cache_variants,
> +                                 &update_cache_variants),
> +              SPINAND_ON_DIE_ECC_MANDATORY,
> +              SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +                           skyhigh_spinand_ecc_get_status)),
> +   SPINAND_INFO("S35ML04G300",
> +              SPINAND_ID(SPINAND_READID_METHOD_OPCODE_DUMMY, 0x35),
> +              NAND_MEMORG(1, 2048, 128, 64, 4096, 80, 2, 1, 1),
> +              NAND_ECCREQ(6, 32),
> +              SPINAND_INFO_OP_VARIANTS(&read_cache_variants,
> +                                 &write_cache_variants,
> +                                 &update_cache_variants),
> +              SPINAND_ON_DIE_ECC_MANDATORY,
> +              SPINAND_ECCINFO(&skyhigh_spinand_ooblayout,
> +                           skyhigh_spinand_ecc_get_status)), };
> +
> +static int skyhigh_spinand_init(struct spinand_device *spinand) {
> +   return spinand_write_reg_op(spinand, REG_BLOCK_LOCK,
> +                         SKYHIGH_CONFIG_PROTECT_EN);

Is this really relevant? Isn't there an API for the block lock mechanism?
[SHM] SHM device should be done ‘Config Protect Enable’ first for unlock.
I changed to use the 'spinand_lock_block' function instead of the 'spinand_write_reg_op' function.

> +}
> +
> +static const struct spinand_manufacturer_ops skyhigh_spinand_manuf_ops = {
> +   .init = skyhigh_spinand_init,
> + };
> +
> +const struct spinand_manufacturer skyhigh_spinand_manufacturer = {
> +   .id = SPINAND_MFR_SKYHIGH,
> +   .name = "SkyHigh",
> +   .chips = skyhigh_spinand_table,
> +   .nchips = ARRAY_SIZE(skyhigh_spinand_table),
> +   .ops = &skyhigh_spinand_manuf_ops,
> +};
> diff --git a/include/linux/mtd/spinand.h b/include/linux/mtd/spinand.h 
> old mode 100644 new mode 100755 index badb4c1ac079..0e135076df24
> --- a/include/linux/mtd/spinand.h
> +++ b/include/linux/mtd/spinand.h
> @@ -268,6 +268,7 @@ extern const struct spinand_manufacturer 
> gigadevice_spinand_manufacturer;  extern const struct 
> spinand_manufacturer macronix_spinand_manufacturer;  extern const 
> struct spinand_manufacturer micron_spinand_manufacturer;  extern const 
> struct spinand_manufacturer paragon_spinand_manufacturer;
> +extern const struct spinand_manufacturer 
> +skyhigh_spinand_manufacturer;
>  extern const struct spinand_manufacturer 
> toshiba_spinand_manufacturer;  extern const struct 
> spinand_manufacturer winbond_spinand_manufacturer;  extern const 
> struct spinand_manufacturer xtx_spinand_manufacturer; @@ -312,6 +313,7 
> @@ struct spinand_ecc_info {
>  
>  #define SPINAND_HAS_QE_BIT        BIT(0)
>  #define SPINAND_HAS_CR_FEAT_BIT         BIT(1)
> +#define SPINAND_ON_DIE_ECC_MANDATORY   BIT(2) /* SHM */

If we go this route, then "mandatory" is not relevant here, we shall convey the fact that the on-die ECC engine cannot be disabled and as mentioned above, there are other impacts.
[SHM] Please elaborate in more specific what I should do.
>  
>  /**
>   * struct spinand_ondie_ecc_conf - private SPI-NAND on-die ECC engine 
> structure @@ -518,5 +520,6 @@ int spinand_match_and_init(struct 
> spinand_device *spinand,
>  
>  int spinand_upd_cfg(struct spinand_device *spinand, u8 mask, u8 val);  
> int spinand_select_target(struct spinand_device *spinand, unsigned int 
> target);
> +int spinand_write_reg_op(struct spinand_device *spinand, u8 reg, u8 
> +val);
>  
>  #endif /* __LINUX_MTD_SPINAND_H */


Thanks,
Miquèl
______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-04-19 15:46 George Guo
  2024-04-23 16:48 ` Greg KH
  0 siblings, 1 reply; 1651+ messages in thread
From: George Guo @ 2024-04-19 15:46 UTC (permalink / raw)
  To: gregkh, tom.zanussi; +Cc: stable

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 3602 bytes --]

Subject: [PATCH 4.19.y v6 0/2] Double-free bug discovery on testing trigger-field-variable-support.tc

1) About v4-0001-tracing-Remove-hist-trigger-synth_var_refs.patch:

The reason I am backporting this patch is that no one found the double-free bug
at that time, then later the code was removed on upstream, but
4.19-stable has the bug.

This is tested via "./ftracetest test.d/trigger/inter-event/
trigger-field-variable-support.tc"
==================================================================
BUG: KASAN: use-after-free in destroy_hist_field+0x115/0x140
Read of size 4 at addr ffff888012e95318 by task ftracetest/1858

CPU: 1 PID: 1858 Comm: ftracetest Kdump: loaded Tainted: GE 4.19.90-89 #24
Source Version: Unknown
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0
Call Trace:
 dump_stack+0xcb/0x10b
 print_address_description.cold+0x54/0x249
 kasan_report_error.cold+0x63/0xab
 ? destroy_hist_field+0x115/0x140
 __asan_report_load4_noabort+0x8d/0xa0
 ? destroy_hist_field+0x115/0x140
 destroy_hist_field+0x115/0x140
 destroy_hist_data+0x4e4/0x9a0
 event_hist_trigger_free+0x212/0x2f0
 ? update_cond_flag+0x128/0x170
 ? event_hist_trigger_func+0x2880/0x2880
 hist_unregister_trigger+0x2f2/0x4f0
 event_hist_trigger_func+0x168c/0x2880
 ? tracing_map_read_var_once+0xd0/0xd0
 ? create_key_field+0x520/0x520
 ? __mutex_lock_slowpath+0x10/0x10
 event_trigger_write+0x2f4/0x490
 ? trigger_start+0x180/0x180
 ? __fget_light+0x369/0x5d0
 ? count_memcg_event_mm+0x104/0x2b0
 ? trigger_start+0x180/0x180
 __vfs_write+0x81/0x100
 vfs_write+0x1e1/0x540
 ksys_write+0x12a/0x290
 ? __ia32_sys_read+0xb0/0xb0
 ? __close_fd+0x1d3/0x280
 do_syscall_64+0xe3/0x2d0
 entry_SYSCALL_64_after_hwframe+0x5c/0xc1
RIP: 0033:0x7efdd342ee04
Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 48
8d 05 39 34 0c 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff
ff 77 54 f3 c3 66 90 41 54 55 49 89 d4 53 48 89 f5
RSP: 002b:00007ffda01f5e08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00000000000000b4 RCX: 00007efdd342ee04
RDX: 00000000000000b4 RSI: 000055c5b41b1e90 RDI: 0000000000000001
RBP: 000055c5b41b1e90 R08: 000000000000000a R09: 0000000000000000
R10: 000000000000000a R11: 0000000000000246 R12: 00007efdd34ed5c0
R13: 00000000000000b4 R14: 00007efdd34ed7c0 R15: 00000000000000b4
==================================================================

2) About v4-0002-tracing-Use-var_refs-for-hist-trigger-reference-c.patch:

Only v4-0001-tracing-Remove-hist-trigger-synth_var_refs.patch will lead
to compilation errors:

../kernel/trace/trace_events_hist.c: In function ‘find_var_ref’:
../kernel/trace/trace_events_hist.c:1364:36: error: ‘struct hist_trigger_data’ has no member named ‘n_synth_var_refs’; did you mean ‘n_var_refs’?
 1364 |         for (i = 0; i < hist_data->n_synth_var_refs; i++) {
      |                                    ^~~~~~~~~~~~~~~~
      |                                    n_var_refs
../kernel/trace/trace_events_hist.c:1365:41: error: ‘struct hist_trigger_data’ has no member named ‘synth_var_refs’; did you mean ‘n_var_refs’?
 1365 |                 hist_field = hist_data->synth_var_refs[i];
      |                                         ^~~~~~~~~~~~~~
      |                                         n_var_refs


Tom Zanussi (2):
  tracing: Remove hist trigger synth_var_refs
  tracing: Use var_refs[] for hist trigger reference checking

 kernel/trace/trace_events_hist.c | 86 ++++----------------------------
 1 file changed, 11 insertions(+), 75 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-04-19 15:46 George Guo
@ 2024-04-23 16:48 ` Greg KH
  0 siblings, 0 replies; 1651+ messages in thread
From: Greg KH @ 2024-04-23 16:48 UTC (permalink / raw)
  To: George Guo; +Cc: tom.zanussi, stable

On Fri, Apr 19, 2024 at 11:46:56PM +0800, George Guo wrote:
> Subject: [PATCH 4.19.y v6 0/2] Double-free bug discovery on testing trigger-field-variable-support.tc
> 
> 1) About v4-0001-tracing-Remove-hist-trigger-synth_var_refs.patch:
> 
> The reason I am backporting this patch is that no one found the double-free bug
> at that time, then later the code was removed on upstream, but
> 4.19-stable has the bug.

Both now queued up, thanks

greg k-h

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH dovetail] x86: irq_pipeline: Add missing definition for !CONFIG_IRQ_PIPELINE
@ 2024-04-23 14:21 Philippe Gerum
  2024-04-24  8:58 ` Fabian Scheler
  0 siblings, 1 reply; 1651+ messages in thread
From: Philippe Gerum @ 2024-04-23 14:21 UTC (permalink / raw)
  To: Florian Bezdeka; +Cc: Fabian Scheler, Roman Hodek, xenomai


Florian Bezdeka <florian.bezdeka@siemens.com> writes:

> On Tue, 2024-04-23 at 14:59 +0200, Philippe Gerum wrote:
>> Merged, thanks.
>
> Don't know if that is important for you Philippe, but the patch was
> missing a proper "From: " tag in the body.
>
> Fabian is now the author, but the signed-off-by tag is from Roman (the
> initial author).
>
> Fabian, when re-sending patches (maybe because the original author has
> no proper mail setup) make sure to add your signed-off-by and a "From:
> " line as first line of the body.
>
> git format-patch --from should do it.
>
> Best regards,
> Florian
>

Let's say that the signed-off-by tag is what matters the most to me
along with a source channel I can trust (in this case siemens.com), but
having a fully consistent information would be better, indeed.

-- 
Philippe.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2024-04-23 14:21 [PATCH dovetail] x86: irq_pipeline: Add missing definition for !CONFIG_IRQ_PIPELINE Philippe Gerum
@ 2024-04-24  8:58 ` Fabian Scheler
  2024-04-24  9:02   ` Scheler, Fabian
  0 siblings, 1 reply; 1651+ messages in thread
From: Fabian Scheler @ 2024-04-24  8:58 UTC (permalink / raw)
  To: xenomai; +Cc: roman.hodek


As suggested by Florian I revised the patch so that the correct author is stated in the commit.

Ciao
Fabian

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-04-24  8:58 ` Fabian Scheler
@ 2024-04-24  9:02   ` Scheler, Fabian
  0 siblings, 0 replies; 1651+ messages in thread
From: Scheler, Fabian @ 2024-04-24  9:02 UTC (permalink / raw)
  To: xenomai

Am 24.04.2024 um 10:58 schrieb Fabian Scheler:
> 
> As suggested by Florian I revised the patch so that the correct author is stated in the commit.
> 
> Ciao
> Fabian

OK, something went wrong here - this simply is additional information 
for the revised patch.

Ciao
Fabian

-- 
With best regards,
Dr. Fabian Scheler

Siemens AG
T CED EDC-DE
Hertha-Sponer-Weg 3
91058 Erlangen, Germany
Phone: +49 (1522) 1702973
Mobile: +49 (1522) 1702973
mailto:fabian.scheler@siemens.com
www.siemens.com

Siemens Aktiengesellschaft: Chairman of the Supervisory Board: Jim 
Hagemann Snabe; Managing Board: Roland Busch, Chairman, President and 
Chief Executive Officer; Cedrik Neike, Matthias Rebellius, Ralf P. 
Thomas, Judith Wiese; Registered offices: Berlin and Munich, Germany; 
Commercial registries: Berlin-Charlottenburg, HRB 12300, Munich, HRB 
6684; WEEE-Reg.-No. DE 23691322

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CGME20240520102002epcas2p3d0944968114a664556cbd74d53beddee@epcas2p3.samsung.com>]

* (no subject)
       [not found] <CGME20240520102002epcas2p3d0944968114a664556cbd74d53beddee@epcas2p3.samsung.com>
@ 2024-05-20 10:09 ` Minwoo Im
  2024-05-20 13:34   ` Vincent Fu
  0 siblings, 1 reply; 1651+ messages in thread
From: Minwoo Im @ 2024-05-20 10:09 UTC (permalink / raw)
  To: fio

[-- Attachment #1: Type: text/plain, Size: 14 bytes --]

subscribe fio

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-05-20 10:09 ` Minwoo Im
@ 2024-05-20 13:34   ` Vincent Fu
  2024-05-21  0:00     ` Re: Minwoo Im
  0 siblings, 1 reply; 1651+ messages in thread
From: Vincent Fu @ 2024-05-20 13:34 UTC (permalink / raw)
  To: Minwoo Im, fio

On 5/20/24 06:09, Minwoo Im wrote:
> subscribe fio
> 
> 

Minwoo, here are instructions for subscribing to this list:

http://vger.kernel.org/vger-lists.html#fio

Vincent

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-05-20 13:34   ` Vincent Fu
@ 2024-05-21  0:00     ` Minwoo Im
  0 siblings, 0 replies; 1651+ messages in thread
From: Minwoo Im @ 2024-05-21  0:00 UTC (permalink / raw)
  To: Vincent Fu; +Cc: fio, Minwoo Im

[-- Attachment #1: Type: text/plain, Size: 307 bytes --]

On 24-05-20 09:34:15, Vincent Fu wrote:
> On 5/20/24 06:09, Minwoo Im wrote:
> > subscribe fio
> > 
> > 
> 
> Minwoo, here are instructions for subscribing to this list:

Ah, my bad.  I should have sent to majordomo@vger.kernel.org.

Thanks!

> 
> http://vger.kernel.org/vger-lists.html#fio
> 
> Vincent
> 

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-06-11 16:54 Jacob Pan
  2024-06-12  2:04 ` Sean Christopherson
  0 siblings, 1 reply; 1651+ messages in thread
From: Jacob Pan @ 2024-06-11 16:54 UTC (permalink / raw)
  To: X86 Kernel, LKML, Thomas Gleixner, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Borislav Petkov, linux-perf-users, Peter Zijlstra
  Cc: Andi Kleen, Xin Li

From e52010df700cde894633c45c0b364847e63a9819 Mon Sep 17 00:00:00 2001
From: Jacob Pan <jacob.jun.pan@linux.intel.com>
Date: Tue, 11 Jun 2024 09:49:17 -0700
Subject: Subject: [PATCH v2 0/6] Add support for NMI source reporting

Hi Thomas and all,

Non-Maskable Interrupts (NMIs) are routed to the local Advanced Programmable
Interrupt Controller (APIC) using vector #2. Before the advent of the
Flexible Return and Event Delivery (FRED)[1], the vector information set by
the NMI initiator was disregarded or lost within the hardware, compelling
system software to poll every registered NMI handler to pinpoint the source
of the NMI[2]. This approach led to several issues:

1.	Inefficiency due to the CPU's time spent polling all handlers.
2.	Increased latency from the additional time taken to poll all handlers.
3.	The occurrence of unnecessary NMIs if they are triggered shortly
	after being processed by a different source.

To tackle these challenges, Intel introduced NMI source reporting as a part
of the FRED specification (detailed in Chapter 9). This CPU feature ensures
that while all NMI sources are still aggregated into NMI vector (#2) for
delivery, the source of the NMI is now conveyed through FRED event data
(a 16-bit bitmap on the stack). This allows for the selective dispatch
of the NMI source handler based on the bitmap, eliminating the need to
invoke all NMI source handlers indiscriminately.

In line with the hardware architecture, various interrupt sources can
generate NMIs by encoding an NMI delivery mode. However, this patchset
activates only the local NMI sources that are currently utilized by the
Linux kernel, which includes:

1.	Performance monitoring.
2.	Inter-Processor Interrupts (IPIs) for functions like CPU backtrace,
	machine check, Kernel GNU Debugger (KGDB), reboot, panic stop, and
	self-test.

Other NMI sources will continue to be handled as previously when the NMI
source is not utilized or remains unidentified.

Next steps:
1. KVM support
2. Optimization to reuse IDT NMI vector 2 as NMI source for "known" source.
Link:https://lore.kernel.org/lkml/746fecd5-4c79-42f9-919e-912ec415e73f@zytor.com/

[1] https://www.intel.com/content/www/us/en/content-details/779982/flexible-return-and-event-delivery-fred-specification.html
[2] https://lore.kernel.org/lkml/171011362209.2468526.15187874627966416701.tglx@xen13/

Thanks,

Jacob

---
Change logs are in individual patches.

Jacob Pan (6):
  x86/irq: Add enumeration of NMI source reporting CPU feature
  x86/irq: Extend NMI handler registration interface to include source
  x86/irq: Factor out common NMI handling code
  x86/irq: Process nmi sources in NMI handler
  perf/x86: Enable NMI source reporting for perfmon
  x86/irq: Enable NMI source on IPIs delivered as NMI

 arch/x86/Kconfig                         |  9 +++
 arch/x86/events/amd/ibs.c                |  2 +-
 arch/x86/events/core.c                   | 11 ++-
 arch/x86/events/intel/core.c             |  6 +-
 arch/x86/include/asm/apic.h              |  1 +
 arch/x86/include/asm/cpufeatures.h       |  1 +
 arch/x86/include/asm/disabled-features.h |  8 +-
 arch/x86/include/asm/irq_vectors.h       | 38 +++++++++
 arch/x86/include/asm/nmi.h               |  4 +-
 arch/x86/kernel/apic/hw_nmi.c            |  5 +-
 arch/x86/kernel/apic/ipi.c               |  4 +-
 arch/x86/kernel/apic/local.h             | 18 +++--
 arch/x86/kernel/cpu/mce/inject.c         |  4 +-
 arch/x86/kernel/cpu/mshyperv.c           |  2 +-
 arch/x86/kernel/kgdb.c                   |  6 +-
 arch/x86/kernel/nmi.c                    | 99 +++++++++++++++++++++---
 arch/x86/kernel/nmi_selftest.c           |  7 +-
 arch/x86/kernel/reboot.c                 |  4 +-
 arch/x86/kernel/smp.c                    |  4 +-
 arch/x86/kernel/traps.c                  |  4 +-
 arch/x86/platform/uv/uv_nmi.c            |  4 +-
 drivers/acpi/apei/ghes.c                 |  2 +-
 drivers/char/ipmi/ipmi_watchdog.c        |  2 +-
 drivers/edac/igen6_edac.c                |  2 +-
 drivers/watchdog/hpwdt.c                 |  6 +-
 25 files changed, 200 insertions(+), 53 deletions(-)

-- 
2.25.1

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-06-11 16:54 Jacob Pan
@ 2024-06-12  2:04 ` Sean Christopherson
  2024-06-12  2:55   ` Re: Xin Li
  0 siblings, 1 reply; 1651+ messages in thread
From: Sean Christopherson @ 2024-06-12  2:04 UTC (permalink / raw)
  To: Jacob Pan
  Cc: X86 Kernel, LKML, Thomas Gleixner, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Borislav Petkov, linux-perf-users, Peter Zijlstra,
	Andi Kleen, Xin Li

On Tue, Jun 11, 2024, Jacob Pan wrote:
> To tackle these challenges, Intel introduced NMI source reporting as a part
> of the FRED specification (detailed in Chapter 9). 

Chapter 9 of the linked spec is "VMX Interactions with FRED Transitions".  I
spent a minute or so poking around the spec and didn't find anything that describes
how "NMI source reporting" works.

> 1.	Performance monitoring.
> 2.	Inter-Processor Interrupts (IPIs) for functions like CPU backtrace,
> 	machine check, Kernel GNU Debugger (KGDB), reboot, panic stop, and
> 	self-test.
> 
> Other NMI sources will continue to be handled as previously when the NMI
> source is not utilized or remains unidentified.
> 
> Next steps:
> 1. KVM support

I can't tell for sure since I can't find the relevant spec info, but doesn't KVM
support need to land before this gets enabled?  Otherwise the source would get
lost if the NMI arrived while the CPU was in non-root mode, no?  E.g. I don't
see any changes to fred_entry_from_kvm() in this series.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-06-12  2:04 ` Sean Christopherson
@ 2024-06-12  2:55   ` Xin Li
  0 siblings, 0 replies; 1651+ messages in thread
From: Xin Li @ 2024-06-12  2:55 UTC (permalink / raw)
  To: Sean Christopherson, Jacob Pan
  Cc: X86 Kernel, LKML, Thomas Gleixner, Dave Hansen, H. Peter Anvin,
	Ingo Molnar, Borislav Petkov, linux-perf-users, Peter Zijlstra,
	Andi Kleen, Xin Li

On 6/11/2024 7:04 PM, Sean Christopherson wrote:
> On Tue, Jun 11, 2024, Jacob Pan wrote:
>> To tackle these challenges, Intel introduced NMI source reporting as a part
>> of the FRED specification (detailed in Chapter 9).
> 
> Chapter 9 of the linked spec is "VMX Interactions with FRED Transitions".  I
> spent a minute or so poking around the spec and didn't find anything that describes
> how "NMI source reporting" works.

I did the same thing when I saw NMI source was added to the spec :)

> 
>> 1.	Performance monitoring.
>> 2.	Inter-Processor Interrupts (IPIs) for functions like CPU backtrace,
>> 	machine check, Kernel GNU Debugger (KGDB), reboot, panic stop, and
>> 	self-test.
>>
>> Other NMI sources will continue to be handled as previously when the NMI
>> source is not utilized or remains unidentified.
>>
>> Next steps:
>> 1. KVM support
> 
> I can't tell for sure since I can't find the relevant spec info, but doesn't KVM
> support need to land before this gets enabled?  Otherwise the source would get
> lost if the NMI arrived while the CPU was in non-root mode, no?  E.g. I don't
> see any changes to fred_entry_from_kvm() in this series.

You're absolutely right!

There is a patch in NMI source KVM patches for this, but as you
mentioned it has to be in this NMI source native patches instead.

Thanks!
     Xin

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-06-26  6:11 Totoro W
  2024-06-26  7:09 ` Eduard Zingerman
  0 siblings, 1 reply; 1651+ messages in thread
From: Totoro W @ 2024-06-26  6:11 UTC (permalink / raw)
  To: bpf

Hi folks,

This is my first time to ask questions in this mailing list. I'm the
author of https://github.com/tw4452852/zbpf which is a framework to
write BPF programs with Zig toolchain.
During the development, as the BTF is totally generated by the Zig
toolchain, some naming conventions will make the BTF verifier refuse
to load.
Right now I have to patch the libbpf to do some fixup before loading
into the kernel
(https://github.com/tw4452852/libbpf_zig/blob/main/0001-temporary-WA-for-invalid-BTF-info-generated-by-Zig.patch).
Even though this just work-around the issue, I'm still curious about
the current naming sanitation, I want to know some background about
it.
If possible, could we relax this to accept more languages (like Zig)
to write BPF programs? Thanks in advance.

Regards.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-06-26  6:11 Totoro W
@ 2024-06-26  7:09 ` Eduard Zingerman
  0 siblings, 0 replies; 1651+ messages in thread
From: Eduard Zingerman @ 2024-06-26  7:09 UTC (permalink / raw)
  To: Totoro W, bpf

On Wed, 2024-06-26 at 14:11 +0800, Totoro W wrote:
> Hi folks,
> 
> This is my first time to ask questions in this mailing list. I'm the
> author of https://github.com/tw4452852/zbpf which is a framework to
> write BPF programs with Zig toolchain.
> During the development, as the BTF is totally generated by the Zig
> toolchain, some naming conventions will make the BTF verifier refuse
> to load.
> Right now I have to patch the libbpf to do some fixup before loading
> into the kernel
> (https://github.com/tw4452852/libbpf_zig/blob/main/0001-temporary-WA-for-invalid-BTF-info-generated-by-Zig.patch).

> +		// https://github.com/tw4452852/zbpf/issues/3
> +		else if (btf_is_ptr(t)) {
> +			t->name_off = 0;

As far as I understand, you control BTF generation, why generate names
for pointers in a first place?

> Even though this just work-around the issue, I'm still curious about
> the current naming sanitation, I want to know some background about
> it.

Doing some git digging shows that name check was first introduced by
the following commit:

2667a2626f4d ("bpf: btf: Add BTF_KIND_FUNC and BTF_KIND_FUNC_PROTO")

And lived like that afterwards.

My guess is that kernel BTF is used to work with kernel functions and
data structures. All of which follow C naming convention.

> If possible, could we relax this to accept more languages (like Zig)
> to write BPF programs? Thanks in advance.

Could you please elaborate a bit?
Citation from [1]:

  Identifiers must start with an alphabetic character or underscore
  and may be followed by any number of alphanumeric characters or
  underscores. They must not overlap with any keywords.

  If a name that does not fit these requirements is needed, such as
  for linking with external libraries, the @"" syntax may be used.

Paragraph 1 matches C naming convention and should be accepted by
kernel/bpf/btf.c:btf_name_valid_identifier().
Paragraph 2 is basically any string.
Which one do you want?

[1] https://ziglang.org/documentation/master/#Identifiers

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v2 09/60] i2c: cp2615: reword according to newest specification
@ 2024-07-06 11:20 Wolfram Sang
  2024-07-10  6:41 ` [PATCH v3] " Wolfram Sang
  0 siblings, 1 reply; 1651+ messages in thread
From: Wolfram Sang @ 2024-07-06 11:20 UTC (permalink / raw)
  To: linux-i2c; +Cc: Wolfram Sang, Bence Csókás, Andi Shyti, linux-kernel

Change the wording of this driver wrt. the newest I2C v7 and SMBus 3.2
specifications and replace "master/slave" with more appropriate terms.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---
 drivers/i2c/busses/i2c-cp2615.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/i2c/busses/i2c-cp2615.c b/drivers/i2c/busses/i2c-cp2615.c
index cf3747d87034..315a37155401 100644
--- a/drivers/i2c/busses/i2c-cp2615.c
+++ b/drivers/i2c/busses/i2c-cp2615.c
@@ -60,7 +60,7 @@ enum cp2615_i2c_status {
 	CP2615_CFG_LOCKED = -6,
 	/* read_len or write_len out of range */
 	CP2615_INVALID_PARAM = -4,
-	/* I2C slave did not ACK in time */
+	/* I2C target did not ACK in time */
 	CP2615_TIMEOUT,
 	/* I2C bus busy */
 	CP2615_BUS_BUSY,
@@ -211,7 +211,7 @@ static int cp2615_check_iop(struct usb_interface *usbif)
 }
 
 static int
-cp2615_i2c_master_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
+cp2615_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
 {
 	struct usb_interface *usbif = adap->algo_data;
 	int i = 0, ret = 0;
@@ -250,8 +250,8 @@ cp2615_i2c_func(struct i2c_adapter *adap)
 }
 
 static const struct i2c_algorithm cp2615_i2c_algo = {
-	.master_xfer	= cp2615_i2c_master_xfer,
-	.functionality	= cp2615_i2c_func,
+	.xfer = cp2615_i2c_xfer,
+	.functionality = cp2615_i2c_func,
 };
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* [PATCH v3] i2c: cp2615: reword according to newest specification
  2024-07-06 11:20 [PATCH v2 09/60] i2c: cp2615: reword according to newest specification Wolfram Sang
@ 2024-07-10  6:41 ` Wolfram Sang
  2024-07-10 17:51   ` Bence Csókás
  0 siblings, 1 reply; 1651+ messages in thread
From: Wolfram Sang @ 2024-07-10  6:41 UTC (permalink / raw)
  To: linux-i2c; +Cc: Wolfram Sang, Bence Csókás, Andi Shyti, linux-kernel

Change the wording of this driver wrt. the newest I2C v7 and SMBus 3.2
specifications and replace "master/slave" with more appropriate terms.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---

Change since v2:
* reworded comment about NAK for consistency as well (Thanks, Bence!)

 drivers/i2c/busses/i2c-cp2615.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/i2c/busses/i2c-cp2615.c b/drivers/i2c/busses/i2c-cp2615.c
index cf3747d87034..e7720ea4045e 100644
--- a/drivers/i2c/busses/i2c-cp2615.c
+++ b/drivers/i2c/busses/i2c-cp2615.c
@@ -60,11 +60,11 @@ enum cp2615_i2c_status {
 	CP2615_CFG_LOCKED = -6,
 	/* read_len or write_len out of range */
 	CP2615_INVALID_PARAM = -4,
-	/* I2C slave did not ACK in time */
+	/* I2C target did not ACK in time */
 	CP2615_TIMEOUT,
 	/* I2C bus busy */
 	CP2615_BUS_BUSY,
-	/* I2C bus error (ie. device NAK'd the request) */
+	/* I2C bus error (ie. target NAK'd the request) */
 	CP2615_BUS_ERROR,
 	CP2615_SUCCESS
 };
@@ -211,7 +211,7 @@ static int cp2615_check_iop(struct usb_interface *usbif)
 }
 
 static int
-cp2615_i2c_master_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
+cp2615_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, int num)
 {
 	struct usb_interface *usbif = adap->algo_data;
 	int i = 0, ret = 0;
@@ -250,8 +250,8 @@ cp2615_i2c_func(struct i2c_adapter *adap)
 }
 
 static const struct i2c_algorithm cp2615_i2c_algo = {
-	.master_xfer	= cp2615_i2c_master_xfer,
-	.functionality	= cp2615_i2c_func,
+	.xfer = cp2615_i2c_xfer,
+	.functionality = cp2615_i2c_func,
 };
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2024-07-10  6:41 ` [PATCH v3] " Wolfram Sang
@ 2024-07-10 17:51   ` Bence Csókás
  0 siblings, 0 replies; 1651+ messages in thread
From: Bence Csókás @ 2024-07-10 17:51 UTC (permalink / raw)
  To: Wolfram Sang, linux-i2c; +Cc: Bence Csókás, Andi Shyti, linux-kernel

Hi!

On 1970. 01. 01. 1:00, Wolfram Sang wrote:
> Change the wording of this driver wrt. the newest I2C v7 and SMBus 3.2
> specifications and replace "master/slave" with more appropriate terms.
> 
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> ---
> 
> Change since v2:
> * reworded comment about NAK for consistency as well (Thanks, Bence!)
> 
>   drivers/i2c/busses/i2c-cp2615.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)

Acked-by: Bence Csókás <bence98@sch.bme.hu>

Bence

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-07-14 19:59 raschupkin.ri
  2024-07-15 20:20 ` Joe Lawrence
  2024-07-16 17:33 ` Re: Song Liu
  0 siblings, 2 replies; 1651+ messages in thread
From: raschupkin.ri @ 2024-07-14 19:59 UTC (permalink / raw)
  To: live-patching, joe.lawrence, pmladek, mbenes, jikos, jpoimboe

[PATCH] livepatch: support of modifying refcount_t without underflow after unpatch

CVE fixes sometimes add refcount_inc/dec() pairs to the code with existing refcount_t.
Two problems arise when applying live-patch in this case:
1) After refcount_t is being inc() during system is live-patched, after unpatch the counter value will not be valid, as corresponing dec() would never be called.
2) Underflows are possible in runtime in case dec() is called before corresponding inc() in the live-patched code.

Proposed kprefcount_t functions are using following approach to solve these two problems:
1) In addition to original refcount_t, temporary refcount_t is allocated, and after unpatch it is just removed. This way system is safe with correct refcounting while patch is applied, and no underflow would happend after unpatch.
2) For inc/dec() added by live-patch code, one bit in reference-holder structure is used (unsigned char *ref_holder, kprefholder_flag). In case dec() is called first, it is just ignored as ref_holder bit would still not be initialized.

API is defined include/linux/livepatch_refcount.h:

typedef struct kprefcount_struct {
	refcount_t *refcount;
	refcount_t kprefcount;
	spinlock_t lock;
} kprefcount_t;

kprefcount_t *kprefcount_alloc(refcount_t *refcount, gfp_t flags);
void kprefcount_free(kprefcount_t *kp_ref);
int kprefcount_read(kprefcount_t *kp_ref);
void kprefcount_inc(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
void kprefcount_dec(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
bool kprefcount_dec_and_test(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-07-14 19:59 raschupkin.ri
@ 2024-07-15 20:20 ` Joe Lawrence
  2024-07-15 22:45   ` Re: Roman Rashchupkin
                     ` (2 more replies)
  2024-07-16 17:33 ` Re: Song Liu
  1 sibling, 3 replies; 1651+ messages in thread
From: Joe Lawrence @ 2024-07-15 20:20 UTC (permalink / raw)
  To: raschupkin.ri; +Cc: live-patching, pmladek, mbenes, jikos, jpoimboe

On Sun, Jul 14, 2024 at 09:59:32PM +0200, raschupkin.ri@gmail.com wrote:
> 
> [PATCH] livepatch: support of modifying refcount_t without underflow after unpatch
> 
> CVE fixes sometimes add refcount_inc/dec() pairs to the code with existing refcount_t.
> Two problems arise when applying live-patch in this case:
> 1) After refcount_t is being inc() during system is live-patched, after unpatch the counter value will not be valid, as corresponing dec() would never be called.
> 2) Underflows are possible in runtime in case dec() is called before corresponding inc() in the live-patched code.
> 
> Proposed kprefcount_t functions are using following approach to solve these two problems:
> 1) In addition to original refcount_t, temporary refcount_t is allocated, and after unpatch it is just removed. This way system is safe with correct refcounting while patch is applied, and no underflow would happend after unpatch.
> 2) For inc/dec() added by live-patch code, one bit in reference-holder structure is used (unsigned char *ref_holder, kprefholder_flag). In case dec() is called first, it is just ignored as ref_holder bit would still not be initialized.
> 
> 
> API is defined include/linux/livepatch_refcount.h:
> 
> typedef struct kprefcount_struct {
> 	refcount_t *refcount;
> 	refcount_t kprefcount;
> 	spinlock_t lock;
> } kprefcount_t;
> 
> kprefcount_t *kprefcount_alloc(refcount_t *refcount, gfp_t flags);
> void kprefcount_free(kprefcount_t *kp_ref);
> int kprefcount_read(kprefcount_t *kp_ref);
> void kprefcount_inc(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> void kprefcount_dec(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> bool kprefcount_dec_and_test(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> 

Hi Roman,

Can you point to a specific upstream commit that this API facilitated a
livepatch conversion?  That would make a good addition to the
Documentation/livepatch/ side of a potential v2.

But first, let me see if I understand the problem correctly.  Let's say
points A and A' below represent the original kernel code reference
get/put pairing in task execution flow.  A livepatch adds a new get/put
pair, B and B' in the middle like so:

  ---  execution flow  --->
  -- A  B       B'  A'  -->

There are potential issues if the livepatch is (de)activated
mid-sequence, between the new pairings:

  problem 1:
  -- A      .   B'  A'  -->                   'B, but no B =  extra put!
            ^ livepatch is activated here

  problem 2:
  -- A  B   .       A'  -->                   B, but no B' =  extra get!
            ^ livepatch is deactivated here

The first thing that comes to mind is that this might be solved using
the existing shadow variable API.  When the livepatch takes the new
reference (B), it could create a new <struct, NEW_REF> shadow variable
instance.  The livepatch code to return the reference (B') would then
check on the shadow variable existence before doing so.  This would
solve problem 1.

The second problem is a little trickier.  Perhaps the shadow variable
approach still works as long as a pre-unpatch hook* were to iterate
through all the <*, NEW_REF> shadow variable instances and returned
their reference before freeing the shadow variable and declaring the
livepatch inactive.  I believe that would align the reference counts
with original kernel code expectations.

* note this approach probably requires atomic-replace livepatches, so
  only a single pre-unpatch hook is ever executed.

Also, the proposed patchset looks like it creates a parallel reference
counting structure... does this mean that the livepatch will need to
update *all* reference counting calls for the API to work (so points A,
B, B', and A' in my ascii-art above)?  This question loops back to my
first point about a real-world example that can be added to
Documentation/livepatch/, much like the ones found in the
shadow-vars.rst file.

Thanks,

--
Joe

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-07-15 20:20 ` Joe Lawrence
@ 2024-07-15 22:45   ` Roman Rashchupkin
  2024-07-16  9:28   ` Re: Nicolai Stange
       [not found]   ` <66963d60.170a0220.70a9a.8866SMTPIN_ADDED_BROKEN@mx.google.com>
  2 siblings, 0 replies; 1651+ messages in thread
From: Roman Rashchupkin @ 2024-07-15 22:45 UTC (permalink / raw)
  To: Joe Lawrence
  Cc: live-patching, pmladek, mbenes, jikos, jpoimboe, quic_jjohnson

Hello.
About upstream commits creating live-patching for which this API would 
facilitate,
I could reference several CVEs:
- CVE-2023-5633
     "drm/vmwgfx: Keep a gem reference to user bos in surfaces"
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=91398b413d03660fd5828f7b4abc64e884b98069

  drm_gem_object_get(&vbo->tbo.base);/drm_gem_object_put(&tmp_buf->tbo.base);

- CVE-2023-6932
     "ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet"
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e2b706c69190

     refcount_inc_not_zero(&im->refcnt)/ip_ma_put(im);

- CVE-2022-20566
     "Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put"
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0be8347c623e0ac4202a1d4e0373882821f56b0

     kref_get_unless_zero(&c->kref)/l2cap_chan_put(chan)

Only in all of these 3 cases I can remember now, refcount_t is mostly 
used inside wrapper-functions, and from the top of my head now I don't 
remember CVEs that plainly add refcount_inc()/dec().
In case the proposed patch is merged, maybe CVE-2023-5633 would be 
suited best for documentation, or source git could be searched for 
better example.

Two types of problems that you classify, are exactly what I'm attempting 
to solve for added refcount_inc/dec() in the code that is added by 
live-patch. Let's continue with your numbering (1) and (2) for 
simplicity of discussion.

Concerning problem (1), shadow variables are certainly could be used 
instead of my refholder bit in reference-holder structures. That's only 
my preference for simplicity that live-patches code is so often lacking, 
to use one bit in existing structures instead of hash-table based shadow 
variables. But certainly shadow-variables are also a good approach, and 
could be used instead of mine (unsigned char *ref_holder, int 
kprefholder_flag) in the kprefcount_t API.

About problem (2), iterating through all shadow-variable/refholder 
instances would also work, but it is just unnecessary processing during 
unpatch.
In my approach the second kprefcount variable with lifetime of 
live-patch being applied is used, it provides correct refcounting during 
live-patch, but the main idea is that this variable can be just safely 
removed at the unpatch.
The only complication could be values of refholder bits, that must be 
reset at live-patch apply, or probably it is more simple to implement at 
the unpatch, as all kprefcount_t structs are allocated by patch-code.
---
Roman Rashchupkin

On 7/15/24 22:20, Joe Lawrence wrote:
> On Sun, Jul 14, 2024 at 09:59:32PM +0200, raschupkin.ri@gmail.com wrote:
>> [PATCH] livepatch: support of modifying refcount_t without underflow after unpatch
>>
>> CVE fixes sometimes add refcount_inc/dec() pairs to the code with existing refcount_t.
>> Two problems arise when applying live-patch in this case:
>> 1) After refcount_t is being inc() during system is live-patched, after unpatch the counter value will not be valid, as corresponing dec() would never be called.
>> 2) Underflows are possible in runtime in case dec() is called before corresponding inc() in the live-patched code.
>>
>> Proposed kprefcount_t functions are using following approach to solve these two problems:
>> 1) In addition to original refcount_t, temporary refcount_t is allocated, and after unpatch it is just removed. This way system is safe with correct refcounting while patch is applied, and no underflow would happend after unpatch.
>> 2) For inc/dec() added by live-patch code, one bit in reference-holder structure is used (unsigned char *ref_holder, kprefholder_flag). In case dec() is called first, it is just ignored as ref_holder bit would still not be initialized.
>>
>>
>> API is defined include/linux/livepatch_refcount.h:
>>
>> typedef struct kprefcount_struct {
>> 	refcount_t *refcount;
>> 	refcount_t kprefcount;
>> 	spinlock_t lock;
>> } kprefcount_t;
>>
>> kprefcount_t *kprefcount_alloc(refcount_t *refcount, gfp_t flags);
>> void kprefcount_free(kprefcount_t *kp_ref);
>> int kprefcount_read(kprefcount_t *kp_ref);
>> void kprefcount_inc(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
>> void kprefcount_dec(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
>> bool kprefcount_dec_and_test(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
>>
> Hi Roman,
>
> Can you point to a specific upstream commit that this API facilitated a
> livepatch conversion?  That would make a good addition to the
> Documentation/livepatch/ side of a potential v2.
>
> But first, let me see if I understand the problem correctly.  Let's say
> points A and A' below represent the original kernel code reference
> get/put pairing in task execution flow.  A livepatch adds a new get/put
> pair, B and B' in the middle like so:
>
>    ---  execution flow  --->
>    -- A  B       B'  A'  -->
>
> There are potential issues if the livepatch is (de)activated
> mid-sequence, between the new pairings:
>
>    problem 1:
>    -- A      .   B'  A'  -->                   'B, but no B =  extra put!
>              ^ livepatch is activated here
>
>    problem 2:
>    -- A  B   .       A'  -->                   B, but no B' =  extra get!
>              ^ livepatch is deactivated here
>
>
> The first thing that comes to mind is that this might be solved using
> the existing shadow variable API.  When the livepatch takes the new
> reference (B), it could create a new <struct, NEW_REF> shadow variable
> instance.  The livepatch code to return the reference (B') would then
> check on the shadow variable existence before doing so.  This would
> solve problem 1.
>
> The second problem is a little trickier.  Perhaps the shadow variable
> approach still works as long as a pre-unpatch hook* were to iterate
> through all the <*, NEW_REF> shadow variable instances and returned
> their reference before freeing the shadow variable and declaring the
> livepatch inactive.  I believe that would align the reference counts
> with original kernel code expectations.
>
> * note this approach probably requires atomic-replace livepatches, so
>    only a single pre-unpatch hook is ever executed.
>
>
> Also, the proposed patchset looks like it creates a parallel reference
> counting structure... does this mean that the livepatch will need to
> update *all* reference counting calls for the API to work (so points A,
> B, B', and A' in my ascii-art above)?  This question loops back to my
> first point about a real-world example that can be added to
> Documentation/livepatch/, much like the ones found in the
> shadow-vars.rst file.
>
> Thanks,
>
> --
> Joe
>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-07-15 20:20 ` Joe Lawrence
  2024-07-15 22:45   ` Re: Roman Rashchupkin
@ 2024-07-16  9:28   ` Nicolai Stange
       [not found]   ` <66963d60.170a0220.70a9a.8866SMTPIN_ADDED_BROKEN@mx.google.com>
  2 siblings, 0 replies; 1651+ messages in thread
From: Nicolai Stange @ 2024-07-16  9:28 UTC (permalink / raw)
  To: Joe Lawrence
  Cc: raschupkin.ri, live-patching, pmladek, mbenes, jikos, jpoimboe

Hi all,

Joe Lawrence <joe.lawrence@redhat.com> writes:

> On Sun, Jul 14, 2024 at 09:59:32PM +0200, raschupkin.ri@gmail.com wrote:
>> 
> But first, let me see if I understand the problem correctly.  Let's say
> points A and A' below represent the original kernel code reference
> get/put pairing in task execution flow.  A livepatch adds a new get/put
> pair, B and B' in the middle like so:
>
>   ---  execution flow  --->
>   -- A  B       B'  A'  -->
>
> There are potential issues if the livepatch is (de)activated
> mid-sequence, between the new pairings:
>
>   problem 1:
>   -- A      .   B'  A'  -->                   'B, but no B =  extra put!
>             ^ livepatch is activated here
>
>   problem 2:
>   -- A  B   .       A'  -->                   B, but no B' =  extra get!
>             ^ livepatch is deactivated here

I can confirm that this scenario happens quite often with real world CVE
fixes and there's currently no way to implement such changes safely from
a livepatch. But I also believe this is an instance of a broader problem
class we attempted to solve with that "enhanced" states API proposed and
discussed at LPC ([1], there's a link to a recording at the bottom). For
reference, see Petr's POC from [2].


> The first thing that comes to mind is that this might be solved using
> the existing shadow variable API.

Same.


> When the livepatch takes the new
> reference (B), it could create a new <struct, NEW_REF> shadow variable
> instance.  The livepatch code to return the reference (B') would then
> check on the shadow variable existence before doing so.  This would
> solve problem 1.
>
> The second problem is a little trickier.  Perhaps the shadow variable
> approach still works as long as a pre-unpatch hook* were to iterate
> through all the <*, NEW_REF> shadow variable instances and returned
> their reference before freeing the shadow variable and declaring the
> livepatch inactive.

I think the problem of consistently maintaining shadowed reference
counts (or anything shadowed for that matter) could be solved with the
help of aforementioned states API enhancements, so I would propose to
revive Petr's IMO more generic patchset as an alternative.

Thoughts?

Thanks,

Nicolai

[1] https://lpc.events/event/17/contributions/1541/
[2] https://lore.kernel.org/r/20231110170428.6664-1-pmladek@suse.com

-- 
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nürnberg, Germany
GF: Ivo Totev, Andrew McDonald, Werner Knoblich
(HRB 36809, AG Nürnberg)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <66963d60.170a0220.70a9a.8866SMTPIN_ADDED_BROKEN@mx.google.com>]

* Re:
       [not found]   ` <66963d60.170a0220.70a9a.8866SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2024-07-16  9:53     ` Roman Rashchupkin
  2024-07-25 14:52       ` Re: Joe Lawrence
  0 siblings, 1 reply; 1651+ messages in thread
From: Roman Rashchupkin @ 2024-07-16  9:53 UTC (permalink / raw)
  To: Nicolai Stange, Joe Lawrence
  Cc: live-patching, pmladek, mbenes, jikos, jpoimboe

 >> The first thing that comes to mind is that this might be solved using
 >> the existing shadow variable API.

 > Same.

I just don't have enough experience using live-patch shadow-variables, 
so I agree that probably that's a better general solution for problem 
(1) of refcount underflow, than mine refholder flags.

 > I can confirm that this scenario happens quite often with real world CVE
 > fixes and there's currently no way to implement such changes safely from
 > a livepatch. But I also believe this is an instance of a broader problem
 > class we attempted to solve with that "enhanced" states API proposed and
 > discussed at LPC ([1], there's a link to a recording at the bottom). For
 > reference, see Petr's POC from [2].

About (2) of incorrect refcount_t value left after unpatch, it seems as 
a bit different and more special problem with counters, compared to the 
support of live-patch states, which can be solved for refcount_t in a 
simpler way.

IMHO adding temporary kprefcount_t variable for the time of live-patch 
being applied, modifying only this kprefcount_t variable by all added in 
livepatch inc()/dec(), provides correct refcounting during live-patch is 
applied. Then at the unpatch this temporaray variable can just safely be 
discarded. This way the only additional code to live-patch would be 
functions with original refcount_dec_and_test() calls.

---

Roman Rashchupkin

On 7/16/24 11:28, Nicolai Stange wrote:
> Hi all,
>
> Joe Lawrence <joe.lawrence@redhat.com> writes:
>
>> On Sun, Jul 14, 2024 at 09:59:32PM +0200, raschupkin.ri@gmail.com wrote:
>> But first, let me see if I understand the problem correctly.  Let's say
>> points A and A' below represent the original kernel code reference
>> get/put pairing in task execution flow.  A livepatch adds a new get/put
>> pair, B and B' in the middle like so:
>>
>>    ---  execution flow  --->
>>    -- A  B       B'  A'  -->
>>
>> There are potential issues if the livepatch is (de)activated
>> mid-sequence, between the new pairings:
>>
>>    problem 1:
>>    -- A      .   B'  A'  -->                   'B, but no B =  extra put!
>>              ^ livepatch is activated here
>>
>>    problem 2:
>>    -- A  B   .       A'  -->                   B, but no B' =  extra get!
>>              ^ livepatch is deactivated here
> I can confirm that this scenario happens quite often with real world CVE
> fixes and there's currently no way to implement such changes safely from
> a livepatch. But I also believe this is an instance of a broader problem
> class we attempted to solve with that "enhanced" states API proposed and
> discussed at LPC ([1], there's a link to a recording at the bottom). For
> reference, see Petr's POC from [2].
>
>
>> The first thing that comes to mind is that this might be solved using
>> the existing shadow variable API.
> Same.
>
>
>> When the livepatch takes the new
>> reference (B), it could create a new <struct, NEW_REF> shadow variable
>> instance.  The livepatch code to return the reference (B') would then
>> check on the shadow variable existence before doing so.  This would
>> solve problem 1.
>>
>> The second problem is a little trickier.  Perhaps the shadow variable
>> approach still works as long as a pre-unpatch hook* were to iterate
>> through all the <*, NEW_REF> shadow variable instances and returned
>> their reference before freeing the shadow variable and declaring the
>> livepatch inactive.
> I think the problem of consistently maintaining shadowed reference
> counts (or anything shadowed for that matter) could be solved with the
> help of aforementioned states API enhancements, so I would propose to
> revive Petr's IMO more generic patchset as an alternative.
>
> Thoughts?
>
> Thanks,
>
> Nicolai
>
> [1] https://lpc.events/event/17/contributions/1541/
> [2] https://lore.kernel.org/r/20231110170428.6664-1-pmladek@suse.com
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-07-16  9:53     ` Re: Roman Rashchupkin
@ 2024-07-25 14:52       ` Joe Lawrence
  0 siblings, 0 replies; 1651+ messages in thread
From: Joe Lawrence @ 2024-07-25 14:52 UTC (permalink / raw)
  To: Roman Rashchupkin, Nicolai Stange
  Cc: live-patching, pmladek, mbenes, jikos, jpoimboe

On 7/16/24 05:53, Roman Rashchupkin wrote:
>>> The first thing that comes to mind is that this might be solved using
>>> the existing shadow variable API.
> 
>> Same.
> 
> I just don't have enough experience using live-patch shadow-variables,
> so I agree that probably that's a better general solution for problem
> (1) of refcount underflow, than mine refholder flags.
> 

Yes, a general solution could cover the same problem but for different
datatypes, including locks, mutex, etc.

>> I can confirm that this scenario happens quite often with real world CVE
>> fixes and there's currently no way to implement such changes safely from
>> a livepatch. But I also believe this is an instance of a broader problem
>> class we attempted to solve with that "enhanced" states API proposed and
>> discussed at LPC ([1], there's a link to a recording at the bottom). For
>> reference, see Petr's POC from [2].

Thanks for the link -- I thought of that grand-unified
shadow/callback/states patch but couldn't find the latest version.  (I
see that Miroslav has just resurrected it with a fresh review, too.)

>> I think the problem of consistently maintaining shadowed reference
>> counts (or anything shadowed for that matter) could be solved with the
>> help of aforementioned states API enhancements, so I would propose to
>> revive Petr's IMO more generic patchset as an alternative.
>>
>> Thoughts?
>>

I definitely think the states API enhancement could be used to handle
the cases here via shadow variables.

In the meantime, are you using the kprefcount_t API currently via a
livepatch support module?  i.e. we don't need this in the kernel asap to
solve these problems, right?

-- 
Joe


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-07-14 19:59 raschupkin.ri
  2024-07-15 20:20 ` Joe Lawrence
@ 2024-07-16 17:33 ` Song Liu
  1 sibling, 0 replies; 1651+ messages in thread
From: Song Liu @ 2024-07-16 17:33 UTC (permalink / raw)
  To: raschupkin.ri
  Cc: live-patching, joe.lawrence, pmladek, mbenes, jikos, jpoimboe

On Mon, Jul 15, 2024 at 4:00 AM <raschupkin.ri@gmail.com> wrote:
>
>
> [PATCH] livepatch: support of modifying refcount_t without underflow after unpatch
>
> CVE fixes sometimes add refcount_inc/dec() pairs to the code with existing refcount_t.
> Two problems arise when applying live-patch in this case:
> 1) After refcount_t is being inc() during system is live-patched, after unpatch the counter value will not be valid, as corresponing dec() would never be called.
> 2) Underflows are possible in runtime in case dec() is called before corresponding inc() in the live-patched code.
>
> Proposed kprefcount_t functions are using following approach to solve these two problems:
> 1) In addition to original refcount_t, temporary refcount_t is allocated, and after unpatch it is just removed. This way system is safe with correct refcounting while patch is applied, and no underflow would happend after unpatch.
> 2) For inc/dec() added by live-patch code, one bit in reference-holder structure is used (unsigned char *ref_holder, kprefholder_flag). In case dec() is called first, it is just ignored as ref_holder bit would still not be initialized.
>
>
> API is defined include/linux/livepatch_refcount.h:
>
> typedef struct kprefcount_struct {
>         refcount_t *refcount;
>         refcount_t kprefcount;
>         spinlock_t lock;
> } kprefcount_t;
>
> kprefcount_t *kprefcount_alloc(refcount_t *refcount, gfp_t flags);
> void kprefcount_free(kprefcount_t *kp_ref);
> int kprefcount_read(kprefcount_t *kp_ref);
> void kprefcount_inc(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> void kprefcount_dec(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);
> bool kprefcount_dec_and_test(kprefcount_t *kp_ref, unsigned char *ref_holder, int kprefholder_flag);

IIUC, kprefcount alone is not enough to solve the two issues. We still
need some mechanism to manage the "ref_holder". Shadow variable
is probably the best option here.

The primary idea here is to enhance the refcount with a map. This
may be too expensive in term memory consumption in some use
cases.

Overall, I don't think this change adds much more value on top of
shadow variable.

Thanks,
Song

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-07-15 21:06 Phil Dennis-Jordan
  2024-07-16  6:07 ` Akihiko Odaki
  0 siblings, 1 reply; 1651+ messages in thread
From: Phil Dennis-Jordan @ 2024-07-15 21:06 UTC (permalink / raw)
  To: qemu-devel, pbonzini, agraf, graf, marcandre.lureau, berrange,
	thuth, philmd, peter.maydell, akihiko.odaki, phil, lists

Date: Mon, 15 Jul 2024 21:07:12 +0200
Subject: [PATCH 00/26] hw/display/apple-gfx: New macOS PV Graphics device
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This sequence of patches integrates the paravirtualised graphics device
implemented by macOS's ParavirtualizedGraphics.Framework into Qemu.
Combined with the guest drivers which ship with macOS versions 11 and up,
this allows the guest OS to use the host's GPU for hardware accelerated
3D graphics, GPGPU compute (both using the 'Metal' graphics API), and
window compositing.

Some background:
----------------

The device exposed by the ParavirtualizedGraphics.Framework's (henceforth
PVG) public API consists of a PCI device with a single memory-mapped BAR;
the VMM is expected to pass reads and writes through to the framework, and
to forward interrupts emenating from it to the guest VM.

The bulk of data exchange between host and guest occurs via shared memory,
however. For this purpose, PVG makes callbacks to VMM code for allocating,
mapping, unmapping, and deallocating "task" memory ranges. Each task
represents a contiguous host virtual address range, and PVG expects the
VMM to map specific guest system memory ranges to these host addresses via
subsequent map callbacks. Multiple tasks can exist at a time, each with
many mappings.

Data is exchanged via an undocumented, Apple-proprietary protocol. The
PVG API only acts as a facilitator for establishing the communication
mechanism. This is perhaps not ideal, and among other things means it
only works on macOS hosts, but it's the only serious option we've got for
good performance and quality graphics with macOS guests at this time.

The first iterations of this PVG integration into Qemu were developed
by Alexander Graf as part of his "vmapple" machine patch series for
supporting aarch64 macOS guests, and posted to qemu-devel in June and
August 2023:

https://lore.kernel.org/all/20230830161425.91946-1-graf@amazon.com/T/

This integration mimics the "vmapple"/"apple-gfx" variant of the PVG device
used by Apple's own VMM, Virtualization.framework. This variant does not use
PCI but acts as a direct MMIO system device; there are two MMIO ranges, one
behaving identically to the PCI BAR, while the other's functionality is
exposed by private APIs in the PVG framework. It is only available on aarch64
macOS hosts.

I had prior to this simultaneously and independently developed my own PVG
integration for Qemu using the public PCI device APIs, with x86-64 and
corresponding macOS guests and hosts as the target. After some months of
use in production, I was slowly reviewing the code and readying it for
upstreaming around the time Alexander posted his vmapple patches.

I ended up reviewing the vmapple PVG code in detail; I identified a number
of issues with it (mainly thanks to my prior trial-and-error working with
the framework) but overall I thought it a better basis for refinement
than my own version:

 - It implemented the vmapple variant of the device. I thought it better to
   port the part I understood well (PCI variant) to this than trying to port
   the part I didn't understand well (MMIO vmapple variant) to my own code.
 - The code was already tidier than my own.

It also became clear in out-of-band communication that Alexander would
probably not end up having the time to see the patch through to inclusion,
and was happy for me to start making changes and to integrate my PCI code.

It's taken a while, but I'm happy with the result and think it will be a
welcome addition for anyone running macOS VMs.

What doesn't work:
------------------

 * State (de-)serialisation and thus migration. There is no fundamental
   technical obstacle to this. PVG supports saving and loading device state.
   I have simply not had the resources to implement (and crucially, test it)
   it yet.
 * Setting the list of display modes via a property is currently only
   implemented on the PCI version, which is the only one readily testable
   without the out-of-tree vmapple patches. (See review notes for patch 24)
 * End-to-end GPU-only rendering. After the host GPU has rendered the guest's
   screen, the framebuffer is copied into a system memory buffer (surface).
   When using the Qemu Cocoa UI, this buffer is drawn by the CPU into a GPU
   texture used for hardware window compositing. It would be vastly more
   efficient to retain the Metal texture and pass it directly through to the
   Cocoa window. We currently have no mechanism for doing so; it would need
   to be similar to the end-to-end OpenGL rendering support, with the added
   complication that Metal textures are Objective-C types and would need to
   traverse the plain C code of the Qemu display subsystem.
 * Dirty region detection. Similarly, the whole framebuffer is marked modified
   even if there has only been a small change. This hurts network data volume
   when using VNC.
 * Multi-head support. PVG allows "connecting" more than one virtual display.
   This integration currently always uses exactly 1 display.
 * The vmapple/aarch64 variant of the device is only testable with Alexander's
   vmapple machine type patch set. I've been maintaining this out-of-tree and
   have made a few improvements, but it doesn't yet run smoothly. (Graphics
   work fine with this code, issues are with other devices.) I can push my
   current draft to a git forge if anyone wants to test with them. I'm
   definitely hoping to eventually resolve the remaining problems and submit
   a revised version of that patch set as well.

I think we can live without these for the moment, and I'd prefer to work on
them only if and when the baseline functionality has been merged.

Patch review notes:
-------------------

I have aimed to submit the patches roughly in order of descending importance
and increasing debatability. From patch 18 it becomes usable and useful in
practice without further modification. Patches 19-23 fix issues that only occur
in certain host configurations or that can otherwise be worked around. Patch
24 is more of an RFC; patch 26 is needed if recently submitted Cocoa UI patches
end up being merged.

Brief meta-discussion of specific patches and groups of patches:

 01:    I have retained Alexander Graf's original patch intact as far as
        possible as patch 01. My only modifications are those required for
        fixing rebase conflicts.
 02:    Resolves build errors caused by upstream API changes since original
        patch submission.
 03:    This moves the device code to hw/display from its original hw/vmapple.
        With the PCI variant added later, it doesn't make much sense to put
        this inside a machine type directory.
 04-13: These patches address issues identified during code review in the
        original e-mail threads as well as my own review.
 14-15: These patches split the monolithic source file into a common core and
        vmapple/mmio specific parts.
        NOTE: checkpatch.pl complains about #ifdef __OBJC__ in the header file.
        This is needed for allowing the file to be #included from pure C.
        Ensuring that isn't currently critical, but I expect it to be useful
        in future.
 16-17: Preparation for x86-64. The x86-64 variant of PVG seems to behave
        slightly differently than the aarch64 version in terms of threading
        and control flow edge cases. We need to do some things asynchronously
        to avoid deadlock and performance problems.
 18:    The PCI variant. This builds on the split from 14/15 and the async
        changes to create the PVG PCI device which works with x86-64 macOS
        guests.
 19-23: Fixes for various additional problems and limitations. Without these
        the PVG device will only work with the Cocoa UI, on Macs which
        have an integrated (shared memory) GPU only, and mouse cursors will
        be slightly off, for example.
 24:    QOM property for specifying the display mode list (resolutions) the
        device will report to the guest. I checked other display devices and
        found none supported this, though I personally find it very useful.
        I'm wondering whether this should be a more generic feature optionally
        usable by any display device in Qemu?
 25:    Adding myself as maintainer for the PVG code, and reviewer for HVF.
 26:    Required if/when Akihiko Odaki's pending patch for removing
        dpy_cursor_define_supported is merged.

I don't know if it's useful/wise to keep these all as separate patches.
I'm happy to squash any number or groups of them. Personally, I find
smaller patches easier to review, and the git blame history acts as
a sort of documentation, so I've kept them for now.

Alexander Graf (1):
  hw/vmapple/apple-gfx: Introduce ParavirtualizedGraphics.Framework
    support

Phil Dennis-Jordan (25):
  hw/vmapple/apple-gfx: BQL renaming update
  hw/display/apple-gfx: Moved from hw/vmapple/
  hw/display/apple-gfx: uses DEFINE_TYPES macro
  hw/display/apple-gfx: native -> little endian memory ops
  hw/display/apple-gfx: Removes dead/superfluous code
  hw/display/apple-gfx: Makes set_mode thread & memory safe
  hw/display/apple-gfx: Adds migration blocker
  hw/display/apple-gfx: Wraps ObjC autorelease code in pool
  hw/display/apple-gfx: Fixes ObjC new/init misuse, plugs leaks
  hw/display/apple-gfx: Uses ObjC category extension for private
    property
  hw/display/apple-gfx: Task memory mapping cleanup
  hw/display/apple-gfx: Defines PGTask_s struct instead of casting
  hw/display/apple-gfx: Refactoring of realize function
  hw/display/apple-gfx: Separates generic & vmapple-specific
    functionality
  hw/display/apple-gfx: Asynchronous MMIO writes on x86-64
  hw/display/apple-gfx: Asynchronous rendering and graphics update
  hw/display/apple-gfx: Adds PCI implementation
  ui/cocoa: Adds non-app runloop on main thread mode
  hw/display/apple-gfx: Fixes cursor hotspot handling
  hw/display/apple-gfx: Implements texture syncing for non-UMA GPUs
  hw/display/apple-gfx: Replaces magic number with queried MMIO length
  hw/display/apple-gfx: Host GPU picking improvements
  hw/display/apple-gfx: Adds configurable mode list
  MAINTAINERS: Add myself as maintainer for apple-gfx, reviewer for HVF
  hw/display/apple-gfx: Removes UI pointer support check

 MAINTAINERS                    |   7 +
 hw/display/Kconfig             |  13 +
 hw/display/apple-gfx-pci.m     | 166 +++++++++
 hw/display/apple-gfx-vmapple.m | 194 ++++++++++
 hw/display/apple-gfx.h         |  72 ++++
 hw/display/apple-gfx.m         | 659 +++++++++++++++++++++++++++++++++
 hw/display/meson.build         |   3 +
 hw/display/trace-events        |  26 ++
 include/qemu-main.h            |   2 +
 meson.build                    |   4 +
 ui/cocoa.m                     |  15 +-
 11 files changed, 1159 insertions(+), 2 deletions(-)
 create mode 100644 hw/display/apple-gfx-pci.m
 create mode 100644 hw/display/apple-gfx-vmapple.m
 create mode 100644 hw/display/apple-gfx.h
 create mode 100644 hw/display/apple-gfx.m

-- 
2.39.3 (Apple Git-146)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-07-15 21:06 Phil Dennis-Jordan
@ 2024-07-16  6:07 ` Akihiko Odaki
  2024-07-17 11:16   ` Re: Phil Dennis-Jordan
  0 siblings, 1 reply; 1651+ messages in thread
From: Akihiko Odaki @ 2024-07-16  6:07 UTC (permalink / raw)
  To: Phil Dennis-Jordan, qemu-devel, pbonzini, agraf, graf,
	marcandre.lureau, berrange, thuth, philmd, peter.maydell, lists

On 2024/07/16 6:06, Phil Dennis-Jordan wrote:
> Date: Mon, 15 Jul 2024 21:07:12 +0200
> Subject: [PATCH 00/26] hw/display/apple-gfx: New macOS PV Graphics device
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> This sequence of patches integrates the paravirtualised graphics device
> implemented by macOS's ParavirtualizedGraphics.Framework into Qemu.
> Combined with the guest drivers which ship with macOS versions 11 and up,
> this allows the guest OS to use the host's GPU for hardware accelerated
> 3D graphics, GPGPU compute (both using the 'Metal' graphics API), and
> window compositing.
> 
> Some background:
> ----------------
> 
> The device exposed by the ParavirtualizedGraphics.Framework's (henceforth
> PVG) public API consists of a PCI device with a single memory-mapped BAR;
> the VMM is expected to pass reads and writes through to the framework, and
> to forward interrupts emenating from it to the guest VM.
> 
> The bulk of data exchange between host and guest occurs via shared memory,
> however. For this purpose, PVG makes callbacks to VMM code for allocating,
> mapping, unmapping, and deallocating "task" memory ranges. Each task
> represents a contiguous host virtual address range, and PVG expects the
> VMM to map specific guest system memory ranges to these host addresses via
> subsequent map callbacks. Multiple tasks can exist at a time, each with
> many mappings.
> 
> Data is exchanged via an undocumented, Apple-proprietary protocol. The
> PVG API only acts as a facilitator for establishing the communication
> mechanism. This is perhaps not ideal, and among other things means it
> only works on macOS hosts, but it's the only serious option we've got for
> good performance and quality graphics with macOS guests at this time.
> 
> The first iterations of this PVG integration into Qemu were developed
> by Alexander Graf as part of his "vmapple" machine patch series for
> supporting aarch64 macOS guests, and posted to qemu-devel in June and
> August 2023:
> 
> https://lore.kernel.org/all/20230830161425.91946-1-graf@amazon.com/T/
> 
> This integration mimics the "vmapple"/"apple-gfx" variant of the PVG device
> used by Apple's own VMM, Virtualization.framework. This variant does not use
> PCI but acts as a direct MMIO system device; there are two MMIO ranges, one
> behaving identically to the PCI BAR, while the other's functionality is
> exposed by private APIs in the PVG framework. It is only available on aarch64
> macOS hosts.
> 
> I had prior to this simultaneously and independently developed my own PVG
> integration for Qemu using the public PCI device APIs, with x86-64 and
> corresponding macOS guests and hosts as the target. After some months of
> use in production, I was slowly reviewing the code and readying it for
> upstreaming around the time Alexander posted his vmapple patches.
> 
> I ended up reviewing the vmapple PVG code in detail; I identified a number
> of issues with it (mainly thanks to my prior trial-and-error working with
> the framework) but overall I thought it a better basis for refinement
> than my own version:
> 
>   - It implemented the vmapple variant of the device. I thought it better to
>     port the part I understood well (PCI variant) to this than trying to port
>     the part I didn't understand well (MMIO vmapple variant) to my own code.
>   - The code was already tidier than my own.
> 
> It also became clear in out-of-band communication that Alexander would
> probably not end up having the time to see the patch through to inclusion,
> and was happy for me to start making changes and to integrate my PCI code.

Hi,

Thanks for continuing his effort.

Please submit a patch series that includes his patches. Please also 
merge fixes for his patches into them. This saves the effort to review 
the obsolete code and keeps git bisect working.

Regards,
Akihiko Odaki


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-07-16  6:07 ` Akihiko Odaki
@ 2024-07-17 11:16   ` Phil Dennis-Jordan
  0 siblings, 0 replies; 1651+ messages in thread
From: Phil Dennis-Jordan @ 2024-07-17 11:16 UTC (permalink / raw)
  To: Akihiko Odaki
  Cc: qemu-devel, pbonzini, agraf, graf, marcandre.lureau, berrange,
	thuth, philmd, peter.maydell, lists

[-- Attachment #1: Type: text/plain, Size: 717 bytes --]

On Tue, 16 Jul 2024 at 08:07, Akihiko Odaki <akihiko.odaki@daynix.com>
wrote:

> Hi,
>
> Thanks for continuing his effort.
>
> Please submit a patch series that includes his patches. Please also
> merge fixes for his patches into them. This saves the effort to review
> the obsolete code and keeps git bisect working.
>
>
Sorry about that - it looks like (a) my edits to the cover letter messed
something up and (b) patch 1 got email-filtered somewhere along the way for
having the "wrong" From: address. I've submitted v2 with most patches
squashed into patch 1, whose authorship I've also changed to myself (with
Co-authored-by tag for the original code) so hopefully this time around it
shows up OK.

Thanks,
Phil

[-- Attachment #2: Type: text/html, Size: 1122 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-08-14  8:03 howard_wang
  2024-08-14 15:04 ` Stephen Hemminger
  0 siblings, 1 reply; 1651+ messages in thread
From: howard_wang @ 2024-08-14  8:03 UTC (permalink / raw)
  To: dev

[-- Attachment #1: 0001-net-r8169-add-PMD-driver-skeleton.patch --]
[-- Type: application/octet-stream, Size: 8056 bytes --]

From 7e61211f99445dab158fc955b363b5c622a6f0ef Mon Sep 17 00:00:00 2001
From: Howard Wang <howard_wang@realsil.com.cn>
Date: Mon, 5 Aug 2024 15:52:04 +0800
Subject: [PATCH] net/r8169: add PMD driver skeleton

Meson build infrastructure, r8169_ethdev minimal skeleton,
header with Realtek NIC device and vendor IDs.

Signed-off-by: Howard Wang <howard_wang@realsil.com.cn>
---
 MAINTAINERS                      |   7 ++
 drivers/net/meson.build          |   1 +
 drivers/net/r8169/meson.build    |   6 ++
 drivers/net/r8169/r8169_base.h   |  15 +++
 drivers/net/r8169/r8169_ethdev.c | 180 +++++++++++++++++++++++++++++++
 drivers/net/r8169/r8169_ethdev.h |  40 +++++++
 6 files changed, 249 insertions(+)
 create mode 100644 drivers/net/r8169/meson.build
 create mode 100644 drivers/net/r8169/r8169_base.h
 create mode 100644 drivers/net/r8169/r8169_ethdev.c
 create mode 100644 drivers/net/r8169/r8169_ethdev.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..5f9eccc43f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1076,6 +1076,13 @@ F: drivers/net/memif/
 F: doc/guides/nics/memif.rst
 F: doc/guides/nics/features/memif.ini
 
+Realtek r8169
+M: Howard Wang <howard_wang@realsil.com.cn>
+M: ChunHao Lin <hau@realtek.com>
+M: Xing Wang <xing_wang@realsil.com.cn>
+M: Realtek NIC SW <pro_nic_dpdk@realtek.com>
+F: drivers/net/r8169
+
 
 Crypto Drivers
 --------------
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index fb6d34b782..fddcf39655 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -53,6 +53,7 @@ drivers = [
         'pfe',
         'qede',
         'ring',
+        'r8169',
         'sfc',
         'softnic',
         'tap',
diff --git a/drivers/net/r8169/meson.build b/drivers/net/r8169/meson.build
new file mode 100644
index 0000000000..5fcd871787
--- /dev/null
+++ b/drivers/net/r8169/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Realtek Corporation. All rights reserved
+
+sources = files(
+		'r8169_ethdev.c',
+)
\ No newline at end of file
diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
new file mode 100644
index 0000000000..5d219a7966
--- /dev/null
+++ b/drivers/net/r8169/r8169_base.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Realtek Corporation. All rights reserved
+ */
+
+#ifndef _R8169_BASE_H_
+#define _R8169_BASE_H_
+
+typedef uint8_t   u8;
+typedef uint16_t  u16;
+typedef uint32_t  u32;
+typedef uint64_t  u64;
+
+#define PCI_VENDOR_ID_REALTEK 0x10EC
+
+#endif
\ No newline at end of file
diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
new file mode 100644
index 0000000000..2288361d0c
--- /dev/null
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -0,0 +1,180 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Realtek Corporation. All rights reserved
+ */
+
+#include <sys/queue.h>
+#include <stdio.h>
+#include <errno.h>
+#include <stdint.h>
+#include <stdarg.h>
+
+#include <rte_eal.h>
+
+#include <rte_string_fns.h>
+#include <rte_common.h>
+#include <rte_interrupts.h>
+#include <rte_byteorder.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+#include <rte_pci.h>
+#include <bus_pci_driver.h>
+#include <rte_ether.h>
+#include <ethdev_driver.h>
+#include <ethdev_pci.h>
+#include <rte_memory.h>
+#include <rte_malloc.h>
+#include <dev_driver.h>
+
+#include "r8169_ethdev.h"
+#include "r8169_base.h"
+
+static int rtl_dev_configure(struct rte_eth_dev *dev __rte_unused);
+static int rtl_dev_start(struct rte_eth_dev *dev);
+static int rtl_dev_stop(struct rte_eth_dev *dev);
+static int rtl_dev_reset(struct rte_eth_dev *dev);
+static int rtl_dev_close(struct rte_eth_dev *dev);
+
+/*
+ * The set of PCI devices this driver supports
+ */
+static const struct rte_pci_id pci_id_r8169_map[] = {
+	{ RTE_PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8125) },
+	{ RTE_PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8162) },
+	{ RTE_PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8126) },
+	{ RTE_PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x5000) },
+	{.vendor_id = 0, /* sentinel */ },
+};
+
+static const struct eth_dev_ops rtl_eth_dev_ops = {
+	.dev_configure	      = rtl_dev_configure,
+	.dev_start	      = rtl_dev_start,
+	.dev_stop	      = rtl_dev_stop,
+	.dev_close	      = rtl_dev_close,
+	.dev_reset	      = rtl_dev_reset,
+};
+
+static int
+rtl_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+	return 0;
+}
+
+/*
+ * Configure device link speed and setup link.
+ * It returns 0 on success.
+ */
+static int
+rtl_dev_start(struct rte_eth_dev *dev)
+{
+	struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+	struct rtl_hw *hw = &adapter->hw;
+
+	hw->adapter_stopped = 0;
+
+	return 0;
+}
+
+/*
+ * Stop device: disable RX and TX functions to allow for reconfiguring.
+ */
+static int
+rtl_dev_stop(struct rte_eth_dev *dev)
+{
+	struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+	struct rtl_hw *hw = &adapter->hw;
+
+	if (hw->adapter_stopped)
+		return 0;
+
+	hw->adapter_stopped = 1;
+	dev->data->dev_started = 0;
+
+	return 0;
+}
+
+/*
+ * Reset and stop device.
+ */
+static int
+rtl_dev_close(struct rte_eth_dev *dev)
+{
+	int ret_stp;
+
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	ret_stp = rtl_dev_stop(dev);
+
+	return ret_stp;
+}
+
+static int
+rtl_dev_init(struct rte_eth_dev *dev)
+{
+	struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
+	struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
+	struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+	struct rtl_hw *hw = &adapter->hw;
+
+	dev->dev_ops = &rtl_eth_dev_ops;
+	//dev->tx_pkt_burst = &rtl_xmit_pkts;
+	//dev->rx_pkt_burst = &rtl_recv_pkts;
+
+	/* For secondary processes, the primary process has done all the work */
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return 0;
+
+	rte_eth_copy_pci_info(dev, pci_dev);
+
+	return 0;
+}
+
+static int
+rtl_dev_uninit(struct rte_eth_dev *dev)
+{
+	if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+		return -EPERM;
+
+	rtl_dev_close(dev);
+
+	return 0;
+}
+
+static int
+rtl_dev_reset(struct rte_eth_dev *dev)
+{
+	int ret;
+
+	ret = rtl_dev_uninit(dev);
+	if (ret)
+		return ret;
+
+	ret = rtl_dev_init(dev);
+
+	return ret;
+}
+
+static int
+rtl_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
+              struct rte_pci_device *pci_dev)
+{
+	return rte_eth_dev_pci_generic_probe(pci_dev, sizeof(struct rtl_adapter),
+	                                     rtl_dev_init);
+}
+
+static int
+rtl_pci_remove(struct rte_pci_device *pci_dev)
+{
+	return rte_eth_dev_pci_generic_remove(pci_dev, rtl_dev_uninit);
+}
+
+static struct rte_pci_driver rte_r8169_pmd = {
+	.id_table  = pci_id_r8169_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+	.probe     = rtl_pci_probe,
+	.remove    = rtl_pci_remove,
+};
+
+RTE_PMD_REGISTER_PCI(net_r8169, rte_r8169_pmd);
+RTE_PMD_REGISTER_PCI_TABLE(net_r8169, pci_id_r8169_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_r8169, "* igb_uio | uio_pci_generic | vfio-pci");
diff --git a/drivers/net/r8169/r8169_ethdev.h b/drivers/net/r8169/r8169_ethdev.h
new file mode 100644
index 0000000000..ab0ad28e28
--- /dev/null
+++ b/drivers/net/r8169/r8169_ethdev.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Realtek Corporation. All rights reserved
+ */
+
+#ifndef _R8169_ETHDEV_H_
+#define _R8169_ETHDEV_H_
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#include <rte_ethdev.h>
+#include <rte_ethdev_core.h>
+
+#include "r8169_base.h"
+
+struct rtl_hw {
+	u8 adapter_stopped;
+};
+
+struct rtl_sw_stats {
+	u64 tx_packets;
+	u64 tx_bytes;
+	u64 tx_errors;
+	u64 rx_packets;
+	u64 rx_bytes;
+	u64 rx_errors;
+};
+
+struct rtl_adapter {
+	struct rtl_hw       hw;
+	struct rtl_sw_stats sw_stats;
+};
+
+#define RTL_DEV_PRIVATE(eth_dev) \
+	((struct rtl_adapter *)((eth_dev)->data->dev_private))
+
+uint16_t rtl_xmit_pkts(void *txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
+uint16_t rtl_recv_pkts(void *rxq, struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
+
+#endif
\ No newline at end of file
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2024-08-14  8:03 howard_wang
@ 2024-08-14 15:04 ` Stephen Hemminger
  0 siblings, 0 replies; 1651+ messages in thread
From: Stephen Hemminger @ 2024-08-14 15:04 UTC (permalink / raw)
  To: howard_wang; +Cc: dev

On Wed, 14 Aug 2024 16:03:41 +0800
<howard_wang@realsil.com.cn> wrote:

> iff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
> new file mode 100644
> index 0000000000..5d219a7966
> --- /dev/null
> +++ b/drivers/net/r8169/r8169_base.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Realtek Corporation. All rights reserved
> + */
> +
> +#ifndef _R8169_BASE_H_
> +#define _R8169_BASE_H_
> +
> +typedef uint8_t   u8;
> +typedef uint16_t  u16;
> +typedef uint32_t  u32;
> +typedef uint64_t  u64;
> +
> +#define PCI_VENDOR_ID_REALTEK 0x10EC
> +
> +#endif
> \ No newline at end of file

Fix you editor setup, all files should end with newline.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-08-16 11:07 Xi Ruoyao
  2024-08-19 12:40 ` Huacai Chen
  2024-08-27  9:45 ` Re: Jason A. Donenfeld
  0 siblings, 2 replies; 1651+ messages in thread
From: Xi Ruoyao @ 2024-08-16 11:07 UTC (permalink / raw)
  To: Jason A . Donenfeld, Huacai Chen, WANG Xuerui
  Cc: Xi Ruoyao, linux-crypto, loongarch, Jinyang He, Tiezhu Yang,
	Arnd Bergmann

Subject: [PATCH v3 0/2] LoongArch: Implement getrandom() in vDSO

For the rationale to implement getrandom() in vDSO see [1].

The vDSO getrandom() needs a stack-less ChaCha20 implementation, so we
need to add architecture-specific code and wire it up with the generic
code.  Both generic LoongArch implementation and Loongson SIMD eXtension
based implementation are added.  To dispatch them at runtime without
invoking cpucfg on each call, the alternative runtime patching mechanism
is extended to cover the vDSO.

The implementation is tested with the kernel selftests added by the last
patch in [1].  I had to make some adjustments to make it work on
LoongArch (see [2], I've not submitted the changes as at now because I'm
unsure about the KHDR_INCLUDES addition).  The vdso_test_getrandom
bench-single result:

       vdso: 25000000 times in 0.647855257 seconds (generic)
       vdso: 25000000 times in 0.601068605 seconds (LSX)
       libc: 25000000 times in 6.948168864 seconds
    syscall: 25000000 times in 6.990265548 seconds

The vdso_test_getrandom bench-multi result:

       vdso: 25000000 x 256 times in 35.322187834 seconds (generic)
       vdso: 25000000 x 256 times in 29.183885426 seconds (LSX)
       libc: 25000000 x 256 times in 356.628428409 seconds
       syscall: 25000000 x 256 times in 334.764602866 seconds

[1]:https://lore.kernel.org/all/20240712014009.281406-1-Jason@zx2c4.com/
[2]:https://github.com/xry111/linux/commits/xry111/la-vdso-v3/

[v2]->v3:
- Add a generic LoongArch implementation for which LSX isn't needed.

v1->v2:
- Properly send the series to the list.

[v2]:https://lore.kernel.org/all/20240815133357.35829-1-xry111@xry111.site/

Xi Ruoyao (3):
  LoongArch: vDSO: Wire up getrandom() vDSO implementation
  LoongArch: Perform alternative runtime patching on vDSO
  LoongArch: vDSO: Add LSX implementation of vDSO getrandom()

 arch/loongarch/Kconfig                      |   1 +
 arch/loongarch/include/asm/vdso/getrandom.h |  47 ++++
 arch/loongarch/include/asm/vdso/vdso.h      |   8 +
 arch/loongarch/kernel/asm-offsets.c         |  10 +
 arch/loongarch/kernel/vdso.c                |  14 +-
 arch/loongarch/vdso/Makefile                |   6 +
 arch/loongarch/vdso/memset.S                |  24 ++
 arch/loongarch/vdso/vdso.lds.S              |   7 +
 arch/loongarch/vdso/vgetrandom-chacha-lsx.S | 162 +++++++++++++
 arch/loongarch/vdso/vgetrandom-chacha.S     | 252 ++++++++++++++++++++
 arch/loongarch/vdso/vgetrandom.c            |  19 ++
 11 files changed, 549 insertions(+), 1 deletion(-)
 create mode 100644 arch/loongarch/include/asm/vdso/getrandom.h
 create mode 100644 arch/loongarch/vdso/memset.S
 create mode 100644 arch/loongarch/vdso/vgetrandom-chacha-lsx.S
 create mode 100644 arch/loongarch/vdso/vgetrandom-chacha.S
 create mode 100644 arch/loongarch/vdso/vgetrandom.c

-- 
2.46.0


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-08-16 11:07 Xi Ruoyao
@ 2024-08-19 12:40 ` Huacai Chen
  2024-08-19 13:01   ` Re: Jason A. Donenfeld
  2024-08-19 15:22   ` Re: Xi Ruoyao
  2024-08-27  9:45 ` Re: Jason A. Donenfeld
  1 sibling, 2 replies; 1651+ messages in thread
From: Huacai Chen @ 2024-08-19 12:40 UTC (permalink / raw)
  To: Xi Ruoyao
  Cc: Jason A . Donenfeld, WANG Xuerui, linux-crypto, loongarch,
	Jinyang He, Tiezhu Yang, Arnd Bergmann

Hi, Ruoyao,

Why no subject?

On Fri, Aug 16, 2024 at 7:07 PM Xi Ruoyao <xry111@xry111.site> wrote:
>
> Subject: [PATCH v3 0/2] LoongArch: Implement getrandom() in vDSO
>
> For the rationale to implement getrandom() in vDSO see [1].
>
> The vDSO getrandom() needs a stack-less ChaCha20 implementation, so we
> need to add architecture-specific code and wire it up with the generic
> code.  Both generic LoongArch implementation and Loongson SIMD eXtension
> based implementation are added.  To dispatch them at runtime without
> invoking cpucfg on each call, the alternative runtime patching mechanism
> is extended to cover the vDSO.
>
> The implementation is tested with the kernel selftests added by the last
> patch in [1].  I had to make some adjustments to make it work on
> LoongArch (see [2], I've not submitted the changes as at now because I'm
> unsure about the KHDR_INCLUDES addition).  The vdso_test_getrandom
> bench-single result:
>
>        vdso: 25000000 times in 0.647855257 seconds (generic)
>        vdso: 25000000 times in 0.601068605 seconds (LSX)
>        libc: 25000000 times in 6.948168864 seconds
>     syscall: 25000000 times in 6.990265548 seconds
>
> The vdso_test_getrandom bench-multi result:
>
>        vdso: 25000000 x 256 times in 35.322187834 seconds (generic)
>        vdso: 25000000 x 256 times in 29.183885426 seconds (LSX)
>        libc: 25000000 x 256 times in 356.628428409 seconds
>        syscall: 25000000 x 256 times in 334.764602866 seconds
I don't see significant improvements about LSX here, so I prefer to
just use the generic version to avoid complexity (I remember Linus
said the whole of __vdso_getrandom is not very useful).


Huacai

>
> [1]:https://lore.kernel.org/all/20240712014009.281406-1-Jason@zx2c4.com/
> [2]:https://github.com/xry111/linux/commits/xry111/la-vdso-v3/
>
> [v2]->v3:
> - Add a generic LoongArch implementation for which LSX isn't needed.
>
> v1->v2:
> - Properly send the series to the list.
>
> [v2]:https://lore.kernel.org/all/20240815133357.35829-1-xry111@xry111.site/
>
> Xi Ruoyao (3):
>   LoongArch: vDSO: Wire up getrandom() vDSO implementation
>   LoongArch: Perform alternative runtime patching on vDSO
>   LoongArch: vDSO: Add LSX implementation of vDSO getrandom()
>
>  arch/loongarch/Kconfig                      |   1 +
>  arch/loongarch/include/asm/vdso/getrandom.h |  47 ++++
>  arch/loongarch/include/asm/vdso/vdso.h      |   8 +
>  arch/loongarch/kernel/asm-offsets.c         |  10 +
>  arch/loongarch/kernel/vdso.c                |  14 +-
>  arch/loongarch/vdso/Makefile                |   6 +
>  arch/loongarch/vdso/memset.S                |  24 ++
>  arch/loongarch/vdso/vdso.lds.S              |   7 +
>  arch/loongarch/vdso/vgetrandom-chacha-lsx.S | 162 +++++++++++++
>  arch/loongarch/vdso/vgetrandom-chacha.S     | 252 ++++++++++++++++++++
>  arch/loongarch/vdso/vgetrandom.c            |  19 ++
>  11 files changed, 549 insertions(+), 1 deletion(-)
>  create mode 100644 arch/loongarch/include/asm/vdso/getrandom.h
>  create mode 100644 arch/loongarch/vdso/memset.S
>  create mode 100644 arch/loongarch/vdso/vgetrandom-chacha-lsx.S
>  create mode 100644 arch/loongarch/vdso/vgetrandom-chacha.S
>  create mode 100644 arch/loongarch/vdso/vgetrandom.c
>
> --
> 2.46.0
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-08-19 12:40 ` Huacai Chen
@ 2024-08-19 13:01   ` Jason A. Donenfeld
  2024-08-19 15:22     ` Re: Xi Ruoyao
  2024-08-19 15:22   ` Re: Xi Ruoyao
  1 sibling, 1 reply; 1651+ messages in thread
From: Jason A. Donenfeld @ 2024-08-19 13:01 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Xi Ruoyao, WANG Xuerui, linux-crypto, loongarch, Jinyang He,
	Tiezhu Yang, Arnd Bergmann

> I don't see significant improvements about LSX here, so I prefer to
> just use the generic version to avoid complexity (I remember Linus
> said the whole of __vdso_getrandom is not very useful).

I'm inclined to feel the same way, at least for now. Let's just go with
one implementation -- the generic one -- and then we can see if
optimization really makes sense later. I suspect the large speedup we're
already getting from being in the vDSO is already sufficient for
purposes.

Regards,
Jason

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-08-19 13:01   ` Re: Jason A. Donenfeld
@ 2024-08-19 15:22     ` Xi Ruoyao
  2024-08-19 15:54       ` Re: Xi Ruoyao
  0 siblings, 1 reply; 1651+ messages in thread
From: Xi Ruoyao @ 2024-08-19 15:22 UTC (permalink / raw)
  To: Jason A. Donenfeld, Huacai Chen
  Cc: WANG Xuerui, linux-crypto, loongarch, Jinyang He, Tiezhu Yang,
	Arnd Bergmann

On Mon, 2024-08-19 at 13:01 +0000, Jason A. Donenfeld wrote:
> > I don't see significant improvements about LSX here, so I prefer to
> > just use the generic version to avoid complexity (I remember Linus
> > said the whole of __vdso_getrandom is not very useful).
> 
> I'm inclined to feel the same way, at least for now. Let's just go with
> one implementation -- the generic one -- and then we can see if
> optimization really makes sense later. I suspect the large speedup we're
> already getting from being in the vDSO is already sufficient for
> purposes.

Ok I'll drop the 2nd and 3rd patches in the next version.  But I'm
puzzled why the LSX implementation isn't much faster, maybe I made some
mistake in it?

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-08-19 15:22     ` Re: Xi Ruoyao
@ 2024-08-19 15:54       ` Xi Ruoyao
  0 siblings, 0 replies; 1651+ messages in thread
From: Xi Ruoyao @ 2024-08-19 15:54 UTC (permalink / raw)
  To: Jason A. Donenfeld, Huacai Chen
  Cc: WANG Xuerui, linux-crypto, loongarch, Jinyang He, Tiezhu Yang,
	Arnd Bergmann

On Mon, 2024-08-19 at 23:22 +0800, Xi Ruoyao wrote:
> On Mon, 2024-08-19 at 13:01 +0000, Jason A. Donenfeld wrote:
> > > I don't see significant improvements about LSX here, so I prefer to
> > > just use the generic version to avoid complexity (I remember Linus
> > > said the whole of __vdso_getrandom is not very useful).
> > 
> > I'm inclined to feel the same way, at least for now. Let's just go with
> > one implementation -- the generic one -- and then we can see if
> > optimization really makes sense later. I suspect the large speedup we're
> > already getting from being in the vDSO is already sufficient for
> > purposes.
> 
> Ok I'll drop the 2nd and 3rd patches in the next version.  But I'm
> puzzled why the LSX implementation isn't much faster, maybe I made some
> mistake in it?

After some thinking this seems making sense: the LoongArch desktop
processors have 4 ALUs able to perform the scalar add/rot/xor
operations, and the throughput is already maximized for ChaCha20 due to
the data dependency.  The advantage of LSX seems just to avoid reloading
key from the memory (because the register file is large enough to hold a
copy of it).

Perhaps LSX will be much better on those embedded processors with 2 ALUs
and 1 SIMD unit (if they don't downclock with heavy SIMD load), but I
don't have one for testing.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-08-19 12:40 ` Huacai Chen
  2024-08-19 13:01   ` Re: Jason A. Donenfeld
@ 2024-08-19 15:22   ` Xi Ruoyao
  1 sibling, 0 replies; 1651+ messages in thread
From: Xi Ruoyao @ 2024-08-19 15:22 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Jason A . Donenfeld, WANG Xuerui, linux-crypto, loongarch,
	Jinyang He, Tiezhu Yang, Arnd Bergmann

On Mon, 2024-08-19 at 20:40 +0800, Huacai Chen wrote:
> Hi, Ruoyao,
> 
> Why no subject?

Because I misused git send-email (again) :(.


-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-08-16 11:07 Xi Ruoyao
  2024-08-19 12:40 ` Huacai Chen
@ 2024-08-27  9:45 ` Jason A. Donenfeld
  1 sibling, 0 replies; 1651+ messages in thread
From: Jason A. Donenfeld @ 2024-08-27  9:45 UTC (permalink / raw)
  To: Xi Ruoyao
  Cc: Huacai Chen, WANG Xuerui, linux-crypto, loongarch, Jinyang He,
	Tiezhu Yang, Arnd Bergmann

Hey,

Per https://lore.kernel.org/all/Zs2c_9Z6sFMNJs1O@zx2c4.com/ , you may
want to rebase on random.git and send a v4 series. Hopefully now it's
just a single patch.

Jason

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
@ 2024-08-22 20:54 Bao D. Nguyen
  2024-08-22 21:08 ` Bart Van Assche
  0 siblings, 1 reply; 1651+ messages in thread
From: Bao D. Nguyen @ 2024-08-22 20:54 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, James E.J. Bottomley, Peter Wang,
	Manivannan Sadhasivam, Avri Altman, Andrew Halaney, Bean Huo,
	Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On 8/22/2024 11:13 AM, Bart Van Assche wrote:
> On 8/21/24 6:05 PM, Bao D. Nguyen wrote:
>> If I understand correctly, the link is hibernated because we had a 
>> successful ufshcd_uic_hibern8_enter() earlier. Then the 
>> ufshcd_uic_pwr_ctrl() is invoked probably from a power mode change 
>> command? (a callstack would be helpful in this case).
> 
> Hi Bao,
> 
> ufshcd_uic_hibern8_enter() calls ufshcd_uic_pwr_ctrl() directly. The
> former function causes the latter to send the UIC_CMD_DME_HIBER_ENTER
> command. Although UIC_CMD_DME_HIBER_ENTER causes the link to enter the
> hibernation state, the code in ufshcd_uic_pwr_ctrl() for re-enabling
> interrupts causes the UFS host controller that I'm testing to exit the
> hibernation state.
> 
>> Anyway, accessing the UFSHCI host controller registers space somehow 
>> affected the M-PHY link state? If my understanding is correct, it 
>> seems like a hardware issue that we are trying to work around?
> 
> Yes, this is a workaround particular behavior of a particular UFS
> controller.

Thank you for the clarification, Bart.
I am just curious about providing workaround for a hardware issue in the 
ufs core driver. Sometimes I notice the community is not accepting such 
a change and push the change to be implemented in the vendor/platform 
drivers.

Now about your workaround, I have the same concern as Bean.
For a ufshcd_uic_pwr_ctrl() command, i.e PMC, hibern8_enter/exit() 
commands, you will get 2 ufshcd_uic_cmd_compl() interrupts or 1 uic 
completion interrupt with 2 status bits set at once.
a. UIC_COMMAND_COMPL is set
b. and one of these bits UIC_POWER_MODE || UIC_HIBERNATE_EXIT || 
UIC_HIBERNATE_ENTER is set, depending on the completed uic command. 

In your proposed change to ufshcd_uic_cmd_compl(), you are signalling 
both complete(&cmd->done) and complete(hba->uic_async_done) for a single 
ufshcd_uic_pwr_ctrl() command which can cause issues.

Thanks, Bao

> 
> Thanks,
> 
> Bart.
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-22 20:54 [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation Bao D. Nguyen
@ 2024-08-22 21:08 ` Bart Van Assche
  2024-08-23 12:01   ` Manivannan Sadhasivam
  0 siblings, 1 reply; 1651+ messages in thread
From: Bart Van Assche @ 2024-08-22 21:08 UTC (permalink / raw)
  To: Bao D. Nguyen, Martin K . Petersen
  Cc: linux-scsi, James E.J. Bottomley, Peter Wang,
	Manivannan Sadhasivam, Avri Altman, Andrew Halaney, Bean Huo,
	Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On 8/22/24 1:54 PM, Bao D. Nguyen wrote:
> I am just curious about providing workaround for a hardware issue in the 
> ufs core driver. Sometimes I notice the community is not accepting such 
> a change and push the change to be implemented in the vendor/platform 
> drivers.

There are two reasons why I propose to implement this workaround as a
change for the UFS driver core:
- I am not aware of any way to implement the behavior change in a vendor
   driver without modifying the UFS driver core.
- The workaround results in a simplification of the UFS driver core
   code. More lines are removed from the UFS driver than added.

Although it would be possible to select whether the old or the new
behavior is selected by introducing yet another host controller quirk, I
prefer not to do that because it would make the UFSHCI driver even more
complex.

> In your proposed change to ufshcd_uic_cmd_compl(), you are signalling 
> both complete(&cmd->done) and complete(hba->uic_async_done) for a single 
> ufshcd_uic_pwr_ctrl() command which can cause issues.

Please take another look at patch 2/2. With or without this patch
applied, only hba->uic_async_done is signaled when a power mode UIC
command completes.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-22 21:08 ` Bart Van Assche
@ 2024-08-23 12:01   ` Manivannan Sadhasivam
  2024-08-23 14:23     ` Bart Van Assche
  0 siblings, 1 reply; 1651+ messages in thread
From: Manivannan Sadhasivam @ 2024-08-23 12:01 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On Thu, Aug 22, 2024 at 02:08:24PM -0700, Bart Van Assche wrote:
> On 8/22/24 1:54 PM, Bao D. Nguyen wrote:
> > I am just curious about providing workaround for a hardware issue in the
> > ufs core driver. Sometimes I notice the community is not accepting such
> > a change and push the change to be implemented in the vendor/platform
> > drivers.
> 
> There are two reasons why I propose to implement this workaround as a
> change for the UFS driver core:
> - I am not aware of any way to implement the behavior change in a vendor
>   driver without modifying the UFS driver core.

Unfortunately you never mentioned which UFS controller this behavior applies to.

> - The workaround results in a simplification of the UFS driver core
>   code. More lines are removed from the UFS driver than added.
> 

This doesn't justify the modification of the UFS code driver for an errantic
behavior of a UFS controller.

> Although it would be possible to select whether the old or the new
> behavior is selected by introducing yet another host controller quirk, I
> prefer not to do that because it would make the UFSHCI driver even more
> complex.
> 

I strongly believe that using the quirk is the way forward to address this
issue. Because this is not a documented behavior to be handled in the core
driver and also defeats the purpose of having the quirks in first place.

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-23 12:01   ` Manivannan Sadhasivam
@ 2024-08-23 14:23     ` Bart Van Assche
  2024-08-23 14:58       ` Manivannan Sadhasivam
  0 siblings, 1 reply; 1651+ messages in thread
From: Bart Van Assche @ 2024-08-23 14:23 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On 8/23/24 5:01 AM, Manivannan Sadhasivam wrote:
> Unfortunately you never mentioned which UFS controller this behavior applies to.

Does your employer allow you to publish detailed information about
unreleased SoCs?

>> - The workaround results in a simplification of the UFS driver core
>>    code. More lines are removed from the UFS driver than added.
> 
> This doesn't justify the modification of the UFS code driver for an errantic
> behavior of a UFS controller.

Patch 2/2 deletes more code than it adds and hence helps to make the UFS
driver core easier to maintain. Adding yet another quirk would make the
UFS driver core more complicated and hence harder to maintain.

>> Although it would be possible to select whether the old or the new
>> behavior is selected by introducing yet another host controller quirk, I
>> prefer not to do that because it would make the UFSHCI driver even more
>> complex.
> 
> I strongly believe that using the quirk is the way forward to address this
> issue. Because this is not a documented behavior to be handled in the core
> driver and also defeats the purpose of having the quirks in first place.

Adding a quirk is not an option in this case because adding a new quirk
without code that uses the quirk is not allowed. It may take another
year before the code that uses that new quirk is sent upstream because
the team that sends Pixel SoC drivers upstream only does this after
devices with that SoC are publicly available.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-23 14:23     ` Bart Van Assche
@ 2024-08-23 14:58       ` Manivannan Sadhasivam
  2024-08-23 16:07         ` Bart Van Assche
  0 siblings, 1 reply; 1651+ messages in thread
From: Manivannan Sadhasivam @ 2024-08-23 14:58 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On Fri, Aug 23, 2024 at 07:23:57AM -0700, Bart Van Assche wrote:
> On 8/23/24 5:01 AM, Manivannan Sadhasivam wrote:
> > Unfortunately you never mentioned which UFS controller this behavior applies to.
> 
> Does your employer allow you to publish detailed information about
> unreleased SoCs?
> 

Certainly not! But in that case I will explicitly mention that the controller is
used in an upcoming SoC or something like that. Otherwise, how can the community
know whether you are sending the patch for an existing controller or a secret
one?

> > > - The workaround results in a simplification of the UFS driver core
> > >    code. More lines are removed from the UFS driver than added.
> > 
> > This doesn't justify the modification of the UFS code driver for an errantic
> > behavior of a UFS controller.
> 
> Patch 2/2 deletes more code than it adds and hence helps to make the UFS
> driver core easier to maintain. Adding yet another quirk would make the
> UFS driver core more complicated and hence harder to maintain.
> 

IMO nobody can make the UFS driver more complicated. It is complicated at its
best :)

But you are just applying the plaster in the core code for a quirky behavior of
an unreleased SoC. IMO that is not something we would want. Imagine if other
vendor also decides to do the same with the argument of removing more lines of
code etc... we will end up with a situation where the core code doing something
out of the spec for a buggy controller without any mentioning of the quirky
behavior.

So that will make the code more complex to understand.

> > > Although it would be possible to select whether the old or the new
> > > behavior is selected by introducing yet another host controller quirk, I
> > > prefer not to do that because it would make the UFSHCI driver even more
> > > complex.
> > 
> > I strongly believe that using the quirk is the way forward to address this
> > issue. Because this is not a documented behavior to be handled in the core
> > driver and also defeats the purpose of having the quirks in first place.
> 
> Adding a quirk is not an option in this case because adding a new quirk
> without code that uses the quirk is not allowed. It may take another
> year before the code that uses that new quirk is sent upstream because
> the team that sends Pixel SoC drivers upstream only does this after
> devices with that SoC are publicly available.
> 

Then why can't you send the change at that time? We do the same for Qcom SoCs
all the time and I'm pretty sure that the Pixel SoCs are no different.

At that time, the SoC may get a new revision and we may end up not needing the
quirk at all. I'm not saying that it will happen, but that is a common practice
among the semiconductor companies.

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-23 14:58       ` Manivannan Sadhasivam
@ 2024-08-23 16:07         ` Bart Van Assche
  2024-08-23 16:48           ` Manivannan Sadhasivam
  0 siblings, 1 reply; 1651+ messages in thread
From: Bart Van Assche @ 2024-08-23 16:07 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On 8/23/24 7:58 AM, Manivannan Sadhasivam wrote:
> Then why can't you send the change at that time?

We use the Android GKI kernel and any patches must be sent upstream
first before these can be considered for inclusion in the GKI kernel.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-23 16:07         ` Bart Van Assche
@ 2024-08-23 16:48           ` Manivannan Sadhasivam
  2024-08-23 18:05             ` Bart Van Assche
  0 siblings, 1 reply; 1651+ messages in thread
From: Manivannan Sadhasivam @ 2024-08-23 16:48 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On Fri, Aug 23, 2024 at 09:07:18AM -0700, Bart Van Assche wrote:
> On 8/23/24 7:58 AM, Manivannan Sadhasivam wrote:
> > Then why can't you send the change at that time?
> 
> We use the Android GKI kernel and any patches must be sent upstream
> first before these can be considered for inclusion in the GKI kernel.
> 

But that's the same requirement for other SoC vendors as well. Anyway, these
don't justify the fact that the core code should be modified to workaround a
controller defect. Please use quirks as like other vendors.

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-23 16:48           ` Manivannan Sadhasivam
@ 2024-08-23 18:05             ` Bart Van Assche
  2024-08-24  2:29               ` Manivannan Sadhasivam
  0 siblings, 1 reply; 1651+ messages in thread
From: Bart Van Assche @ 2024-08-23 18:05 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On 8/23/24 9:48 AM, Manivannan Sadhasivam wrote:
> On Fri, Aug 23, 2024 at 09:07:18AM -0700, Bart Van Assche wrote:
>> On 8/23/24 7:58 AM, Manivannan Sadhasivam wrote:
>>> Then why can't you send the change at that time?
>>
>> We use the Android GKI kernel and any patches must be sent upstream
>> first before these can be considered for inclusion in the GKI kernel.
> 
> But that's the same requirement for other SoC vendors as well. Anyway, these
> don't justify the fact that the core code should be modified to workaround a
> controller defect. Please use quirks as like other vendors.

Let me repeat what I mentioned earlier:
* Introducing a new quirk without introducing a user for that quirk is
   not acceptable because that would involve introducing code that is
   dead code from the point of view of the upstream kernel.
* The UFS driver core is already complicated. If we don't need a new
   quirk we shouldn't introduce a new quirk.

Bart.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-23 18:05             ` Bart Van Assche
@ 2024-08-24  2:29               ` Manivannan Sadhasivam
  2024-08-24  2:48                 ` Bart Van Assche
  0 siblings, 1 reply; 1651+ messages in thread
From: Manivannan Sadhasivam @ 2024-08-24  2:29 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On Fri, Aug 23, 2024 at 11:05:12AM -0700, Bart Van Assche wrote:
> On 8/23/24 9:48 AM, Manivannan Sadhasivam wrote:
> > On Fri, Aug 23, 2024 at 09:07:18AM -0700, Bart Van Assche wrote:
> > > On 8/23/24 7:58 AM, Manivannan Sadhasivam wrote:
> > > > Then why can't you send the change at that time?
> > > 
> > > We use the Android GKI kernel and any patches must be sent upstream
> > > first before these can be considered for inclusion in the GKI kernel.
> > 
> > But that's the same requirement for other SoC vendors as well. Anyway, these
> > don't justify the fact that the core code should be modified to workaround a
> > controller defect. Please use quirks as like other vendors.
> 
> Let me repeat what I mentioned earlier:
> * Introducing a new quirk without introducing a user for that quirk is
>   not acceptable because that would involve introducing code that is
>   dead code from the point of view of the upstream kernel.

As I pointed out earlier, you should just submit the quirk change when you are
submitting your driver. But you said that for GKI requirement you are doing the
change in core driver. But again, that is applicable for other vendors as well.
What if other vendors start adding the workaround in the core driver citing GKI
requirement (provided it also removes some code as you justified)? Will it be
acceptable? NO.

> * The UFS driver core is already complicated. If we don't need a new
>   quirk we shouldn't introduce a new quirk.
> 

Sorry, the quirk is a quirk. All the existing quirks can be worked around in the
core driver in some way. But we have the quirk mechanisms to specifically not to
do that to avoid polluting the core code which has to follow the spec.

Moreover, this workaround you are adding is not at all common for other
controllers. So this definitely doesn't justify modifying the core code.

IMO adding more code alone will not make a driver complicated, but changing the
logic will.

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-24  2:29               ` Manivannan Sadhasivam
@ 2024-08-24  2:48                 ` Bart Van Assche
  2024-08-24  3:03                   ` Manivannan Sadhasivam
  0 siblings, 1 reply; 1651+ messages in thread
From: Bart Van Assche @ 2024-08-24  2:48 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On 8/23/24 7:29 PM, Manivannan Sadhasivam wrote:
> What if other vendors start adding the workaround in the core driver citing GKI
> requirement (provided it also removes some code as you justified)? Will it be
> acceptable? NO.

It's not up to you to define new rules for upstream kernel development.
Anyone is allowed to publish patches that rework kernel code, whether
or not the purpose of such a patch is to work around a SoC bug.

Additionally, it has already happened that one of your colleagues
submitted a workaround for a SoC bug to the UFS core driver.
 From the description of commit 0f52fcb99ea2 ("scsi: ufs: Try to save
power mode change and UIC cmd completion timeout"): "This is to deal
with the scenario in which completion has been raised but the one
waiting for the completion cannot be awaken in time due to kernel
scheduling problem." That description makes zero sense to me. My
conclusion from commit 0f52fcb99ea2 is that it is a workaround for a
bug in a UFS host controller, namely that a particular UFS host
controller not always generates a UIC completion interrupt when it
should.

Bart.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation
  2024-08-24  2:48                 ` Bart Van Assche
@ 2024-08-24  3:03                   ` Manivannan Sadhasivam
  2024-08-26  6:48                     ` Can Guo
  0 siblings, 1 reply; 1651+ messages in thread
From: Manivannan Sadhasivam @ 2024-08-24  3:03 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On Fri, Aug 23, 2024 at 07:48:50PM -0700, Bart Van Assche wrote:
> On 8/23/24 7:29 PM, Manivannan Sadhasivam wrote:
> > What if other vendors start adding the workaround in the core driver citing GKI
> > requirement (provided it also removes some code as you justified)? Will it be
> > acceptable? NO.
> 
> It's not up to you to define new rules for upstream kernel development.

I'm not framing new rules, but just pointing out the common practice.

> Anyone is allowed to publish patches that rework kernel code, whether
> or not the purpose of such a patch is to work around a SoC bug.
> 

Yes, at the same time if that code deviates from the norm, then anyone can
complain. We are all working towards making the code better.

> Additionally, it has already happened that one of your colleagues
> submitted a workaround for a SoC bug to the UFS core driver.
> From the description of commit 0f52fcb99ea2 ("scsi: ufs: Try to save
> power mode change and UIC cmd completion timeout"): "This is to deal
> with the scenario in which completion has been raised but the one
> waiting for the completion cannot be awaken in time due to kernel
> scheduling problem." That description makes zero sense to me. My
> conclusion from commit 0f52fcb99ea2 is that it is a workaround for a
> bug in a UFS host controller, namely that a particular UFS host
> controller not always generates a UIC completion interrupt when it
> should.
> 

0f52fcb99ea2 was submitted in 2020 before I started contributing to UFS driver
seriously. But the description of that commit never mentioned any issue with the
controller. It vaguely mentions 'kernel scheduling problem' which I don't know
how to interpret. If I were looking into the code at that time, I would've
definitely asked for clarity during the review phase.

But there is no need to take it as an example. I can only assert the fact that
working around the controller defect in core code when we already have quirks
for the same purpose defeats the purpose of quirks. And it will encourage other
people to start changing the core code in the future thus bypassing the quirks.

But I'm not a maintainer of this part of the code. So I cannot definitely stop
you from getting this patch merged. I'll leave it up to Martin to decide.

- Mani

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-08-24  3:03                   ` Manivannan Sadhasivam
@ 2024-08-26  6:48                     ` Can Guo
  0 siblings, 0 replies; 1651+ messages in thread
From: Can Guo @ 2024-08-26  6:48 UTC (permalink / raw)
  To: Manivannan Sadhasivam, Bart Van Assche
  Cc: Bao D. Nguyen, Martin K . Petersen, linux-scsi,
	James E.J. Bottomley, Peter Wang, Avri Altman, Andrew Halaney,
	Bean Huo, Alim Akhtar, Eric Biggers, Minwoo Im, Maramaina Naresh

On 1/1/1970 8:00 AM, wrote:
> On Fri, Aug 23, 2024 at 07:48:50PM -0700, Bart Van Assche wrote:
>> On 8/23/24 7:29 PM, Manivannan Sadhasivam wrote:
>>> What if other vendors start adding the workaround in the core driver citing GKI
>>> requirement (provided it also removes some code as you justified)? Will it be
>>> acceptable? NO.
>> It's not up to you to define new rules for upstream kernel development.
> I'm not framing new rules, but just pointing out the common practice.
>
>> Anyone is allowed to publish patches that rework kernel code, whether
>> or not the purpose of such a patch is to work around a SoC bug.
>>
> Yes, at the same time if that code deviates from the norm, then anyone can
> complain. We are all working towards making the code better.
>
>> Additionally, it has already happened that one of your colleagues
>> submitted a workaround for a SoC bug to the UFS core driver.
>>  From the description of commit 0f52fcb99ea2 ("scsi: ufs: Try to save
>> power mode change and UIC cmd completion timeout"): "This is to deal
>> with the scenario in which completion has been raised but the one
>> waiting for the completion cannot be awaken in time due to kernel
>> scheduling problem." That description makes zero sense to me. My
>> conclusion from commit 0f52fcb99ea2 is that it is a workaround for a
>> bug in a UFS host controller, namely that a particular UFS host
>> controller not always generates a UIC completion interrupt when it
>> should.
>>
> 0f52fcb99ea2 was submitted in 2020 before I started contributing to UFS driver
> seriously. But the description of that commit never mentioned any issue with the
> controller. It vaguely mentions 'kernel scheduling problem' which I don't know
> how to interpret. If I were looking into the code at that time, I would've
> definitely asked for clarity during the review phase.

0f52fcb99ea2 is my commit, apologize for the confusion due to poor commit msg.
What we were trying to fix was not a SoC BUG. More background for this change:
from our customer side, we used to hit corner cases where the UIC command is
sent, UFS host controller generates the UIC command completion interrupt fine,
then UIC completion IRQ handler fires and calls the complete(), however the
completion timeout error still happens. In this case, UFS, UFS host and UFS
driver are the victims. And whatever could cause this scheduling problem should
be fixed properly by the right PoC, but we thought making UFS driver robust in
this spot would be good for all of the users who may face the similar issue,
hence the change.

Thanks,
Can Guo.

>
> But there is no need to take it as an example. I can only assert the fact that
> working around the controller defect in core code when we already have quirks
> for the same purpose defeats the purpose of quirks. And it will encourage other
> people to start changing the core code in the future thus bypassing the quirks.
>
> But I'm not a maintainer of this part of the code. So I cannot definitely stop
> you from getting this patch merged. I'll leave it up to Martin to decide.
>
> - Mani
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-09-13 17:11 David Hunter
  2024-09-13 20:39 ` Shuah Khan
  0 siblings, 1 reply; 1651+ messages in thread
From: David Hunter @ 2024-09-13 17:11 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: David Hunter, linux-kbuild, linux-kernel, shuah,
	javier.carrasco.cruz

Date: Fri, 13 Sep 2024 11:52:16 -0400
Subject: [PATCH 0/7] linux-kbuild: fix: process configs set to "y"

An assumption made in this script is that the config options do not need
to be processed because they will simply be in the new config file. This
assumption is incorrect. 

Process the config entries set to "y" because those config entries might
have dependencies set to "m". If a config entry is set to "m" and is not
loaded directly into the machine, the script will currently turn off
that config entry; however, if that turned off config entry is a
dependency for a "y" option. that means the config entry set to "y"
will also be turned off later when the conf executive file is called. 

Here is a model of the problem (arrows show dependency): 

Original config file
Config_1 (m) <-- Config_2 (y) 

Config_1 is not loaded in this example, so it is turned off. 
After scripts/kconfig/streamline_config.pl, but before scripts/kconfig/conf
Config_1 (n) <-- Config_2 (y) 

After  scripts/kconfig/conf
Config_1 (n) <-- Config_2 (n) 

It should also be noted that any module in the dependency chain will
also be turned off, even if that module is loaded directly onto the
computer. Here is an example: 

Original config file
Config_1 (m) <-- Config_2 (y) <-- Config_3 (m)

Config_3 will be loaded in this example.
After scripts/kconfig/streamline_config.pl, but before scripts/kconfig/conf
Config_1 (n) <-- Config_2 (y) <-- Config_3 (m)

After scripts/kconfig/conf
Config_1 (n) <-- Config_2 (n) <-- Config_3 (n)

I discovered this problem when I ran "make localmodconfig" on a generic
Ubuntu config file. Many hardware devices were not recognized once the
kernel was installed and booted. Another way to reproduced the error I
had is to run "make localmodconfig" twice. The standard error might display
warnings that certain modules should be selected but no config files are
turned on that select that module. 

With the changes in this series patch, all modules are loaded properly
and all of the hardware is loaded when the kernel is installed and
booted.  

David Hunter (7):
  linux-kbuild: fix: config option can be bool
  linux-kbuild: fix: missing variable operator
  linux-kbuild: fix: ensure all defaults are tracked
  linux-kbuild: fix: ensure selected configs were turned on in original
  linux-kbuild: fix: implement choice for kconfigs
  linux-kbuild: fix: configs with defaults do not need a prompt
  linux-kbuild: fix: process config options set to "y"

 scripts/kconfig/streamline_config.pl | 77 ++++++++++++++++++++++++----
 1 file changed, 66 insertions(+), 11 deletions(-)

-- 
2.43.0

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-09-13 17:11 David Hunter
@ 2024-09-13 20:39 ` Shuah Khan
  0 siblings, 0 replies; 1651+ messages in thread
From: Shuah Khan @ 2024-09-13 20:39 UTC (permalink / raw)
  To: David Hunter, Masahiro Yamada
  Cc: linux-kbuild, linux-kernel, shuah, javier.carrasco.cruz,
	Shuah Khan

On 9/13/24 11:11, David Hunter wrote:

Missing subject line for the cover-letter?
> 	
> Date: Fri, 13 Sep 2024 11:52:16 -0400
> Subject: [PATCH 0/7] linux-kbuild: fix: process configs set to "y"
> 
> An assumption made in this script is that the config options do not need
> to be processed because they will simply be in the new config file. This
> assumption is incorrect.
> 
> Process the config entries set to "y" because those config entries might
> have dependencies set to "m". If a config entry is set to "m" and is not
> loaded directly into the machine, the script will currently turn off
> that config entry; however, if that turned off config entry is a
> dependency for a "y" option. that means the config entry set to "y"
> will also be turned off later when the conf executive file is called.
> 
> Here is a model of the problem (arrows show dependency):
> 
> Original config file
> Config_1 (m) <-- Config_2 (y)
> 
> Config_1 is not loaded in this example, so it is turned off.
> After scripts/kconfig/streamline_config.pl, but before scripts/kconfig/conf
> Config_1 (n) <-- Config_2 (y)
> 
> After  scripts/kconfig/conf
> Config_1 (n) <-- Config_2 (n)
> 
> 
> It should also be noted that any module in the dependency chain will
> also be turned off, even if that module is loaded directly onto the
> computer. Here is an example:
> 
> Original config file
> Config_1 (m) <-- Config_2 (y) <-- Config_3 (m)
> 
> Config_3 will be loaded in this example.
> After scripts/kconfig/streamline_config.pl, but before scripts/kconfig/conf
> Config_1 (n) <-- Config_2 (y) <-- Config_3 (m)
> 
> After scripts/kconfig/conf
> Config_1 (n) <-- Config_2 (n) <-- Config_3 (n)
> 
> 
> I discovered this problem when I ran "make localmodconfig" on a generic
> Ubuntu config file. Many hardware devices were not recognized once the
> kernel was installed and booted. Another way to reproduced the error I
> had is to run "make localmodconfig" twice. The standard error might display
> warnings that certain modules should be selected but no config files are
> turned on that select that module.
> 
> With the changes in this series patch, all modules are loaded properly
> and all of the hardware is loaded when the kernel is installed and
> booted.
> 
> 
> David Hunter (7):
>    linux-kbuild: fix: config option can be bool
>    linux-kbuild: fix: missing variable operator
>    linux-kbuild: fix: ensure all defaults are tracked
>    linux-kbuild: fix: ensure selected configs were turned on in original
>    linux-kbuild: fix: implement choice for kconfigs
>    linux-kbuild: fix: configs with defaults do not need a prompt
>    linux-kbuild: fix: process config options set to "y"
> 
>   scripts/kconfig/streamline_config.pl | 77 ++++++++++++++++++++++++----
>   1 file changed, 66 insertions(+), 11 deletions(-)
> 


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-09-17  7:10 Akhil P Oommen
  2024-09-17  7:24 ` Dmitry Baryshkov
  0 siblings, 1 reply; 1651+ messages in thread
From: Akhil P Oommen @ 2024-09-17  7:10 UTC (permalink / raw)
  To: GPUfirmwareforSA8775Pchipset, linux-firmware
  Cc: dmitry.baryshkov, quic_rajeshk

The following changes since commit 6c88d9b8253b8ec6df701a551a56438ea2e5bacf:

  Merge branch 'amd-staging' into 'main' (2024-09-13 20:28:50 +0000)

are available in the Git repository at:

  https://git.codelinaro.org/clo/linux-kernel/linux-firmware.git gpu-fw-SA8775p

for you to fetch changes up to 43c971bcd74d9793140a8fbbcc805204cb797f96:

  qcom: add gpu firmwares for sa8775p chipset (2024-09-17 11:57:51 +0530)

----------------------------------------------------------------
Akhil P Oommen (1):
      qcom: add gpu firmwares for sa8775p chipset

 WHENCE                    |   2 ++
 qcom/a663_gmu.bin         | Bin 0 -> 55892 bytes
 qcom/sa8775p/a663_zap.mbn | Bin 0 -> 1054680 bytes
 3 files changed, 2 insertions(+)
 create mode 100644 qcom/a663_gmu.bin
 create mode 100644 qcom/sa8775p/a663_zap.mbn

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-09-17  7:10 Akhil P Oommen
@ 2024-09-17  7:24 ` Dmitry Baryshkov
  0 siblings, 0 replies; 1651+ messages in thread
From: Dmitry Baryshkov @ 2024-09-17  7:24 UTC (permalink / raw)
  To: Akhil P Oommen; +Cc: GPUfirmwareforSA8775Pchipset, linux-firmware, quic_rajeshk

On Tue, 17 Sept 2024 at 09:10, Akhil P Oommen <quic_akhilpo@quicinc.com> wrote:
>
> The following changes since commit 6c88d9b8253b8ec6df701a551a56438ea2e5bacf:
>
>   Merge branch 'amd-staging' into 'main' (2024-09-13 20:28:50 +0000)
>
> are available in the Git repository at:
>
>   https://git.codelinaro.org/clo/linux-kernel/linux-firmware.git gpu-fw-SA8775p
>
> for you to fetch changes up to 43c971bcd74d9793140a8fbbcc805204cb797f96:
>
>   qcom: add gpu firmwares for sa8775p chipset (2024-09-17 11:57:51 +0530)
>
> ----------------------------------------------------------------
> Akhil P Oommen (1):
>       qcom: add gpu firmwares for sa8775p chipset
>
>  WHENCE                    |   2 ++
>  qcom/a663_gmu.bin         | Bin 0 -> 55892 bytes
>  qcom/sa8775p/a663_zap.mbn | Bin 0 -> 1054680 bytes
>  3 files changed, 2 insertions(+)
>  create mode 100644 qcom/a663_gmu.bin
>  create mode 100644 qcom/sa8775p/a663_zap.mbn

Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>

Thank you!

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2024-10-10 22:44 PRIVATE
  0 siblings, 0 replies; 1651+ messages in thread
From: PRIVATE @ 2024-10-10 22:44 UTC (permalink / raw)
  To: kexec

I have a viable proposal for you, If You want to know more, get back to me.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-10-15 22:48 Daniel Yang
  2024-10-16  1:27 ` Jakub Kicinski
  0 siblings, 1 reply; 1651+ messages in thread
From: Daniel Yang @ 2024-10-15 22:48 UTC (permalink / raw)
  To: Wenjia Zhang, Jan Karcher, D. Wythe, Tony Lu, Wen Gu,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-s390, netdev, linux-kernel
  Cc: danielyangkang

Date: Tue, 15 Oct 2024 15:31:12 -0700
Subject: [PATCH v3 0/2 RESEND] resolve gtp possible deadlock warning

Fixes deadlock described in this bug:
https://syzkaller.appspot.com/bug?extid=e953a8f3071f5c0a28fd.
Specific crash report here:
https://syzkaller.appspot.com/text?tag=CrashReport&x=14670e07980000.

This bug is a false positive lockdep warning since gtp and smc use
completely different socket protocols.

Lockdep thinks that lock_sock() in smc will deadlock with gtp's
lock_sock() acquisition.

Adding lockdep annotations on smc socket creation prevents these false
positives.

Daniel Yang (2):
  Patch from D. Wythe <alibuda@linux.alibaba.com>
  Move lockdep annotation to separate function for readability.

 net/smc/smc_inet.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

-- 
2.39.2

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-10-15 22:48 Daniel Yang
@ 2024-10-16  1:27 ` Jakub Kicinski
  0 siblings, 0 replies; 1651+ messages in thread
From: Jakub Kicinski @ 2024-10-16  1:27 UTC (permalink / raw)
  To: Daniel Yang
  Cc: Wenjia Zhang, Jan Karcher, D. Wythe, Tony Lu, Wen Gu,
	David S. Miller, Eric Dumazet, Paolo Abeni, linux-s390, netdev,
	linux-kernel

On Tue, 15 Oct 2024 15:48:03 -0700 Daniel Yang wrote:
> Subject: 
> Date: Tue, 15 Oct 2024 15:48:03 -0700
> X-Mailer: git-send-email 2.39.2
> 
> Date: Tue, 15 Oct 2024 15:31:12 -0700
> Subject: [PATCH v3 0/2 RESEND] resolve gtp possible deadlock warning

This is garbled as well. 

Before you repost please make sure you take a look at:
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#tl-dr
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-10-17  9:09 Paulo Miguel Almeida
  2024-10-17  9:12 ` Paulo Miguel Almeida
  0 siblings, 1 reply; 1651+ messages in thread
From: Paulo Miguel Almeida @ 2024-10-17  9:09 UTC (permalink / raw)
  To: tsbogend, bvanassche, gregkh, ricardo, zhanggenjian, linux-mips,
	linux-kernel
  Cc: paulo.miguel.almeida.rodenas

linux-hardening@vger.kernel.org
Bcc: 
Subject: [PATCH v2][next] mips: sgi-ip22: Replace "s[n]?printf" with
 sysfs_emit in sysfs callbacks
Reply-To: 

Replace open-coded pieces with sysfs_emit() helper in sysfs .show()
callbacks.

Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
---
Changelog:
- v2: amend commit message (Req: Maciej W. Rozycki)
- v1: https://lore.kernel.org/lkml/Zw2GRQkbx8Z8DlcS@mail.google.com/
---
 arch/mips/sgi-ip22/ip22-gio.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/mips/sgi-ip22/ip22-gio.c b/arch/mips/sgi-ip22/ip22-gio.c
index d20eec742bfa..5893ea4e382c 100644
--- a/arch/mips/sgi-ip22/ip22-gio.c
+++ b/arch/mips/sgi-ip22/ip22-gio.c
@@ -165,9 +165,8 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *a,
 			     char *buf)
 {
 	struct gio_device *gio_dev = to_gio_device(dev);
-	int len = snprintf(buf, PAGE_SIZE, "gio:%x\n", gio_dev->id.id);
 
-	return (len >= PAGE_SIZE) ? (PAGE_SIZE - 1) : len;
+	return sysfs_emit(buf, "gio:%x\n", gio_dev->id.id);
 }
 static DEVICE_ATTR_RO(modalias);
 
@@ -177,7 +176,7 @@ static ssize_t name_show(struct device *dev,
 	struct gio_device *giodev;
 
 	giodev = to_gio_device(dev);
-	return sprintf(buf, "%s", giodev->name);
+	return sysfs_emit(buf, "%s\n", giodev->name);
 }
 static DEVICE_ATTR_RO(name);
 
@@ -187,7 +186,7 @@ static ssize_t id_show(struct device *dev,
 	struct gio_device *giodev;
 
 	giodev = to_gio_device(dev);
-	return sprintf(buf, "%x", giodev->id.id);
+	return sysfs_emit(buf, "%x\n", giodev->id.id);
 }
 static DEVICE_ATTR_RO(id);
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2024-10-17  9:09 Paulo Miguel Almeida
@ 2024-10-17  9:12 ` Paulo Miguel Almeida
  0 siblings, 0 replies; 1651+ messages in thread
From: Paulo Miguel Almeida @ 2024-10-17  9:12 UTC (permalink / raw)
  To: tsbogend, bvanassche, gregkh, ricardo, zhanggenjian, linux-mips,
	linux-kernel

On Thu, Oct 17, 2024 at 10:09:26PM +1300, Paulo Miguel Almeida wrote:
> linux-hardening@vger.kernel.org
> Bcc: 
> Subject: [PATCH v2][next] mips: sgi-ip22: Replace "s[n]?printf" with
>  sysfs_emit in sysfs callbacks
> Reply-To: 
> 
> Replace open-coded pieces with sysfs_emit() helper in sysfs .show()
> callbacks.
> 
> Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
> ---
> Changelog:
> - v2: amend commit message (Req: Maciej W. Rozycki)
> - v1: https://lore.kernel.org/lkml/Zw2GRQkbx8Z8DlcS@mail.google.com/
> ---
> 

Apologies to you all. Fat finger from my part (and a little of mutt's fault too)

Will submit the patch shortly

- Paulo A.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2024-11-23  1:39 the Hide
  2024-11-23  7:32 ` Christoph Biedl
  0 siblings, 1 reply; 1651+ messages in thread
From: the Hide @ 2024-11-23  1:39 UTC (permalink / raw)
  To: stable

Who should I contact regarding the following error

E: Malformed entry 5 in list file 
/etc/apt/sources.list.d/additional-repositories.list (Component)
E: The list of sources could not be read.
E: _cache->open() failed, please report.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2024-11-23  1:39 the Hide
@ 2024-11-23  7:32 ` Christoph Biedl
  0 siblings, 0 replies; 1651+ messages in thread
From: Christoph Biedl @ 2024-11-23  7:32 UTC (permalink / raw)
  To: the Hide; +Cc: stable

[-- Attachment #1: Type: text/plain, Size: 828 bytes --]

the Hide wrote...

> Who should I contact regarding the following error
> 
> 
> E: Malformed entry 5 in list file
> /etc/apt/sources.list.d/additional-repositories.list (Component)
> E: The list of sources could not be read.
> E: _cache->open() failed, please report.

Assuming you're using Debian and not some derivatve: Some Debian users
mailing list, like <https://lists.debian.org/debian-user/>

From the above error message I assume there's a format error in
/etc/apt/sources.list.d/additional-repositories.list - so it was wise to
include the content of that file in a message to that list.

If it's actually a bug in apt, the Debian bug tracker was the place to
go. This list here however is about development of the Linux kernel, the
stable releases, so not quite the right place.

    Christoph

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2024-11-25 19:23 Robert Harewood
  0 siblings, 0 replies; 1651+ messages in thread
From: Robert Harewood @ 2024-11-25 19:23 UTC (permalink / raw)
  To: v9fs

Dear CEO,

I hope this message finds you safe and well.

I have a team of investors who are interested in providing capital to your
business operations and projects without any upfront costs from you. I'd
love to have the chance to discuss the details with you further.

Please let me know if this is something that you would be interested in, and
we can schedule a call to further evaluate the details.

Thank you for your time and consideration.

Warm regards,

Robert Harewood,
Robert Harewood Advisory
The Broadgate Tower
20 Primrose St,
London, United Kingdom, EC2A 2EW.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2024-11-25 20:13 Robert Harewood
  0 siblings, 0 replies; 1651+ messages in thread
From: Robert Harewood @ 2024-11-25 20:13 UTC (permalink / raw)
  To: llvm

Dear CEO,

I hope this message finds you safe and well.

I have a team of investors who are interested in providing capital to your
business operations and projects without any upfront costs from you. I'd
love to have the chance to discuss the details with you further.

Please let me know if this is something that you would be interested in, and
we can schedule a call to further evaluate the details.

Thank you for your time and consideration.

Warm regards,

Robert Harewood,
Robert Harewood Advisory
The Broadgate Tower
20 Primrose St,
London, United Kingdom, EC2A 2EW.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-01-08 13:59 Jiang Liu
  2025-01-08 14:10 ` Christian König
  2025-01-08 16:33 ` Re: Mario Limonciello
  0 siblings, 2 replies; 1651+ messages in thread
From: Jiang Liu @ 2025-01-08 13:59 UTC (permalink / raw)
  To: alexander.deucher, christian.koenig, Xinhui.Pan, airlied, simona,
	sunil.khatri, lijo.lazar, Hawking.Zhang, mario.limonciello,
	Jun.Ma2, xiaogang.chen, Kent.Russell, shuox.liu, amd-gfx
  Cc: Jiang Liu

Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume

Recently we were testing suspend/resume functionality with AMD GPUs,
we have encountered several resource tracking related bugs, such as
double buffer free, use after free and unbalanced irq reference count.

We have tried to solve these issues case by case, but found that may
not be the right way. Especially about the unbalanced irq reference
count, there will be new issues appear once we fixed the current known
issues. After analyzing related source code, we found that there may be
some fundamental implementaion flaws behind these resource tracking
issues.

The amdgpu driver has two major state machines to driver the device
management flow, one is for ip blocks, the other is for ras blocks.
The hook points defined in struct amd_ip_funcs for device setup/teardown
are symmetric, but the implementation is asymmetric, sometime even
ambiguous. The most obvious two issues we noticed are:
1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
   are called from .hw_fini() instead of .early_fini().
2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
   match the way to set those flags.

When taking device suspend/resume into account, in addition to device
probe/remove, things get much more complex. Some issues arise because
many suspend/resume implementations directly reuse .hw_init/.hw_fini/
.late_init hook points.

So we try to fix those issues by two enhancements/refinements to current
device management state machines.

The first change is to make the ip block state machine and associated
status flags work in stack-like way as below:
Callback        Status Flags
early_init:     valid = true
sw_init:        sw = true
hw_init:        hw = true
late_init:      late_initialized = true
early_fini:     late_initialized = false
hw_fini:        hw = false
sw_fini:        sw = false
late_fini:      valid = false

Also do the same thing for ras block state machine, though it's much
more simpler.

The second change is fine tune the overall device management work
flow as below:
1. amdgpu_driver_load_kms()
	amdgpu_device_init()
		amdgpu_device_ip_early_init()
			ip_blocks[i].early_init()
			ip_blocks[i].status.valid = true
		amdgpu_device_ip_init()
			amdgpu_ras_init()
			ip_blocks[i].sw_init()
			ip_blocks[i].status.sw = true
			ip_blocks[i].hw_init()
			ip_blocks[i].status.hw = true
		amdgpu_device_ip_late_init()
			ip_blocks[i].late_init()
			ip_blocks[i].status.late_initialized = true
			amdgpu_ras_late_init()
				ras_blocks[i].ras_late_init()
					amdgpu_ras_feature_enable_on_boot()

2. amdgpu_pmops_suspend()/amdgpu_pmops_freeze()/amdgpu_pmops_poweroff()
	amdgpu_device_suspend()
		amdgpu_ras_early_fini()
			ras_blocks[i].ras_early_fini()
				amdgpu_ras_feature_disable()
		amdgpu_ras_suspend()
			amdgpu_ras_disable_all_features()
+++		ip_blocks[i].early_fini()
+++		ip_blocks[i].status.late_initialized = false
		ip_blocks[i].suspend()

3. amdgpu_pmops_resume()/amdgpu_pmops_thaw()/amdgpu_pmops_restore()
	amdgpu_device_resume()
		amdgpu_device_ip_resume()
			ip_blocks[i].resume()
		amdgpu_device_ip_late_init()
			ip_blocks[i].late_init()
			ip_blocks[i].status.late_initialized = true
			amdgpu_ras_late_init()
				ras_blocks[i].ras_late_init()
					amdgpu_ras_feature_enable_on_boot()
		amdgpu_ras_resume()
			amdgpu_ras_enable_all_features()

4. amdgpu_driver_unload_kms()
	amdgpu_device_fini_hw()
		amdgpu_ras_early_fini()
			ras_blocks[i].ras_early_fini()
+++		ip_blocks[i].early_fini()
+++		ip_blocks[i].status.late_initialized = false
		ip_blocks[i].hw_fini()
		ip_blocks[i].status.hw = false

5. amdgpu_driver_release_kms()
	amdgpu_device_fini_sw()
		amdgpu_device_ip_fini()
			ip_blocks[i].sw_fini()
			ip_blocks[i].status.sw = false
---			ip_blocks[i].status.valid = false
+++			amdgpu_ras_fini()
			ip_blocks[i].late_fini()
+++			ip_blocks[i].status.valid = false
---			ip_blocks[i].status.late_initialized = false
---			amdgpu_ras_fini()

The main changes include:
1) invoke ip_blocks[i].early_fini in amdgpu_pmops_suspend().
   Currently there's only one ip block which provides `early_fini`
   callback. We have add a check of `in_s3` to keep current behavior in
   function amdgpu_dm_early_fini(). So there should be no functional
   changes.
2) set ip_blocks[i].status.late_initialized to false after calling
   callback `early_fini`. We have auditted all usages of the
   late_initialized flag and no functional changes found.
3) only set ip_blocks[i].status.valid = false after calling the
   `late_fini` callback.
4) call amdgpu_ras_fini() before invoking ip_blocks[i].late_fini.

Then we try to refine each subsystem, such as nbio, asic, gfx, gmc,
ras etc, to follow the new design. Currently we have only taken the
nbio and asic as examples to show the proposed changes. Once we have
confirmed that's the right way to go, we will handle the lefting
subsystems.

This is in early stage and requesting for comments, any comments and
suggestions are welcomed!
Jiang Liu (13):
  amdgpu: wrong array index to get ip block for PSP
  drm/admgpu: add helper functions to track status for ras manager
  drm/amdgpu: add a flag to track ras debugfs creation status
  drm/amdgpu: free all resources on error recovery path of
    amdgpu_ras_init()
  drm/amdgpu: introduce a flag to track refcount held for features
  drm/amdgpu: enhance amdgpu_ras_block_late_fini()
  drm/amdgpu: enhance amdgpu_ras_pre_fini() to better support SR
  drm/admgpu: rename amdgpu_ras_pre_fini() to amdgpu_ras_early_fini()
  drm/amdgpu: make IP block state machine works in stack like way
  drm/admgpu: make device state machine work in stack like way
  drm/amdgpu/sdma: improve the way to manage irq reference count
  drm/amdgpu/nbio: improve the way to manage irq reference count
  drm/amdgpu/asic: make ip block operations symmetric by .early_fini()

 drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  40 +++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  37 ++++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c       |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c      |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c      |  16 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h      |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       |   8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 144 +++++++++++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |  16 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c      |  26 +++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h      |   2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c       |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |   2 +-
 drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c       |   2 +-
 drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c       |   2 +-
 drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c        |   1 +
 drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c        |   1 +
 drivers/gpu/drm/amd/amdgpu/nv.c               |  14 +-
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c        |   8 -
 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c      |  23 +--
 drivers/gpu/drm/amd/amdgpu/soc15.c            |  38 ++---
 drivers/gpu/drm/amd/amdgpu/soc21.c            |  35 +++--
 drivers/gpu/drm/amd/amdgpu/soc24.c            |  17 ++-
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   3 +
 25 files changed, 326 insertions(+), 118 deletions(-)

-- 
2.43.5

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-01-08 13:59 Jiang Liu
@ 2025-01-08 14:10 ` Christian König
  2025-01-08 16:33 ` Re: Mario Limonciello
  1 sibling, 0 replies; 1651+ messages in thread
From: Christian König @ 2025-01-08 14:10 UTC (permalink / raw)
  To: Jiang Liu, alexander.deucher, Xinhui.Pan, airlied, simona,
	sunil.khatri, lijo.lazar, Hawking.Zhang, mario.limonciello,
	Jun.Ma2, xiaogang.chen, Kent.Russell, shuox.liu, amd-gfx

Am 08.01.25 um 14:59 schrieb Jiang Liu:
> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume
>
> Recently we were testing suspend/resume functionality with AMD GPUs,
> we have encountered several resource tracking related bugs, such as
> double buffer free, use after free and unbalanced irq reference count.
>
> We have tried to solve these issues case by case, but found that may
> not be the right way. Especially about the unbalanced irq reference
> count, there will be new issues appear once we fixed the current known
> issues. After analyzing related source code, we found that there may be
> some fundamental implementaion flaws behind these resource tracking
> issues.

In general please run your patches through checkpatch.pl. There are 
quite a number of style issues with those code changes.

>
> The amdgpu driver has two major state machines to driver the device
> management flow, one is for ip blocks, the other is for ras blocks.
> The hook points defined in struct amd_ip_funcs for device setup/teardown
> are symmetric, but the implementation is asymmetric, sometime even
> ambiguous. The most obvious two issues we noticed are:
> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>     are called from .hw_fini() instead of .early_fini().

Yes and if I remember correctly that is absolutely intentional.

IRQs can't be enabled unless all IP blocks are up and running because 
otherwise the IRQ handler sometimes doesn't have the necessary 
functionality at hand.

But for HW fini we only disable IRQs before we actually tear down the HW 
state because we need them for operation feedback. E.g. for example ring 
buffer completion interrupts for tear down commands.

Regards,
Christian.

> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>     match the way to set those flags.
>
> When taking device suspend/resume into account, in addition to device
> probe/remove, things get much more complex. Some issues arise because
> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
> .late_init hook points.
>
> So we try to fix those issues by two enhancements/refinements to current
> device management state machines.
>
> The first change is to make the ip block state machine and associated
> status flags work in stack-like way as below:
> Callback        Status Flags
> early_init:     valid = true
> sw_init:        sw = true
> hw_init:        hw = true
> late_init:      late_initialized = true
> early_fini:     late_initialized = false
> hw_fini:        hw = false
> sw_fini:        sw = false
> late_fini:      valid = false
>
> Also do the same thing for ras block state machine, though it's much
> more simpler.
>
> The second change is fine tune the overall device management work
> flow as below:
> 1. amdgpu_driver_load_kms()
> 	amdgpu_device_init()
> 		amdgpu_device_ip_early_init()
> 			ip_blocks[i].early_init()
> 			ip_blocks[i].status.valid = true
> 		amdgpu_device_ip_init()
> 			amdgpu_ras_init()
> 			ip_blocks[i].sw_init()
> 			ip_blocks[i].status.sw = true
> 			ip_blocks[i].hw_init()
> 			ip_blocks[i].status.hw = true
> 		amdgpu_device_ip_late_init()
> 			ip_blocks[i].late_init()
> 			ip_blocks[i].status.late_initialized = true
> 			amdgpu_ras_late_init()
> 				ras_blocks[i].ras_late_init()
> 					amdgpu_ras_feature_enable_on_boot()
>
> 2. amdgpu_pmops_suspend()/amdgpu_pmops_freeze()/amdgpu_pmops_poweroff()
> 	amdgpu_device_suspend()
> 		amdgpu_ras_early_fini()
> 			ras_blocks[i].ras_early_fini()
> 				amdgpu_ras_feature_disable()
> 		amdgpu_ras_suspend()
> 			amdgpu_ras_disable_all_features()
> +++		ip_blocks[i].early_fini()
> +++		ip_blocks[i].status.late_initialized = false
> 		ip_blocks[i].suspend()
>
> 3. amdgpu_pmops_resume()/amdgpu_pmops_thaw()/amdgpu_pmops_restore()
> 	amdgpu_device_resume()
> 		amdgpu_device_ip_resume()
> 			ip_blocks[i].resume()
> 		amdgpu_device_ip_late_init()
> 			ip_blocks[i].late_init()
> 			ip_blocks[i].status.late_initialized = true
> 			amdgpu_ras_late_init()
> 				ras_blocks[i].ras_late_init()
> 					amdgpu_ras_feature_enable_on_boot()
> 		amdgpu_ras_resume()
> 			amdgpu_ras_enable_all_features()
>
> 4. amdgpu_driver_unload_kms()
> 	amdgpu_device_fini_hw()
> 		amdgpu_ras_early_fini()
> 			ras_blocks[i].ras_early_fini()
> +++		ip_blocks[i].early_fini()
> +++		ip_blocks[i].status.late_initialized = false
> 		ip_blocks[i].hw_fini()
> 		ip_blocks[i].status.hw = false
>
> 5. amdgpu_driver_release_kms()
> 	amdgpu_device_fini_sw()
> 		amdgpu_device_ip_fini()
> 			ip_blocks[i].sw_fini()
> 			ip_blocks[i].status.sw = false
> ---			ip_blocks[i].status.valid = false
> +++			amdgpu_ras_fini()
> 			ip_blocks[i].late_fini()
> +++			ip_blocks[i].status.valid = false
> ---			ip_blocks[i].status.late_initialized = false
> ---			amdgpu_ras_fini()
>
> The main changes include:
> 1) invoke ip_blocks[i].early_fini in amdgpu_pmops_suspend().
>     Currently there's only one ip block which provides `early_fini`
>     callback. We have add a check of `in_s3` to keep current behavior in
>     function amdgpu_dm_early_fini(). So there should be no functional
>     changes.
> 2) set ip_blocks[i].status.late_initialized to false after calling
>     callback `early_fini`. We have auditted all usages of the
>     late_initialized flag and no functional changes found.
> 3) only set ip_blocks[i].status.valid = false after calling the
>     `late_fini` callback.
> 4) call amdgpu_ras_fini() before invoking ip_blocks[i].late_fini.
>
> Then we try to refine each subsystem, such as nbio, asic, gfx, gmc,
> ras etc, to follow the new design. Currently we have only taken the
> nbio and asic as examples to show the proposed changes. Once we have
> confirmed that's the right way to go, we will handle the lefting
> subsystems.
>
> This is in early stage and requesting for comments, any comments and
> suggestions are welcomed!
> Jiang Liu (13):
>    amdgpu: wrong array index to get ip block for PSP
>    drm/admgpu: add helper functions to track status for ras manager
>    drm/amdgpu: add a flag to track ras debugfs creation status
>    drm/amdgpu: free all resources on error recovery path of
>      amdgpu_ras_init()
>    drm/amdgpu: introduce a flag to track refcount held for features
>    drm/amdgpu: enhance amdgpu_ras_block_late_fini()
>    drm/amdgpu: enhance amdgpu_ras_pre_fini() to better support SR
>    drm/admgpu: rename amdgpu_ras_pre_fini() to amdgpu_ras_early_fini()
>    drm/amdgpu: make IP block state machine works in stack like way
>    drm/admgpu: make device state machine work in stack like way
>    drm/amdgpu/sdma: improve the way to manage irq reference count
>    drm/amdgpu/nbio: improve the way to manage irq reference count
>    drm/amdgpu/asic: make ip block operations symmetric by .early_fini()
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  40 +++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  37 ++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c      |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c      |  16 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h      |   1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       |   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 144 +++++++++++++-----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |  16 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c      |  26 +++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h      |   2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |   2 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c        |   1 +
>   drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c        |   1 +
>   drivers/gpu/drm/amd/amdgpu/nv.c               |  14 +-
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c        |   8 -
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c      |  23 +--
>   drivers/gpu/drm/amd/amdgpu/soc15.c            |  38 ++---
>   drivers/gpu/drm/amd/amdgpu/soc21.c            |  35 +++--
>   drivers/gpu/drm/amd/amdgpu/soc24.c            |  17 ++-
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   3 +
>   25 files changed, 326 insertions(+), 118 deletions(-)
>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-01-08 13:59 Jiang Liu
  2025-01-08 14:10 ` Christian König
@ 2025-01-08 16:33 ` Mario Limonciello
  2025-01-09  5:34   ` Re: Gerry Liu
  1 sibling, 1 reply; 1651+ messages in thread
From: Mario Limonciello @ 2025-01-08 16:33 UTC (permalink / raw)
  To: Jiang Liu, alexander.deucher, christian.koenig, Xinhui.Pan,
	airlied, simona, sunil.khatri, lijo.lazar, Hawking.Zhang, Jun.Ma2,
	xiaogang.chen, Kent.Russell, shuox.liu, amd-gfx

On 1/8/2025 07:59, Jiang Liu wrote:
> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume

I'm not sure how this happened, but your subject didn't end up in the 
subject of the thread on patch 0 so the thread just looks like an 
unsubjected thread.

> 
> Recently we were testing suspend/resume functionality with AMD GPUs,
> we have encountered several resource tracking related bugs, such as
> double buffer free, use after free and unbalanced irq reference count.

Can you share more aobut how you were hitting these issues?  Are they 
specific to S3 or to s2idle flows?  dGPU or APU?
Are they only with SRIOV?

Is there anything to do with the host influencing the failures to 
happen, or are you contriving the failures to find the bugs?

I know we've had some reports about resource tracking warnings on the 
reset flows, but I haven't heard much about suspend/resume.

> 
> We have tried to solve these issues case by case, but found that may
> not be the right way. Especially about the unbalanced irq reference
> count, there will be new issues appear once we fixed the current known
> issues. After analyzing related source code, we found that there may be
> some fundamental implementaion flaws behind these resource tracking

implementation

> issues.
> 
> The amdgpu driver has two major state machines to driver the device
> management flow, one is for ip blocks, the other is for ras blocks.
> The hook points defined in struct amd_ip_funcs for device setup/teardown
> are symmetric, but the implementation is asymmetric, sometime even
> ambiguous. The most obvious two issues we noticed are:
> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>     are called from .hw_fini() instead of .early_fini().
> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>     match the way to set those flags.
> 
> When taking device suspend/resume into account, in addition to device
> probe/remove, things get much more complex. Some issues arise because
> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
> .late_init hook points.
>
> So we try to fix those issues by two enhancements/refinements to current
> device management state machines.
> 
> The first change is to make the ip block state machine and associated
> status flags work in stack-like way as below:
> Callback        Status Flags
> early_init:     valid = true
> sw_init:        sw = true
> hw_init:        hw = true
> late_init:      late_initialized = true
> early_fini:     late_initialized = false
> hw_fini:        hw = false
> sw_fini:        sw = false
> late_fini:      valid = false

At a high level this makes sense to me, but I'd just call 'late' or 
'late_init'.

Another idea if you make it stack like is to do it as a true enum for 
the state machine and store it all in one variable.

> 
> Also do the same thing for ras block state machine, though it's much
> more simpler.
> 
> The second change is fine tune the overall device management work
> flow as below:
> 1. amdgpu_driver_load_kms()
> 	amdgpu_device_init()
> 		amdgpu_device_ip_early_init()
> 			ip_blocks[i].early_init()
> 			ip_blocks[i].status.valid = true
> 		amdgpu_device_ip_init()
> 			amdgpu_ras_init()
> 			ip_blocks[i].sw_init()
> 			ip_blocks[i].status.sw = true
> 			ip_blocks[i].hw_init()
> 			ip_blocks[i].status.hw = true
> 		amdgpu_device_ip_late_init()
> 			ip_blocks[i].late_init()
> 			ip_blocks[i].status.late_initialized = true
> 			amdgpu_ras_late_init()
> 				ras_blocks[i].ras_late_init()
> 					amdgpu_ras_feature_enable_on_boot()
> 
> 2. amdgpu_pmops_suspend()/amdgpu_pmops_freeze()/amdgpu_pmops_poweroff()
> 	amdgpu_device_suspend()
> 		amdgpu_ras_early_fini()
> 			ras_blocks[i].ras_early_fini()
> 				amdgpu_ras_feature_disable()
> 		amdgpu_ras_suspend()
> 			amdgpu_ras_disable_all_features()
> +++		ip_blocks[i].early_fini()
> +++		ip_blocks[i].status.late_initialized = false
> 		ip_blocks[i].suspend()
> 
> 3. amdgpu_pmops_resume()/amdgpu_pmops_thaw()/amdgpu_pmops_restore()
> 	amdgpu_device_resume()
> 		amdgpu_device_ip_resume()
> 			ip_blocks[i].resume()
> 		amdgpu_device_ip_late_init()
> 			ip_blocks[i].late_init()
> 			ip_blocks[i].status.late_initialized = true
> 			amdgpu_ras_late_init()
> 				ras_blocks[i].ras_late_init()
> 					amdgpu_ras_feature_enable_on_boot()
> 		amdgpu_ras_resume()
> 			amdgpu_ras_enable_all_features()
> 
> 4. amdgpu_driver_unload_kms()
> 	amdgpu_device_fini_hw()
> 		amdgpu_ras_early_fini()
> 			ras_blocks[i].ras_early_fini()
> +++		ip_blocks[i].early_fini()
> +++		ip_blocks[i].status.late_initialized = false
> 		ip_blocks[i].hw_fini()
> 		ip_blocks[i].status.hw = false
> 
> 5. amdgpu_driver_release_kms()
> 	amdgpu_device_fini_sw()
> 		amdgpu_device_ip_fini()
> 			ip_blocks[i].sw_fini()
> 			ip_blocks[i].status.sw = false
> ---			ip_blocks[i].status.valid = false
> +++			amdgpu_ras_fini()
> 			ip_blocks[i].late_fini()
> +++			ip_blocks[i].status.valid = false
> ---			ip_blocks[i].status.late_initialized = false
> ---			amdgpu_ras_fini()
> 
> The main changes include:
> 1) invoke ip_blocks[i].early_fini in amdgpu_pmops_suspend().
>     Currently there's only one ip block which provides `early_fini`
>     callback. We have add a check of `in_s3` to keep current behavior in
>     function amdgpu_dm_early_fini(). So there should be no functional
>     changes.
> 2) set ip_blocks[i].status.late_initialized to false after calling
>     callback `early_fini`. We have auditted all usages of the
>     late_initialized flag and no functional changes found.
> 3) only set ip_blocks[i].status.valid = false after calling the
>     `late_fini` callback.
> 4) call amdgpu_ras_fini() before invoking ip_blocks[i].late_fini.
> 
> Then we try to refine each subsystem, such as nbio, asic, gfx, gmc,
> ras etc, to follow the new design. Currently we have only taken the
> nbio and asic as examples to show the proposed changes. Once we have
> confirmed that's the right way to go, we will handle the lefting
> subsystems.
> 
> This is in early stage and requesting for comments, any comments and
> suggestions are welcomed!
> Jiang Liu (13):
>    amdgpu: wrong array index to get ip block for PSP
>    drm/admgpu: add helper functions to track status for ras manager
>    drm/amdgpu: add a flag to track ras debugfs creation status
>    drm/amdgpu: free all resources on error recovery path of
>      amdgpu_ras_init()
>    drm/amdgpu: introduce a flag to track refcount held for features
>    drm/amdgpu: enhance amdgpu_ras_block_late_fini()
>    drm/amdgpu: enhance amdgpu_ras_pre_fini() to better support SR
>    drm/admgpu: rename amdgpu_ras_pre_fini() to amdgpu_ras_early_fini()
>    drm/amdgpu: make IP block state machine works in stack like way
>    drm/admgpu: make device state machine work in stack like way
>    drm/amdgpu/sdma: improve the way to manage irq reference count
>    drm/amdgpu/nbio: improve the way to manage irq reference count
>    drm/amdgpu/asic: make ip block operations symmetric by .early_fini()
> 
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  40 +++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  37 ++++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c      |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c      |  16 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h      |   1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       |   8 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 144 +++++++++++++-----
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |  16 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c      |  26 +++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h      |   2 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |   2 +-
>   drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c       |   2 +-
>   drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c        |   1 +
>   drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c        |   1 +
>   drivers/gpu/drm/amd/amdgpu/nv.c               |  14 +-
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c        |   8 -
>   drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c      |  23 +--
>   drivers/gpu/drm/amd/amdgpu/soc15.c            |  38 ++---
>   drivers/gpu/drm/amd/amdgpu/soc21.c            |  35 +++--
>   drivers/gpu/drm/amd/amdgpu/soc24.c            |  17 ++-
>   .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   3 +
>   25 files changed, 326 insertions(+), 118 deletions(-)
> 


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-01-08 16:33 ` Re: Mario Limonciello
@ 2025-01-09  5:34   ` Gerry Liu
  2025-01-09 17:10     ` Re: Mario Limonciello
  0 siblings, 1 reply; 1651+ messages in thread
From: Gerry Liu @ 2025-01-09  5:34 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: alexander.deucher, christian.koenig, Xinhui.Pan, airlied, simona,
	sunil.khatri, Lazar, Lijo, Hawking.Zhang, Chen, Xiaogang,
	Kent.Russell, Shuo Liu, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 10505 bytes --]



> 2025年1月9日 00:33，Mario Limonciello <mario.limonciello@amd.com> 写道：
> 
> On 1/8/2025 07:59, Jiang Liu wrote:
>> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume
> 
> I'm not sure how this happened, but your subject didn't end up in the subject of the thread on patch 0 so the thread just looks like an unsubjected thread.
Maybe it’s caused by one extra blank line at the header.

> 
>> Recently we were testing suspend/resume functionality with AMD GPUs,
>> we have encountered several resource tracking related bugs, such as
>> double buffer free, use after free and unbalanced irq reference count.
> 
> Can you share more aobut how you were hitting these issues?  Are they specific to S3 or to s2idle flows?  dGPU or APU?
> Are they only with SRIOV?
> 
> Is there anything to do with the host influencing the failures to happen, or are you contriving the failures to find the bugs?
> 
> I know we've had some reports about resource tracking warnings on the reset flows, but I haven't heard much about suspend/resume.
We are investigating to develop some advanced product features based on amdgpu suspend/resume.
So we started by tested the suspend/resume functionality of AMD 308x GPUs with the following simple script:
```
echo platform > /sys/power/pm_test
i=0
while true; do
        echo mem > /sys/power/state
        let i=i+1
        echo $i
        sleep 1
done
```

It succeeds with the first and second iteration but always fails on following iterations on a bare metal servers with eight MI308X GPUs.
With some investigation we found that the gpu asic should be reset during the test, so we submitted a patch to fix the failure (https://github.com/ROCm/ROCK-Kernel-Driver/pull/181 <https://github.com/ROCm/ROCK-Kernel-Driver/pull/181>)

During analyze and root-cause the failure, we have encountered several crashes, resource leakages and false alarms.
So I have worked out patch sets to solve issues we encountered. The other patch set is https://lists.freedesktop.org/archives/amd-gfx/2025-January/118484.html <https://lists.freedesktop.org/archives/amd-gfx/2025-January/118484.html>

With sriov in single VF mode, resume always fails. Seems some contexts/vram buffers get lost during suspend and haven’t be restored on resume, so cause failure.
We haven’t tested sriov in multiple VFs mode yet. We need more help from AMD side to make SR work for SRIOV:)  

> 
>> We have tried to solve these issues case by case, but found that may
>> not be the right way. Especially about the unbalanced irq reference
>> count, there will be new issues appear once we fixed the current known
>> issues. After analyzing related source code, we found that there may be
>> some fundamental implementaion flaws behind these resource tracking
> 
> implementation
> 
>> issues.
>> The amdgpu driver has two major state machines to driver the device
>> management flow, one is for ip blocks, the other is for ras blocks.
>> The hook points defined in struct amd_ip_funcs for device setup/teardown
>> are symmetric, but the implementation is asymmetric, sometime even
>> ambiguous. The most obvious two issues we noticed are:
>> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>>    are called from .hw_fini() instead of .early_fini().
>> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>>    match the way to set those flags.
>> When taking device suspend/resume into account, in addition to device
>> probe/remove, things get much more complex. Some issues arise because
>> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
>> .late_init hook points.
>> 
>> So we try to fix those issues by two enhancements/refinements to current
>> device management state machines.
>> The first change is to make the ip block state machine and associated
>> status flags work in stack-like way as below:
>> Callback        Status Flags
>> early_init:     valid = true
>> sw_init:        sw = true
>> hw_init:        hw = true
>> late_init:      late_initialized = true
>> early_fini:     late_initialized = false
>> hw_fini:        hw = false
>> sw_fini:        sw = false
>> late_fini:      valid = false
> 
> At a high level this makes sense to me, but I'd just call 'late' or 'late_init'.
> 
> Another idea if you make it stack like is to do it as a true enum for the state machine and store it all in one variable.
I will add a patch to convert those bool flags into an enum.
Thanks,
Gerry

> 
>> Also do the same thing for ras block state machine, though it's much
>> more simpler.
>> The second change is fine tune the overall device management work
>> flow as below:
>> 1. amdgpu_driver_load_kms()
>> 	amdgpu_device_init()
>> 		amdgpu_device_ip_early_init()
>> 			ip_blocks[i].early_init()
>> 			ip_blocks[i].status.valid = true
>> 		amdgpu_device_ip_init()
>> 			amdgpu_ras_init()
>> 			ip_blocks[i].sw_init()
>> 			ip_blocks[i].status.sw = true
>> 			ip_blocks[i].hw_init()
>> 			ip_blocks[i].status.hw = true
>> 		amdgpu_device_ip_late_init()
>> 			ip_blocks[i].late_init()
>> 			ip_blocks[i].status.late_initialized = true
>> 			amdgpu_ras_late_init()
>> 				ras_blocks[i].ras_late_init()
>> 					amdgpu_ras_feature_enable_on_boot()
>> 2. amdgpu_pmops_suspend()/amdgpu_pmops_freeze()/amdgpu_pmops_poweroff()
>> 	amdgpu_device_suspend()
>> 		amdgpu_ras_early_fini()
>> 			ras_blocks[i].ras_early_fini()
>> 				amdgpu_ras_feature_disable()
>> 		amdgpu_ras_suspend()
>> 			amdgpu_ras_disable_all_features()
>> +++		ip_blocks[i].early_fini()
>> +++		ip_blocks[i].status.late_initialized = false
>> 		ip_blocks[i].suspend()
>> 3. amdgpu_pmops_resume()/amdgpu_pmops_thaw()/amdgpu_pmops_restore()
>> 	amdgpu_device_resume()
>> 		amdgpu_device_ip_resume()
>> 			ip_blocks[i].resume()
>> 		amdgpu_device_ip_late_init()
>> 			ip_blocks[i].late_init()
>> 			ip_blocks[i].status.late_initialized = true
>> 			amdgpu_ras_late_init()
>> 				ras_blocks[i].ras_late_init()
>> 					amdgpu_ras_feature_enable_on_boot()
>> 		amdgpu_ras_resume()
>> 			amdgpu_ras_enable_all_features()
>> 4. amdgpu_driver_unload_kms()
>> 	amdgpu_device_fini_hw()
>> 		amdgpu_ras_early_fini()
>> 			ras_blocks[i].ras_early_fini()
>> +++		ip_blocks[i].early_fini()
>> +++		ip_blocks[i].status.late_initialized = false
>> 		ip_blocks[i].hw_fini()
>> 		ip_blocks[i].status.hw = false
>> 5. amdgpu_driver_release_kms()
>> 	amdgpu_device_fini_sw()
>> 		amdgpu_device_ip_fini()
>> 			ip_blocks[i].sw_fini()
>> 			ip_blocks[i].status.sw = false
>> ---			ip_blocks[i].status.valid = false
>> +++			amdgpu_ras_fini()
>> 			ip_blocks[i].late_fini()
>> +++			ip_blocks[i].status.valid = false
>> ---			ip_blocks[i].status.late_initialized = false
>> ---			amdgpu_ras_fini()
>> The main changes include:
>> 1) invoke ip_blocks[i].early_fini in amdgpu_pmops_suspend().
>>    Currently there's only one ip block which provides `early_fini`
>>    callback. We have add a check of `in_s3` to keep current behavior in
>>    function amdgpu_dm_early_fini(). So there should be no functional
>>    changes.
>> 2) set ip_blocks[i].status.late_initialized to false after calling
>>    callback `early_fini`. We have auditted all usages of the
>>    late_initialized flag and no functional changes found.
>> 3) only set ip_blocks[i].status.valid = false after calling the
>>    `late_fini` callback.
>> 4) call amdgpu_ras_fini() before invoking ip_blocks[i].late_fini.
>> Then we try to refine each subsystem, such as nbio, asic, gfx, gmc,
>> ras etc, to follow the new design. Currently we have only taken the
>> nbio and asic as examples to show the proposed changes. Once we have
>> confirmed that's the right way to go, we will handle the lefting
>> subsystems.
>> This is in early stage and requesting for comments, any comments and
>> suggestions are welcomed!
>> Jiang Liu (13):
>>   amdgpu: wrong array index to get ip block for PSP
>>   drm/admgpu: add helper functions to track status for ras manager
>>   drm/amdgpu: add a flag to track ras debugfs creation status
>>   drm/amdgpu: free all resources on error recovery path of
>>     amdgpu_ras_init()
>>   drm/amdgpu: introduce a flag to track refcount held for features
>>   drm/amdgpu: enhance amdgpu_ras_block_late_fini()
>>   drm/amdgpu: enhance amdgpu_ras_pre_fini() to better support SR
>>   drm/admgpu: rename amdgpu_ras_pre_fini() to amdgpu_ras_early_fini()
>>   drm/amdgpu: make IP block state machine works in stack like way
>>   drm/admgpu: make device state machine work in stack like way
>>   drm/amdgpu/sdma: improve the way to manage irq reference count
>>   drm/amdgpu/nbio: improve the way to manage irq reference count
>>   drm/amdgpu/asic: make ip block operations symmetric by .early_fini()
>>  drivers/gpu/drm/amd/amdgpu/amdgpu.h           |  40 +++++
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c    |  37 ++++-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_jpeg.c      |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.c      |  16 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h      |   1 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c       |   8 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c       | 144 +++++++++++++-----
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h       |  16 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c      |  26 +++-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h      |   2 +
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c      |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c       |   2 +-
>>  drivers/gpu/drm/amd/amdgpu/nbio_v7_4.c        |   1 +
>>  drivers/gpu/drm/amd/amdgpu/nbio_v7_9.c        |   1 +
>>  drivers/gpu/drm/amd/amdgpu/nv.c               |  14 +-
>>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c        |   8 -
>>  drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c      |  23 +--
>>  drivers/gpu/drm/amd/amdgpu/soc15.c            |  38 ++---
>>  drivers/gpu/drm/amd/amdgpu/soc21.c            |  35 +++--
>>  drivers/gpu/drm/amd/amdgpu/soc24.c            |  17 ++-
>>  .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c |   3 +
>>  25 files changed, 326 insertions(+), 118 deletions(-)


[-- Attachment #2: Type: text/html, Size: 38668 bytes --]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-01-09  5:34   ` Re: Gerry Liu
@ 2025-01-09 17:10     ` Mario Limonciello
  2025-01-13  1:19       ` Re: Gerry Liu
  0 siblings, 1 reply; 1651+ messages in thread
From: Mario Limonciello @ 2025-01-09 17:10 UTC (permalink / raw)
  To: Gerry Liu
  Cc: alexander.deucher, christian.koenig, Xinhui.Pan, airlied, simona,
	sunil.khatri, Lazar, Lijo, Hawking.Zhang, Chen, Xiaogang,
	Kent.Russell, Shuo Liu, amd-gfx

General note - don't use HTML for mailing list communication.

I'm not sure if Apple Mail lets you switch this around.

If not, you might try using Thunderbird instead.  You can pick to reply 
in plain text or HTML by holding shift when you hit "reply all"

For my reply I'll convert my reply to plain text, please see inline below.

On 1/8/2025 23:34, Gerry Liu wrote:
> 
> 
>> 2025年1月9日 00:33，Mario Limonciello <mario.limonciello@amd.com 
>> <mailto:mario.limonciello@amd.com>> 写道：
>>
>> On 1/8/2025 07:59, Jiang Liu wrote:
>>> Subject: [RFC PATCH 00/13] Enhance device state machine to better 
>>> support suspend/resume
>>
>> I'm not sure how this happened, but your subject didn't end up in the 
>> subject of the thread on patch 0 so the thread just looks like an 
>> unsubjected thread.
> Maybe it’s caused by one extra blank line at the header.

Yeah that might be it.  Hopefully it doesn't happen on v2.

> 
>>
>>> Recently we were testing suspend/resume functionality with AMD GPUs,
>>> we have encountered several resource tracking related bugs, such as
>>> double buffer free, use after free and unbalanced irq reference count.
>>
>> Can you share more aobut how you were hitting these issues?  Are they 
>> specific to S3 or to s2idle flows?  dGPU or APU?
>> Are they only with SRIOV?
>>
>> Is there anything to do with the host influencing the failures to 
>> happen, or are you contriving the failures to find the bugs?
>>
>> I know we've had some reports about resource tracking warnings on the 
>> reset flows, but I haven't heard much about suspend/resume.
> We are investigating to develop some advanced product features based on 
> amdgpu suspend/resume.
> So we started by tested the suspend/resume functionality of AMD 308x 
> GPUs with the following simple script:
> ```
> echoplatform >/sys/power/pm_test
> i=0
> while true; do
> echomem >/sys/power/state
> leti=i+1
> echo$i
> sleep1
> done
> ```
> 
> It succeeds with the first and second iteration but always fails on 
> following iterations on a bare metal servers with eight MI308X GPUs.

Can you share more about this server?  Does it support suspend to ram or 
a hardware backed suspend to idle?  If you don't know, you can check 
like this:

❯ cat /sys/power/mem_sleep
s2idle [deep]

If it's suspend to idle, what does the FACP indicate?  You can do this 
check to find out if you don't know.

❯ sudo cp /sys/firmware/acpi/tables/FACP /tmp
❯ sudo iasl -d /tmp/FACP
❯ grep "idle" -i /tmp/FACP.dsl
                       Low Power S0 Idle (V5) : 0


> With some investigation we found that the gpu asic should be reset 
> during the test, 

Yeah; but this comes back to my above questions.  Typically there is an 
assumption that the power rails are going to be cut in system suspend.

If that doesn't hold true, then you're doing a pure software suspend and 
have found a series of issues in the driver with how that's handled.

> so we submitted a patch to fix the failure (https:// 
> github.com/ROCm/ROCK-Kernel-Driver/pull/181 <https://github.com/ROCm/ 
> ROCK-Kernel-Driver/pull/181>)

Typically kernel patches don't go through that repo, they're discussed 
on the mailing lists. Can you bring this patch for discussion on amd-gfx?

> 
> During analyze and root-cause the failure, we have encountered several 
> crashes, resource leakages and false alarms.

Yeah; I think you found some real issues.

> So I have worked out patch sets to solve issues we encountered. The 
> other patch set is https://lists.freedesktop.org/archives/amd-gfx/2025- 
> January/118484.html <https://lists.freedesktop.org/archives/amd- 
> gfx/2025-January/118484.html>

Thanks!

> 
> With sriov in single VF mode, resume always fails. Seems some contexts/ 
> vram buffers get lost during suspend and haven’t be restored on resume, 
> so cause failure.
> We haven’t tested sriov in multiple VFs mode yet. We need more help from 
> AMD side to make SR work for SRIOV:)
> 
>>
>>> We have tried to solve these issues case by case, but found that may
>>> not be the right way. Especially about the unbalanced irq reference
>>> count, there will be new issues appear once we fixed the current known
>>> issues. After analyzing related source code, we found that there may be
>>> some fundamental implementaion flaws behind these resource tracking
>>
>> implementation
>>
>>> issues.
>>> The amdgpu driver has two major state machines to driver the device
>>> management flow, one is for ip blocks, the other is for ras blocks.
>>> The hook points defined in struct amd_ip_funcs for device setup/teardown
>>> are symmetric, but the implementation is asymmetric, sometime even
>>> ambiguous. The most obvious two issues we noticed are:
>>> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>>>    are called from .hw_fini() instead of .early_fini().
>>> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>>>    match the way to set those flags.
>>> When taking device suspend/resume into account, in addition to device
>>> probe/remove, things get much more complex. Some issues arise because
>>> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
>>> .late_init hook points.
>>>
>>> So we try to fix those issues by two enhancements/refinements to current
>>> device management state machines.
>>> The first change is to make the ip block state machine and associated
>>> status flags work in stack-like way as below:
>>> Callback        Status Flags
>>> early_init:     valid = true
>>> sw_init:        sw = true
>>> hw_init:        hw = true
>>> late_init:      late_initialized = true
>>> early_fini:     late_initialized = false
>>> hw_fini:        hw = false
>>> sw_fini:        sw = false
>>> late_fini:      valid = false
>>
>> At a high level this makes sense to me, but I'd just call 'late' or 
>> 'late_init'.
>>
>> Another idea if you make it stack like is to do it as a true enum for 
>> the state machine and store it all in one variable.
> I will add a patch to convert those bool flags into an enum.

Thanks!


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-01-09 17:10     ` Re: Mario Limonciello
@ 2025-01-13  1:19       ` Gerry Liu
  2025-01-13 21:59         ` Re: Mario Limonciello
  0 siblings, 1 reply; 1651+ messages in thread
From: Gerry Liu @ 2025-01-13  1:19 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: alexander.deucher, christian.koenig, Xinhui.Pan, airlied, simona,
	sunil.khatri, Lazar, Lijo, Hawking.Zhang, Chen, Xiaogang,
	Kent.Russell, Shuo Liu, amd-gfx



> 2025年1月10日 01:10，Mario Limonciello <mario.limonciello@amd.com> 写道：
> 
> General note - don't use HTML for mailing list communication.
> 
> I'm not sure if Apple Mail lets you switch this around.
> 
> If not, you might try using Thunderbird instead.  You can pick to reply in plain text or HTML by holding shift when you hit "reply all"
> 
> For my reply I'll convert my reply to plain text, please see inline below.
> 
> On 1/8/2025 23:34, Gerry Liu wrote:
>>> 2025年1月9日 00:33，Mario Limonciello <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> 写道：
>>> 
>>> On 1/8/2025 07:59, Jiang Liu wrote:
>>>> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume
>>> 
>>> I'm not sure how this happened, but your subject didn't end up in the subject of the thread on patch 0 so the thread just looks like an unsubjected thread.
>> Maybe it’s caused by one extra blank line at the header.
> 
> Yeah that might be it.  Hopefully it doesn't happen on v2.
> 
>>> 
>>>> Recently we were testing suspend/resume functionality with AMD GPUs,
>>>> we have encountered several resource tracking related bugs, such as
>>>> double buffer free, use after free and unbalanced irq reference count.
>>> 
>>> Can you share more aobut how you were hitting these issues?  Are they specific to S3 or to s2idle flows?  dGPU or APU?
>>> Are they only with SRIOV?
>>> 
>>> Is there anything to do with the host influencing the failures to happen, or are you contriving the failures to find the bugs?
>>> 
>>> I know we've had some reports about resource tracking warnings on the reset flows, but I haven't heard much about suspend/resume.
>> We are investigating to develop some advanced product features based on amdgpu suspend/resume.
>> So we started by tested the suspend/resume functionality of AMD 308x GPUs with the following simple script:
>> ```
>> echoplatform >/sys/power/pm_test
>> i=0
>> while true; do
>> echomem >/sys/power/state
>> leti=i+1
>> echo$i
>> sleep1
>> done
>> ```
>> It succeeds with the first and second iteration but always fails on following iterations on a bare metal servers with eight MI308X GPUs.
> 
> Can you share more about this server?  Does it support suspend to ram or a hardware backed suspend to idle?  If you don't know, you can check like this:
> 
> ❯ cat /sys/power/mem_sleep
> s2idle [deep]
# cat /sys/power/mem_sleep 
[s2idle]

> 
> If it's suspend to idle, what does the FACP indicate?  You can do this check to find out if you don't know.
> 
> ❯ sudo cp /sys/firmware/acpi/tables/FACP /tmp
> ❯ sudo iasl -d /tmp/FACP
> ❯ grep "idle" -i /tmp/FACP.dsl
>                      Low Power S0 Idle (V5) : 0
> 
With acpidump and `iasl -d facp.data`, we got:
[070h 0112   4]        Flags (decoded below) : 000084A5
      WBINVD instruction is operational (V1) : 1
              WBINVD flushes all caches (V1) : 0
                    All CPUs support C1 (V1) : 1
                  C2 works on MP system (V1) : 0
            Control Method Power Button (V1) : 0
            Control Method Sleep Button (V1) : 1
        RTC wake not in fixed reg space (V1) : 0
            RTC can wake system from S4 (V1) : 1
                        32-bit PM Timer (V1) : 0
                      Docking Supported (V1) : 0
               Reset Register Supported (V2) : 1
                            Sealed Case (V3) : 0
                    Headless - No Video (V3) : 0
        Use native instr after SLP_TYPx (V3) : 0
              PCIEXP_WAK Bits Supported (V4) : 0
                     Use Platform Timer (V4) : 1
               RTC_STS valid on S4 wake (V4) : 0
                Remote Power-on capable (V4) : 0
                 Use APIC Cluster Model (V4) : 0
     Use APIC Physical Destination Mode (V4) : 0
                       Hardware Reduced (V5) : 0
                      Low Power S0 Idle (V5) : 0

>> With some investigation we found that the gpu asic should be reset during the test, 
> 
> Yeah; but this comes back to my above questions.  Typically there is an assumption that the power rails are going to be cut in system suspend.
> 
> If that doesn't hold true, then you're doing a pure software suspend and have found a series of issues in the driver with how that's handled.
Yeah, we are trying to do a `pure software suspend`, letting hypervisor to save/restore system images instead of guest OS.
And during the suspend process, we hope we can cancel the suspend request at any later stage.
We cancel suspend at late stages, it does behave like a pure software suspend.

> 
>> so we submitted a patch to fix the failure (https:// github.com/ROCm/ROCK-Kernel-Driver/pull/181 <https://github.com/ROCm/ ROCK-Kernel-Driver/pull/181>)
> 
> Typically kernel patches don't go through that repo, they're discussed on the mailing lists. Can you bring this patch for discussion on amd-gfx?
Will post to amd-gfx after solving the conflicts.

Regards,
Gerry

> 
>> During analyze and root-cause the failure, we have encountered several crashes, resource leakages and false alarms.
> 
> Yeah; I think you found some real issues.
> 
>> So I have worked out patch sets to solve issues we encountered. The other patch set is https://lists.freedesktop.org/archives/amd-gfx/2025- January/118484.html <https://lists.freedesktop.org/archives/amd- gfx/2025-January/118484.html>
> 
> Thanks!
> 
>> With sriov in single VF mode, resume always fails. Seems some contexts/ vram buffers get lost during suspend and haven’t be restored on resume, so cause failure.
>> We haven’t tested sriov in multiple VFs mode yet. We need more help from AMD side to make SR work for SRIOV:)
>>> 
>>>> We have tried to solve these issues case by case, but found that may
>>>> not be the right way. Especially about the unbalanced irq reference
>>>> count, there will be new issues appear once we fixed the current known
>>>> issues. After analyzing related source code, we found that there may be
>>>> some fundamental implementaion flaws behind these resource tracking
>>> 
>>> implementation
>>> 
>>>> issues.
>>>> The amdgpu driver has two major state machines to driver the device
>>>> management flow, one is for ip blocks, the other is for ras blocks.
>>>> The hook points defined in struct amd_ip_funcs for device setup/teardown
>>>> are symmetric, but the implementation is asymmetric, sometime even
>>>> ambiguous. The most obvious two issues we noticed are:
>>>> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>>>>    are called from .hw_fini() instead of .early_fini().
>>>> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>>>>    match the way to set those flags.
>>>> When taking device suspend/resume into account, in addition to device
>>>> probe/remove, things get much more complex. Some issues arise because
>>>> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
>>>> .late_init hook points.
>>>> 
>>>> So we try to fix those issues by two enhancements/refinements to current
>>>> device management state machines.
>>>> The first change is to make the ip block state machine and associated
>>>> status flags work in stack-like way as below:
>>>> Callback        Status Flags
>>>> early_init:     valid = true
>>>> sw_init:        sw = true
>>>> hw_init:        hw = true
>>>> late_init:      late_initialized = true
>>>> early_fini:     late_initialized = false
>>>> hw_fini:        hw = false
>>>> sw_fini:        sw = false
>>>> late_fini:      valid = false
>>> 
>>> At a high level this makes sense to me, but I'd just call 'late' or 'late_init'.
>>> 
>>> Another idea if you make it stack like is to do it as a true enum for the state machine and store it all in one variable.
>> I will add a patch to convert those bool flags into an enum.
> 
> Thanks!


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-01-13  1:19       ` Re: Gerry Liu
@ 2025-01-13 21:59         ` Mario Limonciello
  0 siblings, 0 replies; 1651+ messages in thread
From: Mario Limonciello @ 2025-01-13 21:59 UTC (permalink / raw)
  To: Gerry Liu
  Cc: alexander.deucher, christian.koenig, Xinhui.Pan, airlied, simona,
	sunil.khatri, Lazar, Lijo, Hawking.Zhang, Chen, Xiaogang,
	Kent.Russell, Shuo Liu, amd-gfx

On 1/12/2025 19:19, Gerry Liu wrote:
> 
> 
>> 2025年1月10日 01:10，Mario Limonciello <mario.limonciello@amd.com> 写道：
>>
>> General note - don't use HTML for mailing list communication.
>>
>> I'm not sure if Apple Mail lets you switch this around.
>>
>> If not, you might try using Thunderbird instead.  You can pick to reply in plain text or HTML by holding shift when you hit "reply all"
>>
>> For my reply I'll convert my reply to plain text, please see inline below.
>>
>> On 1/8/2025 23:34, Gerry Liu wrote:
>>>> 2025年1月9日 00:33，Mario Limonciello <mario.limonciello@amd.com <mailto:mario.limonciello@amd.com>> 写道：
>>>>
>>>> On 1/8/2025 07:59, Jiang Liu wrote:
>>>>> Subject: [RFC PATCH 00/13] Enhance device state machine to better support suspend/resume
>>>>
>>>> I'm not sure how this happened, but your subject didn't end up in the subject of the thread on patch 0 so the thread just looks like an unsubjected thread.
>>> Maybe it’s caused by one extra blank line at the header.
>>
>> Yeah that might be it.  Hopefully it doesn't happen on v2.
>>
>>>>
>>>>> Recently we were testing suspend/resume functionality with AMD GPUs,
>>>>> we have encountered several resource tracking related bugs, such as
>>>>> double buffer free, use after free and unbalanced irq reference count.
>>>>
>>>> Can you share more aobut how you were hitting these issues?  Are they specific to S3 or to s2idle flows?  dGPU or APU?
>>>> Are they only with SRIOV?
>>>>
>>>> Is there anything to do with the host influencing the failures to happen, or are you contriving the failures to find the bugs?
>>>>
>>>> I know we've had some reports about resource tracking warnings on the reset flows, but I haven't heard much about suspend/resume.
>>> We are investigating to develop some advanced product features based on amdgpu suspend/resume.
>>> So we started by tested the suspend/resume functionality of AMD 308x GPUs with the following simple script:
>>> ```
>>> echoplatform >/sys/power/pm_test
>>> i=0
>>> while true; do
>>> echomem >/sys/power/state
>>> leti=i+1
>>> echo$i
>>> sleep1
>>> done
>>> ```
>>> It succeeds with the first and second iteration but always fails on following iterations on a bare metal servers with eight MI308X GPUs.
>>
>> Can you share more about this server?  Does it support suspend to ram or a hardware backed suspend to idle?  If you don't know, you can check like this:
>>
>> ❯ cat /sys/power/mem_sleep
>> s2idle [deep]
> # cat /sys/power/mem_sleep
> [s2idle]
> 
>>
>> If it's suspend to idle, what does the FACP indicate?  You can do this check to find out if you don't know.
>>
>> ❯ sudo cp /sys/firmware/acpi/tables/FACP /tmp
>> ❯ sudo iasl -d /tmp/FACP
>> ❯ grep "idle" -i /tmp/FACP.dsl
>>                       Low Power S0 Idle (V5) : 0
>>
> With acpidump and `iasl -d facp.data`, we got:
> [070h 0112   4]        Flags (decoded below) : 000084A5
>        WBINVD instruction is operational (V1) : 1
>                WBINVD flushes all caches (V1) : 0
>                      All CPUs support C1 (V1) : 1
>                    C2 works on MP system (V1) : 0
>              Control Method Power Button (V1) : 0
>              Control Method Sleep Button (V1) : 1
>          RTC wake not in fixed reg space (V1) : 0
>              RTC can wake system from S4 (V1) : 1
>                          32-bit PM Timer (V1) : 0
>                        Docking Supported (V1) : 0
>                 Reset Register Supported (V2) : 1
>                              Sealed Case (V3) : 0
>                      Headless - No Video (V3) : 0
>          Use native instr after SLP_TYPx (V3) : 0
>                PCIEXP_WAK Bits Supported (V4) : 0
>                       Use Platform Timer (V4) : 1
>                 RTC_STS valid on S4 wake (V4) : 0
>                  Remote Power-on capable (V4) : 0
>                   Use APIC Cluster Model (V4) : 0
>       Use APIC Physical Destination Mode (V4) : 0
>                         Hardware Reduced (V5) : 0
>                        Low Power S0 Idle (V5) : 0
> 
>>> With some investigation we found that the gpu asic should be reset during the test,
>>
>> Yeah; but this comes back to my above questions.  Typically there is an assumption that the power rails are going to be cut in system suspend.
>>
>> If that doesn't hold true, then you're doing a pure software suspend and have found a series of issues in the driver with how that's handled.
> Yeah, we are trying to do a `pure software suspend`, letting hypervisor to save/restore system images instead of guest OS.
> And during the suspend process, we hope we can cancel the suspend request at any later stage.
> We cancel suspend at late stages, it does behave like a pure software suspend.
> 

Thanks; this all makes a lot more sense now.  This isn't an area that 
has a lot of coverage right now.  Most suspend testing happens with the 
power being cut and coming back fresh.

Will keep this in mind as reviewing your future iterations of your patches.

>>
>>> so we submitted a patch to fix the failure (https:// github.com/ROCm/ROCK-Kernel-Driver/pull/181 <https://github.com/ROCm/ ROCK-Kernel-Driver/pull/181>)
>>
>> Typically kernel patches don't go through that repo, they're discussed on the mailing lists. Can you bring this patch for discussion on amd-gfx?
> Will post to amd-gfx after solving the conflicts.

Thx!

> 
> Regards,
> Gerry
> 
>>
>>> During analyze and root-cause the failure, we have encountered several crashes, resource leakages and false alarms.
>>
>> Yeah; I think you found some real issues.
>>
>>> So I have worked out patch sets to solve issues we encountered. The other patch set is https://lists.freedesktop.org/archives/amd-gfx/2025- January/118484.html <https://lists.freedesktop.org/archives/amd- gfx/2025-January/118484.html>
>>
>> Thanks!
>>
>>> With sriov in single VF mode, resume always fails. Seems some contexts/ vram buffers get lost during suspend and haven’t be restored on resume, so cause failure.
>>> We haven’t tested sriov in multiple VFs mode yet. We need more help from AMD side to make SR work for SRIOV:)
>>>>
>>>>> We have tried to solve these issues case by case, but found that may
>>>>> not be the right way. Especially about the unbalanced irq reference
>>>>> count, there will be new issues appear once we fixed the current known
>>>>> issues. After analyzing related source code, we found that there may be
>>>>> some fundamental implementaion flaws behind these resource tracking
>>>>
>>>> implementation
>>>>
>>>>> issues.
>>>>> The amdgpu driver has two major state machines to driver the device
>>>>> management flow, one is for ip blocks, the other is for ras blocks.
>>>>> The hook points defined in struct amd_ip_funcs for device setup/teardown
>>>>> are symmetric, but the implementation is asymmetric, sometime even
>>>>> ambiguous. The most obvious two issues we noticed are:
>>>>> 1) amdgpu_irq_get() are called from .late_init() but amdgpu_irq_put()
>>>>>     are called from .hw_fini() instead of .early_fini().
>>>>> 2) the way to reset ip_bloc.status.valid/sw/hw/late_initialized doesn't
>>>>>     match the way to set those flags.
>>>>> When taking device suspend/resume into account, in addition to device
>>>>> probe/remove, things get much more complex. Some issues arise because
>>>>> many suspend/resume implementations directly reuse .hw_init/.hw_fini/
>>>>> .late_init hook points.
>>>>>
>>>>> So we try to fix those issues by two enhancements/refinements to current
>>>>> device management state machines.
>>>>> The first change is to make the ip block state machine and associated
>>>>> status flags work in stack-like way as below:
>>>>> Callback        Status Flags
>>>>> early_init:     valid = true
>>>>> sw_init:        sw = true
>>>>> hw_init:        hw = true
>>>>> late_init:      late_initialized = true
>>>>> early_fini:     late_initialized = false
>>>>> hw_fini:        hw = false
>>>>> sw_fini:        sw = false
>>>>> late_fini:      valid = false
>>>>
>>>> At a high level this makes sense to me, but I'd just call 'late' or 'late_init'.
>>>>
>>>> Another idea if you make it stack like is to do it as a true enum for the state machine and store it all in one variable.
>>> I will add a patch to convert those bool flags into an enum.
>>
>> Thanks!
> 


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-04-18  7:46 Shung-Hsi Yu
  2025-04-18  7:49 ` Shung-Hsi Yu
  2025-04-23 17:30 ` Re: patchwork-bot+netdevbpf
  0 siblings, 2 replies; 1651+ messages in thread
From: Shung-Hsi Yu @ 2025-04-18  7:46 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Kumar Kartikeya Dwivedi, Dan Carpenter, Shung-Hsi Yu

From bda8bb8011d865cebf066350c8625e8be1625656 Mon Sep 17 00:00:00 2001
From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Date: Fri, 18 Apr 2025 15:22:00 +0800
Subject: [PATCH bpf-next 1/1] bpf: use proper type to calculate
 bpf_raw_tp_null_args.mask index

The calculation of the index used to access the mask field in 'struct
bpf_raw_tp_null_args' is done with 'int' type, which could overflow when
the tracepoint being attached has more than 8 arguments.

While none of the tracepoints mentioned in raw_tp_null_args[] currently
have more than 8 arguments, there do exist tracepoints that had more
than 8 arguments (e.g. iocost_iocg_forgive_debt), so use the correct
type for calculation and avoid Smatch static checker warning.

Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/843a3b94-d53d-42db-93d4-be10a4090146@stanley.mountain/
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
---
 kernel/bpf/btf.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 16ba36f34dfa..656ee11aff67 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6829,10 +6829,10 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 			/* Is this a func with potential NULL args? */
 			if (strcmp(tname, raw_tp_null_args[i].func))
 				continue;
-			if (raw_tp_null_args[i].mask & (0x1 << (arg * 4)))
+			if (raw_tp_null_args[i].mask & (0x1ULL << (arg * 4)))
 				info->reg_type |= PTR_MAYBE_NULL;
 			/* Is the current arg IS_ERR? */
-			if (raw_tp_null_args[i].mask & (0x2 << (arg * 4)))
+			if (raw_tp_null_args[i].mask & (0x2ULL << (arg * 4)))
 				ptr_err_raw_tp = true;
 			break;
 		}
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2025-04-18  7:46 Shung-Hsi Yu
@ 2025-04-18  7:49 ` Shung-Hsi Yu
  2025-04-23 17:30 ` Re: patchwork-bot+netdevbpf
  1 sibling, 0 replies; 1651+ messages in thread
From: Shung-Hsi Yu @ 2025-04-18  7:49 UTC (permalink / raw)
  To: bpf
  Cc: Martin KaFai Lau, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Kumar Kartikeya Dwivedi, Dan Carpenter

On Fri, Apr 18, 2025 at 3:46 PM Shung-Hsi Yu <shung-hsi.yu@suse.com> wrote:
> From bda8bb8011d865cebf066350c8625e8be1625656 Mon Sep 17 00:00:00 2001
> From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Date: Fri, 18 Apr 2025 15:22:00 +0800
> Subject: [PATCH bpf-next 1/1] bpf: use proper type to calculate
>  bpf_raw_tp_null_args.mask index
...

Email headers are off, hence no subject. WIll resend.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-04-18  7:46 Shung-Hsi Yu
  2025-04-18  7:49 ` Shung-Hsi Yu
@ 2025-04-23 17:30 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 1651+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-04-23 17:30 UTC (permalink / raw)
  To: Shung-Hsi Yu
  Cc: bpf, martin.lau, ast, daniel, andrii, eddyz87, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	memxor, dan.carpenter

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:

On Fri, 18 Apr 2025 15:46:31 +0800 you wrote:
> >From bda8bb8011d865cebf066350c8625e8be1625656 Mon Sep 17 00:00:00 2001
> From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
> Date: Fri, 18 Apr 2025 15:22:00 +0800
> Subject: [PATCH bpf-next 1/1] bpf: use proper type to calculate
>  bpf_raw_tp_null_args.mask index
> 
> The calculation of the index used to access the mask field in 'struct
> bpf_raw_tp_null_args' is done with 'int' type, which could overflow when
> the tracepoint being attached has more than 8 arguments.
> 
> [...]

Here is the summary with links:
  - 
    https://git.kernel.org/bpf/bpf-next/c/53ebef53a657

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH bpf-next] bpf: Remove bpf_get_smp_processor_id_proto
@ 2025-04-22  1:53 Alexei Starovoitov
  2025-04-22  8:04 ` Feng Yang
  0 siblings, 1 reply; 1651+ messages in thread
From: Alexei Starovoitov @ 2025-04-22  1:53 UTC (permalink / raw)
  To: Feng Yang
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard, Song Liu, Yonghong Song, bpf, LKML,
	linux-trace-kernel, Network Development, Feng Yang

On Thu, Apr 17, 2025 at 8:41 PM Feng Yang <yangfeng59949@163.com> wrote:
>
> From: Feng Yang <yangfeng@kylinos.cn>
>
> All BPF programs either disable CPU preemption or CPU migration,
> so the bpf_get_smp_processor_id_proto can be safely removed,
> and the bpf_get_raw_smp_processor_id_proto in bpf_base_func_proto works perfectly.
>
> Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
> ---
>  include/linux/bpf.h      |  1 -
>  kernel/bpf/core.c        |  1 -
>  kernel/bpf/helpers.c     | 12 ------------
>  kernel/trace/bpf_trace.c |  2 --
>  net/core/filter.c        |  6 ------
>  5 files changed, 22 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 3f0cc89c0622..36e525141556 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -3316,7 +3316,6 @@ extern const struct bpf_func_proto bpf_map_peek_elem_proto;
>  extern const struct bpf_func_proto bpf_map_lookup_percpu_elem_proto;
>
>  extern const struct bpf_func_proto bpf_get_prandom_u32_proto;
> -extern const struct bpf_func_proto bpf_get_smp_processor_id_proto;
>  extern const struct bpf_func_proto bpf_get_numa_node_id_proto;
>  extern const struct bpf_func_proto bpf_tail_call_proto;
>  extern const struct bpf_func_proto bpf_ktime_get_ns_proto;
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index ba6b6118cf50..1ad41a16b86e 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -2943,7 +2943,6 @@ const struct bpf_func_proto bpf_spin_unlock_proto __weak;
>  const struct bpf_func_proto bpf_jiffies64_proto __weak;
>
>  const struct bpf_func_proto bpf_get_prandom_u32_proto __weak;
> -const struct bpf_func_proto bpf_get_smp_processor_id_proto __weak;
>  const struct bpf_func_proto bpf_get_numa_node_id_proto __weak;
>  const struct bpf_func_proto bpf_ktime_get_ns_proto __weak;
>  const struct bpf_func_proto bpf_ktime_get_boot_ns_proto __weak;
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index e3a2662f4e33..2d2bfb2911f8 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -149,18 +149,6 @@ const struct bpf_func_proto bpf_get_prandom_u32_proto = {
>         .ret_type       = RET_INTEGER,
>  };
>
> -BPF_CALL_0(bpf_get_smp_processor_id)
> -{
> -       return smp_processor_id();
> -}
> -
> -const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
> -       .func           = bpf_get_smp_processor_id,
> -       .gpl_only       = false,
> -       .ret_type       = RET_INTEGER,
> -       .allow_fastcall = true,
> -};
> -

bpf_get_raw_smp_processor_id_proto doesn't have
allow_fastcall = true

so this breaks tests.

Instead of removing BPF_CALL_0(bpf_get_smp_processor_id)
we should probably remove BPF_CALL_0(bpf_get_raw_cpu_id)
and adjust SKF_AD_OFF + SKF_AD_CPU case.
I don't recall why raw_ version was used back in 2014.

pw-bot: cr

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2025-04-22  1:53 [PATCH bpf-next] bpf: Remove bpf_get_smp_processor_id_proto Alexei Starovoitov
@ 2025-04-22  8:04 ` Feng Yang
  2025-04-22 14:37   ` Alexei Starovoitov
  0 siblings, 1 reply; 1651+ messages in thread
From: Feng Yang @ 2025-04-22  8:04 UTC (permalink / raw)
  To: alexei.starovoitov
  Cc: andrii, ast, bpf, daniel, eddyz87, linux-kernel,
	linux-trace-kernel, martin.lau, netdev, song, yangfeng59949,
	yangfeng, yonghong.song

Subject: Re: [PATCH bpf-next] bpf: Remove bpf_get_smp_processor_id_proto

On Mon, 21 Apr 2025 18:53:07 -0700 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Thu, Apr 17, 2025 at 8:41 PM Feng Yang <yangfeng59949@163.com> wrote:
> >
> > From: Feng Yang <yangfeng@kylinos.cn>
> >
> > All BPF programs either disable CPU preemption or CPU migration,
> > so the bpf_get_smp_processor_id_proto can be safely removed,
> > and the bpf_get_raw_smp_processor_id_proto in bpf_base_func_proto works perfectly.
> >
> > Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
> > ---
> >  include/linux/bpf.h      |  1 -
> >  kernel/bpf/core.c        |  1 -
> >  kernel/bpf/helpers.c     | 12 ------------
> >  kernel/trace/bpf_trace.c |  2 --
> >  net/core/filter.c        |  6 ------
> >  5 files changed, 22 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 3f0cc89c0622..36e525141556 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -3316,7 +3316,6 @@ extern const struct bpf_func_proto bpf_map_peek_elem_proto;
> >  extern const struct bpf_func_proto bpf_map_lookup_percpu_elem_proto;
> >
> >  extern const struct bpf_func_proto bpf_get_prandom_u32_proto;
> > -extern const struct bpf_func_proto bpf_get_smp_processor_id_proto;
> >  extern const struct bpf_func_proto bpf_get_numa_node_id_proto;
> >  extern const struct bpf_func_proto bpf_tail_call_proto;
> >  extern const struct bpf_func_proto bpf_ktime_get_ns_proto;
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index ba6b6118cf50..1ad41a16b86e 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -2943,7 +2943,6 @@ const struct bpf_func_proto bpf_spin_unlock_proto __weak;
> >  const struct bpf_func_proto bpf_jiffies64_proto __weak;
> >
> >  const struct bpf_func_proto bpf_get_prandom_u32_proto __weak;
> > -const struct bpf_func_proto bpf_get_smp_processor_id_proto __weak;
> >  const struct bpf_func_proto bpf_get_numa_node_id_proto __weak;
> >  const struct bpf_func_proto bpf_ktime_get_ns_proto __weak;
> >  const struct bpf_func_proto bpf_ktime_get_boot_ns_proto __weak;
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index e3a2662f4e33..2d2bfb2911f8 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -149,18 +149,6 @@ const struct bpf_func_proto bpf_get_prandom_u32_proto = {
> >         .ret_type       = RET_INTEGER,
> >  };
> >
> > -BPF_CALL_0(bpf_get_smp_processor_id)
> > -{
> > -       return smp_processor_id();
> > -}
> > -
> > -const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
> > -       .func           = bpf_get_smp_processor_id,
> > -       .gpl_only       = false,
> > -       .ret_type       = RET_INTEGER,
> > -       .allow_fastcall = true,
> > -};
> > -
> 
> bpf_get_raw_smp_processor_id_proto doesn't have
> allow_fastcall = true
> 
> so this breaks tests.
> 
> Instead of removing BPF_CALL_0(bpf_get_smp_processor_id)
> we should probably remove BPF_CALL_0(bpf_get_raw_cpu_id)
> and adjust SKF_AD_OFF + SKF_AD_CPU case.
> I don't recall why raw_ version was used back in 2014.
> 

The following two seem to explain the reason:
https://lore.kernel.org/all/7103e2085afa29c006cd5b94a6e4a2ac83efc30d.1467106475.git.daniel@iogearbox.net/
https://lore.kernel.org/all/02fa71ebe1c560cad489967aa29c653b48932596.1474586162.git.daniel@iogearbox.net/


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-04-22  8:04 ` Feng Yang
@ 2025-04-22 14:37   ` Alexei Starovoitov
  0 siblings, 0 replies; 1651+ messages in thread
From: Alexei Starovoitov @ 2025-04-22 14:37 UTC (permalink / raw)
  To: Feng Yang
  Cc: Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Eduard,
	LKML, linux-trace-kernel, Martin KaFai Lau, Network Development,
	Song Liu, Feng Yang, Yonghong Song

On Tue, Apr 22, 2025 at 1:04 AM Feng Yang <yangfeng59949@163.com> wrote:
>
> Subject: Re: [PATCH bpf-next] bpf: Remove bpf_get_smp_processor_id_proto
>
> On Mon, 21 Apr 2025 18:53:07 -0700 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>
> > On Thu, Apr 17, 2025 at 8:41 PM Feng Yang <yangfeng59949@163.com> wrote:
> > >
> > > From: Feng Yang <yangfeng@kylinos.cn>
> > >
> > > All BPF programs either disable CPU preemption or CPU migration,
> > > so the bpf_get_smp_processor_id_proto can be safely removed,
> > > and the bpf_get_raw_smp_processor_id_proto in bpf_base_func_proto works perfectly.
> > >
> > > Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
> > > ---
> > >  include/linux/bpf.h      |  1 -
> > >  kernel/bpf/core.c        |  1 -
> > >  kernel/bpf/helpers.c     | 12 ------------
> > >  kernel/trace/bpf_trace.c |  2 --
> > >  net/core/filter.c        |  6 ------
> > >  5 files changed, 22 deletions(-)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 3f0cc89c0622..36e525141556 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -3316,7 +3316,6 @@ extern const struct bpf_func_proto bpf_map_peek_elem_proto;
> > >  extern const struct bpf_func_proto bpf_map_lookup_percpu_elem_proto;
> > >
> > >  extern const struct bpf_func_proto bpf_get_prandom_u32_proto;
> > > -extern const struct bpf_func_proto bpf_get_smp_processor_id_proto;
> > >  extern const struct bpf_func_proto bpf_get_numa_node_id_proto;
> > >  extern const struct bpf_func_proto bpf_tail_call_proto;
> > >  extern const struct bpf_func_proto bpf_ktime_get_ns_proto;
> > > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > > index ba6b6118cf50..1ad41a16b86e 100644
> > > --- a/kernel/bpf/core.c
> > > +++ b/kernel/bpf/core.c
> > > @@ -2943,7 +2943,6 @@ const struct bpf_func_proto bpf_spin_unlock_proto __weak;
> > >  const struct bpf_func_proto bpf_jiffies64_proto __weak;
> > >
> > >  const struct bpf_func_proto bpf_get_prandom_u32_proto __weak;
> > > -const struct bpf_func_proto bpf_get_smp_processor_id_proto __weak;
> > >  const struct bpf_func_proto bpf_get_numa_node_id_proto __weak;
> > >  const struct bpf_func_proto bpf_ktime_get_ns_proto __weak;
> > >  const struct bpf_func_proto bpf_ktime_get_boot_ns_proto __weak;
> > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > > index e3a2662f4e33..2d2bfb2911f8 100644
> > > --- a/kernel/bpf/helpers.c
> > > +++ b/kernel/bpf/helpers.c
> > > @@ -149,18 +149,6 @@ const struct bpf_func_proto bpf_get_prandom_u32_proto = {
> > >         .ret_type       = RET_INTEGER,
> > >  };
> > >
> > > -BPF_CALL_0(bpf_get_smp_processor_id)
> > > -{
> > > -       return smp_processor_id();
> > > -}
> > > -
> > > -const struct bpf_func_proto bpf_get_smp_processor_id_proto = {
> > > -       .func           = bpf_get_smp_processor_id,
> > > -       .gpl_only       = false,
> > > -       .ret_type       = RET_INTEGER,
> > > -       .allow_fastcall = true,
> > > -};
> > > -
> >
> > bpf_get_raw_smp_processor_id_proto doesn't have
> > allow_fastcall = true
> >
> > so this breaks tests.
> >
> > Instead of removing BPF_CALL_0(bpf_get_smp_processor_id)
> > we should probably remove BPF_CALL_0(bpf_get_raw_cpu_id)
> > and adjust SKF_AD_OFF + SKF_AD_CPU case.
> > I don't recall why raw_ version was used back in 2014.
> >
>
> The following two seem to explain the reason:
> https://lore.kernel.org/all/7103e2085afa29c006cd5b94a6e4a2ac83efc30d.1467106475.git.daniel@iogearbox.net/
> https://lore.kernel.org/all/02fa71ebe1c560cad489967aa29c653b48932596.1474586162.git.daniel@iogearbox.net/
>

Ahh. socket filters run in RCU CS. They don't disable preemption or migration.
Then let's keep things as-is.
We still want debugging provided by smp_processor_id().
If we switch everything to raw_ may miss things. Like this example with
socket filters.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-04-24  0:40 Cong Wang
  2025-04-24  0:59 ` Jiayuan Chen
  0 siblings, 1 reply; 1651+ messages in thread
From: Cong Wang @ 2025-04-24  0:40 UTC (permalink / raw)
  To: jiayuan.chen; +Cc: john.fastabend, jakub, netdev, bpf

netdev@vger.kernel.org, bpf@vger.kernel.org
Bcc: 
Subject: test_sockmap failures on the latest bpf-next
Reply-To: 

Hi all,

The latest bpf-next failed on test_sockmap tests, I got the following
failures (including 1 kernel warning). It is 100% reproducible here.

I don't have time to look into them, a quick glance at the changelog
shows quite some changes from Jiayuan. So please take a look, Jiayuan.

Meanwhile, please let me know if you need more information from me.

Thanks!

--------------->

[root@localhost bpf]# ./test_sockmap 
# 1/ 6  sockmap::txmsg test passthrough:OK
# 2/ 6  sockmap::txmsg test redirect:OK
# 3/ 2  sockmap::txmsg test redirect wait send mem:OK
# 4/ 6  sockmap::txmsg test drop:OK
[  182.498017] perf: interrupt took too long (3406 > 3238), lowering kernel.perf_event_max_sample_rate to 58500
# 5/ 6  sockmap::txmsg test ingress redirect:OK
# 6/ 7  sockmap::txmsg test skb:OK
# 7/12  sockmap::txmsg test apply:OK
# 8/12  sockmap::txmsg test cork:OK
# 9/ 3  sockmap::txmsg test hanging corks:OK
#10/11  sockmap::txmsg test push_data:OK
#11/17  sockmap::txmsg test pull-data:OK
#12/ 9  sockmap::txmsg test pop-data:OK
#13/ 6  sockmap::txmsg test push/pop data:OK
#14/ 1  sockmap::txmsg test ingress parser:OK
#15/ 1  sockmap::txmsg test ingress parser2:OK
#16/ 6 sockhash::txmsg test passthrough:OK
#17/ 6 sockhash::txmsg test redirect:OK
#18/ 2 sockhash::txmsg test redirect wait send mem:OK
#19/ 6 sockhash::txmsg test drop:OK
#20/ 6 sockhash::txmsg test ingress redirect:OK
#21/ 7 sockhash::txmsg test skb:OK
#22/12 sockhash::txmsg test apply:OK
#23/12 sockhash::txmsg test cork:OK
#24/ 3 sockhash::txmsg test hanging corks:OK
#25/11 sockhash::txmsg test push_data:OK
#26/17 sockhash::txmsg test pull-data:OK
#27/ 9 sockhash::txmsg test pop-data:OK
#28/ 6 sockhash::txmsg test push/pop data:OK
#29/ 1 sockhash::txmsg test ingress parser:OK
#30/ 1 sockhash::txmsg test ingress parser2:OK
#31/ 6 sockhash:ktls:txmsg test passthrough:OK
#32/ 6 sockhash:ktls:txmsg test redirect:OK
#33/ 2 sockhash:ktls:txmsg test redirect wait send mem:OK
[  263.509707] ------------[ cut here ]------------
[  263.510439] WARNING: CPU: 1 PID: 40 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x173/0x1d5
[  263.511450] CPU: 1 UID: 0 PID: 40 Comm: kworker/1:1 Tainted: G        W           6.15.0-rc3+ #238 PREEMPT(voluntary) 
[  263.512683] Tainted: [W]=WARN
[  263.513062] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[  263.514763] Workqueue: events sk_psock_destroy
[  263.515332] RIP: 0010:inet_sock_destruct+0x173/0x1d5
[  263.515916] Code: e8 dc dc 3f ff 41 83 bc 24 c0 02 00 00 00 74 02 0f 0b 49 8d bc 24 ac 02 00 00 e8 c2 dc 3f ff 41 83 bc 24 ac 02 00 00 00 74 02 <0f> 0b e8 c7 95 3d 00 49 8d bc 24 b0 05 00 00 e8 c0 dd 3f ff 49 8b
[  263.518899] RSP: 0018:ffff8880085cfc18 EFLAGS: 00010202
[  263.519596] RAX: 1ffff11003dbfc00 RBX: ffff88801edfe3e8 RCX: ffffffff822f5af4
[  263.520502] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88801edfe16c
[  263.522128] RBP: ffff88801edfe184 R08: ffffed1003dbfc31 R09: 0000000000000000
[  263.523008] R10: ffffffff822f5ab7 R11: ffff88801edfe187 R12: ffff88801edfdec0
[  263.523822] R13: ffff888020376ac0 R14: ffff888020376ac0 R15: ffff888020376a60
[  263.524682] FS:  0000000000000000(0000) GS:ffff8880b0e88000(0000) knlGS:0000000000000000
[  263.525999] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  263.526765] CR2: 0000556365155830 CR3: 000000001d6aa000 CR4: 0000000000350ef0
[  263.527700] Call Trace:
[  263.528037]  <TASK>
[  263.528339]  __sk_destruct+0x46/0x222
[  263.528856]  sk_psock_destroy+0x22f/0x242
[  263.529471]  process_one_work+0x504/0x8a8
[  263.530029]  ? process_one_work+0x39d/0x8a8
[  263.530587]  ? __pfx_process_one_work+0x10/0x10
[  263.531195]  ? worker_thread+0x44/0x2ae
[  263.531721]  ? __list_add_valid_or_report+0x83/0xea
[  263.532395]  ? srso_return_thunk+0x5/0x5f
[  263.532929]  ? __list_add+0x45/0x52
[  263.533482]  process_scheduled_works+0x73/0x82
[  263.534079]  worker_thread+0x1ce/0x2ae
[  263.534582]  ? _raw_spin_unlock_irqrestore+0x2e/0x44
[  263.535243]  ? __pfx_worker_thread+0x10/0x10
[  263.535822]  kthread+0x32a/0x33c
[  263.536278]  ? kthread+0x13c/0x33c
[  263.536724]  ? __pfx_kthread+0x10/0x10
[  263.537225]  ? srso_return_thunk+0x5/0x5f
[  263.537869]  ? find_held_lock+0x2b/0x75
[  263.538388]  ? __pfx_kthread+0x10/0x10
[  263.538866]  ? srso_return_thunk+0x5/0x5f
[  263.539523]  ? local_clock_noinstr+0x32/0x9c
[  263.540128]  ? srso_return_thunk+0x5/0x5f
[  263.540677]  ? srso_return_thunk+0x5/0x5f
[  263.541228]  ? __lock_release+0xd3/0x1ad
[  263.541890]  ? srso_return_thunk+0x5/0x5f
[  263.542442]  ? tracer_hardirqs_on+0x17/0x149
[  263.543047]  ? _raw_spin_unlock_irq+0x24/0x39
[  263.543589]  ? __pfx_kthread+0x10/0x10
[  263.544069]  ? __pfx_kthread+0x10/0x10
[  263.544543]  ret_from_fork+0x21/0x41
[  263.545000]  ? __pfx_kthread+0x10/0x10
[  263.545557]  ret_from_fork_asm+0x1a/0x30
[  263.546095]  </TASK>
[  263.546374] irq event stamp: 1094079
[  263.546798] hardirqs last  enabled at (1094089): [<ffffffff813be0f6>] __up_console_sem+0x47/0x4e
[  263.547762] hardirqs last disabled at (1094098): [<ffffffff813be0d6>] __up_console_sem+0x27/0x4e
[  263.548817] softirqs last  enabled at (1093692): [<ffffffff812f2906>] handle_softirqs+0x48c/0x4de
[  263.550127] softirqs last disabled at (1094117): [<ffffffff812f29b3>] __irq_exit_rcu+0x4b/0xc3
[  263.551104] ---[ end trace 0000000000000000 ]---
#34/ 6 sockhash:ktls:txmsg test drop:OK
#35/ 6 sockhash:ktls:txmsg test ingress redirect:OK
#36/ 7 sockhash:ktls:txmsg test skb:OK
#37/12 sockhash:ktls:txmsg test apply:OK
[  278.915147] perf: interrupt took too long (4331 > 4257), lowering kernel.perf_event_max_sample_rate to 46000
[  282.974989] test_sockmap (1077) used greatest stack depth: 25072 bytes left
#38/12 sockhash:ktls:txmsg test cork:OK
#39/ 3 sockhash:ktls:txmsg test hanging corks:OK
#40/11 sockhash:ktls:txmsg test push_data:OK
#41/17 sockhash:ktls:txmsg test pull-data:OK
recv failed(): Invalid argument
rx thread exited with err 1.
recv failed(): Invalid argument
rx thread exited with err 1.
recv failed(): Bad message
rx thread exited with err 1.
#42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
recv failed(): Bad message
rx thread exited with err 1.
recv failed(): Message too long
rx thread exited with err 1.
#43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
#44/ 1 sockhash:ktls:txmsg test ingress parser:OK
#45/ 0 sockhash:ktls:txmsg test ingress parser2:OK
Pass: 43 Fail: 5



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-04-24  0:40 Cong Wang
@ 2025-04-24  0:59 ` Jiayuan Chen
  2025-04-24  9:19   ` Re: Jiayuan Chen
  0 siblings, 1 reply; 1651+ messages in thread
From: Jiayuan Chen @ 2025-04-24  0:59 UTC (permalink / raw)
  To: Cong Wang; +Cc: john.fastabend, jakub, netdev, bpf

April 24, 2025 at 08:40, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:



> 
> netdev@vger.kernel.org, bpf@vger.kernel.org
> 
> Bcc: 
> 
> Subject: test_sockmap failures on the latest bpf-next
> 
> Reply-To: 
> 
> Hi all,
> 
> The latest bpf-next failed on test_sockmap tests, I got the following
> 
> failures (including 1 kernel warning). It is 100% reproducible here.
> 
> I don't have time to look into them, a quick glance at the changelog
> 
> shows quite some changes from Jiayuan. So please take a look, Jiayuan.
> 
> Meanwhile, please let me know if you need more information from me.
> 
> Thanks!
> 
> --------------->

Thanks, I'm working on it.

> 
> [root@localhost bpf]# ./test_sockmap 
> 
> # 1/ 6 sockmap::txmsg test passthrough:OK
> 
> # 2/ 6 sockmap::txmsg test redirect:OK
> 
> # 3/ 2 sockmap::txmsg test redirect wait send mem:OK
> 
> # 4/ 6 sockmap::txmsg test drop:OK
> 
> [ 182.498017] perf: interrupt took too long (3406 > 3238), lowering kernel.perf_event_max_sample_rate to 58500
> 
> # 5/ 6 sockmap::txmsg test ingress redirect:OK
> 
> # 6/ 7 sockmap::txmsg test skb:OK
> 
> # 7/12 sockmap::txmsg test apply:OK
> 
> # 8/12 sockmap::txmsg test cork:OK
> 
> # 9/ 3 sockmap::txmsg test hanging corks:OK
> 
> #10/11 sockmap::txmsg test push_data:OK
> 
> #11/17 sockmap::txmsg test pull-data:OK
> 
> #12/ 9 sockmap::txmsg test pop-data:OK
> 
> #13/ 6 sockmap::txmsg test push/pop data:OK
> 
> #14/ 1 sockmap::txmsg test ingress parser:OK
> 
> #15/ 1 sockmap::txmsg test ingress parser2:OK
> 
> #16/ 6 sockhash::txmsg test passthrough:OK
> 
> #17/ 6 sockhash::txmsg test redirect:OK
> 
> #18/ 2 sockhash::txmsg test redirect wait send mem:OK
> 
> #19/ 6 sockhash::txmsg test drop:OK
> 
> #20/ 6 sockhash::txmsg test ingress redirect:OK
> 
> #21/ 7 sockhash::txmsg test skb:OK
> 
> #22/12 sockhash::txmsg test apply:OK
> 
> #23/12 sockhash::txmsg test cork:OK
> 
> #24/ 3 sockhash::txmsg test hanging corks:OK
> 
> #25/11 sockhash::txmsg test push_data:OK
> 
> #26/17 sockhash::txmsg test pull-data:OK
> 
> #27/ 9 sockhash::txmsg test pop-data:OK
> 
> #28/ 6 sockhash::txmsg test push/pop data:OK
> 
> #29/ 1 sockhash::txmsg test ingress parser:OK
> 
> #30/ 1 sockhash::txmsg test ingress parser2:OK
> 
> #31/ 6 sockhash:ktls:txmsg test passthrough:OK
> 
> #32/ 6 sockhash:ktls:txmsg test redirect:OK
> 
> #33/ 2 sockhash:ktls:txmsg test redirect wait send mem:OK
> 
> [ 263.509707] ------------[ cut here ]------------
> 
> [ 263.510439] WARNING: CPU: 1 PID: 40 at net/ipv4/af_inet.c:156 inet_sock_destruct+0x173/0x1d5
> 
> [ 263.511450] CPU: 1 UID: 0 PID: 40 Comm: kworker/1:1 Tainted: G W 6.15.0-rc3+ #238 PREEMPT(voluntary) 
> 
> [ 263.512683] Tainted: [W]=WARN
> 
> [ 263.513062] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> 
> [ 263.514763] Workqueue: events sk_psock_destroy
> 
> [ 263.515332] RIP: 0010:inet_sock_destruct+0x173/0x1d5
> 
> [ 263.515916] Code: e8 dc dc 3f ff 41 83 bc 24 c0 02 00 00 00 74 02 0f 0b 49 8d bc 24 ac 02 00 00 e8 c2 dc 3f ff 41 83 bc 24 ac 02 00 00 00 74 02 <0f> 0b e8 c7 95 3d 00 49 8d bc 24 b0 05 00 00 e8 c0 dd 3f ff 49 8b
> 
> [ 263.518899] RSP: 0018:ffff8880085cfc18 EFLAGS: 00010202
> 
> [ 263.519596] RAX: 1ffff11003dbfc00 RBX: ffff88801edfe3e8 RCX: ffffffff822f5af4
> 
> [ 263.520502] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88801edfe16c
> 
> [ 263.522128] RBP: ffff88801edfe184 R08: ffffed1003dbfc31 R09: 0000000000000000
> 
> [ 263.523008] R10: ffffffff822f5ab7 R11: ffff88801edfe187 R12: ffff88801edfdec0
> 
> [ 263.523822] R13: ffff888020376ac0 R14: ffff888020376ac0 R15: ffff888020376a60
> 
> [ 263.524682] FS: 0000000000000000(0000) GS:ffff8880b0e88000(0000) knlGS:0000000000000000
> 
> [ 263.525999] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 
> [ 263.526765] CR2: 0000556365155830 CR3: 000000001d6aa000 CR4: 0000000000350ef0
> 
> [ 263.527700] Call Trace:
> 
> [ 263.528037] <TASK>
> 
> [ 263.528339] __sk_destruct+0x46/0x222
> 
> [ 263.528856] sk_psock_destroy+0x22f/0x242
> 
> [ 263.529471] process_one_work+0x504/0x8a8
> 
> [ 263.530029] ? process_one_work+0x39d/0x8a8
> 
> [ 263.530587] ? __pfx_process_one_work+0x10/0x10
> 
> [ 263.531195] ? worker_thread+0x44/0x2ae
> 
> [ 263.531721] ? __list_add_valid_or_report+0x83/0xea
> 
> [ 263.532395] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.532929] ? __list_add+0x45/0x52
> 
> [ 263.533482] process_scheduled_works+0x73/0x82
> 
> [ 263.534079] worker_thread+0x1ce/0x2ae
> 
> [ 263.534582] ? _raw_spin_unlock_irqrestore+0x2e/0x44
> 
> [ 263.535243] ? __pfx_worker_thread+0x10/0x10
> 
> [ 263.535822] kthread+0x32a/0x33c
> 
> [ 263.536278] ? kthread+0x13c/0x33c
> 
> [ 263.536724] ? __pfx_kthread+0x10/0x10
> 
> [ 263.537225] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.537869] ? find_held_lock+0x2b/0x75
> 
> [ 263.538388] ? __pfx_kthread+0x10/0x10
> 
> [ 263.538866] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.539523] ? local_clock_noinstr+0x32/0x9c
> 
> [ 263.540128] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.540677] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.541228] ? __lock_release+0xd3/0x1ad
> 
> [ 263.541890] ? srso_return_thunk+0x5/0x5f
> 
> [ 263.542442] ? tracer_hardirqs_on+0x17/0x149
> 
> [ 263.543047] ? _raw_spin_unlock_irq+0x24/0x39
> 
> [ 263.543589] ? __pfx_kthread+0x10/0x10
> 
> [ 263.544069] ? __pfx_kthread+0x10/0x10
> 
> [ 263.544543] ret_from_fork+0x21/0x41
> 
> [ 263.545000] ? __pfx_kthread+0x10/0x10
> 
> [ 263.545557] ret_from_fork_asm+0x1a/0x30
> 
> [ 263.546095] </TASK>
> 
> [ 263.546374] irq event stamp: 1094079
> 
> [ 263.546798] hardirqs last enabled at (1094089): [<ffffffff813be0f6>] __up_console_sem+0x47/0x4e
> 
> [ 263.547762] hardirqs last disabled at (1094098): [<ffffffff813be0d6>] __up_console_sem+0x27/0x4e
> 
> [ 263.548817] softirqs last enabled at (1093692): [<ffffffff812f2906>] handle_softirqs+0x48c/0x4de
> 
> [ 263.550127] softirqs last disabled at (1094117): [<ffffffff812f29b3>] __irq_exit_rcu+0x4b/0xc3
> 
> [ 263.551104] ---[ end trace 0000000000000000 ]---
> 
> #34/ 6 sockhash:ktls:txmsg test drop:OK
> 
> #35/ 6 sockhash:ktls:txmsg test ingress redirect:OK
> 
> #36/ 7 sockhash:ktls:txmsg test skb:OK
> 
> #37/12 sockhash:ktls:txmsg test apply:OK
> 
> [ 278.915147] perf: interrupt took too long (4331 > 4257), lowering kernel.perf_event_max_sample_rate to 46000
> 
> [ 282.974989] test_sockmap (1077) used greatest stack depth: 25072 bytes left
> 
> #38/12 sockhash:ktls:txmsg test cork:OK
> 
> #39/ 3 sockhash:ktls:txmsg test hanging corks:OK
> 
> #40/11 sockhash:ktls:txmsg test push_data:OK
> 
> #41/17 sockhash:ktls:txmsg test pull-data:OK
> 
> recv failed(): Invalid argument
> 
> rx thread exited with err 1.
> 
> recv failed(): Invalid argument
> 
> rx thread exited with err 1.
> 
> recv failed(): Bad message
> 
> rx thread exited with err 1.
> 
> #42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
> 
> recv failed(): Bad message
> 
> rx thread exited with err 1.
> 
> recv failed(): Message too long
> 
> rx thread exited with err 1.
> 
> #43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
> 
> #44/ 1 sockhash:ktls:txmsg test ingress parser:OK
> 
> #45/ 0 sockhash:ktls:txmsg test ingress parser2:OK
> 
> Pass: 43 Fail: 5
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-04-24  0:59 ` Jiayuan Chen
@ 2025-04-24  9:19   ` Jiayuan Chen
  0 siblings, 0 replies; 1651+ messages in thread
From: Jiayuan Chen @ 2025-04-24  9:19 UTC (permalink / raw)
  To: Cong Wang; +Cc: john.fastabend, jakub, netdev, bpf

April 24, 2025 at 08:59, "Jiayuan Chen" <jiayuan.chen@linux.dev> wrote:

> 
> April 24, 2025 at 08:40, "Cong Wang" <xiyou.wangcong@gmail.com> wrote:
> 
> > 
> > netdev@vger.kernel.org, bpf@vger.kernel.org
> > 
> >  Bcc: 
> > 
> > 
> >  Subject: test_sockmap failures on the latest bpf-next
> > 
> >  Reply-To: 
> > 
> >  
> > 
> >  Hi all,
> > 
> >  
> > 
> >  The latest bpf-next failed on test_sockmap tests, I got the following
> > 
> >  failures (including 1 kernel warning). It is 100% reproducible here.
> > 
> >  I don't have time to look into them, a quick glance at the changelog
> >  
> >  shows quite some changes from Jiayuan. So please take a look, Jiayuan.
> > 
> >  Meanwhile, please let me know if you need more information from me.
> > 
> >  Thanks!
> > 
> >  
> > 
> >  --------------->
> > 
> 
> Thanks, I'm working on it.
> 

After resetting my commit to 0bb2f7a1ad1f, which is before my changes, the warning still exists.

The warning originates from test_txmsg_redir_wait_sndmem(), which performs
'KTLS + sockmap with redir EGRESS and limited receive buffer'.

The memory charge/uncharge logic is problematic, I need some time to investigate and fix it.

Thanks.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-05-09 17:38 Shawn Anastasio
  2025-05-10 19:50 ` Trilok Soni
  0 siblings, 1 reply; 1651+ messages in thread
From: Shawn Anastasio @ 2025-05-09 17:38 UTC (permalink / raw)
  To: linux-pci, Lukas Wunner, Krishna Chaitanya Chundru
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, chaitanya chundru, Bjorn Andersson, Konrad Dybcio,
	cros-qcom-dts-watchers, Jingoo Han, Bartosz Golaszewski,
	quic_vbadigan, amitk, devicetree, linux-kernel, linux-arm-msm,
	jorge.ramirez, Dmitry Baryshkov, Timothy Pearson, Shawn Anastasio

From: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>

Date: Sat, 12 Apr 2025 07:19:56 +0530
Subject: [PATCH v6] PCI: PCI: Add pcie_link_is_active() to determine if the
 PCIe link is active

Introduce a common API to check if the PCIe link is active, replacing
duplicate code in multiple locations.

Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
---
This is an updated patch pulled from Krishna's v5 series:
https://patchwork.kernel.org/project/linux-pci/list/?series=952665

The following changes to Krishna's v5 were made by me:
  - Revert pcie_link_is_active return type back to int per Lukas' review
    comments
  - Handle non-zero error returns at call site of the new function in
    pci.c/pci_bridge_wait_for_secondary_bus

 drivers/pci/hotplug/pciehp.h      |  1 -
 drivers/pci/hotplug/pciehp_ctrl.c |  2 +-
 drivers/pci/hotplug/pciehp_hpc.c  | 33 +++----------------------------
 drivers/pci/pci.c                 | 26 +++++++++++++++++++++---
 include/linux/pci.h               |  4 ++++
 5 files changed, 31 insertions(+), 35 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 273dd8c66f4e..acef728530e3 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -186,7 +186,6 @@ int pciehp_query_power_fault(struct controller *ctrl);
 int pciehp_card_present(struct controller *ctrl);
 int pciehp_card_present_or_link_active(struct controller *ctrl);
 int pciehp_check_link_status(struct controller *ctrl);
-int pciehp_check_link_active(struct controller *ctrl);
 void pciehp_release_ctrl(struct controller *ctrl);

 int pciehp_sysfs_enable_slot(struct hotplug_slot *hotplug_slot);
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index d603a7aa7483..4bb58ba1c766 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -260,7 +260,7 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
 	/* Turn the slot on if it's occupied or link is up */
 	mutex_lock(&ctrl->state_lock);
 	present = pciehp_card_present(ctrl);
-	link_active = pciehp_check_link_active(ctrl);
+	link_active = pcie_link_is_active(ctrl->pcie->port);
 	if (present <= 0 && link_active <= 0) {
 		if (ctrl->state == BLINKINGON_STATE) {
 			ctrl->state = OFF_STATE;
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 8a09fb6083e2..278bc21d531d 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -221,33 +221,6 @@ static void pcie_write_cmd_nowait(struct controller *ctrl, u16 cmd, u16 mask)
 	pcie_do_write_cmd(ctrl, cmd, mask, false);
 }

-/**
- * pciehp_check_link_active() - Is the link active
- * @ctrl: PCIe hotplug controller
- *
- * Check whether the downstream link is currently active. Note it is
- * possible that the card is removed immediately after this so the
- * caller may need to take it into account.
- *
- * If the hotplug controller itself is not available anymore returns
- * %-ENODEV.
- */
-int pciehp_check_link_active(struct controller *ctrl)
-{
-	struct pci_dev *pdev = ctrl_dev(ctrl);
-	u16 lnk_status;
-	int ret;
-
-	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
-	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
-		return -ENODEV;
-
-	ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
-	ctrl_dbg(ctrl, "%s: lnk_status = %x\n", __func__, lnk_status);
-
-	return ret;
-}
-
 static bool pci_bus_check_dev(struct pci_bus *bus, int devfn)
 {
 	u32 l;
@@ -467,7 +440,7 @@ int pciehp_card_present_or_link_active(struct controller *ctrl)
 	if (ret)
 		return ret;

-	return pciehp_check_link_active(ctrl);
+	return pcie_link_is_active(ctrl_dev(ctrl));
 }

 int pciehp_query_power_fault(struct controller *ctrl)
@@ -584,7 +557,7 @@ static void pciehp_ignore_dpc_link_change(struct controller *ctrl,
 	 * Synthesize it to ensure that it is acted on.
 	 */
 	down_read_nested(&ctrl->reset_lock, ctrl->depth);
-	if (!pciehp_check_link_active(ctrl))
+	if (!pcie_link_is_active(ctrl_dev(ctrl)))
 		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
 	up_read(&ctrl->reset_lock);
 }
@@ -884,7 +857,7 @@ int pciehp_slot_reset(struct pcie_device *dev)
 	pcie_capability_write_word(dev->port, PCI_EXP_SLTSTA,
 				   PCI_EXP_SLTSTA_DLLSC);

-	if (!pciehp_check_link_active(ctrl))
+	if (!pcie_link_is_active(ctrl_dev(ctrl)))
 		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);

 	return 0;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e77d5b53c0ce..3bb8354b14bf 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4926,7 +4926,6 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
 		return 0;

 	if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
-		u16 status;

 		pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
 		msleep(delay);
@@ -4942,8 +4941,7 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
 		if (!dev->link_active_reporting)
 			return -ENOTTY;

-		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
-		if (!(status & PCI_EXP_LNKSTA_DLLLA))
+		if (pcie_link_is_active(dev) <= 0)
 			return -ENOTTY;

 		return pci_dev_wait(child, reset_type,
@@ -6247,6 +6245,28 @@ void pcie_print_link_status(struct pci_dev *dev)
 }
 EXPORT_SYMBOL(pcie_print_link_status);

+/**
+ * pcie_link_is_active() - Checks if the link is active or not
+ * @pdev: PCI device to query
+ *
+ * Check whether the link is active or not.
+ *
+ * Return: link state, or -ENODEV if the config read failes.
+ */
+int pcie_link_is_active(struct pci_dev *pdev)
+{
+	u16 lnk_status;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
+	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
+		return -ENODEV;
+
+	pci_dbg(pdev, "lnk_status = %x\n", lnk_status);
+	return !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
+}
+EXPORT_SYMBOL(pcie_link_is_active);
+
 /**
  * pci_select_bars - Make BAR mask from the type of resource
  * @dev: the PCI device for which BAR mask is made
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 51e2bd6405cd..a79a9919320c 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1945,6 +1945,7 @@ pci_release_mem_regions(struct pci_dev *pdev)
 			    pci_select_bars(pdev, IORESOURCE_MEM));
 }

+int pcie_link_is_active(struct pci_dev *dev);
 #else /* CONFIG_PCI is not enabled */

 static inline void pci_set_flags(int flags) { }
@@ -2093,6 +2094,9 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
 {
 	return -ENOSPC;
 }
+
+static inline bool pcie_link_is_active(struct pci_dev *dev)
+{ return false; }
 #endif /* CONFIG_PCI */

 /* Include architecture-dependent settings and functions */
--
2.30.2


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2025-05-09 17:38 Shawn Anastasio
@ 2025-05-10 19:50 ` Trilok Soni
  0 siblings, 0 replies; 1651+ messages in thread
From: Trilok Soni @ 2025-05-10 19:50 UTC (permalink / raw)
  To: Shawn Anastasio, linux-pci, Lukas Wunner,
	Krishna Chaitanya Chundru
  Cc: Bjorn Helgaas, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, chaitanya chundru, Bjorn Andersson, Konrad Dybcio,
	cros-qcom-dts-watchers, Jingoo Han, Bartosz Golaszewski,
	quic_vbadigan, amitk, devicetree, linux-kernel, linux-arm-msm,
	jorge.ramirez, Dmitry Baryshkov, Timothy Pearson

On 5/9/2025 10:38 AM, Shawn Anastasio wrote:
> From: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
> 
> Date: Sat, 12 Apr 2025 07:19:56 +0530
> Subject: [PATCH v6] PCI: PCI: Add pcie_link_is_active() to determine if the
>  PCIe link is active

I don't understand this patch and it doesn't have any subject in email. Please fix. 

> 
> Introduce a common API to check if the PCIe link is active, replacing
> duplicate code in multiple locations.
> 
> Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
> Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
> ---
> This is an updated patch pulled from Krishna's v5 series:
> https://patchwork.kernel.org/project/linux-pci/list/?series=952665
> 
> The following changes to Krishna's v5 were made by me:
>   - Revert pcie_link_is_active return type back to int per Lukas' review
>     comments
>   - Handle non-zero error returns at call site of the new function in
>     pci.c/pci_bridge_wait_for_secondary_bus
> 
>  drivers/pci/hotplug/pciehp.h      |  1 -
>  drivers/pci/hotplug/pciehp_ctrl.c |  2 +-
>  drivers/pci/hotplug/pciehp_hpc.c  | 33 +++----------------------------
>  drivers/pci/pci.c                 | 26 +++++++++++++++++++++---
>  include/linux/pci.h               |  4 ++++
>  5 files changed, 31 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
> index 273dd8c66f4e..acef728530e3 100644
> --- a/drivers/pci/hotplug/pciehp.h
> +++ b/drivers/pci/hotplug/pciehp.h
> @@ -186,7 +186,6 @@ int pciehp_query_power_fault(struct controller *ctrl);
>  int pciehp_card_present(struct controller *ctrl);
>  int pciehp_card_present_or_link_active(struct controller *ctrl);
>  int pciehp_check_link_status(struct controller *ctrl);
> -int pciehp_check_link_active(struct controller *ctrl);
>  void pciehp_release_ctrl(struct controller *ctrl);
> 
>  int pciehp_sysfs_enable_slot(struct hotplug_slot *hotplug_slot);
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index d603a7aa7483..4bb58ba1c766 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -260,7 +260,7 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  	/* Turn the slot on if it's occupied or link is up */
>  	mutex_lock(&ctrl->state_lock);
>  	present = pciehp_card_present(ctrl);
> -	link_active = pciehp_check_link_active(ctrl);
> +	link_active = pcie_link_is_active(ctrl->pcie->port);
>  	if (present <= 0 && link_active <= 0) {
>  		if (ctrl->state == BLINKINGON_STATE) {
>  			ctrl->state = OFF_STATE;
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index 8a09fb6083e2..278bc21d531d 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -221,33 +221,6 @@ static void pcie_write_cmd_nowait(struct controller *ctrl, u16 cmd, u16 mask)
>  	pcie_do_write_cmd(ctrl, cmd, mask, false);
>  }
> 
> -/**
> - * pciehp_check_link_active() - Is the link active
> - * @ctrl: PCIe hotplug controller
> - *
> - * Check whether the downstream link is currently active. Note it is
> - * possible that the card is removed immediately after this so the
> - * caller may need to take it into account.
> - *
> - * If the hotplug controller itself is not available anymore returns
> - * %-ENODEV.
> - */
> -int pciehp_check_link_active(struct controller *ctrl)
> -{
> -	struct pci_dev *pdev = ctrl_dev(ctrl);
> -	u16 lnk_status;
> -	int ret;
> -
> -	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
> -	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
> -		return -ENODEV;
> -
> -	ret = !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
> -	ctrl_dbg(ctrl, "%s: lnk_status = %x\n", __func__, lnk_status);
> -
> -	return ret;
> -}
> -
>  static bool pci_bus_check_dev(struct pci_bus *bus, int devfn)
>  {
>  	u32 l;
> @@ -467,7 +440,7 @@ int pciehp_card_present_or_link_active(struct controller *ctrl)
>  	if (ret)
>  		return ret;
> 
> -	return pciehp_check_link_active(ctrl);
> +	return pcie_link_is_active(ctrl_dev(ctrl));
>  }
> 
>  int pciehp_query_power_fault(struct controller *ctrl)
> @@ -584,7 +557,7 @@ static void pciehp_ignore_dpc_link_change(struct controller *ctrl,
>  	 * Synthesize it to ensure that it is acted on.
>  	 */
>  	down_read_nested(&ctrl->reset_lock, ctrl->depth);
> -	if (!pciehp_check_link_active(ctrl))
> +	if (!pcie_link_is_active(ctrl_dev(ctrl)))
>  		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
>  	up_read(&ctrl->reset_lock);
>  }
> @@ -884,7 +857,7 @@ int pciehp_slot_reset(struct pcie_device *dev)
>  	pcie_capability_write_word(dev->port, PCI_EXP_SLTSTA,
>  				   PCI_EXP_SLTSTA_DLLSC);
> 
> -	if (!pciehp_check_link_active(ctrl))
> +	if (!pcie_link_is_active(ctrl_dev(ctrl)))
>  		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
> 
>  	return 0;
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e77d5b53c0ce..3bb8354b14bf 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4926,7 +4926,6 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
>  		return 0;
> 
>  	if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
> -		u16 status;
> 
>  		pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
>  		msleep(delay);
> @@ -4942,8 +4941,7 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type)
>  		if (!dev->link_active_reporting)
>  			return -ENOTTY;
> 
> -		pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &status);
> -		if (!(status & PCI_EXP_LNKSTA_DLLLA))
> +		if (pcie_link_is_active(dev) <= 0)
>  			return -ENOTTY;
> 
>  		return pci_dev_wait(child, reset_type,
> @@ -6247,6 +6245,28 @@ void pcie_print_link_status(struct pci_dev *dev)
>  }
>  EXPORT_SYMBOL(pcie_print_link_status);
> 
> +/**
> + * pcie_link_is_active() - Checks if the link is active or not
> + * @pdev: PCI device to query
> + *
> + * Check whether the link is active or not.
> + *
> + * Return: link state, or -ENODEV if the config read failes.
> + */
> +int pcie_link_is_active(struct pci_dev *pdev)
> +{
> +	u16 lnk_status;
> +	int ret;
> +
> +	ret = pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status);
> +	if (ret == PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status))
> +		return -ENODEV;
> +
> +	pci_dbg(pdev, "lnk_status = %x\n", lnk_status);
> +	return !!(lnk_status & PCI_EXP_LNKSTA_DLLLA);
> +}
> +EXPORT_SYMBOL(pcie_link_is_active);
> +
>  /**
>   * pci_select_bars - Make BAR mask from the type of resource
>   * @dev: the PCI device for which BAR mask is made
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 51e2bd6405cd..a79a9919320c 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1945,6 +1945,7 @@ pci_release_mem_regions(struct pci_dev *pdev)
>  			    pci_select_bars(pdev, IORESOURCE_MEM));
>  }
> 
> +int pcie_link_is_active(struct pci_dev *dev);
>  #else /* CONFIG_PCI is not enabled */
> 
>  static inline void pci_set_flags(int flags) { }
> @@ -2093,6 +2094,9 @@ pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
>  {
>  	return -ENOSPC;
>  }
> +
> +static inline bool pcie_link_is_active(struct pci_dev *dev)
> +{ return false; }
>  #endif /* CONFIG_PCI */
> 
>  /* Include architecture-dependent settings and functions */
> --
> 2.30.2
> 
> 


-- 
---Trilok Soni

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-05-14 20:21 Nicolas Pitre
  2025-05-15  8:33 ` Jiri Slaby
  0 siblings, 1 reply; 1651+ messages in thread
From: Nicolas Pitre @ 2025-05-14 20:21 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Jiri Slaby; +Cc: npitre, linux-serial, linux-kernel

From 28043dec8352fd857c6878c2ee568620a124b855 Mon Sep 17 00:00:00 2001
From: Nicolas Pitre <nico@fluxnic.net>
Date: Wed, 14 May 2025 15:58:22 -0400
Subject: [PATCH] vt: remove VT_RESIZE and VT_RESIZEX from vt_compat_ioctl()
From: Nicolas Pitre <npitre@baylibre.com>

They are listed amon those cmd values that "treat 'arg' as an integer"
which is wrong. They should instead fall into the default case. Probably
nobody ever exercized that code since 2009 but still.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Fixes: e92166517e3c ("tty: handle VT specific compat ioctls in vt driver")

diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
index 83a3d49535e5..61342e06970a 100644
--- a/drivers/tty/vt/vt_ioctl.c
+++ b/drivers/tty/vt/vt_ioctl.c
@@ -1119,8 +1119,6 @@ long vt_compat_ioctl(struct tty_struct *tty,
 	case VT_WAITACTIVE:
 	case VT_RELDISP:
 	case VT_DISALLOCATE:
-	case VT_RESIZE:
-	case VT_RESIZEX:
 		return vt_ioctl(tty, cmd, arg);
 
 	/*

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2025-05-14 20:21 Nicolas Pitre
@ 2025-05-15  8:33 ` Jiri Slaby
  0 siblings, 0 replies; 1651+ messages in thread
From: Jiri Slaby @ 2025-05-15  8:33 UTC (permalink / raw)
  To: Nicolas Pitre, Greg Kroah-Hartman; +Cc: npitre, linux-serial, linux-kernel

On 14. 05. 25, 22:21, Nicolas Pitre wrote:
>  From 28043dec8352fd857c6878c2ee568620a124b855 Mon Sep 17 00:00:00 2001
> From: Nicolas Pitre <nico@fluxnic.net>
> Date: Wed, 14 May 2025 15:58:22 -0400
> Subject: [PATCH] vt: remove VT_RESIZE and VT_RESIZEX from vt_compat_ioctl()
> From: Nicolas Pitre <npitre@baylibre.com>
> 
> They are listed amon those cmd values that "treat 'arg' as an integer"
> which is wrong. They should instead fall into the default case. Probably
> nobody ever exercized that code since 2009 but still.

AFAICS in the debian code search, exactly noone (except sanitizers, 
strace, fuzzers, valgrind, ...) uses VT_RESIZEX.

VT_RESIZE is used by kbd's resizecons -- and there it's the sole purpose 
to call this ioctl. I wonder how comes noone using 32bit of resizecons 
on 64bit noticed?

Thinking...

Actually, on x86, it doesn't matter if it takes arg (case VT_RESIZE) or 
compat_ptr() (default label) path as both are given the same user pointer...

It matters on s390x, but noone cares about the 32--64bit mix in there, 
apparently.

> Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
> Fixes: e92166517e3c ("tty: handle VT specific compat ioctls in vt driver")

FWIW, the e-mail's Subject is empty.

Reviewed-by: Jiri Slaby <jirislaby@kernel.org>

> diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
> index 83a3d49535e5..61342e06970a 100644
> --- a/drivers/tty/vt/vt_ioctl.c
> +++ b/drivers/tty/vt/vt_ioctl.c
> @@ -1119,8 +1119,6 @@ long vt_compat_ioctl(struct tty_struct *tty,
>   	case VT_WAITACTIVE:
>   	case VT_RELDISP:
>   	case VT_DISALLOCATE:
> -	case VT_RESIZE:
> -	case VT_RESIZEX:
>   		return vt_ioctl(tty, cmd, arg);
>   
>   	/*


-- 
js
suse labs

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-07-01 13:44 Emanuele Ghidoli
  2025-07-11  2:21 ` Fabio Estevam
  0 siblings, 1 reply; 1651+ messages in thread
From: Emanuele Ghidoli @ 2025-07-01 13:44 UTC (permalink / raw)
  To: Francesco Dolcini, Tom Rini, Fabio Estevam; +Cc: Emanuele Ghidoli, u-boot

From: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>

Subject: [PATCH v1 0/5] Enable RNG support for KASLR on Toradex arm64 i.MX SoMs

This patch series enables RNG support to automatically populate /chosen/kaslr-seed on the following Toradex arm64 i.MX System on Modules (SoMs):
- Verdin iMX8MM
- Verdin iMX8MP
- Toradex SMARC iMX8MP
- Apalis iMX8
- Colibri iMX8X

This improves kernel security by supporting Kernel Address Space Layout Randomization (KASLR) using a runtime-provided seed from the hardware RNG.

Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
---
Emanuele Ghidoli (5):
  configs: verdin-imx8mm: enable RNG support for KASLR
  configs: verdin-imx8mp: enable RNG support for KASLR
  configs: toradex-smarc-imx8mp: enable RNG support for KASLR
  configs: apalis-imx8: enable RNG support for KASLR
  configs: colibri-imx8x: enable RNG support for KASLR

 configs/apalis-imx8_defconfig          | 5 ++++-
 configs/colibri-imx8x_defconfig        | 5 ++++-
 configs/toradex-smarc-imx8mp_defconfig | 1 +
 configs/verdin-imx8mm_defconfig        | 5 ++++-
 configs/verdin-imx8mp_defconfig        | 1 +
 5 files changed, 14 insertions(+), 3 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-07-01 13:44 Emanuele Ghidoli
@ 2025-07-11  2:21 ` Fabio Estevam
  0 siblings, 0 replies; 1651+ messages in thread
From: Fabio Estevam @ 2025-07-11  2:21 UTC (permalink / raw)
  To: Emanuele Ghidoli; +Cc: Francesco Dolcini, Tom Rini, Emanuele Ghidoli, u-boot

On Tue, Jul 1, 2025 at 10:45 AM Emanuele Ghidoli
<ghidoliemanuele@gmail.com> wrote:
>
> From: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
>
> Subject: [PATCH v1 0/5] Enable RNG support for KASLR on Toradex arm64 i.MX SoMs
>
> This patch series enables RNG support to automatically populate /chosen/kaslr-seed on the following Toradex arm64 i.MX System on Modules (SoMs):
> - Verdin iMX8MM
> - Verdin iMX8MP
> - Toradex SMARC iMX8MP
> - Apalis iMX8
> - Colibri iMX8X
>
> This improves kernel security by supporting Kernel Address Space Layout Randomization (KASLR) using a runtime-provided seed from the hardware RNG.
>
> Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>

Applied the series, thanks.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CADU64hCr7mshqfBRE2Wp8zf4BHBdJoLLH=VJt2MrHeR+zHOV4w@mail.gmail.com>]

* (no subject)
       [not found] <CADU64hCr7mshqfBRE2Wp8zf4BHBdJoLLH=VJt2MrHeR+zHOV4w@mail.gmail.com>
@ 2025-07-20 18:26 ` >
  2025-07-20 19:30     ` Re: David Lechner
  2025-07-21  7:52     ` Re: Andy Shevchenko
  0 siblings, 2 replies; 1651+ messages in thread
From: > @ 2025-07-20 18:26 UTC (permalink / raw)
  To: linux-kernel, devicetree, linux-iio, netdev, linux-arm-kernel,
	linux-amlogic
  Cc: ribalda, jic23, dlechner, nuno.sa, andy, robh, krzk+dt, conor+dt,
	andrew+netdev, davem, edumazet, kuba, pabeni, neil.armstrong,
	khilman, jbrunet, martin.blumenstingl, sanjay suthar

Changes in v2:
- Fixed commit message grammar
- Fixed subject line style as per DT convention
- Added missing reviewers/maintainers in CC

From 5c00524cbb47e30ee04223fe9502af2eb003ddf1 Mon Sep 17 00:00:00 2001
From: sanjay suthar <sanjaysuthar661996@gmail.com>
Date: Sun, 20 Jul 2025 01:11:00 +0530
Subject: [PATCH v2] dt-bindings: cleanup: fix duplicated 'is is' in YAML docs

Fix minor grammatical issues by removing duplicated "is" in two devicetree
binding documents:

- net/amlogic,meson-dwmac.yaml
- iio/dac/ti,dac7612.yaml

Signed-off-by: sanjay suthar <sanjaysuthar661996@gmail.com>
---
 Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml      | 2 +-
 Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
index 20dd1370660d..624c640be4c8 100644
--- a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
+++ b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
@@ -9,7 +9,7 @@ title: Texas Instruments DAC7612 family of DACs
 description:
   The DAC7612 is a dual, 12-bit digital-to-analog converter (DAC) with
   guaranteed 12-bit monotonicity performance over the industrial temperature
-  range. Is is programmable through an SPI interface.
+  range. It is programmable through an SPI interface.
 
 maintainers:
   - Ricardo Ribalda Delgado <ricardo@ribalda.com>
diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
index 0cd78d71768c..5c91716d1f21 100644
--- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
+++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
@@ -149,7 +149,7 @@ properties:
       - description:
           The first register range should be the one of the DWMAC controller
       - description:
-          The second range is is for the Amlogic specific configuration
+          The second range is for the Amlogic specific configuration
           (for example the PRG_ETHERNET register range on Meson8b and newer)
 
   interrupts:
-- 
2.34.1


_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2025-07-20 18:26 ` >
@ 2025-07-20 19:30     ` David Lechner
  2025-07-21  7:52     ` Re: Andy Shevchenko
  1 sibling, 0 replies; 1651+ messages in thread
From: David Lechner @ 2025-07-20 19:30 UTC (permalink / raw)
  To: >, linux-kernel, devicetree, linux-iio, netdev,
	linux-arm-kernel, linux-amlogic
  Cc: ribalda, jic23, nuno.sa, andy, robh, krzk+dt, conor+dt,
	andrew+netdev, davem, edumazet, kuba, pabeni, neil.armstrong,
	khilman, jbrunet, martin.blumenstingl

On 7/20/25 1:26 PM, > wrote:
> Changes in v2:
> - Fixed commit message grammar
> - Fixed subject line style as per DT convention
> - Added missing reviewers/maintainers in CC
> 

By placing this before the headers, our email clients think this
message doesn't have a subject. It should go after the ---.

> From 5c00524cbb47e30ee04223fe9502af2eb003ddf1 Mon Sep 17 00:00:00 2001
> From: sanjay suthar <sanjaysuthar661996@gmail.com>
> Date: Sun, 20 Jul 2025 01:11:00 +0530
> Subject: [PATCH v2] dt-bindings: cleanup: fix duplicated 'is is' in YAML docs
> 
> Fix minor grammatical issues by removing duplicated "is" in two devicetree
> binding documents:
> 
> - net/amlogic,meson-dwmac.yaml
> - iio/dac/ti,dac7612.yaml
> 
> Signed-off-by: sanjay suthar <sanjaysuthar661996@gmail.com>
> ---

This is where the changelog belongs.

>  Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml      | 2 +-
>  Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> index 20dd1370660d..624c640be4c8 100644
> --- a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> +++ b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> @@ -9,7 +9,7 @@ title: Texas Instruments DAC7612 family of DACs
>  description:
>    The DAC7612 is a dual, 12-bit digital-to-analog converter (DAC) with
>    guaranteed 12-bit monotonicity performance over the industrial temperature
> -  range. Is is programmable through an SPI interface.
> +  range. It is programmable through an SPI interface.
>  
>  maintainers:
>    - Ricardo Ribalda Delgado <ricardo@ribalda.com>
> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> index 0cd78d71768c..5c91716d1f21 100644
> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> @@ -149,7 +149,7 @@ properties:
>        - description:
>            The first register range should be the one of the DWMAC controller
>        - description:
> -          The second range is is for the Amlogic specific configuration
> +          The second range is for the Amlogic specific configuration
>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>  
>    interrupts:

I would be tempted to split this into two patches. It's a bit odd to have
a single patch touching two unrelated bindings.

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2025-07-20 19:30     ` David Lechner
  0 siblings, 0 replies; 1651+ messages in thread
From: David Lechner @ 2025-07-20 19:30 UTC (permalink / raw)
  To: >, linux-kernel, devicetree, linux-iio, netdev,
	linux-arm-kernel, linux-amlogic
  Cc: ribalda, jic23, nuno.sa, andy, robh, krzk+dt, conor+dt,
	andrew+netdev, davem, edumazet, kuba, pabeni, neil.armstrong,
	khilman, jbrunet, martin.blumenstingl

On 7/20/25 1:26 PM, > wrote:
> Changes in v2:
> - Fixed commit message grammar
> - Fixed subject line style as per DT convention
> - Added missing reviewers/maintainers in CC
> 

By placing this before the headers, our email clients think this
message doesn't have a subject. It should go after the ---.

> From 5c00524cbb47e30ee04223fe9502af2eb003ddf1 Mon Sep 17 00:00:00 2001
> From: sanjay suthar <sanjaysuthar661996@gmail.com>
> Date: Sun, 20 Jul 2025 01:11:00 +0530
> Subject: [PATCH v2] dt-bindings: cleanup: fix duplicated 'is is' in YAML docs
> 
> Fix minor grammatical issues by removing duplicated "is" in two devicetree
> binding documents:
> 
> - net/amlogic,meson-dwmac.yaml
> - iio/dac/ti,dac7612.yaml
> 
> Signed-off-by: sanjay suthar <sanjaysuthar661996@gmail.com>
> ---

This is where the changelog belongs.

>  Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml      | 2 +-
>  Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> index 20dd1370660d..624c640be4c8 100644
> --- a/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> +++ b/Documentation/devicetree/bindings/iio/dac/ti,dac7612.yaml
> @@ -9,7 +9,7 @@ title: Texas Instruments DAC7612 family of DACs
>  description:
>    The DAC7612 is a dual, 12-bit digital-to-analog converter (DAC) with
>    guaranteed 12-bit monotonicity performance over the industrial temperature
> -  range. Is is programmable through an SPI interface.
> +  range. It is programmable through an SPI interface.
>  
>  maintainers:
>    - Ricardo Ribalda Delgado <ricardo@ribalda.com>
> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> index 0cd78d71768c..5c91716d1f21 100644
> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
> @@ -149,7 +149,7 @@ properties:
>        - description:
>            The first register range should be the one of the DWMAC controller
>        - description:
> -          The second range is is for the Amlogic specific configuration
> +          The second range is for the Amlogic specific configuration
>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>  
>    interrupts:

I would be tempted to split this into two patches. It's a bit odd to have
a single patch touching two unrelated bindings.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-07-20 19:30     ` Re: David Lechner
@ 2025-07-21  6:52       ` Krzysztof Kozlowski
  -1 siblings, 0 replies; 1651+ messages in thread
From: Krzysztof Kozlowski @ 2025-07-21  6:52 UTC (permalink / raw)
  To: David Lechner, >, linux-kernel, devicetree, linux-iio, netdev,
	linux-arm-kernel, linux-amlogic
  Cc: ribalda, jic23, nuno.sa, andy, robh, krzk+dt, conor+dt,
	andrew+netdev, davem, edumazet, kuba, pabeni, neil.armstrong,
	khilman, jbrunet, martin.blumenstingl

On 20/07/2025 21:30, David Lechner wrote:
>>    - Ricardo Ribalda Delgado <ricardo@ribalda.com>
>> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> index 0cd78d71768c..5c91716d1f21 100644
>> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> @@ -149,7 +149,7 @@ properties:
>>        - description:
>>            The first register range should be the one of the DWMAC controller
>>        - description:
>> -          The second range is is for the Amlogic specific configuration
>> +          The second range is for the Amlogic specific configuration
>>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>>  
>>    interrupts:
> 
> I would be tempted to split this into two patches. It's a bit odd to have


No, it's a churn to split this into more than one patch.

> a single patch touching two unrelated bindings.




Best regards,
Krzysztof

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2025-07-21  6:52       ` Krzysztof Kozlowski
  0 siblings, 0 replies; 1651+ messages in thread
From: Krzysztof Kozlowski @ 2025-07-21  6:52 UTC (permalink / raw)
  To: David Lechner, >, linux-kernel, devicetree, linux-iio, netdev,
	linux-arm-kernel, linux-amlogic
  Cc: ribalda, jic23, nuno.sa, andy, robh, krzk+dt, conor+dt,
	andrew+netdev, davem, edumazet, kuba, pabeni, neil.armstrong,
	khilman, jbrunet, martin.blumenstingl

On 20/07/2025 21:30, David Lechner wrote:
>>    - Ricardo Ribalda Delgado <ricardo@ribalda.com>
>> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> index 0cd78d71768c..5c91716d1f21 100644
>> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> @@ -149,7 +149,7 @@ properties:
>>        - description:
>>            The first register range should be the one of the DWMAC controller
>>        - description:
>> -          The second range is is for the Amlogic specific configuration
>> +          The second range is for the Amlogic specific configuration
>>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>>  
>>    interrupts:
> 
> I would be tempted to split this into two patches. It's a bit odd to have


No, it's a churn to split this into more than one patch.

> a single patch touching two unrelated bindings.




Best regards,
Krzysztof


^ permalink raw reply	[flat|nested] 1651+ messages in thread

[parent not found: <CADU64hDZeyaCpHXBmSG1rtHjpxmjejT7asK9oGBUMF55eYeh4w@mail.gmail.com>]

* Re:
       [not found]       ` <CADU64hDZeyaCpHXBmSG1rtHjpxmjejT7asK9oGBUMF55eYeh4w@mail.gmail.com>
@ 2025-07-21 14:09           ` David Lechner
  0 siblings, 0 replies; 1651+ messages in thread
From: David Lechner @ 2025-07-21 14:09 UTC (permalink / raw)
  To: Sanjay Suthar, Krzysztof Kozlowski
  Cc: linux-kernel, devicetree, linux-iio, netdev, linux-arm-kernel,
	linux-amlogic, ribalda, jic23, nuno.sa, andy, robh, krzk+dt,
	conor+dt, andrew+netdev, davem, edumazet, kuba, pabeni,
	neil.armstrong, khilman, jbrunet, martin.blumenstingl

On 7/21/25 5:15 AM, Sanjay Suthar wrote:
> On Mon, Jul 21, 2025 at 12:22 PM Krzysztof Kozlowski <krzk@kernel.org <mailto:krzk@kernel.org>> wrote:
>>
>> On 20/07/2025 21:30, David Lechner wrote:
>> >>    - Ricardo Ribalda Delgado <ricardo@ribalda.com <mailto:ricardo@ribalda.com>>
>> >> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> index 0cd78d71768c..5c91716d1f21 100644
>> >> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> @@ -149,7 +149,7 @@ properties:
>> >>        - description:
>> >>            The first register range should be the one of the DWMAC controller
>> >>        - description:
>> >> -          The second range is is for the Amlogic specific configuration
>> >> +          The second range is for the Amlogic specific configuration
>> >>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>> >>
>> >>    interrupts:
>> >
>> > I would be tempted to split this into two patches. It's a bit odd to have
>>
>>
>> No, it's a churn to split this into more than one patch.
>>
> 
> Thanks for the reply. Since there are suggestions on patch split as it is touching different subsystems, still not clear if I should split the patch or single patch is fine. I would appreciate if you can guide on the next steps to be taken
> 
> Best Regards,
> Sanjay Suthar

Krzysztof is one of the devicetree maintainers and I am not, so you
should do what Krzysztof says - leave it as one patch. :-)

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2025-07-21 14:09           ` David Lechner
  0 siblings, 0 replies; 1651+ messages in thread
From: David Lechner @ 2025-07-21 14:09 UTC (permalink / raw)
  To: Sanjay Suthar, Krzysztof Kozlowski
  Cc: linux-kernel, devicetree, linux-iio, netdev, linux-arm-kernel,
	linux-amlogic, ribalda, jic23, nuno.sa, andy, robh, krzk+dt,
	conor+dt, andrew+netdev, davem, edumazet, kuba, pabeni,
	neil.armstrong, khilman, jbrunet, martin.blumenstingl

On 7/21/25 5:15 AM, Sanjay Suthar wrote:
> On Mon, Jul 21, 2025 at 12:22 PM Krzysztof Kozlowski <krzk@kernel.org <mailto:krzk@kernel.org>> wrote:
>>
>> On 20/07/2025 21:30, David Lechner wrote:
>> >>    - Ricardo Ribalda Delgado <ricardo@ribalda.com <mailto:ricardo@ribalda.com>>
>> >> diff --git a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> index 0cd78d71768c..5c91716d1f21 100644
>> >> --- a/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> +++ b/Documentation/devicetree/bindings/net/amlogic,meson-dwmac.yaml
>> >> @@ -149,7 +149,7 @@ properties:
>> >>        - description:
>> >>            The first register range should be the one of the DWMAC controller
>> >>        - description:
>> >> -          The second range is is for the Amlogic specific configuration
>> >> +          The second range is for the Amlogic specific configuration
>> >>            (for example the PRG_ETHERNET register range on Meson8b and newer)
>> >>
>> >>    interrupts:
>> >
>> > I would be tempted to split this into two patches. It's a bit odd to have
>>
>>
>> No, it's a churn to split this into more than one patch.
>>
> 
> Thanks for the reply. Since there are suggestions on patch split as it is touching different subsystems, still not clear if I should split the patch or single patch is fine. I would appreciate if you can guide on the next steps to be taken
> 
> Best Regards,
> Sanjay Suthar

Krzysztof is one of the devicetree maintainers and I am not, so you
should do what Krzysztof says - leave it as one patch. :-)


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-07-20 18:26 ` >
@ 2025-07-21  7:52     ` Andy Shevchenko
  2025-07-21  7:52     ` Re: Andy Shevchenko
  1 sibling, 0 replies; 1651+ messages in thread
From: Andy Shevchenko @ 2025-07-21  7:52 UTC (permalink / raw)
  To: >
  Cc: linux-kernel, devicetree, linux-iio, netdev, linux-arm-kernel,
	linux-amlogic, ribalda, jic23, dlechner, nuno.sa, andy, robh,
	krzk+dt, conor+dt, andrew+netdev, davem, edumazet, kuba, pabeni,
	neil.armstrong, khilman, jbrunet, martin.blumenstingl

On Sun, Jul 20, 2025 at 11:56:27PM +0530, > wrote:
> Changes in v2:
> - Fixed commit message grammar
> - Fixed subject line style as per DT convention
> - Added missing reviewers/maintainers in CC
> 
> From 5c00524cbb47e30ee04223fe9502af2eb003ddf1 Mon Sep 17 00:00:00 2001
> From: sanjay suthar <sanjaysuthar661996@gmail.com>
> Date: Sun, 20 Jul 2025 01:11:00 +0530
> Subject: [PATCH v2] dt-bindings: cleanup: fix duplicated 'is is' in YAML docs
> 
> Fix minor grammatical issues by removing duplicated "is" in two devicetree
> binding documents:
> 
> - net/amlogic,meson-dwmac.yaml
> - iio/dac/ti,dac7612.yaml

This mail is b0rken.

-- 
With Best Regards,
Andy Shevchenko



_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
@ 2025-07-21  7:52     ` Andy Shevchenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Andy Shevchenko @ 2025-07-21  7:52 UTC (permalink / raw)
  To: >
  Cc: linux-kernel, devicetree, linux-iio, netdev, linux-arm-kernel,
	linux-amlogic, ribalda, jic23, dlechner, nuno.sa, andy, robh,
	krzk+dt, conor+dt, andrew+netdev, davem, edumazet, kuba, pabeni,
	neil.armstrong, khilman, jbrunet, martin.blumenstingl

On Sun, Jul 20, 2025 at 11:56:27PM +0530, > wrote:
> Changes in v2:
> - Fixed commit message grammar
> - Fixed subject line style as per DT convention
> - Added missing reviewers/maintainers in CC
> 
> From 5c00524cbb47e30ee04223fe9502af2eb003ddf1 Mon Sep 17 00:00:00 2001
> From: sanjay suthar <sanjaysuthar661996@gmail.com>
> Date: Sun, 20 Jul 2025 01:11:00 +0530
> Subject: [PATCH v2] dt-bindings: cleanup: fix duplicated 'is is' in YAML docs
> 
> Fix minor grammatical issues by removing duplicated "is" in two devicetree
> binding documents:
> 
> - net/amlogic,meson-dwmac.yaml
> - iio/dac/ti,dac7612.yaml

This mail is b0rken.

-- 
With Best Regards,
Andy Shevchenko




^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-08-06  3:34 Sang-Heon Jeon
  2025-08-06  3:44 ` Sang-Heon Jeon
  0 siblings, 1 reply; 1651+ messages in thread
From: Sang-Heon Jeon @ 2025-08-06  3:34 UTC (permalink / raw)
  To: damon

subscribe damon mailing list

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-06  3:34 Sang-Heon Jeon
@ 2025-08-06  3:44 ` Sang-Heon Jeon
  0 siblings, 0 replies; 1651+ messages in thread
From: Sang-Heon Jeon @ 2025-08-06  3:44 UTC (permalink / raw)
  To: damon

Because of my lack of knowledge, the above mail is just a mistake.
Just Ignore plz.
Sorry for being annoying.

PS) Unfortunately, I found this thread [1] after I sent unnecessary
mail to others.

[1] https://lore.kernel.org/damon/20240926213942.17022-1-sj@kernel.org/

On Wed, Aug 6, 2025 at 12:34 PM Sang-Heon Jeon <ekffu200098@gmail.com> wrote:
>
> subscribe damon mailing list

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-08-12 13:34 Baoquan He
  2025-08-12 13:49 ` Baoquan He
  0 siblings, 1 reply; 1651+ messages in thread
From: Baoquan He @ 2025-08-12 13:34 UTC (permalink / raw)
  To: linux-mm, christophe.leroy

alexghiti@rivosinc.com, agordeev@linux.ibm.com, linux@armlinux.org.uk,
linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev,
linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org,
x86@kernel.org, chris@zankel.net, jcmvbkbc@gmail.com, linux-um@lists.infradead.org
Cc: ryabinin.a.a@gmail.com, glider@google.com, andreyknvl@gmail.com,
	dvyukov@google.com, vincenzo.frascino@arm.com,
	akpm@linux-foundation.org, kasan-dev@googlegroups.com,
	linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
	sj@kernel.org, lorenzo.stoakes@oracle.com, elver@google.com,
	snovitoll@gmail.com
Bcc: bhe@redhat.com
Subject: Re: [PATCH v2 00/12] mm/kasan: make kasan=on|off work for all three
 modes
Reply-To: 
In-Reply-To: <20250812124941.69508-1-bhe@redhat.com>

Forgot adding related ARCH mailing list or people to CC, add them.

On 08/12/25 at 08:49pm, Baoquan He wrote:
> Currently only hw_tags mode of kasan can be enabled or disabled with
> kernel parameter kasan=on|off for built kernel. For kasan generic and
> sw_tags mode, there's no way to disable them once kernel is built.
> This is not convenient sometime, e.g in system kdump is configured.
> When the 1st kernel has KASAN enabled and crash triggered to switch to
> kdump kernel, the generic or sw_tags mode will cost much extra memory
> for kasan shadow while in fact it's meaningless to have kasan in kdump
> kernel.
> 
> So this patchset moves the kasan=on|off out of hw_tags scope and into
> common code to make it visible in generic and sw_tags mode too. Then we
> can add kasan=off in kdump kernel to reduce the unneeded meomry cost for
> kasan.
> 
> Changelog:
> ====
> v1->v2:
> - Add __ro_after_init for __ro_after_init, and remove redundant blank
>   lines in mm/kasan/common.c. Thanks to Marco.
> - Fix a code bug in <linux/kasan-enabled.h> when CONFIG_KASAN is unset,
>   this is found out by SeongJae and Lorenzo, and also reported by LKP
>   report, thanks to them.
> - Add a missing kasan_enabled() checking in kasan_report(). This will
>   cause below KASAN report info even though kasan=off is set:
>      ==================================================================
>      BUG: KASAN: stack-out-of-bounds in tick_program_event+0x130/0x150
>      Read of size 4 at addr ffff00005f747778 by task swapper/0/1
>      
>      CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0+ #8 PREEMPT(voluntary) 
>      Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS F31n (SCP: 2.10.20220810) 09/30/2022
>      Call trace:
>       show_stack+0x30/0x90 (C)
>       dump_stack_lvl+0x7c/0xa0
>       print_address_description.constprop.0+0x90/0x310
>       print_report+0x104/0x1f0
>       kasan_report+0xc8/0x110
>       __asan_report_load4_noabort+0x20/0x30
>       tick_program_event+0x130/0x150
>       ......snip...
>      ==================================================================
> 
> - Add jump_label_init() calling before kasan_init() in setup_arch() in these
>   architectures: xtensa, arm. Because they currenly rely on
>   jump_label_init() in main() which is a little late. Then the early static
>   key kasan_flag_enabled in kasan_init() won't work.
> 
> - In UML architecture, change to enable kasan_flag_enabled in arch_mm_preinit()
>   because kasan_init() is enabled before main(), there's no chance to operate
>   on static key in kasan_init().
> 
> Test:
> =====
> In v1, I took test on x86_64 for generic mode, and on arm64 for
> generic, sw_tags and hw_tags mode. All of them works well.
> 
> In v2, I only tested on arm64 for generic, sw_tags and hw_tags mode, it
> works. For powerpc, I got a BOOK3S/64 machine, while it says
> 'KASAN not enabled as it requires radix' and KASAN is disabled. Will
> look for other POWER machine to test this.
> ====
> 
> Baoquan He (12):
>   mm/kasan: add conditional checks in functions to return directly if
>     kasan is disabled
>   mm/kasan: move kasan= code to common place
>   mm/kasan/sw_tags: don't initialize kasan if it's disabled
>   arch/arm: don't initialize kasan if it's disabled
>   arch/arm64: don't initialize kasan if it's disabled
>   arch/loongarch: don't initialize kasan if it's disabled
>   arch/powerpc: don't initialize kasan if it's disabled
>   arch/riscv: don't initialize kasan if it's disabled
>   arch/x86: don't initialize kasan if it's disabled
>   arch/xtensa: don't initialize kasan if it's disabled
>   arch/um: don't initialize kasan if it's disabled
>   mm/kasan: make kasan=on|off take effect for all three modes
> 
>  arch/arm/kernel/setup.c                |  6 +++++
>  arch/arm/mm/kasan_init.c               |  6 +++++
>  arch/arm64/mm/kasan_init.c             |  7 ++++++
>  arch/loongarch/mm/kasan_init.c         |  5 ++++
>  arch/powerpc/mm/kasan/init_32.c        |  8 +++++-
>  arch/powerpc/mm/kasan/init_book3e_64.c |  6 +++++
>  arch/powerpc/mm/kasan/init_book3s_64.c |  6 +++++
>  arch/riscv/mm/kasan_init.c             |  6 +++++
>  arch/um/kernel/mem.c                   |  6 +++++
>  arch/x86/mm/kasan_init_64.c            |  6 +++++
>  arch/xtensa/kernel/setup.c             |  1 +
>  arch/xtensa/mm/kasan_init.c            |  6 +++++
>  include/linux/kasan-enabled.h          | 18 ++++++-------
>  mm/kasan/common.c                      | 25 ++++++++++++++++++
>  mm/kasan/generic.c                     | 20 +++++++++++++--
>  mm/kasan/hw_tags.c                     | 35 ++------------------------
>  mm/kasan/init.c                        |  6 +++++
>  mm/kasan/quarantine.c                  |  3 +++
>  mm/kasan/report.c                      |  4 ++-
>  mm/kasan/shadow.c                      | 23 ++++++++++++++++-
>  mm/kasan/sw_tags.c                     |  9 +++++++
>  21 files changed, 165 insertions(+), 47 deletions(-)
> 
> -- 
> 2.41.0
> 



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-12 13:34 Baoquan He
@ 2025-08-12 13:49 ` Baoquan He
  0 siblings, 0 replies; 1651+ messages in thread
From: Baoquan He @ 2025-08-12 13:49 UTC (permalink / raw)
  To: linux-mm, christophe.leroy

On 08/12/25 at 09:34pm, Baoquan He wrote:
> alexghiti@rivosinc.com, agordeev@linux.ibm.com, linux@armlinux.org.uk,
> linux-arm-kernel@lists.infradead.org, loongarch@lists.linux.dev,
> linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org,
> x86@kernel.org, chris@zankel.net, jcmvbkbc@gmail.com, linux-um@lists.infradead.org
> Cc: ryabinin.a.a@gmail.com, glider@google.com, andreyknvl@gmail.com,
> 	dvyukov@google.com, vincenzo.frascino@arm.com,
> 	akpm@linux-foundation.org, kasan-dev@googlegroups.com,
> 	linux-kernel@vger.kernel.org, kexec@lists.infradead.org,
> 	sj@kernel.org, lorenzo.stoakes@oracle.com, elver@google.com,
> 	snovitoll@gmail.com
> Bcc: bhe@redhat.com
> Subject: Re: [PATCH v2 00/12] mm/kasan: make kasan=on|off work for all three
>  modes
> Reply-To: 
> In-Reply-To: <20250812124941.69508-1-bhe@redhat.com>
> 
> Forgot adding related ARCH mailing list or people to CC, add them.

Sorry for the noise, I made mistake on mail format when adding people to
CC.

> 
> On 08/12/25 at 08:49pm, Baoquan He wrote:
> > Currently only hw_tags mode of kasan can be enabled or disabled with
> > kernel parameter kasan=on|off for built kernel. For kasan generic and
> > sw_tags mode, there's no way to disable them once kernel is built.
> > This is not convenient sometime, e.g in system kdump is configured.
> > When the 1st kernel has KASAN enabled and crash triggered to switch to
> > kdump kernel, the generic or sw_tags mode will cost much extra memory
> > for kasan shadow while in fact it's meaningless to have kasan in kdump
> > kernel.
> > 
> > So this patchset moves the kasan=on|off out of hw_tags scope and into
> > common code to make it visible in generic and sw_tags mode too. Then we
> > can add kasan=off in kdump kernel to reduce the unneeded meomry cost for
> > kasan.
> > 
> > Changelog:
> > ====
> > v1->v2:
> > - Add __ro_after_init for __ro_after_init, and remove redundant blank
> >   lines in mm/kasan/common.c. Thanks to Marco.
> > - Fix a code bug in <linux/kasan-enabled.h> when CONFIG_KASAN is unset,
> >   this is found out by SeongJae and Lorenzo, and also reported by LKP
> >   report, thanks to them.
> > - Add a missing kasan_enabled() checking in kasan_report(). This will
> >   cause below KASAN report info even though kasan=off is set:
> >      ==================================================================
> >      BUG: KASAN: stack-out-of-bounds in tick_program_event+0x130/0x150
> >      Read of size 4 at addr ffff00005f747778 by task swapper/0/1
> >      
> >      CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0+ #8 PREEMPT(voluntary) 
> >      Hardware name: GIGABYTE R272-P30-JG/MP32-AR0-JG, BIOS F31n (SCP: 2.10.20220810) 09/30/2022
> >      Call trace:
> >       show_stack+0x30/0x90 (C)
> >       dump_stack_lvl+0x7c/0xa0
> >       print_address_description.constprop.0+0x90/0x310
> >       print_report+0x104/0x1f0
> >       kasan_report+0xc8/0x110
> >       __asan_report_load4_noabort+0x20/0x30
> >       tick_program_event+0x130/0x150
> >       ......snip...
> >      ==================================================================
> > 
> > - Add jump_label_init() calling before kasan_init() in setup_arch() in these
> >   architectures: xtensa, arm. Because they currenly rely on
> >   jump_label_init() in main() which is a little late. Then the early static
> >   key kasan_flag_enabled in kasan_init() won't work.
> > 
> > - In UML architecture, change to enable kasan_flag_enabled in arch_mm_preinit()
> >   because kasan_init() is enabled before main(), there's no chance to operate
> >   on static key in kasan_init().
> > 
> > Test:
> > =====
> > In v1, I took test on x86_64 for generic mode, and on arm64 for
> > generic, sw_tags and hw_tags mode. All of them works well.
> > 
> > In v2, I only tested on arm64 for generic, sw_tags and hw_tags mode, it
> > works. For powerpc, I got a BOOK3S/64 machine, while it says
> > 'KASAN not enabled as it requires radix' and KASAN is disabled. Will
> > look for other POWER machine to test this.
> > ====
> > 
> > Baoquan He (12):
> >   mm/kasan: add conditional checks in functions to return directly if
> >     kasan is disabled
> >   mm/kasan: move kasan= code to common place
> >   mm/kasan/sw_tags: don't initialize kasan if it's disabled
> >   arch/arm: don't initialize kasan if it's disabled
> >   arch/arm64: don't initialize kasan if it's disabled
> >   arch/loongarch: don't initialize kasan if it's disabled
> >   arch/powerpc: don't initialize kasan if it's disabled
> >   arch/riscv: don't initialize kasan if it's disabled
> >   arch/x86: don't initialize kasan if it's disabled
> >   arch/xtensa: don't initialize kasan if it's disabled
> >   arch/um: don't initialize kasan if it's disabled
> >   mm/kasan: make kasan=on|off take effect for all three modes
> > 
> >  arch/arm/kernel/setup.c                |  6 +++++
> >  arch/arm/mm/kasan_init.c               |  6 +++++
> >  arch/arm64/mm/kasan_init.c             |  7 ++++++
> >  arch/loongarch/mm/kasan_init.c         |  5 ++++
> >  arch/powerpc/mm/kasan/init_32.c        |  8 +++++-
> >  arch/powerpc/mm/kasan/init_book3e_64.c |  6 +++++
> >  arch/powerpc/mm/kasan/init_book3s_64.c |  6 +++++
> >  arch/riscv/mm/kasan_init.c             |  6 +++++
> >  arch/um/kernel/mem.c                   |  6 +++++
> >  arch/x86/mm/kasan_init_64.c            |  6 +++++
> >  arch/xtensa/kernel/setup.c             |  1 +
> >  arch/xtensa/mm/kasan_init.c            |  6 +++++
> >  include/linux/kasan-enabled.h          | 18 ++++++-------
> >  mm/kasan/common.c                      | 25 ++++++++++++++++++
> >  mm/kasan/generic.c                     | 20 +++++++++++++--
> >  mm/kasan/hw_tags.c                     | 35 ++------------------------
> >  mm/kasan/init.c                        |  6 +++++
> >  mm/kasan/quarantine.c                  |  3 +++
> >  mm/kasan/report.c                      |  4 ++-
> >  mm/kasan/shadow.c                      | 23 ++++++++++++++++-
> >  mm/kasan/sw_tags.c                     |  9 +++++++
> >  21 files changed, 165 insertions(+), 47 deletions(-)
> > 
> > -- 
> > 2.41.0
> > 
> 
> 



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH 6.15 000/480] 6.15.10-rc1 review
@ 2025-08-13 15:48 Jon Hunter
  2025-08-13 17:25 ` Jon Hunter
  0 siblings, 1 reply; 1651+ messages in thread
From: Jon Hunter @ 2025-08-13 15:48 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Greg Kroah-Hartman, patches, linux-kernel, torvalds, akpm, linux,
	shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
	sudipm.mukherjee, srw, rwarsow, conor, hargar, broonie, achill,
	linux-tegra, stable

On Tue, 12 Aug 2025 19:43:28 +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.15.10 release.
> There are 480 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu, 14 Aug 2025 17:42:20 +0000.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.10-rc1.gz
> or in the git tree and branch at:
> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

Failures detected for Tegra ...

Test results for stable-v6.15:
    10 builds:	10 pass, 0 fail
    28 boots:	28 pass, 0 fail
    120 tests:	119 pass, 1 fail

Linux version:	6.15.10-rc1-g2510f67e2e34
Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
                tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000,
                tegra194-p3509-0000+p3668-0000, tegra20-ventana,
                tegra210-p2371-2180, tegra210-p3450-0000,
                tegra30-cardhu-a04

Test failures:	tegra194-p2972-0000: boot.py


Jon

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2025-08-13 15:48 [PATCH 6.15 000/480] 6.15.10-rc1 review Jon Hunter
@ 2025-08-13 17:25 ` Jon Hunter
  2025-08-14 15:36   ` Greg KH
  0 siblings, 1 reply; 1651+ messages in thread
From: Jon Hunter @ 2025-08-13 17:25 UTC (permalink / raw)
  To: jonathanh
  Cc: achill, akpm, broonie, conor, f.fainelli, gregkh, hargar,
	linux-kernel, linux-tegra, linux, lkft-triage, patches, patches,
	pavel, rwarsow, shuah, srw, stable, sudipm.mukherjee, torvalds

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="y", Size: 1971 bytes --]

From: Jon Hunter <jonathanh@nvidia.com>
Date: Wed, 13 Aug 2025 18:18:01 +0100
Subject: Re: [PATCH 6.15 000/480] 6.15.10-rc1 review
X-NVConfidentiality: public

On Wed, Aug 13, 2025 at 08:48:28AM -0700, Jon Hunter wrote:
> On Tue, 12 Aug 2025 19:43:28 +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 6.15.10 release.
> > There are 480 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu, 14 Aug 2025 17:42:20 +0000.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.10-rc1.gz
> > or in the git tree and branch at:
> > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Failures detected for Tegra ...
> 
> Test results for stable-v6.15:
>     10 builds:	10 pass, 0 fail
>     28 boots:	28 pass, 0 fail
>     120 tests:	119 pass, 1 fail
> 
> Linux version:	6.15.10-rc1-g2510f67e2e34
> Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
>                 tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000,
>                 tegra194-p3509-0000+p3668-0000, tegra20-ventana,
>                 tegra210-p2371-2180, tegra210-p3450-0000,
>                 tegra30-cardhu-a04
> 
> Test failures:	tegra194-p2972-0000: boot.py

I am seeing the following kernel warning for both linux-6.15.y and linux-6.16.y …

 WARNING KERN sched: DL replenish lagged too much

I believe that this is introduced by …

Peter Zijlstra <peterz@infradead.org>
    sched/deadline: Less agressive dl_server handling

This has been reported here: https://lore.kernel.org/all/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com/

Jon

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-13 17:25 ` Jon Hunter
@ 2025-08-14 15:36   ` Greg KH
  2025-08-15 16:20     ` Re: Jon Hunter
  0 siblings, 1 reply; 1651+ messages in thread
From: Greg KH @ 2025-08-14 15:36 UTC (permalink / raw)
  To: Jon Hunter
  Cc: achill, akpm, broonie, conor, f.fainelli, hargar, linux-kernel,
	linux-tegra, linux, lkft-triage, patches, patches, pavel, rwarsow,
	shuah, srw, stable, sudipm.mukherjee, torvalds

On Wed, Aug 13, 2025 at 06:25:32PM +0100, Jon Hunter wrote:
> On Wed, Aug 13, 2025 at 08:48:28AM -0700, Jon Hunter wrote:
> > On Tue, 12 Aug 2025 19:43:28 +0200, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 6.15.10 release.
> > > There are 480 patches in this series, all will be posted as a response
> > > to this one.  If anyone has any issues with these being applied, please
> > > let me know.
> > > 
> > > Responses should be made by Thu, 14 Aug 2025 17:42:20 +0000.
> > > Anything received after that time might be too late.
> > > 
> > > The whole patch series can be found in one patch at:
> > > 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.10-rc1.gz
> > > or in the git tree and branch at:
> > > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
> > > and the diffstat can be found below.
> > > 
> > > thanks,
> > > 
> > > greg k-h
> > 
> > Failures detected for Tegra ...
> > 
> > Test results for stable-v6.15:
> >     10 builds:	10 pass, 0 fail
> >     28 boots:	28 pass, 0 fail
> >     120 tests:	119 pass, 1 fail
> > 
> > Linux version:	6.15.10-rc1-g2510f67e2e34
> > Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
> >                 tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000,
> >                 tegra194-p3509-0000+p3668-0000, tegra20-ventana,
> >                 tegra210-p2371-2180, tegra210-p3450-0000,
> >                 tegra30-cardhu-a04
> > 
> > Test failures:	tegra194-p2972-0000: boot.py
> 
> I am seeing the following kernel warning for both linux-6.15.y and linux-6.16.y …
> 
>  WARNING KERN sched: DL replenish lagged too much
> 
> I believe that this is introduced by …
> 
> Peter Zijlstra <peterz@infradead.org>
>     sched/deadline: Less agressive dl_server handling
> 
> This has been reported here: https://lore.kernel.org/all/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com/

I've now dropped this.

Is that causing the test failure for you?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-14 15:36   ` Greg KH
@ 2025-08-15 16:20     ` Jon Hunter
  2025-08-15 16:53       ` Re: Greg KH
  0 siblings, 1 reply; 1651+ messages in thread
From: Jon Hunter @ 2025-08-15 16:20 UTC (permalink / raw)
  To: Greg KH
  Cc: achill, akpm, broonie, conor, f.fainelli, hargar, linux-kernel,
	linux-tegra, linux, lkft-triage, patches, patches, pavel, rwarsow,
	shuah, srw, stable, sudipm.mukherjee, torvalds

On 14/08/2025 16:36, Greg KH wrote:
> On Wed, Aug 13, 2025 at 06:25:32PM +0100, Jon Hunter wrote:
>> On Wed, Aug 13, 2025 at 08:48:28AM -0700, Jon Hunter wrote:
>>> On Tue, 12 Aug 2025 19:43:28 +0200, Greg Kroah-Hartman wrote:
>>>> This is the start of the stable review cycle for the 6.15.10 release.
>>>> There are 480 patches in this series, all will be posted as a response
>>>> to this one.  If anyone has any issues with these being applied, please
>>>> let me know.
>>>>
>>>> Responses should be made by Thu, 14 Aug 2025 17:42:20 +0000.
>>>> Anything received after that time might be too late.
>>>>
>>>> The whole patch series can be found in one patch at:
>>>> 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.10-rc1.gz
>>>> or in the git tree and branch at:
>>>> 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
>>>> and the diffstat can be found below.
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>>
>>> Failures detected for Tegra ...
>>>
>>> Test results for stable-v6.15:
>>>      10 builds:	10 pass, 0 fail
>>>      28 boots:	28 pass, 0 fail
>>>      120 tests:	119 pass, 1 fail
>>>
>>> Linux version:	6.15.10-rc1-g2510f67e2e34
>>> Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
>>>                  tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000,
>>>                  tegra194-p3509-0000+p3668-0000, tegra20-ventana,
>>>                  tegra210-p2371-2180, tegra210-p3450-0000,
>>>                  tegra30-cardhu-a04
>>>
>>> Test failures:	tegra194-p2972-0000: boot.py
>>
>> I am seeing the following kernel warning for both linux-6.15.y and linux-6.16.y …
>>
>>   WARNING KERN sched: DL replenish lagged too much
>>
>> I believe that this is introduced by …
>>
>> Peter Zijlstra <peterz@infradead.org>
>>      sched/deadline: Less agressive dl_server handling
>>
>> This has been reported here: https://lore.kernel.org/all/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com/
> 
> I've now dropped this.
> 
> Is that causing the test failure for you?

Yes that is causing the test failure. Thanks!

Jon

-- 
nvpublic


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-15 16:20     ` Re: Jon Hunter
@ 2025-08-15 16:53       ` Greg KH
  0 siblings, 0 replies; 1651+ messages in thread
From: Greg KH @ 2025-08-15 16:53 UTC (permalink / raw)
  To: Jon Hunter
  Cc: achill, akpm, broonie, conor, f.fainelli, hargar, linux-kernel,
	linux-tegra, linux, lkft-triage, patches, patches, pavel, rwarsow,
	shuah, srw, stable, sudipm.mukherjee, torvalds

On Fri, Aug 15, 2025 at 05:20:34PM +0100, Jon Hunter wrote:
> On 14/08/2025 16:36, Greg KH wrote:
> > On Wed, Aug 13, 2025 at 06:25:32PM +0100, Jon Hunter wrote:
> > > On Wed, Aug 13, 2025 at 08:48:28AM -0700, Jon Hunter wrote:
> > > > On Tue, 12 Aug 2025 19:43:28 +0200, Greg Kroah-Hartman wrote:
> > > > > This is the start of the stable review cycle for the 6.15.10 release.
> > > > > There are 480 patches in this series, all will be posted as a response
> > > > > to this one.  If anyone has any issues with these being applied, please
> > > > > let me know.
> > > > > 
> > > > > Responses should be made by Thu, 14 Aug 2025 17:42:20 +0000.
> > > > > Anything received after that time might be too late.
> > > > > 
> > > > > The whole patch series can be found in one patch at:
> > > > > 	https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.10-rc1.gz
> > > > > or in the git tree and branch at:
> > > > > 	git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
> > > > > and the diffstat can be found below.
> > > > > 
> > > > > thanks,
> > > > > 
> > > > > greg k-h
> > > > 
> > > > Failures detected for Tegra ...
> > > > 
> > > > Test results for stable-v6.15:
> > > >      10 builds:	10 pass, 0 fail
> > > >      28 boots:	28 pass, 0 fail
> > > >      120 tests:	119 pass, 1 fail
> > > > 
> > > > Linux version:	6.15.10-rc1-g2510f67e2e34
> > > > Boards tested:	tegra124-jetson-tk1, tegra186-p2771-0000,
> > > >                  tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000,
> > > >                  tegra194-p3509-0000+p3668-0000, tegra20-ventana,
> > > >                  tegra210-p2371-2180, tegra210-p3450-0000,
> > > >                  tegra30-cardhu-a04
> > > > 
> > > > Test failures:	tegra194-p2972-0000: boot.py
> > > 
> > > I am seeing the following kernel warning for both linux-6.15.y and linux-6.16.y …
> > > 
> > >   WARNING KERN sched: DL replenish lagged too much
> > > 
> > > I believe that this is introduced by …
> > > 
> > > Peter Zijlstra <peterz@infradead.org>
> > >      sched/deadline: Less agressive dl_server handling
> > > 
> > > This has been reported here: https://lore.kernel.org/all/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoGa82w@mail.gmail.com/
> > 
> > I've now dropped this.
> > 
> > Is that causing the test failure for you?
> 
> Yes that is causing the test failure. Thanks!

Is the test just noticing the warning message?  Or is it a functional
failure?  Does it also fail on Linus's tree?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] documentation/arm64 : kdump fixed typo errors
@ 2025-08-18 16:08 Jonathan Corbet
  2025-09-08  9:54 ` hariconscious
  0 siblings, 1 reply; 1651+ messages in thread
From: Jonathan Corbet @ 2025-08-18 16:08 UTC (permalink / raw)
  To: hariconscious, shuah, catalin.marinas, will, linux-arm-kernel,
	linux-doc, linux-kernel
  Cc: HariKrishna

hariconscious@gmail.com writes:

> From: HariKrishna <hariconscious@gmail.com>
>
> kdump.rst documentation typos corrected
>
> Signed-off-by: HariKrishna <hariconscious@gmail.com>
> ---
>  Documentation/arch/arm64/kdump.rst | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/arch/arm64/kdump.rst b/Documentation/arch/arm64/kdump.rst
> index 56a89f45df28..d3195a93a066 100644
> --- a/Documentation/arch/arm64/kdump.rst
> +++ b/Documentation/arch/arm64/kdump.rst
> @@ -5,7 +5,7 @@ crashkernel memory reservation on arm64
>  Author: Baoquan He <bhe@redhat.com>
>  
>  Kdump mechanism is used to capture a corrupted kernel vmcore so that
> -it can be subsequently analyzed. In order to do this, a preliminarily
> +it can be subsequently analyzed. In order to do this, a preliminary
>  reserved memory is needed to pre-load the kdump kernel and boot such
>  kernel if corruption happens.

I don't think this is right.  While reserving judgment on
"preliminarily" as a word, the intended use is adverbial, so this change
does not make things better.  The better fix, perhaps, is to say
"previously" instead.

Should you choose to resubmit this, we'll need your real name in the
Signed-off-by tag, please.

Thanks,

jon


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2025-08-18 16:08 [PATCH] documentation/arm64 : kdump fixed typo errors Jonathan Corbet
@ 2025-09-08  9:54 ` hariconscious
  2025-09-08 13:23   ` Jonathan Corbet
  0 siblings, 1 reply; 1651+ messages in thread
From: hariconscious @ 2025-09-08  9:54 UTC (permalink / raw)
  To: corbet
  Cc: catalin.marinas, hariconscious, linux-arm-kernel, linux-doc,
	linux-kernel, shuah, will

m mboxrd@z Thu Jan  1 00:00:00 1970
Received: from ms.lwn.net (ms.lwn.net [45.79.88.28])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 286623770B;
	Mon, 18 Aug 2025 16:08:22 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.79.88.28
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1755533304; cv=none; b=uzqxObiUW0V5/KwDbGE974y+FjJSI5MFNyebWi2+5q0TOrcLDMz7UaGA7zz0rM3Untp7AnczUbPlO0EA7ijp4VWjKvA/jRHEa1WGY2xCRbl7sgaCW2Jjds9q0+1gZLf0j/daCCEppHrPUjZ8YflPc2iu9lSvMSpKu47NIakpBUE=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1755533304; c=relaxed/simple;
	bh=ou2NodSL61SjBBx3oNoEL8d9FXu/qhf54SYinlYpEpw=;
	h=From:To:Cc:Subject:In-Reply-To:References:Date:Message-ID:
	 MIME-Version:Content-Type; b=ZgMDNRzIMf6C1EvsD5ClxxovJPOa43e4oQ6VSmdSN6I7NCnbnvwlTuhLwbC7PoGla8+5kqNNoHprCsJFKlmaUz88nO4IC+USEYcRtm/2Bcyxy5Vkovozj6Qt8dT+KI2+uNMueCPw97mRWW184W5bNT6MZmCp0bCVjUVZaMhgFsI=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=lwn.net; spf=pass smtp.mailfrom=lwn.net; dkim=pass (2048-bit key) header.d=lwn.net header.i=@lwn.net header.b=FS1soXLa; arc=none smtp.client-ip=45.79.88.28
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=lwn.net
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=lwn.net
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=lwn.net header.i=@lwn.net header.b="FS1soXLa"
DKIM-Filter: OpenDKIM Filter v2.11.0 ms.lwn.net 5CE0F40AB4
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lwn.net; s=20201203;
	t=1755533302; bh=PgKB2qnc/bRPAL5RwjdygVePDYnntw4ZWdIavREWLBQ=;
	h=From:To:Cc:Subject:In-Reply-To:References:Date:From;
	b=FS1soXLa521ujJUbl/402whSSESTHg9tMk1J4yJXD61Dfi279n3ZYdXUxCAqBrXbP
	 7uzuyvggLkYoXlerUj63roxK9vLzAUnVybiJ8ADgblvGi5LkC02Rne6uNwELG05I+6
	 k7OuwJQBxt+73WjptP+fON5qK+HI/wyJ2vePn9MgKDmwsi+JYLLCa/Ou+Yk0IU287M
	 3Bs39RXJcFhH8aA2anPFcTcN3lReE6Ci8Tc22FjnWBpnYKR6l1Y7r40THAF4d2OISs
	 tpNV3csYjY/jWhXXQNvQZbmv84TtmdKAC3lrG0p2OxW0RInDuM1q8EFkKH6kheFgli
	 bhQXe4L9M8KpA==
Received: from localhost (unknown [IPv6:2601:280:4600:2da9::1fe])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by ms.lwn.net (Postfix) with ESMTPSA id 5CE0F40AB4;
	Mon, 18 Aug 2025 16:08:22 +0000 (UTC)
From: Jonathan Corbet <corbet@lwn.net>
To: hariconscious@gmail.com, shuah@kernel.org, catalin.marinas@arm.com,
 will@kernel.org, linux-arm-kernel@lists.infradead.org,
 linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: HariKrishna <hariconscious@gmail.com>
Subject: Re: [PATCH] documentation/arm64 : kdump fixed typo errors
In-Reply-To: <20250816120731.24508-1-hariconscious@gmail.com>
References: <20250816120731.24508-1-hariconscious@gmail.com>
Date: Mon, 18 Aug 2025 10:08:21 -0600
Message-ID: <871pp8a9ze.fsf@trenco.lwn.net>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain

hariconscious@gmail.com writes:

> From: HariKrishna <hariconscious@gmail.com>
>
> kdump.rst documentation typos corrected
>
> Signed-off-by: HariKrishna <hariconscious@gmail.com>
> ---
>  Documentation/arch/arm64/kdump.rst | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/arch/arm64/kdump.rst b/Documentation/arch/arm64/kdump.rst
> index 56a89f45df28..d3195a93a066 100644
> --- a/Documentation/arch/arm64/kdump.rst
> +++ b/Documentation/arch/arm64/kdump.rst
> @@ -5,7 +5,7 @@ crashkernel memory reservation on arm64
>  Author: Baoquan He <bhe@redhat.com>
>  
>  Kdump mechanism is used to capture a corrupted kernel vmcore so that
> -it can be subsequently analyzed. In order to do this, a preliminarily
> +it can be subsequently analyzed. In order to do this, a preliminary
>  reserved memory is needed to pre-load the kdump kernel and boot such
>  kernel if corruption happens.

I don't think this is right.  While reserving judgment on
"preliminarily" as a word, the intended use is adverbial, so this change
does not make things better.  The better fix, perhaps, is to say
"previously" instead.

Should you choose to resubmit this, we'll need your real name in the
Signed-off-by tag, please.

Thanks,

jon

Hi Jon,

Good day.
Thanks for the suggestion, will correct and send the patch again.
And my real name is "HariKrishna" and see that it is mentioned in Signed-off-by tag.
Do I need to add surname as well ? Please let me know.

Thank you.
HariKrishna.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-09-08  9:54 ` hariconscious
@ 2025-09-08 13:23   ` Jonathan Corbet
  0 siblings, 0 replies; 1651+ messages in thread
From: Jonathan Corbet @ 2025-09-08 13:23 UTC (permalink / raw)
  To: hariconscious
  Cc: catalin.marinas, hariconscious, linux-arm-kernel, linux-doc,
	linux-kernel, shuah, will

hariconscious@gmail.com writes:

> Thanks for the suggestion, will correct and send the patch again.
> And my real name is "HariKrishna" and see that it is mentioned in Signed-off-by tag.
> Do I need to add surname as well ? Please let me know.

Yes, signoffs should give your full name.

Thanks,

jon


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-08-20 14:33 Christian König
  2025-08-20 15:23 ` David Hildenbrand
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2025-08-20 14:33 UTC (permalink / raw)
  To: intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, david, dave.hansen,
	luto, peterz

Hi everyone,

sorry for CCing so many people, but that rabbit hole turned out to be
deeper than originally thought.

TTM always had problems with UC/WC mappings on 32bit systems and drivers
often had to revert to hacks like using GFP_DMA32 to get things working
while having no rational explanation why that helped (see the TTM AGP,
radeon and nouveau driver code for that).

It turned out that the PAT implementation we use on x86 not only enforces
the same caching attributes for pages in the linear kernel mapping, but
also for highmem pages through a separate R/B tree.

That was unexpected and TTM never updated that R/B tree for highmem pages,
so the function pgprot_set_cachemode() just overwrote the caching
attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
caused all kind of random trouble.

An R/B tree is potentially not a good data structure to hold thousands if
not millions of different attributes for each page, so updating that is
probably not the way to solve this issue. 

Thomas pointed out that the i915 driver is using apply_page_range()
instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
just fill in the page tables with what the driver things is the right
caching attribute.

This patch set here implements this and it turns out to much *faster* than
the old implementation. Together with another change on my test system
mapping 1GiB of memory through TTM improved nearly by a factor of 10
(197ms -> 20ms)!

Please review the general idea and/or comment on the patches.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-20 14:33 Christian König
@ 2025-08-20 15:23 ` David Hildenbrand
  2025-08-21  8:10   ` Re: Christian König
  0 siblings, 1 reply; 1651+ messages in thread
From: David Hildenbrand @ 2025-08-20 15:23 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

CCing Lorenzo

On 20.08.25 16:33, Christian König wrote:
> Hi everyone,
> 
> sorry for CCing so many people, but that rabbit hole turned out to be
> deeper than originally thought.
> 
> TTM always had problems with UC/WC mappings on 32bit systems and drivers
> often had to revert to hacks like using GFP_DMA32 to get things working
> while having no rational explanation why that helped (see the TTM AGP,
> radeon and nouveau driver code for that).
> 
> It turned out that the PAT implementation we use on x86 not only enforces
> the same caching attributes for pages in the linear kernel mapping, but
> also for highmem pages through a separate R/B tree.
> 
> That was unexpected and TTM never updated that R/B tree for highmem pages,
> so the function pgprot_set_cachemode() just overwrote the caching
> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
> caused all kind of random trouble.
> 
> An R/B tree is potentially not a good data structure to hold thousands if
> not millions of different attributes for each page, so updating that is
> probably not the way to solve this issue.
> 
> Thomas pointed out that the i915 driver is using apply_page_range()
> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
> just fill in the page tables with what the driver things is the right
> caching attribute.

I assume you mean apply_to_page_range() -- same issue in patch subjects.

Oh this sounds horrible. Why oh why do we have these hacks in core-mm 
and have drivers abuse them :(

Honestly, apply_to_pte_range() is just the entry in doing all kinds of 
weird crap to page tables because "you know better".

All the sanity checks from vmf_insert_pfn(), gone.

Can we please fix the underlying issue properly?

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-20 15:23 ` David Hildenbrand
@ 2025-08-21  8:10   ` Christian König
  2025-08-25 19:10     ` Re: David Hildenbrand
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2025-08-21  8:10 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 20.08.25 17:23, David Hildenbrand wrote:
> CCing Lorenzo
> 
> On 20.08.25 16:33, Christian König wrote:
>> Hi everyone,
>>
>> sorry for CCing so many people, but that rabbit hole turned out to be
>> deeper than originally thought.
>>
>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>> often had to revert to hacks like using GFP_DMA32 to get things working
>> while having no rational explanation why that helped (see the TTM AGP,
>> radeon and nouveau driver code for that).
>>
>> It turned out that the PAT implementation we use on x86 not only enforces
>> the same caching attributes for pages in the linear kernel mapping, but
>> also for highmem pages through a separate R/B tree.
>>
>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>> so the function pgprot_set_cachemode() just overwrote the caching
>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>> caused all kind of random trouble.
>>
>> An R/B tree is potentially not a good data structure to hold thousands if
>> not millions of different attributes for each page, so updating that is
>> probably not the way to solve this issue.
>>
>> Thomas pointed out that the i915 driver is using apply_page_range()
>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>> just fill in the page tables with what the driver things is the right
>> caching attribute.
> 
> I assume you mean apply_to_page_range() -- same issue in patch subjects.

Oh yes, of course. Sorry.

> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(

Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.

> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".

Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.

The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.

What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.

Is that because of the of x86 CPUs which have problems when different page tables contain different caching attributes for the same physical memory?

> All the sanity checks from vmf_insert_pfn(), gone.
> 
> Can we please fix the underlying issue properly?

I'm happy to implement anything advised, my question is what should we solve this issue?

Regards,
Christian.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-21  8:10   ` Re: Christian König
@ 2025-08-25 19:10     ` David Hildenbrand
  2025-08-26  8:38       ` Re: Christian König
  0 siblings, 1 reply; 1651+ messages in thread
From: David Hildenbrand @ 2025-08-25 19:10 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 21.08.25 10:10, Christian König wrote:
> On 20.08.25 17:23, David Hildenbrand wrote:
>> CCing Lorenzo
>>
>> On 20.08.25 16:33, Christian König wrote:
>>> Hi everyone,
>>>
>>> sorry for CCing so many people, but that rabbit hole turned out to be
>>> deeper than originally thought.
>>>
>>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>>> often had to revert to hacks like using GFP_DMA32 to get things working
>>> while having no rational explanation why that helped (see the TTM AGP,
>>> radeon and nouveau driver code for that).
>>>
>>> It turned out that the PAT implementation we use on x86 not only enforces
>>> the same caching attributes for pages in the linear kernel mapping, but
>>> also for highmem pages through a separate R/B tree.
>>>
>>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>>> so the function pgprot_set_cachemode() just overwrote the caching
>>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>>> caused all kind of random trouble.
>>>
>>> An R/B tree is potentially not a good data structure to hold thousands if
>>> not millions of different attributes for each page, so updating that is
>>> probably not the way to solve this issue.
>>>
>>> Thomas pointed out that the i915 driver is using apply_page_range()
>>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>>> just fill in the page tables with what the driver things is the right
>>> caching attribute.
>>
>> I assume you mean apply_to_page_range() -- same issue in patch subjects.
> 
> Oh yes, of course. Sorry.
> 
>> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(
> 
> Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.
> 
>> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".
> 
> Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.
> 
> The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.
> 
> What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.

Probably because no other architecture has these weird glitches I assume 
... skimming over memtype_reserve() and friends there are quite some 
corner cases the code is handling (BIOS, ACPI, low ISA, system RAM, ...)

I did a lot of work on the higher PAT level functions, but I am no 
expert on the lower level management functions, and in particular all 
the special cases with different memory types.

IIRC, the goal of the PAT subsystem is to make sure that no two page 
tables map the same PFN with different caching attributes.

It treats ordinary system RAM (IORESOURCE_SYSTEM_RAM) usually in a 
special way: no special caching mode.

For everything else, it expects that someone first reserves a memory 
range for a specific caching mode.

For example, remap_pfn_range()...->pfnmap_track()->memtype_reserve() 
will make sure that there are no conflicts, to the call 
memtype_kernel_map_sync() to make sure the identity mapping is updated 
to the new type.

In case someone ends up calling pfnmap_setup_cachemode(), the 
expectation is that there was a previous call to memtype_reserve_io() or 
similar, such that pfnmap_setup_cachemode() will find that caching mode.

So my assumption would be that that is missing for the drivers here?

Last time I asked where this reservation is done, Peter Xu explained [1] 
it at least for VFIO:

vfio_pci_core_mmap
   pci_iomap
     pci_iomap_range
       ...
         __ioremap_caller
           memtype_reserve

Now, could it be that something like that is missing in these drivers 
(ioremap etc)?

[1] https://lkml.kernel.org/r/aBDXr-Qp4z0tS50P@x1.local

> 
> Is that because of the of x86 CPUs which have problems when different page tables contain different caching attributes for the same physical memory?

Yes, but I don't think x86 is special here.

-- 
Cheers

David / dhildenb

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-25 19:10     ` Re: David Hildenbrand
@ 2025-08-26  8:38       ` Christian König
  2025-08-26  8:46         ` Re: David Hildenbrand
  2025-08-26 12:37         ` Re: David Hildenbrand
  0 siblings, 2 replies; 1651+ messages in thread
From: Christian König @ 2025-08-26  8:38 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 25.08.25 21:10, David Hildenbrand wrote:
> On 21.08.25 10:10, Christian König wrote:
>> On 20.08.25 17:23, David Hildenbrand wrote:
>>> CCing Lorenzo
>>>
>>> On 20.08.25 16:33, Christian König wrote:
>>>> Hi everyone,
>>>>
>>>> sorry for CCing so many people, but that rabbit hole turned out to be
>>>> deeper than originally thought.
>>>>
>>>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>>>> often had to revert to hacks like using GFP_DMA32 to get things working
>>>> while having no rational explanation why that helped (see the TTM AGP,
>>>> radeon and nouveau driver code for that).
>>>>
>>>> It turned out that the PAT implementation we use on x86 not only enforces
>>>> the same caching attributes for pages in the linear kernel mapping, but
>>>> also for highmem pages through a separate R/B tree.
>>>>
>>>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>>>> so the function pgprot_set_cachemode() just overwrote the caching
>>>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>>>> caused all kind of random trouble.
>>>>
>>>> An R/B tree is potentially not a good data structure to hold thousands if
>>>> not millions of different attributes for each page, so updating that is
>>>> probably not the way to solve this issue.
>>>>
>>>> Thomas pointed out that the i915 driver is using apply_page_range()
>>>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>>>> just fill in the page tables with what the driver things is the right
>>>> caching attribute.
>>>
>>> I assume you mean apply_to_page_range() -- same issue in patch subjects.
>>
>> Oh yes, of course. Sorry.
>>
>>> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(
>>
>> Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.
>>
>>> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".
>>
>> Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.
>>
>> The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.
>>
>> What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.
> 
> Probably because no other architecture has these weird glitches I assume ... skimming over memtype_reserve() and friends there are quite some corner cases the code is handling (BIOS, ACPI, low ISA, system RAM, ...)
> 
> 
> I did a lot of work on the higher PAT level functions, but I am no expert on the lower level management functions, and in particular all the special cases with different memory types.
> 
> IIRC, the goal of the PAT subsystem is to make sure that no two page tables map the same PFN with different caching attributes.

Yeah, that actually makes sense. Thomas from Intel recently explained the technical background to me:

Some x86 CPUs write back cache lines even if they aren't dirty and what can happen is that because of the linear mapping the CPU speculatively loads a cache line which is elsewhere mapped uncached.

So the end result is that the writeback of not dirty cache lines potentially corrupts the data in the otherwise uncached system memory.

But that a) only applies to memory in the linear mapping and b) only to a handful of x86 CPU types (e.g. recently Intels Luna Lake, AMD Athlons produced before 2004, maybe others).

> It treats ordinary system RAM (IORESOURCE_SYSTEM_RAM) usually in a special way: no special caching mode.
> 
> For everything else, it expects that someone first reserves a memory range for a specific caching mode.
> 
> For example, remap_pfn_range()...->pfnmap_track()->memtype_reserve() will make sure that there are no conflicts, to the call memtype_kernel_map_sync() to make sure the identity mapping is updated to the new type.
> 
> In case someone ends up calling pfnmap_setup_cachemode(), the expectation is that there was a previous call to memtype_reserve_io() or similar, such that pfnmap_setup_cachemode() will find that caching mode.
> 
> 
> So my assumption would be that that is missing for the drivers here?

Well yes and no.

See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.

So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.

> 
> Last time I asked where this reservation is done, Peter Xu explained [1] it at least for VFIO:
> 
> vfio_pci_core_mmap
>   pci_iomap
>     pci_iomap_range
>       ...
>         __ioremap_caller
>           memtype_reserve
> 
> 
> Now, could it be that something like that is missing in these drivers (ioremap etc)?

Well that would solve the issue temporary, but I'm pretty sure that will just go boom at a different place then :(

One possibility would be to say that the PAT only overrides the attributes if they aren't normal cached and leaves everything else alone.

What do you think?

Thanks,
Christian.

> 
> 
> 
> [1] https://lkml.kernel.org/r/aBDXr-Qp4z0tS50P@x1.local
> 
> 
>>
>> Is that because of the of x86 CPUs which have problems when different page tables contain different caching attributes for the same physical memory?
> 
> Yes, but I don't think x86 is special here.
> 


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-26  8:38       ` Re: Christian König
@ 2025-08-26  8:46         ` David Hildenbrand
  2025-08-26  9:00           ` Re: Christian König
  2025-08-26 12:37         ` Re: David Hildenbrand
  1 sibling, 1 reply; 1651+ messages in thread
From: David Hildenbrand @ 2025-08-26  8:46 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 10:38, Christian König wrote:
> On 25.08.25 21:10, David Hildenbrand wrote:
>> On 21.08.25 10:10, Christian König wrote:
>>> On 20.08.25 17:23, David Hildenbrand wrote:
>>>> CCing Lorenzo
>>>>
>>>> On 20.08.25 16:33, Christian König wrote:
>>>>> Hi everyone,
>>>>>
>>>>> sorry for CCing so many people, but that rabbit hole turned out to be
>>>>> deeper than originally thought.
>>>>>
>>>>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>>>>> often had to revert to hacks like using GFP_DMA32 to get things working
>>>>> while having no rational explanation why that helped (see the TTM AGP,
>>>>> radeon and nouveau driver code for that).
>>>>>
>>>>> It turned out that the PAT implementation we use on x86 not only enforces
>>>>> the same caching attributes for pages in the linear kernel mapping, but
>>>>> also for highmem pages through a separate R/B tree.
>>>>>
>>>>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>>>>> so the function pgprot_set_cachemode() just overwrote the caching
>>>>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>>>>> caused all kind of random trouble.
>>>>>
>>>>> An R/B tree is potentially not a good data structure to hold thousands if
>>>>> not millions of different attributes for each page, so updating that is
>>>>> probably not the way to solve this issue.
>>>>>
>>>>> Thomas pointed out that the i915 driver is using apply_page_range()
>>>>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>>>>> just fill in the page tables with what the driver things is the right
>>>>> caching attribute.
>>>>
>>>> I assume you mean apply_to_page_range() -- same issue in patch subjects.
>>>
>>> Oh yes, of course. Sorry.
>>>
>>>> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(
>>>
>>> Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.
>>>
>>>> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".
>>>
>>> Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.
>>>
>>> The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.
>>>
>>> What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.
>>
>> Probably because no other architecture has these weird glitches I assume ... skimming over memtype_reserve() and friends there are quite some corner cases the code is handling (BIOS, ACPI, low ISA, system RAM, ...)
>>
>>
>> I did a lot of work on the higher PAT level functions, but I am no expert on the lower level management functions, and in particular all the special cases with different memory types.
>>
>> IIRC, the goal of the PAT subsystem is to make sure that no two page tables map the same PFN with different caching attributes.
> 
> Yeah, that actually makes sense. Thomas from Intel recently explained the technical background to me:
> 
> Some x86 CPUs write back cache lines even if they aren't dirty and what can happen is that because of the linear mapping the CPU speculatively loads a cache line which is elsewhere mapped uncached.
> 
> So the end result is that the writeback of not dirty cache lines potentially corrupts the data in the otherwise uncached system memory.
> 
> But that a) only applies to memory in the linear mapping and b) only to a handful of x86 CPU types (e.g. recently Intels Luna Lake, AMD Athlons produced before 2004, maybe others).
> 
>> It treats ordinary system RAM (IORESOURCE_SYSTEM_RAM) usually in a special way: no special caching mode.
>>
>> For everything else, it expects that someone first reserves a memory range for a specific caching mode.
>>
>> For example, remap_pfn_range()...->pfnmap_track()->memtype_reserve() will make sure that there are no conflicts, to the call memtype_kernel_map_sync() to make sure the identity mapping is updated to the new type.
>>
>> In case someone ends up calling pfnmap_setup_cachemode(), the expectation is that there was a previous call to memtype_reserve_io() or similar, such that pfnmap_setup_cachemode() will find that caching mode.
>>
>>
>> So my assumption would be that that is missing for the drivers here?
> 
> Well yes and no.
> 
> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.
> 
> So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.
> 

Hm, above you're saying that there is no direct map, but now you are 
saying that the pages were obtained through get_free_page()?

I agree that what you describe here sounds suboptimal. But if the pages 
where obtained from the buddy, there surely is a direct map -- unless we 
explicitly remove it :(

If we're talking about individual pages without a directmap, I would 
wonder if they are actually part of a bigger memory region that can just 
be reserved in one go (similar to how remap_pfn_range()) would handle it.

Can you briefly describe how your use case obtains these PFNs, and how 
scattered tehy + their caching attributes might be?

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-26  8:46         ` Re: David Hildenbrand
@ 2025-08-26  9:00           ` Christian König
  2025-08-26  9:17             ` Re: David Hildenbrand
  0 siblings, 1 reply; 1651+ messages in thread
From: Christian König @ 2025-08-26  9:00 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 10:46, David Hildenbrand wrote:
>>> So my assumption would be that that is missing for the drivers here?
>>
>> Well yes and no.
>>
>> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.
>>
>> So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.
>>
> 
> Hm, above you're saying that there is no direct map, but now you are saying that the pages were obtained through get_free_page()?

The problem only happens with highmem pages on 32bit kernels. Those pages are not in the linear mapping.

> I agree that what you describe here sounds suboptimal. But if the pages where obtained from the buddy, there surely is a direct map -- unless we explicitly remove it :(
> 
> If we're talking about individual pages without a directmap, I would wonder if they are actually part of a bigger memory region that can just be reserved in one go (similar to how remap_pfn_range()) would handle it.
> 
> Can you briefly describe how your use case obtains these PFNs, and how scattered tehy + their caching attributes might be?

What drivers do is to call get_free_page() or alloc_pages_node() with the GFP_HIGHUSER flag set.

For non highmem pages drivers then calls set_pages_wc/uc() which changes the caching of the linear mapping, but for highmem pages there is no linear mapping so set_pages_wc() or set_pages_uc() doesn't work and drivers avoid calling it.

Those are basically just random system memory pages. So they are potentially scattered over the whole memory address space.

Regards,
Christian.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-26  9:00           ` Re: Christian König
@ 2025-08-26  9:17             ` David Hildenbrand
  2025-08-26  9:56               ` Re: Christian König
  0 siblings, 1 reply; 1651+ messages in thread
From: David Hildenbrand @ 2025-08-26  9:17 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 11:00, Christian König wrote:
> On 26.08.25 10:46, David Hildenbrand wrote:
>>>> So my assumption would be that that is missing for the drivers here?
>>>
>>> Well yes and no.
>>>
>>> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.
>>>
>>> So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.
>>>
>>
>> Hm, above you're saying that there is no direct map, but now you are saying that the pages were obtained through get_free_page()?
> 
> The problem only happens with highmem pages on 32bit kernels. Those pages are not in the linear mapping.

Right, in the common case there is a direct map.

> 
>> I agree that what you describe here sounds suboptimal. But if the pages where obtained from the buddy, there surely is a direct map -- unless we explicitly remove it :(
>>
>> If we're talking about individual pages without a directmap, I would wonder if they are actually part of a bigger memory region that can just be reserved in one go (similar to how remap_pfn_range()) would handle it.
>>
>> Can you briefly describe how your use case obtains these PFNs, and how scattered tehy + their caching attributes might be?
> 
> What drivers do is to call get_free_page() or alloc_pages_node() with the GFP_HIGHUSER flag set.
> 
> For non highmem pages drivers then calls set_pages_wc/uc() which changes the caching of the linear mapping, but for highmem pages there is no linear mapping so set_pages_wc() or set_pages_uc() doesn't work and drivers avoid calling it.
> 
> Those are basically just random system memory pages. So they are potentially scattered over the whole memory address space.

Thanks, that's valuable information.

So essentially these drivers maintain their own consistency and PAT is 
not aware of that.

And the real problem is ordinary system RAM.

There are various ways forward.

1) We use another interface that consumes pages instead of PFNs, like a
    vm_insert_pages_pgprot() we would be adding.

    Is there any strong requirement for inserting non-refcounted PFNs?

2) We add another interface that consumes PFNs, but explicitly states
    that it is only for ordinary system RAM, and that the user is
    required for updating the direct map.

    We could sanity-check the direct map in debug kernels.

3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
    system RAM differently.


There is also the option for a mixture between 1 and 2, where we get 
pages, but we map them non-refcounted in a VM_PFNMAP.

In general, having pages makes it easier to assert that they are likely 
ordinary system ram pages, and that the interface is not getting abused 
for something else.

We could also perform the set_pages_wc/uc() from inside that function, 
but maybe it depends on the use case whether we want to do that whenever 
we map them into a process?

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-26  9:17             ` Re: David Hildenbrand
@ 2025-08-26  9:56               ` Christian König
  2025-08-26 12:07                 ` Re: David Hildenbrand
  2025-08-26 14:27                 ` Re: Thomas Hellström
  0 siblings, 2 replies; 1651+ messages in thread
From: Christian König @ 2025-08-26  9:56 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 11:17, David Hildenbrand wrote:
> On 26.08.25 11:00, Christian König wrote:
>> On 26.08.25 10:46, David Hildenbrand wrote:
>>>>> So my assumption would be that that is missing for the drivers here?
>>>>
>>>> Well yes and no.
>>>>
>>>> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.
>>>>
>>>> So what would happen is that we completely clutter the R/B tree used by the PAT with thousands if not millions of entries.
>>>>
>>>
>>> Hm, above you're saying that there is no direct map, but now you are saying that the pages were obtained through get_free_page()?
>>
>> The problem only happens with highmem pages on 32bit kernels. Those pages are not in the linear mapping.
> 
> Right, in the common case there is a direct map.
> 
>>
>>> I agree that what you describe here sounds suboptimal. But if the pages where obtained from the buddy, there surely is a direct map -- unless we explicitly remove it :(
>>>
>>> If we're talking about individual pages without a directmap, I would wonder if they are actually part of a bigger memory region that can just be reserved in one go (similar to how remap_pfn_range()) would handle it.
>>>
>>> Can you briefly describe how your use case obtains these PFNs, and how scattered tehy + their caching attributes might be?
>>
>> What drivers do is to call get_free_page() or alloc_pages_node() with the GFP_HIGHUSER flag set.
>>
>> For non highmem pages drivers then calls set_pages_wc/uc() which changes the caching of the linear mapping, but for highmem pages there is no linear mapping so set_pages_wc() or set_pages_uc() doesn't work and drivers avoid calling it.
>>
>> Those are basically just random system memory pages. So they are potentially scattered over the whole memory address space.
> 
> Thanks, that's valuable information.
> 
> So essentially these drivers maintain their own consistency and PAT is not aware of that.
> 
> And the real problem is ordinary system RAM.
> 
> There are various ways forward.
> 
> 1) We use another interface that consumes pages instead of PFNs, like a
>    vm_insert_pages_pgprot() we would be adding.
> 
>    Is there any strong requirement for inserting non-refcounted PFNs?

Yes, there is a strong requirement to insert non-refcounted PFNs.

We had a lot of trouble with KVM people trying to grab a reference to those pages even if the VMA had the VM_PFNMAP flag set.

> 2) We add another interface that consumes PFNs, but explicitly states
>    that it is only for ordinary system RAM, and that the user is
>    required for updating the direct map.
> 
>    We could sanity-check the direct map in debug kernels.

I would rather like to see vmf_insert_pfn_prot() fixed instead.

That function was explicitly added to insert the PFN with the given attributes and as far as I can see all users of that function expect exactly that.

> 
> 3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
>    system RAM differently.
> 
> 
> There is also the option for a mixture between 1 and 2, where we get pages, but we map them non-refcounted in a VM_PFNMAP.
> 
> In general, having pages makes it easier to assert that they are likely ordinary system ram pages, and that the interface is not getting abused for something else.

Well, exactly that's the use case here and that is not abusive at all as far as I can see.

What drivers want is to insert a PFN with a certain set of caching attributes regardless if it's system memory or iomem. That's why vmf_insert_pfn_prot() was created in the first place.

That drivers need to call set_pages_wc/uc() for the linear mapping on x86 manually is correct and checking that is clearly a good idea for debug kernels.

> We could also perform the set_pages_wc/uc() from inside that function, but maybe it depends on the use case whether we want to do that whenever we map them into a process?

It sounds like a good idea in theory, but I think it is potentially to much overhead to be applicable.

Thanks,
Christian.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-26  9:56               ` Re: Christian König
@ 2025-08-26 12:07                 ` David Hildenbrand
  2025-08-26 16:09                   ` Re: Christian König
  2025-08-26 14:27                 ` Re: Thomas Hellström
  1 sibling, 1 reply; 1651+ messages in thread
From: David Hildenbrand @ 2025-08-26 12:07 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

>>
>> 1) We use another interface that consumes pages instead of PFNs, like a
>>     vm_insert_pages_pgprot() we would be adding.
>>
>>     Is there any strong requirement for inserting non-refcounted PFNs?
> 
> Yes, there is a strong requirement to insert non-refcounted PFNs.
> 
> We had a lot of trouble with KVM people trying to grab a reference to those pages even if the VMA had the VM_PFNMAP flag set.

Yes, KVM ignored (and maybe still does) VM_PFNMAP to some degree, which 
is rather nasty.

> 
>> 2) We add another interface that consumes PFNs, but explicitly states
>>     that it is only for ordinary system RAM, and that the user is
>>     required for updating the direct map.
>>
>>     We could sanity-check the direct map in debug kernels.
> 
> I would rather like to see vmf_insert_pfn_prot() fixed instead.
> 
> That function was explicitly added to insert the PFN with the given attributes and as far as I can see all users of that function expect exactly that.

It's all a bit tricky :(

> 
>>
>> 3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
>>     system RAM differently.
>>
>>
>> There is also the option for a mixture between 1 and 2, where we get pages, but we map them non-refcounted in a VM_PFNMAP.
>>
>> In general, having pages makes it easier to assert that they are likely ordinary system ram pages, and that the interface is not getting abused for something else.
> 
> Well, exactly that's the use case here and that is not abusive at all as far as I can see.
> 
> What drivers want is to insert a PFN with a certain set of caching attributes regardless if it's system memory or iomem. That's why vmf_insert_pfn_prot() was created in the first place.

I mean, the use case of "allocate pages from the buddy and fixup the 
linear map" sounds perfectly reasonable to me. Absolutely no reason to 
get PAT involved. Nobody else should be messing with that memory after all.

As soon as we are talking about other memory ranges (iomem) that are not 
from the buddy, it gets weird to bypass PAT, and the question I am 
asking myself is, when is it okay, and when not.

> 
> That drivers need to call set_pages_wc/uc() for the linear mapping on x86 manually is correct and checking that is clearly a good idea for debug kernels.

I'll have to think about this a bit: assuming only vmf_insert_pfn() 
calls pfnmap_setup_cachemode_pfn() but vmf_insert_pfn_prot() doesn't, 
how could we sanity check that somebody is doing something against the 
will of PAT.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-26 12:07                 ` Re: David Hildenbrand
@ 2025-08-26 16:09                   ` Christian König
  0 siblings, 0 replies; 1651+ messages in thread
From: Christian König @ 2025-08-26 16:09 UTC (permalink / raw)
  To: David Hildenbrand, intel-xe, intel-gfx, dri-devel, amd-gfx, x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 14:07, David Hildenbrand wrote: 
>>
>>> 2) We add another interface that consumes PFNs, but explicitly states
>>>     that it is only for ordinary system RAM, and that the user is
>>>     required for updating the direct map.
>>>
>>>     We could sanity-check the direct map in debug kernels.
>>
>> I would rather like to see vmf_insert_pfn_prot() fixed instead.
>>
>> That function was explicitly added to insert the PFN with the given attributes and as far as I can see all users of that function expect exactly that.
> 
> It's all a bit tricky :(

I would rather say horrible complicated :(

>>>
>>> 3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating this
>>>     system RAM differently.
>>>
>>>
>>> There is also the option for a mixture between 1 and 2, where we get pages, but we map them non-refcounted in a VM_PFNMAP.
>>>
>>> In general, having pages makes it easier to assert that they are likely ordinary system ram pages, and that the interface is not getting abused for something else.
>>
>> Well, exactly that's the use case here and that is not abusive at all as far as I can see.
>>
>> What drivers want is to insert a PFN with a certain set of caching attributes regardless if it's system memory or iomem. That's why vmf_insert_pfn_prot() was created in the first place.
> 
> I mean, the use case of "allocate pages from the buddy and fixup the linear map" sounds perfectly reasonable to me. Absolutely no reason to get PAT involved. Nobody else should be messing with that memory after all.
> 
> As soon as we are talking about other memory ranges (iomem) that are not from the buddy, it gets weird to bypass PAT, and the question I am asking myself is, when is it okay, and when not.

Ok let me try to explain parts of the history and the big picture for at least the graphics use case on x86.

In 1996/97 Intel came up with the idea of AGP: https://en.wikipedia.org/wiki/Accelerated_Graphics_Port

At that time the CPUs, PCI bus and system memory were all connected together through the north bridge: https://en.wikipedia.org/wiki/Northbridge_(computing)

The problem was that AGP also introduced the concept of putting large amounts of data for the video controller (PCI device) into system memory when you don't have enough local device memory (VRAM).

But that meant when that memory is cached that the north bridge always had to snoop the CPU cache over the front side bus for every access the video controller made. This meant a huge performance bottleneck, so the idea was born to access that data uncached.

Well that was nearly 30years ago, PCI, AGP and front side bus are long gone, but the concept of putting video controller (GPU) stuff into uncached system memory has prevailed.

So for example even modern AMD CPU based laptops need uncached system memory if their local memory is not large enough to contain the picture to display on the monitor. And with modern 8k monitors that can actually happen quite fast...

What drivers do today is to call vmf_insert_pfn_prot() either with the PFN of their local memory (iomem) or uncached/wc system memory.

To summarize that we have an interface to fill in the page tables with either iomem or system memory is actually part of the design. That's how the HW driver is expected to work.

>> That drivers need to call set_pages_wc/uc() for the linear mapping on x86 manually is correct and checking that is clearly a good idea for debug kernels.
> 
> I'll have to think about this a bit: assuming only vmf_insert_pfn() calls pfnmap_setup_cachemode_pfn() but vmf_insert_pfn_prot() doesn't, how could we sanity check that somebody is doing something against the will of PAT.

I think the most defensive approach for a quick fix is this change here:

 static inline void pgprot_set_cachemode(pgprot_t *prot, enum page_cache_mode pcm)
 {
-       *prot = __pgprot((pgprot_val(*prot) & ~_PAGE_CACHE_MASK) |
-                        cachemode2protval(pcm));
+       if (pcm != _PAGE_CACHE_MODE_WB)
+               *prot = __pgprot((pgprot_val(*prot) & ~_PAGE_CACHE_MASK) |
+                                cachemode2protval(pcm));
 }

This applies the PAT value if it's anything else than _PAGE_CACHE_MODE_WB but still allows callers to use something different on normal WB system memory.

What do you think?

Regards,
Christian

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-26  9:56               ` Re: Christian König
  2025-08-26 12:07                 ` Re: David Hildenbrand
@ 2025-08-26 14:27                 ` Thomas Hellström
  1 sibling, 0 replies; 1651+ messages in thread
From: Thomas Hellström @ 2025-08-26 14:27 UTC (permalink / raw)
  To: Christian König, David Hildenbrand, intel-xe, intel-gfx,
	dri-devel, amd-gfx, x86
  Cc: airlied, matthew.brost, dave.hansen, luto, peterz,
	Lorenzo Stoakes

Hi, Christian,

On Tue, 2025-08-26 at 11:56 +0200, Christian König wrote:
> On 26.08.25 11:17, David Hildenbrand wrote:
> > On 26.08.25 11:00, Christian König wrote:
> > > On 26.08.25 10:46, David Hildenbrand wrote:
> > > > > > So my assumption would be that that is missing for the
> > > > > > drivers here?
> > > > > 
> > > > > Well yes and no.
> > > > > 
> > > > > See the PAT is optimized for applying specific caching
> > > > > attributes to ranges [A..B] (e.g. it uses an R/B tree). But
> > > > > what drivers do here is that they have single pages (usually
> > > > > for get_free_page or similar) and want to apply a certain
> > > > > caching attribute to it.
> > > > > 
> > > > > So what would happen is that we completely clutter the R/B
> > > > > tree used by the PAT with thousands if not millions of
> > > > > entries.
> > > > > 
> > > > 
> > > > Hm, above you're saying that there is no direct map, but now
> > > > you are saying that the pages were obtained through
> > > > get_free_page()?
> > > 
> > > The problem only happens with highmem pages on 32bit kernels.
> > > Those pages are not in the linear mapping.
> > 
> > Right, in the common case there is a direct map.
> > 
> > > 
> > > > I agree that what you describe here sounds suboptimal. But if
> > > > the pages where obtained from the buddy, there surely is a
> > > > direct map -- unless we explicitly remove it :(
> > > > 
> > > > If we're talking about individual pages without a directmap, I
> > > > would wonder if they are actually part of a bigger memory
> > > > region that can just be reserved in one go (similar to how
> > > > remap_pfn_range()) would handle it.
> > > > 
> > > > Can you briefly describe how your use case obtains these PFNs,
> > > > and how scattered tehy + their caching attributes might be?
> > > 
> > > What drivers do is to call get_free_page() or alloc_pages_node()
> > > with the GFP_HIGHUSER flag set.
> > > 
> > > For non highmem pages drivers then calls set_pages_wc/uc() which
> > > changes the caching of the linear mapping, but for highmem pages
> > > there is no linear mapping so set_pages_wc() or set_pages_uc()
> > > doesn't work and drivers avoid calling it.
> > > 
> > > Those are basically just random system memory pages. So they are
> > > potentially scattered over the whole memory address space.
> > 
> > Thanks, that's valuable information.
> > 
> > So essentially these drivers maintain their own consistency and PAT
> > is not aware of that.
> > 
> > And the real problem is ordinary system RAM.
> > 
> > There are various ways forward.
> > 
> > 1) We use another interface that consumes pages instead of PFNs,
> > like a
> >    vm_insert_pages_pgprot() we would be adding.
> > 
> >    Is there any strong requirement for inserting non-refcounted
> > PFNs?
> 
> Yes, there is a strong requirement to insert non-refcounted PFNs.
> 
> We had a lot of trouble with KVM people trying to grab a reference to
> those pages even if the VMA had the VM_PFNMAP flag set.
> 
> > 2) We add another interface that consumes PFNs, but explicitly
> > states
> >    that it is only for ordinary system RAM, and that the user is
> >    required for updating the direct map.
> > 
> >    We could sanity-check the direct map in debug kernels.
> 
> I would rather like to see vmf_insert_pfn_prot() fixed instead.
> 
> That function was explicitly added to insert the PFN with the given
> attributes and as far as I can see all users of that function expect
> exactly that.
> 
> > 
> > 3) We teach PAT code in pfnmap_setup_cachemode_pfn() about treating
> > this
> >    system RAM differently.
> > 
> > 
> > There is also the option for a mixture between 1 and 2, where we
> > get pages, but we map them non-refcounted in a VM_PFNMAP.
> > 
> > In general, having pages makes it easier to assert that they are
> > likely ordinary system ram pages, and that the interface is not
> > getting abused for something else.
> 
> Well, exactly that's the use case here and that is not abusive at all
> as far as I can see.
> 
> What drivers want is to insert a PFN with a certain set of caching
> attributes regardless if it's system memory or iomem. That's why
> vmf_insert_pfn_prot() was created in the first place.
> 
> That drivers need to call set_pages_wc/uc() for the linear mapping on
> x86 manually is correct and checking that is clearly a good idea for
> debug kernels.

So where is this trending? Is the current suggestion to continue
disallowing aliased mappings with conflicting caching modes and enforce
checks in debug kernels?

/Thomas


> 
> > We could also perform the set_pages_wc/uc() from inside that
> > function, but maybe it depends on the use case whether we want to
> > do that whenever we map them into a process?
> 
> It sounds like a good idea in theory, but I think it is potentially
> to much overhead to be applicable.
> 
> Thanks,
> Christian.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-26  8:38       ` Re: Christian König
  2025-08-26  8:46         ` Re: David Hildenbrand
@ 2025-08-26 12:37         ` David Hildenbrand
  1 sibling, 0 replies; 1651+ messages in thread
From: David Hildenbrand @ 2025-08-26 12:37 UTC (permalink / raw)
  To: Christian König, intel-xe, intel-gfx, dri-devel, amd-gfx,
	x86
  Cc: airlied, thomas.hellstrom, matthew.brost, dave.hansen, luto,
	peterz, Lorenzo Stoakes

On 26.08.25 10:38, Christian König wrote:
> On 25.08.25 21:10, David Hildenbrand wrote:
>> On 21.08.25 10:10, Christian König wrote:
>>> On 20.08.25 17:23, David Hildenbrand wrote:
>>>> CCing Lorenzo
>>>>
>>>> On 20.08.25 16:33, Christian König wrote:
>>>>> Hi everyone,
>>>>>
>>>>> sorry for CCing so many people, but that rabbit hole turned out to be
>>>>> deeper than originally thought.
>>>>>
>>>>> TTM always had problems with UC/WC mappings on 32bit systems and drivers
>>>>> often had to revert to hacks like using GFP_DMA32 to get things working
>>>>> while having no rational explanation why that helped (see the TTM AGP,
>>>>> radeon and nouveau driver code for that).
>>>>>
>>>>> It turned out that the PAT implementation we use on x86 not only enforces
>>>>> the same caching attributes for pages in the linear kernel mapping, but
>>>>> also for highmem pages through a separate R/B tree.
>>>>>
>>>>> That was unexpected and TTM never updated that R/B tree for highmem pages,
>>>>> so the function pgprot_set_cachemode() just overwrote the caching
>>>>> attributes drivers passed in to vmf_insert_pfn_prot() and that essentially
>>>>> caused all kind of random trouble.
>>>>>
>>>>> An R/B tree is potentially not a good data structure to hold thousands if
>>>>> not millions of different attributes for each page, so updating that is
>>>>> probably not the way to solve this issue.
>>>>>
>>>>> Thomas pointed out that the i915 driver is using apply_page_range()
>>>>> instead of vmf_insert_pfn_prot() to circumvent the PAT implementation and
>>>>> just fill in the page tables with what the driver things is the right
>>>>> caching attribute.
>>>>
>>>> I assume you mean apply_to_page_range() -- same issue in patch subjects.
>>>
>>> Oh yes, of course. Sorry.
>>>
>>>> Oh this sounds horrible. Why oh why do we have these hacks in core-mm and have drivers abuse them :(
>>>
>>> Yeah I was also a bit hesitated to use that, but the performance advantage is so high that we probably can't avoid the general approach.
>>>
>>>> Honestly, apply_to_pte_range() is just the entry in doing all kinds of weird crap to page tables because "you know better".
>>>
>>> Exactly that's the problem I'm pointing out, drivers *do* know it better. The core memory management has applied incorrect values which caused all kind of the trouble.
>>>
>>> The problem is not a bug in PAT nor TTM/drivers but rather how they interact with each other.
>>>
>>> What I don't understand is why do we have the PAT in the first place? No other architecture does it this way.
>>
>> Probably because no other architecture has these weird glitches I assume ... skimming over memtype_reserve() and friends there are quite some corner cases the code is handling (BIOS, ACPI, low ISA, system RAM, ...)
>>
>>
>> I did a lot of work on the higher PAT level functions, but I am no expert on the lower level management functions, and in particular all the special cases with different memory types.
>>
>> IIRC, the goal of the PAT subsystem is to make sure that no two page tables map the same PFN with different caching attributes.
> 
> Yeah, that actually makes sense. Thomas from Intel recently explained the technical background to me:
> 
> Some x86 CPUs write back cache lines even if they aren't dirty and what can happen is that because of the linear mapping the CPU speculatively loads a cache line which is elsewhere mapped uncached.
> 
> So the end result is that the writeback of not dirty cache lines potentially corrupts the data in the otherwise uncached system memory.
> 
> But that a) only applies to memory in the linear mapping and b) only to a handful of x86 CPU types (e.g. recently Intels Luna Lake, AMD Athlons produced before 2004, maybe others).
> 
>> It treats ordinary system RAM (IORESOURCE_SYSTEM_RAM) usually in a special way: no special caching mode.
>>
>> For everything else, it expects that someone first reserves a memory range for a specific caching mode.
>>
>> For example, remap_pfn_range()...->pfnmap_track()->memtype_reserve() will make sure that there are no conflicts, to the call memtype_kernel_map_sync() to make sure the identity mapping is updated to the new type.
>>
>> In case someone ends up calling pfnmap_setup_cachemode(), the expectation is that there was a previous call to memtype_reserve_io() or similar, such that pfnmap_setup_cachemode() will find that caching mode.
>>
>>
>> So my assumption would be that that is missing for the drivers here?
> 
> Well yes and no.
> 
> See the PAT is optimized for applying specific caching attributes to ranges [A..B] (e.g. it uses an R/B tree). But what drivers do here is that they have single pages (usually for get_free_page or similar) and want to apply a certain caching attribute to it.

One clarification after staring at PAT code once again: for pages (RAM), 
the caching attribute is stored in the page flags, not in the R/B tree.

If nothing was set, it defaults to _PAGE_CACHE_MODE_WB AFAIKs.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [RFC PATCH] time: introduce BOOT_TIME_TRACKER and minimal boot timestamp
@ 2025-08-23 22:53 Thomas Gleixner
  2025-09-01  4:05 ` Kaiwan N Billimoria
  0 siblings, 1 reply; 1651+ messages in thread
From: Thomas Gleixner @ 2025-08-23 22:53 UTC (permalink / raw)
  To: vishnu singh, mark.rutland, maz, catalin.marinas, will, jstultz,
	sboyd, akpm, chenhuacai, pmladek, agordeev, bigeasy, urezki,
	Llillian, francesco, guoweikang.kernel, alexander.shishkin,
	rrangel, kpsingh, anna-maria, mingo, frederic, linux-kernel,
	linux-arm-kernel

On Sat, Aug 23 2025 at  9:49, vishnu singh wrote:
On Sat, Aug 23 2025 at 10:10, vishnu singh wrote:

Why are you sending the same thing twice within 20 minutes?

> From: Vishnu Singh <v-singh1@ti.com>
>
> Part of BOOT SIG Initiative,

What the heck is BOOT SIG Initiative? Are you seriously expecting me to
look this up?

Please don't provide me a random link to it because it's not relevant to
the rest of what I have to say about this.

> , This patch adds a tiny,opt-in facility to help measure kernel boot
>   duration without full tracing:

1) Any spell checker would have pointed out to you that 'This' after a
   comma is not a proper sentence and neither is 'tiny,opt-in facility'

2) You failed to read documentation:

   # git grep "This patch" Documentation/process/

3) This change log starts with the WHAT and fails completely to explain
   the WHY. See:

     https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#changelog

> New CONFIG_BOOT_TIME_TRACKER in kernel/time/Kconfig.
> When enabled, the kernel logs two boot markers:
>     1. kernel entry in start_kernel()
>     2. first userspace start in kernel_init() before run_init_process()
>
> These markers are intended for post-boot parsers to compute coarse
> kernel boot time and to merge with bootloader/MCU/DSP records into
> a unified timeline.
>
> Core helper u64 boot_time_now() in kernel/time/boot_time_now.c,
> exporting a counter‑derived timestamp via small per-arch primitives.
> This series includes an initial arm64 primitive that uses CNTVCT_EL0
> as the source, other architectures can wire up equivalents.
>
> Files touched:
> kernel/time/Kconfig, kernel/time/Makefile
> kernel/time/boot_time_now.c (new core helper)
> arch/arm64/include/asm/boot_time_primitives.h (arm64 primitive)
> include/linux/boot_time_now.h (public API + IDs)
> init/main.c (print two markers)

Seriously? This can be seen from the diffstat and the patch itself.

You still fail to explain the problem you are trying to solve and
instead babble about WHAT you are doing, which means you never read the
documentation of the project which you are trying to contribute to.

Do you really think that the people who spent time on writing it, did
so just to be ignored?

> This complements U-Boot’s stashed bootstage records so a userspace tool
> can build a system-wide boot timeline across SPL, U-Boot, kernel and other
> subsystems.
>
> Reference boot-time parser utility:
>      https://github.com/v-singh1/boot-time-parser
>
> Sample boot time report:
> +--------------------------------------------------------------------+
>                  am62xx-evm Boot Time Report
> +--------------------------------------------------------------------+
> Device Power On         : 0 ms

<SNIP>

> IPC_SYNC_ALL                   =   6787 ms (+151 ms)
> +--------------------------------------------------------------------+

How are these 30 lines of useless information helpful to understand the
underlying problem?

That's what cover letters are for.

>  MAINTAINERS                                   |  3 +++
>  arch/arm64/include/asm/boot_time_primitives.h | 14 ++++++++++++++
>  include/linux/boot_time_now.h                 | 16 ++++++++++++++++
>  init/main.c                                   | 13 +++++++++++++
>  kernel/time/Kconfig                           | 10 ++++++++++
>  kernel/time/Makefile                          |  1 +
>  kernel/time/boot_time_now.c                   | 13 +++++++++++++

This does too many things at once. See Documentation.

One patch for creating the infrastructure with a proper rationale and
then one which hooks it up.

Again:

    Documentation has not been written to be ignored. RFC patches are
    not exempt from that.

> +static inline u64 arch_boot_counter_now(void)
> +{
> +	return ((arch_timer_read_cntvct_el0() * 1000000) / arch_timer_get_cntfrq());
> +}

Q: What guarantees that this timer is available and functional at this
   point?

A: Nothing

> +++ b/include/linux/boot_time_now.h

What means boot_time_now?

You couldn't come up with a less non-descriptive name, right?

> +enum kernel_bootstage_id {
> +	BOOTSTAGE_ID_KERNEL_START = 300,
> +	BOOTSTAGE_ID_KERNEL_END = 301,

Aside of the formatting (See Documentation), these are random numbers
pulled out of thin air without any explanation why they need to start at
300.

> +};
> +
> +/* Return boot time in nanoseconds using hardware counter */
> +u64 boot_time_now(void);

That's a function name which is as bad as is can be. This is about
getting an early time stamp and that needs to be properly named _AND_
encapsulated so it works universally without some magic hardware
dependency. If at all, see below.

>  #include <kunit/test.h>
>  
> +#ifdef CONFIG_BOOT_TIME_TRACKER
> +#include <linux/boot_time_now.h>
> +#endif

What's this ifdeffery for? Headers have to be written in a way that they
can be unconditionally included. IOW, put the ifdeffery into the header.

> @@ -929,6 +933,11 @@ void start_kernel(void)
>  	page_address_init();
>  	pr_notice("%s", linux_banner);
>  	setup_arch(&command_line);
> +
> +#ifdef CONFIG_BOOT_TIME_TRACKER
> +	pr_info("[BOOT TRACKER] - ID:%d, %s = %llu\n",
> +		BOOTSTAGE_ID_KERNEL_START, __func__, boot_time_now());
> +#endif

Seriously? Have you looked at all the functions invoked in this file?

Those which depend on a config have:

#ifdef CONFIG_FOO
void foo_init(void);
#else
static inline void foo_init(void) { }
#endif

in the headers to avoid this horrible #ifdef maze. No?

> diff --git a/kernel/time/boot_time_now.c b/kernel/time/boot_time_now.c
> new file mode 100644
> index 000000000000..6dc12d454be0
> --- /dev/null
> +++ b/kernel/time/boot_time_now.c
> @@ -0,0 +1,13 @@
> +// SPDX-License-Identifier: LGPL-2.0
> +
> +#include <linux/boot_time_now.h>
> +#include <asm/boot_time_primitives.h>
> +
> +u64 boot_time_now(void)
> +{
> +	return arch_boot_counter_now();
> +}
> +EXPORT_SYMBOL_GPL(boot_time_now);

Why does this need to exported for modules when the only users are
always built in?

> +
> +MODULE_DESCRIPTION("boot time tracker");
> +MODULE_LICENSE("GPL");

Why needs this a module description? This has always to be built in, no?

Copy and pasta from some boilerplate template is fine, but using brain
on what to paste is not optional.

But that's all irrelevant, because none of this is actually required in
the first place as there is existing infrastructure, which allows you to
gather most of that information already today.

Extending it to gain what you really want to achieve is trivial enough
when you actually start to look at the existing infrastructure instead
of blindly hacking some ill-defined mechanism into the kernel, which
relies on the assumption that a particular piece of hardware is
available and functional.

That assumption is not even true for ARM64 under all circumstances.

dmesg already exposes time stamps and while they might be coarse grained
until a sched clock is registered, you still can utilize that
registration:

--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -236,16 +236,14 @@ sched_clock_register(u64 (*read)(void),
 	/* Calculate the ns resolution of this counter */
 	res = cyc_to_ns(1ULL, new_mult, new_shift);

-	pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
-		bits, r, r_unit, res, wrap);
+	pr_info("sched_clock: %pS: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns hwcnt: %llu\n",
+		read, bits, r, r_unit, res, wrap, read());

 	/* Enable IRQ time accounting if we have a fast enough sched_clock() */
 	if (irqtime > 0 || (irqtime == -1 && rate >= 1000000))
 		enable_sched_clock_irqtime();

 	local_irq_restore(flags);
-
-	pr_debug("Registered %pS as sched_clock source\n", read);
 }

 void __init generic_sched_clock_init(void)

That message provides you all the information you need for your pretty
printed postprocessing results by re-calculating all the other coarse
grained dmesg timestamps from there, no?

That obviously does not work on architectures which do no use the
sched_clock infrastructure. Some of them do not for a good reason, but
emitting the same information for them if anyone is interested is
trivial enough. And that's none of your problems.

If you really need some not yet existing dedicated time stamp in the
maze of dmesg, then add it unconditionlly and without introducing an
artifical subsystem which is of no value at all.

But I tell you that's not necessary at all. The points in dmesg are well
defined. Here is the relevant output on a arm64 machine:

[    0.000000] Linux version 6.17.0-rc1 ...
...
[    0.000008] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps every 3579139424256ns

which is missing the actual hardware value, but see above...

So let's assume this give you

[    0.000008] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps
                            every 3579139424256ns hwcnt: 19000000

Which means that the counter accumulated 19000000 increments since the
hardware was powered up, no?

So the [0.000008] timestamp happens exactly 1.0 seconds after power on.
At least to my understanding of basic math, but your favourite AI bot
might disagree with that.

So anything you need for your pretty printing boot record can be
retrieved from there without any magic 300 and 301 numbers.

Because there is another printk() which has been there forever:

[   11.651192] Run /init as init process

No?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2025-08-23 22:53 [RFC PATCH] time: introduce BOOT_TIME_TRACKER and minimal boot timestamp Thomas Gleixner
@ 2025-09-01  4:05 ` Kaiwan N Billimoria
  2025-09-01  5:57   ` Kaiwan N Billimoria
  0 siblings, 1 reply; 1651+ messages in thread
From: Kaiwan N Billimoria @ 2025-09-01  4:05 UTC (permalink / raw)
  To: tglx
  Cc: Llillian, agordeev, akpm, alexander.shishkin, anna-maria, bigeasy,
	catalin.marinas, chenhuacai, francesco, frederic,
	guoweikang.kernel, jstultz, kpsingh, linux-arm-kernel,
	linux-kernel, mark.rutland, maz, mingo, pmladek, rrangel, sboyd,
	urezki, v-singh1, will, Kaiwan N Billimoria

> What the heck is BOOT SIG Initiative?
Very, very briefly: it's an initiative that plans to measure the complete or
unified boot time, i.e., the time it takes to boot the system completely. This
includes (or plans to) track the time taken for:
- Boot from CPU power-on, ROM code execution
- 1st, 2nd, (and possibly) 3rd stage bootloader(s)
- Linux kernel upto running the PID 1 process
- Include time taken for onboard MCUs (and their apps to come up).

The plan is to be able to show the cumulative and individual times taken across
all of these. Then report it via ASCII text and a GUI system (as of now, a HTML
file).
For anyone interested, here's the PDF of a super presentation on this topic by
Vishnu P Singh (OP) this August at the OSS EU:
https://static.sched.com/hosted_files/osseu2025/a2/EOSS_2025_Unified%20Boot%20Time%20log%20based%20measurement.pdf
As mentioned by Vishnu, the work is in the very early dev stages.

> -	pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
> -		bits, r, r_unit, res, wrap);
> +	pr_info("sched_clock: %pS: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns hwcnt: %llu\n",
> +		read, bits, r, r_unit, res, wrap, read());
--snip--
> So let's assume this give you
>
> [    0.000008] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps
>                             every 3579139424256ns hwcnt: 19000000
>
> Which means that the counter accumulated 19000000 increments since the
> hardware was powered up, no?
I agree with your approach Thomas (tglx)! (eye-roll)... So, following this
approach, here's the resulting tiny patch:

From 1e687ab12269f5f129b17eb7e9c3c5c2cec441b7 Mon Sep 17 00:00:00 2001
From: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
Date: Mon, 1 Sep 2025 09:17:57 +0530
Subject: [PATCH] [sched-clock] Extend printk to show h/w counter in a
 parseable manner

Signed-off-by: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
---
 kernel/time/sched_clock.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index cc15fe293719..e4fe900d6b60 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -236,16 +236,14 @@ sched_clock_register(u64 (*read)(void), int bits, unsigned long rate)
 	/* Calculate the ns resolution of this counter */
 	res = cyc_to_ns(1ULL, new_mult, new_shift);

-	pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
-		bits, r, r_unit, res, wrap);
+	pr_info("sched_clock: %pS: bits=%u,freq=%lu %cHz,resolution=%llu ns,wraps every=%llu ns,hwcnt=%llu\n",
+                read, bits, r, r_unit, res, wrap, read());

 	/* Enable IRQ time accounting if we have a fast enough sched_clock() */
 	if (irqtime > 0 || (irqtime == -1 && rate >= 1000000))
 		enable_sched_clock_irqtime();

 	local_irq_restore(flags);
-
-	pr_debug("Registered %pS as sched_clock source\n", read);
 }

 void __init generic_sched_clock_init(void)
--
2.43.0

Of course, this is almost identical to what Thomas has already shown. I've
added some formatting to make for easier parsing. A sample output obtained with
this code on a patched kernel for the BeaglePlay k3 am625 board:
[    0.000001] sched_clock: arch_counter_get_cntpct+0x0/0x18: bits=58,freq=200 MHz,resolution=5 ns,wraps every=4398046511102 ns,hwcnt=1409443611

This printk format allows us to easily parse it; f.e. to obtain the hwcnt value:
debian@BeagleBone:~$ dmesg |grep sched_clock |awk -F, '{print $5}'
hwcnt=1409443611

So, just confirming: here 1409443611 divided by 200 MHz gives us 7.047218055s
since boot, and thus the actual timestamp here is that plus 0.000001s yes?
(Over 7s here? yes, it's just that I haven't yet setup U-Boot properly for uSD
card boot, thus am manually loading commands in U-Boot to boot up, that's all).

A question (perhaps very stupid): will the hwcnt - the output of the read() -
be guaranteed to be (close to) the number of increments since processor
power-up (or reset)? Meaning, it's simply a hardware feature and agnostic to
what code the core was executing (ROM/BL/kernel), yes?
If so, I guess we can move forward with this approach... Else, or otherwise,
suggestions are welcome,

Regards,
Kaiwan.

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2025-09-01  4:05 ` Kaiwan N Billimoria
@ 2025-09-01  5:57   ` Kaiwan N Billimoria
  0 siblings, 0 replies; 1651+ messages in thread
From: Kaiwan N Billimoria @ 2025-09-01  5:57 UTC (permalink / raw)
  To: tglx
  Cc: Llillian, agordeev, akpm, alexander.shishkin, anna-maria, bigeasy,
	catalin.marinas, chenhuacai, francesco, frederic,
	guoweikang.kernel, jstultz, kpsingh, linux-arm-kernel,
	linux-kernel, mark.rutland, maz, mingo, pmladek, rrangel, sboyd,
	urezki, v-singh1, will

Apologies, subject missing; it's:
time: introduce BOOT_TIME_TRACKER and minimal boot timestamp


On Mon, Sep 1, 2025 at 9:36 AM Kaiwan N Billimoria
<kaiwan.billimoria@gmail.com> wrote:
>
> > What the heck is BOOT SIG Initiative?
> Very, very briefly: it's an initiative that plans to measure the complete or
> unified boot time, i.e., the time it takes to boot the system completely. This
> includes (or plans to) track the time taken for:
> - Boot from CPU power-on, ROM code execution
> - 1st, 2nd, (and possibly) 3rd stage bootloader(s)
> - Linux kernel upto running the PID 1 process
> - Include time taken for onboard MCUs (and their apps to come up).
>
> The plan is to be able to show the cumulative and individual times taken across
> all of these. Then report it via ASCII text and a GUI system (as of now, a HTML
> file).
> For anyone interested, here's the PDF of a super presentation on this topic by
> Vishnu P Singh (OP) this August at the OSS EU:
> https://static.sched.com/hosted_files/osseu2025/a2/EOSS_2025_Unified%20Boot%20Time%20log%20based%20measurement.pdf
> As mentioned by Vishnu, the work is in the very early dev stages.
>
> > -     pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
> > -             bits, r, r_unit, res, wrap);
> > +     pr_info("sched_clock: %pS: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns hwcnt: %llu\n",
> > +             read, bits, r, r_unit, res, wrap, read());
> --snip--
> > So let's assume this give you
> >
> > [    0.000008] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps
> >                             every 3579139424256ns hwcnt: 19000000
> >
> > Which means that the counter accumulated 19000000 increments since the
> > hardware was powered up, no?
> I agree with your approach Thomas (tglx)! (eye-roll)... So, following this
> approach, here's the resulting tiny patch:
>
> From 1e687ab12269f5f129b17eb7e9c3c5c2cec441b7 Mon Sep 17 00:00:00 2001
> From: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
> Date: Mon, 1 Sep 2025 09:17:57 +0530
> Subject: [PATCH] [sched-clock] Extend printk to show h/w counter in a
>  parseable manner
>
> Signed-off-by: Kaiwan N Billimoria <kaiwan.billimoria@gmail.com>
> ---
>  kernel/time/sched_clock.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
> index cc15fe293719..e4fe900d6b60 100644
> --- a/kernel/time/sched_clock.c
> +++ b/kernel/time/sched_clock.c
> @@ -236,16 +236,14 @@ sched_clock_register(u64 (*read)(void), int bits, unsigned long rate)
>         /* Calculate the ns resolution of this counter */
>         res = cyc_to_ns(1ULL, new_mult, new_shift);
>
> -       pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns\n",
> -               bits, r, r_unit, res, wrap);
> +       pr_info("sched_clock: %pS: bits=%u,freq=%lu %cHz,resolution=%llu ns,wraps every=%llu ns,hwcnt=%llu\n",
> +                read, bits, r, r_unit, res, wrap, read());
>
>         /* Enable IRQ time accounting if we have a fast enough sched_clock() */
>         if (irqtime > 0 || (irqtime == -1 && rate >= 1000000))
>                 enable_sched_clock_irqtime();
>
>         local_irq_restore(flags);
> -
> -       pr_debug("Registered %pS as sched_clock source\n", read);
>  }
>
>  void __init generic_sched_clock_init(void)
> --
> 2.43.0
>
> Of course, this is almost identical to what Thomas has already shown. I've
> added some formatting to make for easier parsing. A sample output obtained with
> this code on a patched kernel for the BeaglePlay k3 am625 board:
> [    0.000001] sched_clock: arch_counter_get_cntpct+0x0/0x18: bits=58,freq=200 MHz,resolution=5 ns,wraps every=4398046511102 ns,hwcnt=1409443611
>
> This printk format allows us to easily parse it; f.e. to obtain the hwcnt value:
> debian@BeagleBone:~$ dmesg |grep sched_clock |awk -F, '{print $5}'
> hwcnt=1409443611
>
> So, just confirming: here 1409443611 divided by 200 MHz gives us 7.047218055s
> since boot, and thus the actual timestamp here is that plus 0.000001s yes?
> (Over 7s here? yes, it's just that I haven't yet setup U-Boot properly for uSD
> card boot, thus am manually loading commands in U-Boot to boot up, that's all).
>
> A question (perhaps very stupid): will the hwcnt - the output of the read() -
> be guaranteed to be (close to) the number of increments since processor
> power-up (or reset)? Meaning, it's simply a hardware feature and agnostic to
> what code the core was executing (ROM/BL/kernel), yes?
> If so, I guess we can move forward with this approach... Else, or otherwise,
> suggestions are welcome,
>
> Regards,
> Kaiwan.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] net/netfilter/ipvs: Fix data-race in ip_vs_add_service / ip_vs_out_hook
@ 2025-08-27  6:48 Julian Anastasov
  2025-08-27 14:43 ` Zhang Tengfei
  0 siblings, 1 reply; 1651+ messages in thread
From: Julian Anastasov @ 2025-08-27  6:48 UTC (permalink / raw)
  To: Zhang Tengfei
  Cc: Simon Horman, lvs-devel, netfilter-devel, Pablo Neira Ayuso,
	Jozsef Kadlecsik, Florian Westphal, David S . Miller, David Ahern,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, coreteam,
	syzbot+1651b5234028c294c339


	Hello,

On Tue, 26 Aug 2025, Zhang Tengfei wrote:

> A data-race was detected by KCSAN between ip_vs_add_service() which
> acts as a writer, and ip_vs_out_hook() which acts as a reader. This
> can lead to unpredictable behavior and crashes. One observed symptom
> is the "no destination available" error when processing packets.
> 
> The race occurs on the `enable` flag within the `netns_ipvs`
> struct. This flag was being written in the configuration path without
> any protection, while concurrently being read in the packet processing
> path. This lack of synchronization means a reader on one CPU could see a
> partially initialized service, leading to incorrect behavior.
> 
> To fix this, convert the `enable` flag from a plain integer to an
> atomic_t. This ensures that all reads and writes to the flag are atomic.
> More importantly, using atomic_set() and atomic_read() provides the
> necessary memory barriers to guarantee that changes to other fields of
> the service are visible to the reader CPU before the service is marked
> as enabled.
> 
> Reported-by: syzbot+1651b5234028c294c339@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=1651b5234028c294c339
> Signed-off-by: Zhang Tengfei <zhtfdev@gmail.com>
> ---
>  include/net/ip_vs.h             |  2 +-
>  net/netfilter/ipvs/ip_vs_conn.c |  4 ++--
>  net/netfilter/ipvs/ip_vs_core.c | 10 +++++-----
>  net/netfilter/ipvs/ip_vs_ctl.c  |  6 +++---
>  net/netfilter/ipvs/ip_vs_est.c  | 16 ++++++++--------
>  5 files changed, 19 insertions(+), 19 deletions(-)
> 

> diff --git a/net/netfilter/ipvs/ip_vs_est.c b/net/netfilter/ipvs/ip_vs_est.c
> index 15049b826732..c5aa2660de92 100644
> --- a/net/netfilter/ipvs/ip_vs_est.c
> +++ b/net/netfilter/ipvs/ip_vs_est.c
...
> @@ -757,7 +757,7 @@ static void ip_vs_est_calc_phase(struct netns_ipvs *ipvs)
>  	mutex_lock(&ipvs->est_mutex);
>  	for (id = 1; id < ipvs->est_kt_count; id++) {
>  		/* netns clean up started, abort */
> -		if (!ipvs->enable)
> +		if (!atomic_read(&ipvs->enable))
>  			goto unlock2;

	It is a simple flag but as it is checked in loops
in a few places in ip_vs_est.c, lets use READ_ONCE/WRITE_ONCE as
suggested by Florian and Eric. The 3 checks in hooks in ip_vs_core.c
can be simply removed: in ip_vs_out_hook, ip_vs_in_hook and
ip_vs_forward_icmp. We can see enable=0 in rare cases which is
not fatal. It is a flying packet in two possible cases:

1. after hooks are registered but before the flag is set
2. after the hooks are unregistered on cleanup_net

Regards

--
Julian Anastasov <ja@ssi.bg>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2025-08-27  6:48 [PATCH] net/netfilter/ipvs: Fix data-race in ip_vs_add_service / ip_vs_out_hook Julian Anastasov
@ 2025-08-27 14:43 ` Zhang Tengfei
  2025-08-27 21:37   ` Pablo Neira Ayuso
  0 siblings, 1 reply; 1651+ messages in thread
From: Zhang Tengfei @ 2025-08-27 14:43 UTC (permalink / raw)
  To: ja
  Cc: coreteam, davem, dsahern, edumazet, fw, horms, kadlec, kuba,
	lvs-devel, netfilter-devel, pabeni, pablo,
	syzbot+1651b5234028c294c339, zhtfdev

Hi everyone,

Here is the v2 patch that incorporates the feedback.

Many thanks to Julian for his thorough review and for providing 
the detailed plan for this new version, and thanks to Florian 
and Eric for suggestions.

Subject: [PATCH v2] net/netfilter/ipvs: Use READ_ONCE/WRITE_ONCE for
 ipvs->enable

KCSAN reported a data-race on the `ipvs->enable` flag, which is
written in the control path and read concurrently from many other
contexts.

Following a suggestion by Julian, this patch fixes the race by
converting all accesses to use `WRITE_ONCE()/READ_ONCE()`.
This lightweight approach ensures atomic access and acts as a
compiler barrier, preventing unsafe optimizations where the flag
is checked in loops (e.g., in ip_vs_est.c).

Additionally, the now-obsolete `enable` checks in the fast path
hooks (`ip_vs_in_hook`, `ip_vs_out_hook`, `ip_vs_forward_icmp`)
are removed. These are unnecessary since commit 857ca89711de
("ipvs: register hooks only with services").

Reported-by: syzbot+1651b5234028c294c339@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1651b5234028c294c339
Suggested-by: Julian Anastasov <ja@ssi.bg>
Link: https://lore.kernel.org/lvs-devel/2189fc62-e51e-78c9-d1de-d35b8e3657e3@ssi.bg/
Signed-off-by: Zhang Tengfei <zhtfdev@gmail.com>

---
v2:
- Switched from atomic_t to the suggested READ_ONCE()/WRITE_ONCE().
- Removed obsolete checks from the packet processing hooks.
- Polished commit message based on feedback.
---
 net/netfilter/ipvs/ip_vs_conn.c |  4 ++--
 net/netfilter/ipvs/ip_vs_core.c | 11 ++++-------
 net/netfilter/ipvs/ip_vs_ctl.c  |  6 +++---
 net/netfilter/ipvs/ip_vs_est.c  | 16 ++++++++--------
 4 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 965f3c8e5..37ebb0cb6 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -885,7 +885,7 @@ static void ip_vs_conn_expire(struct timer_list *t)
 			 * conntrack cleanup for the net.
 			 */
 			smp_rmb();
-			if (ipvs->enable)
+			if (READ_ONCE(ipvs->enable))
 				ip_vs_conn_drop_conntrack(cp);
 		}
 
@@ -1439,7 +1439,7 @@ void ip_vs_expire_nodest_conn_flush(struct netns_ipvs *ipvs)
 		cond_resched_rcu();
 
 		/* netns clean up started, abort delayed work */
-		if (!ipvs->enable)
+		if (!READ_ONCE(ipvs->enable))
 			break;
 	}
 	rcu_read_unlock();
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index c7a8a08b7..5ea7ab8bf 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1353,9 +1353,6 @@ ip_vs_out_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *stat
 	if (unlikely(!skb_dst(skb)))
 		return NF_ACCEPT;
 
-	if (!ipvs->enable)
-		return NF_ACCEPT;
-
 	ip_vs_fill_iph_skb(af, skb, false, &iph);
 #ifdef CONFIG_IP_VS_IPV6
 	if (af == AF_INET6) {
@@ -1940,7 +1937,7 @@ ip_vs_in_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *state
 		return NF_ACCEPT;
 	}
 	/* ipvs enabled in this netns ? */
-	if (unlikely(sysctl_backup_only(ipvs) || !ipvs->enable))
+	if (unlikely(sysctl_backup_only(ipvs)))
 		return NF_ACCEPT;
 
 	ip_vs_fill_iph_skb(af, skb, false, &iph);
@@ -2108,7 +2105,7 @@ ip_vs_forward_icmp(void *priv, struct sk_buff *skb,
 	int r;
 
 	/* ipvs enabled in this netns ? */
-	if (unlikely(sysctl_backup_only(ipvs) || !ipvs->enable))
+	if (unlikely(sysctl_backup_only(ipvs)))
 		return NF_ACCEPT;
 
 	if (state->pf == NFPROTO_IPV4) {
@@ -2295,7 +2292,7 @@ static int __net_init __ip_vs_init(struct net *net)
 		return -ENOMEM;
 
 	/* Hold the beast until a service is registered */
-	ipvs->enable = 0;
+	WRITE_ONCE(ipvs->enable, 0);
 	ipvs->net = net;
 	/* Counters used for creating unique names */
 	ipvs->gen = atomic_read(&ipvs_netns_cnt);
@@ -2367,7 +2364,7 @@ static void __net_exit __ip_vs_dev_cleanup_batch(struct list_head *net_list)
 		ipvs = net_ipvs(net);
 		ip_vs_unregister_hooks(ipvs, AF_INET);
 		ip_vs_unregister_hooks(ipvs, AF_INET6);
-		ipvs->enable = 0;	/* Disable packet reception */
+		WRITE_ONCE(ipvs->enable, 0);	/* Disable packet reception */
 		smp_wmb();
 		ip_vs_sync_net_cleanup(ipvs);
 	}
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 6a6fc4478..4c8fa22be 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -256,7 +256,7 @@ static void est_reload_work_handler(struct work_struct *work)
 		struct ip_vs_est_kt_data *kd = ipvs->est_kt_arr[id];
 
 		/* netns clean up started, abort delayed work */
-		if (!ipvs->enable)
+		if (!READ_ONCE(ipvs->enable))
 			goto unlock;
 		if (!kd)
 			continue;
@@ -1483,9 +1483,9 @@ ip_vs_add_service(struct netns_ipvs *ipvs, struct ip_vs_service_user_kern *u,
 
 	*svc_p = svc;
 
-	if (!ipvs->enable) {
+	if (!READ_ONCE(ipvs->enable)) {
 		/* Now there is a service - full throttle */
-		ipvs->enable = 1;
+		WRITE_ONCE(ipvs->enable, 1);
 
 		/* Start estimation for first time */
 		ip_vs_est_reload_start(ipvs);
diff --git a/net/netfilter/ipvs/ip_vs_est.c b/net/netfilter/ipvs/ip_vs_est.c
index 15049b826..93a925f1e 100644
--- a/net/netfilter/ipvs/ip_vs_est.c
+++ b/net/netfilter/ipvs/ip_vs_est.c
@@ -231,7 +231,7 @@ static int ip_vs_estimation_kthread(void *data)
 void ip_vs_est_reload_start(struct netns_ipvs *ipvs)
 {
 	/* Ignore reloads before first service is added */
-	if (!ipvs->enable)
+	if (!READ_ONCE(ipvs->enable))
 		return;
 	ip_vs_est_stopped_recalc(ipvs);
 	/* Bump the kthread configuration genid */
@@ -306,7 +306,7 @@ static int ip_vs_est_add_kthread(struct netns_ipvs *ipvs)
 	int i;
 
 	if ((unsigned long)ipvs->est_kt_count >= ipvs->est_max_threads &&
-	    ipvs->enable && ipvs->est_max_threads)
+	    READ_ONCE(ipvs->enable) && ipvs->est_max_threads)
 		return -EINVAL;
 
 	mutex_lock(&ipvs->est_mutex);
@@ -343,7 +343,7 @@ static int ip_vs_est_add_kthread(struct netns_ipvs *ipvs)
 	}
 
 	/* Start kthread tasks only when services are present */
-	if (ipvs->enable && !ip_vs_est_stopped(ipvs)) {
+	if (READ_ONCE(ipvs->enable) && !ip_vs_est_stopped(ipvs)) {
 		ret = ip_vs_est_kthread_start(ipvs, kd);
 		if (ret < 0)
 			goto out;
@@ -486,7 +486,7 @@ int ip_vs_start_estimator(struct netns_ipvs *ipvs, struct ip_vs_stats *stats)
 	struct ip_vs_estimator *est = &stats->est;
 	int ret;
 
-	if (!ipvs->est_max_threads && ipvs->enable)
+	if (!ipvs->est_max_threads && READ_ONCE(ipvs->enable))
 		ipvs->est_max_threads = ip_vs_est_max_threads(ipvs);
 
 	est->ktid = -1;
@@ -663,7 +663,7 @@ static int ip_vs_est_calc_limits(struct netns_ipvs *ipvs, int *chain_max)
 			/* Wait for cpufreq frequency transition */
 			wait_event_idle_timeout(wq, kthread_should_stop(),
 						HZ / 50);
-			if (!ipvs->enable || kthread_should_stop())
+			if (!READ_ONCE(ipvs->enable) || kthread_should_stop())
 				goto stop;
 		}
 
@@ -681,7 +681,7 @@ static int ip_vs_est_calc_limits(struct netns_ipvs *ipvs, int *chain_max)
 		rcu_read_unlock();
 		local_bh_enable();
 
-		if (!ipvs->enable || kthread_should_stop())
+		if (!READ_ONCE(ipvs->enable) || kthread_should_stop())
 			goto stop;
 		cond_resched();
 
@@ -757,7 +757,7 @@ static void ip_vs_est_calc_phase(struct netns_ipvs *ipvs)
 	mutex_lock(&ipvs->est_mutex);
 	for (id = 1; id < ipvs->est_kt_count; id++) {
 		/* netns clean up started, abort */
-		if (!ipvs->enable)
+		if (!READ_ONCE(ipvs->enable))
 			goto unlock2;
 		kd = ipvs->est_kt_arr[id];
 		if (!kd)
@@ -787,7 +787,7 @@ static void ip_vs_est_calc_phase(struct netns_ipvs *ipvs)
 	id = ipvs->est_kt_count;
 
 next_kt:
-	if (!ipvs->enable || kthread_should_stop())
+	if (!READ_ONCE(ipvs->enable) || kthread_should_stop())
 		goto unlock;
 	id--;
 	if (id < 0)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2025-08-27 14:43 ` Zhang Tengfei
@ 2025-08-27 21:37   ` Pablo Neira Ayuso
  0 siblings, 0 replies; 1651+ messages in thread
From: Pablo Neira Ayuso @ 2025-08-27 21:37 UTC (permalink / raw)
  To: Zhang Tengfei
  Cc: ja, coreteam, davem, dsahern, edumazet, fw, horms, kadlec, kuba,
	lvs-devel, netfilter-devel, pabeni, syzbot+1651b5234028c294c339

On Wed, Aug 27, 2025 at 10:43:42PM +0800, Zhang Tengfei wrote:
> Hi everyone,
> 
> Here is the v2 patch that incorporates the feedback.

Patch without subject will not fly too far, I'm afraid you will have
to resubmit. One more comment below.

> Many thanks to Julian for his thorough review and for providing 
> the detailed plan for this new version, and thanks to Florian 
> and Eric for suggestions.
> 
> Subject: [PATCH v2] net/netfilter/ipvs: Use READ_ONCE/WRITE_ONCE for
>  ipvs->enable
> 
> KCSAN reported a data-race on the `ipvs->enable` flag, which is
> written in the control path and read concurrently from many other
> contexts.
> 
> Following a suggestion by Julian, this patch fixes the race by
> converting all accesses to use `WRITE_ONCE()/READ_ONCE()`.
> This lightweight approach ensures atomic access and acts as a
> compiler barrier, preventing unsafe optimizations where the flag
> is checked in loops (e.g., in ip_vs_est.c).
> 
> Additionally, the now-obsolete `enable` checks in the fast path
> hooks (`ip_vs_in_hook`, `ip_vs_out_hook`, `ip_vs_forward_icmp`)
> are removed. These are unnecessary since commit 857ca89711de
> ("ipvs: register hooks only with services").
> 
> Reported-by: syzbot+1651b5234028c294c339@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=1651b5234028c294c339
> Suggested-by: Julian Anastasov <ja@ssi.bg>
> Link: https://lore.kernel.org/lvs-devel/2189fc62-e51e-78c9-d1de-d35b8e3657e3@ssi.bg/
> Signed-off-by: Zhang Tengfei <zhtfdev@gmail.com>
> 
> ---
> v2:
> - Switched from atomic_t to the suggested READ_ONCE()/WRITE_ONCE().
> - Removed obsolete checks from the packet processing hooks.
> - Polished commit message based on feedback.
> ---
>  net/netfilter/ipvs/ip_vs_conn.c |  4 ++--
>  net/netfilter/ipvs/ip_vs_core.c | 11 ++++-------
>  net/netfilter/ipvs/ip_vs_ctl.c  |  6 +++---
>  net/netfilter/ipvs/ip_vs_est.c  | 16 ++++++++--------
>  4 files changed, 17 insertions(+), 20 deletions(-)
[...]
> diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
> index c7a8a08b7..5ea7ab8bf 100644
> --- a/net/netfilter/ipvs/ip_vs_core.c
> +++ b/net/netfilter/ipvs/ip_vs_core.c
> @@ -1353,9 +1353,6 @@ ip_vs_out_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *stat
>  	if (unlikely(!skb_dst(skb)))
>  		return NF_ACCEPT;
>  
> -	if (!ipvs->enable)
> -		return NF_ACCEPT;

Patch does say why is this going away? If you think this is not
necessary, then make a separated patch and example why this is needed?

Thanks

>  	ip_vs_fill_iph_skb(af, skb, false, &iph);
>  #ifdef CONFIG_IP_VS_IPV6
>  	if (af == AF_INET6) {

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-08-29  2:01 xinpeng.wang
  2025-08-29  2:42 ` bluez.test.bot
  0 siblings, 1 reply; 1651+ messages in thread
From: xinpeng.wang @ 2025-08-29  2:01 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: xinpeng . wang

From d364976e93e23f5defbbf711dcda4787bdad3beb Mon Sep 17 00:00:00 2001
From: xinpeng wang <wangxinpeng@uniontech.com>
Date: Thu, 28 Aug 2025 21:31:31 +0800
Subject: [PATCH] device: Recreate paired device from storage if previous
 restoration failed

When a USB Bluetooth adapter is resumed from S4 suspend, the kernel may trigger
an "index remove" followed by an "index add". BlueZ responds by removing all
devices and attempting to recreate them from stored configuration (storage).

However, if a connected A2DP device disconnects just before suspend, BlueZ may
have started a disconnect timer (via set_disconnect_timer) but not yet freed
the session. During this period:
- The session pointer is set to NULL and becomes inaccessible.
- The session still holds a reference to the device, preventing it from being freed.
- As a result, the "index add" event fails to recreate the device from storage (due to
  D-Bus path conflict or incomplete cleanup).
- Later, when the timer expires, a new device is created from discovery data, bypassing
  storage and causing it to appear as unpaired.

This leads to loss of pairing information and confuses desktop applications that
rely on paired/unpaired state.

This patch improves the device creation logic during discovery: when creating a
device from scan data, it checks whether the remote address corresponds to a
known paired device in storage. If so, and if no valid device instance exists,
it forces recreation from storage to restore the correct paired state.

This ensures that devices are properly restored after suspend/resume cycles,
even in race conditions involving delayed session cleanup.

Signed-off-by: xinpeng.wang <wangxinpeng@uniontech.com>
---
 src/adapter.c | 111 ++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 99 insertions(+), 12 deletions(-)

diff --git a/src/adapter.c b/src/adapter.c
index 549a6c0b8..c2ac19f32 100644
--- a/src/adapter.c
+++ b/src/adapter.c
@@ -342,6 +342,8 @@ struct btd_adapter {
 
 	struct queue *exp_pending;
 	struct queue *exps;
+
+	GSList *pending_restore_device_addrs;
 };
 
 static char *adapter_power_state_str(uint32_t power_state)
@@ -1400,17 +1402,7 @@ static void adapter_add_device(struct btd_adapter *adapter,
 
 static struct btd_device *adapter_create_device(struct btd_adapter *adapter,
 						const bdaddr_t *bdaddr,
-						uint8_t bdaddr_type)
-{
-	struct btd_device *device;
-
-	device = device_create(adapter, bdaddr, bdaddr_type);
-	if (!device)
-		return NULL;
-
-	adapter_add_device(adapter, device);
-	return device;
-}
+						uint8_t bdaddr_type);
 
 static void service_auth_cancel(struct service_auth *auth)
 {
@@ -4969,6 +4961,95 @@ done:
 	mgmt_tlv_list_free(list);
 }
 
+static struct btd_device *adapter_create_device(struct btd_adapter *adapter,
+						const bdaddr_t *bdaddr,
+						uint8_t bdaddr_type)
+{
+	struct btd_device *device;
+	char addr[18];
+	GSList *l;
+	GKeyFile *key_file = NULL;
+
+	ba2str(bdaddr, addr);
+
+	l = g_slist_find_custom(adapter->pending_restore_device_addrs, addr,
+		(GCompareFunc)strcasecmp);
+	if (l != NULL) {
+		char filename[PATH_MAX];
+    		GError *gerr = NULL;
+
+		create_filename(filename, PATH_MAX, "/%s/%s/info",
+					btd_adapter_get_storage_dir(adapter),
+					addr);
+
+		key_file = g_key_file_new();
+		if (!g_key_file_load_from_file(key_file, filename, 0, &gerr)) {
+			error("Unable to load key file from %s: (%s)", filename,
+								gerr->message);
+			g_clear_error(&gerr);
+			adapter->pending_restore_device_addrs =
+				g_slist_delete_link(adapter->pending_restore_device_addrs, l);
+        		g_free(l->data);
+			key_file = NULL;
+		}
+	}
+
+	if (key_file != NULL) {
+		struct link_key_info *key_info;
+		struct smp_ltk_info *ltk_info;
+		struct smp_ltk_info *peripheral_ltk_info;
+		struct irk_info *irk_info;
+		
+		DBG("Found device %s but restoring from storage", addr);
+        	device = device_create_from_storage(adapter, addr, key_file);
+		if (!device) {
+			g_key_file_free(key_file);
+            		return NULL;
+		}
+		adapter->pending_restore_device_addrs = 
+			g_slist_delete_link(adapter->pending_restore_device_addrs, l);
+        	g_free(l->data);
+
+		if (bdaddr_type == BDADDR_BREDR)
+			device_set_bredr_support(device);
+		else
+			device_set_le_support(device, bdaddr_type);
+
+		key_info = get_key_info(key_file, addr, bdaddr_type);
+		ltk_info = get_ltk_info(key_file, addr, bdaddr_type);
+		peripheral_ltk_info = get_peripheral_ltk_info(key_file,
+						addr, bdaddr_type);
+		irk_info = get_irk_info(key_file, addr, bdaddr_type);
+		if (key_info) {
+			device_set_paired(device, BDADDR_BREDR);
+			device_set_bonded(device, BDADDR_BREDR);
+		}
+		if (ltk_info || peripheral_ltk_info) {
+			struct smp_ltk_info *info = ltk_info ? ltk_info : peripheral_ltk_info;
+			device_set_paired(device, ltk_info->bdaddr_type);
+			device_set_bonded(device, ltk_info->bdaddr_type);
+			
+			device_set_ltk(device, info->val, info->central,
+						info->enc_size);
+		}
+		if (irk_info)
+			device_set_rpa(device, true);
+
+		btd_device_set_temporary(device, false);
+		g_free(key_info);
+		g_free(ltk_info);
+		g_free(peripheral_ltk_info);
+		g_free(irk_info);
+	} else {
+		device = device_create(adapter, bdaddr, bdaddr_type);
+		if (!device)
+			return NULL;
+	}
+
+	adapter_add_device(adapter, device);
+	return device;
+}
+
 static void load_devices(struct btd_adapter *adapter)
 {
 	char dirname[PATH_MAX];
@@ -5087,8 +5168,11 @@ static void load_devices(struct btd_adapter *adapter)
 
 		device = device_create_from_storage(adapter, entry->d_name,
 							key_file);
-		if (!device)
+		if (!device) {
+			char *addr_copy = g_strdup(entry->d_name);
+    			adapter->pending_restore_device_addrs = g_slist_append(adapter->pending_restore_device_addrs,addr_copy);
 			goto free;
+		}
 
 		if (irk_info)
 			device_set_rpa(device, true);
@@ -7085,6 +7169,9 @@ static void adapter_remove(struct btd_adapter *adapter)
 	adapter->msd_callbacks = NULL;
 
 	queue_remove_all(adapter->exp_pending, NULL, NULL, cancel_exp_pending);
+
+	g_slist_free_full(adapter->pending_restore_device_addrs,g_free);
+	adapter->pending_restore_device_addrs = NULL;
 }
 
 const char *adapter_get_path(struct btd_adapter *adapter)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* RE:
  2025-08-29  2:01 xinpeng.wang
@ 2025-08-29  2:42 ` bluez.test.bot
  0 siblings, 0 replies; 1651+ messages in thread
From: bluez.test.bot @ 2025-08-29  2:42 UTC (permalink / raw)
  To: linux-bluetooth, wangxinpeng

[-- Attachment #1: Type: text/plain, Size: 382 bytes --]

This is an automated email and please do not reply to this email.

Dear Submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
While preparing the CI tests, the patches you submitted couldn't be applied to the current HEAD of the repository.

----- Output -----

Please resolve the issue and submit the patches again.

---
Regards,
Linux Bluetooth

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-09-15 19:52 Yury Norov (NVIDIA)
  2025-09-16 14:48 ` Simon Horman
  0 siblings, 1 reply; 1651+ messages in thread
From: Yury Norov (NVIDIA) @ 2025-09-15 19:52 UTC (permalink / raw)
  To: Yoshihiro Shimoda, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikita Yushchenko,
	Michal Swiatkowski, Geert Uytterhoeven, Uwe Kleine-König,
	Yury Norov (NVIDIA), Simon Horman
  Cc: netdev, linux-renesas-soc, linux-kernel

Subject: [PATCH net-next v2] net: renesas: rswitch: simplify rswitch_stop()

rswitch_stop() opencodes for_each_set_bit().

CC: Simon Horman <horms@kernel.org>
Reviewed-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
---
v1: https://lore.kernel.org/all/20250913181345.204344-1-yury.norov@gmail.com/
v2: Rebase on top of net-next/main

 drivers/net/ethernet/renesas/rswitch_main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/renesas/rswitch_main.c b/drivers/net/ethernet/renesas/rswitch_main.c
index ff5f966c98a9..69676db20fec 100644
--- a/drivers/net/ethernet/renesas/rswitch_main.c
+++ b/drivers/net/ethernet/renesas/rswitch_main.c
@@ -1656,9 +1656,7 @@ static int rswitch_stop(struct net_device *ndev)
 	if (bitmap_empty(rdev->priv->opened_ports, RSWITCH_NUM_PORTS))
 		iowrite32(GWCA_TS_IRQ_BIT, rdev->priv->addr + GWTSDID);
 
-	for (tag = find_first_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT);
-	     tag < TS_TAGS_PER_PORT;
-	     tag = find_next_bit(rdev->ts_skb_used, TS_TAGS_PER_PORT, tag + 1)) {
+	for_each_set_bit(tag, rdev->ts_skb_used, TS_TAGS_PER_PORT) {
 		ts_skb = xchg(&rdev->ts_skb[tag], NULL);
 		clear_bit(tag, rdev->ts_skb_used);
 		if (ts_skb)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2025-09-15 19:52 Yury Norov (NVIDIA)
@ 2025-09-16 14:48 ` Simon Horman
  2025-09-16 15:22   ` Re: Yury Norov
  0 siblings, 1 reply; 1651+ messages in thread
From: Simon Horman @ 2025-09-16 14:48 UTC (permalink / raw)
  To: Yury Norov (NVIDIA)
  Cc: Yoshihiro Shimoda, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikita Yushchenko,
	Michal Swiatkowski, Geert Uytterhoeven, Uwe Kleine-König,
	netdev, linux-renesas-soc, linux-kernel

On Mon, Sep 15, 2025 at 03:52:31PM -0400, Yury Norov (NVIDIA) wrote:
> Subject: [PATCH net-next v2] net: renesas: rswitch: simplify rswitch_stop()
> 
> rswitch_stop() opencodes for_each_set_bit().
> 
> CC: Simon Horman <horms@kernel.org>
> Reviewed-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> ---
> v1: https://lore.kernel.org/all/20250913181345.204344-1-yury.norov@gmail.com/
> v2: Rebase on top of net-next/main
> 
>  drivers/net/ethernet/renesas/rswitch_main.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

Hi Yury,

I see this marked as Changes Requested in Patchwork.
But no response on the netdev ML. So I'll provide one.

Unfortunately it seems that the posting is slightly mangled,
there was no Subject in the header (or an empty one), and what
was supposed to be the Subject ended up at the top of the body.

I'm wondering if you could repost with that addressed,
being sure to observe the 24h delay between postings.

Thanks!

...

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-09-16 14:48 ` Simon Horman
@ 2025-09-16 15:22   ` Yury Norov
  0 siblings, 0 replies; 1651+ messages in thread
From: Yury Norov @ 2025-09-16 15:22 UTC (permalink / raw)
  To: Simon Horman
  Cc: Yoshihiro Shimoda, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Nikita Yushchenko,
	Michal Swiatkowski, Geert Uytterhoeven, Uwe Kleine-König,
	netdev, linux-renesas-soc, linux-kernel

On Tue, Sep 16, 2025 at 03:48:13PM +0100, Simon Horman wrote:
> On Mon, Sep 15, 2025 at 03:52:31PM -0400, Yury Norov (NVIDIA) wrote:
> > Subject: [PATCH net-next v2] net: renesas: rswitch: simplify rswitch_stop()
> > 
> > rswitch_stop() opencodes for_each_set_bit().
> > 
> > CC: Simon Horman <horms@kernel.org>
> > Reviewed-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
> > Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
> > ---
> > v1: https://lore.kernel.org/all/20250913181345.204344-1-yury.norov@gmail.com/
> > v2: Rebase on top of net-next/main
> > 
> >  drivers/net/ethernet/renesas/rswitch_main.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> Hi Yury,
> 
> I see this marked as Changes Requested in Patchwork.
> But no response on the netdev ML. So I'll provide one.
> 
> Unfortunately it seems that the posting is slightly mangled,

Yeah, bad luck.

> there was no Subject in the header (or an empty one), and what
> was supposed to be the Subject ended up at the top of the body.
> 
> I'm wondering if you could repost with that addressed,
> being sure to observe the 24h delay between postings.

Sure, will do shortly.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-09-16 21:23 Jay Vosburgh
  2025-09-16 21:56 ` Jay Vosburgh
  0 siblings, 1 reply; 1651+ messages in thread
From: Jay Vosburgh @ 2025-09-16 21:23 UTC (permalink / raw)
  To: netdev; +Cc: Jamal Hadi Salim, Stephen Hemminger, David Ahern


Subject: [PATCH v2 0/4 iproute2-next] tc/police: Allow 64 bit burst size

	In summary, this patchset changes the user space handling of the
tc police burst parameter to permit burst sizes that exceed 4 GB when the
specified rate is high enough that the kernel API for burst can accomodate
such.

	Additionally, if the burst exceeds the upper limit of the kernel
API, this is now flagged as an error.  The existing behavior silently
overflows, resulting in arbitrary values passed to the kernel.

	In detail, as presently implemented, the tc police burst option
limits the size of the burst to to 4 GB, i.e., UINT_MAX for a 32 bit
unsigned int.  This is a reasonable limit for the low rates common when
this was developed.  However, the underlying implementation of burst is
computed as "time at the specified rate," and for higher rates, a burst
size exceeding 4 GB is feasible without modification to the kernel.

	The burst size provided on the command line is translated into a
duration, representing how much time is required at the specified rate to
transmit the given burst size.

	This time is calculated in units of "psched ticks," each of which
is 64 nsec[0].  The computed number of psched ticks is sent to the kernel
as a __u32 value.

	Because burst is ultimately calculated as a time duration, the
real upper limit for a burst is UINT_MAX psched ticks, i.e.,

	UINT_MAX * psched tick duration / NSEC_PER_SEC
	(2^32-1) *         64           / 1E9

	which is roughly 274.88 seconds (274.8779...).

	At low rates, e.g., 5 Mbit/sec, UINT_MAX psched ticks does not
correspond to a burst size in excess of 4 GB, so the above is moot, e.g.,

	5Mbit/sec / 8 = 625000 MBytes/sec
	625000 * ~274.88 seconds = ~171800000 max burst size, below UINT_MAX

	Thus, the burst size at 5Mbit/sec is limited by the __u32 size of
the psched tick field in the kernel API, not the 4 GB limit of the tc
police burst user space API.

	However, at higher rates, e.g., 10 Gbit/sec, the burst size is
currently limited by the 4 GB maximum for the burst command line parameter
value, rather than UINT_MAX psched ticks:

	10 Gbit/sec / 8 = 1250000000 MBbytes/sec
	1250000000 * ~274.88 seconds = ~343600000000, more than UINT_MAX

	Here, the maximum duration of a burst the kernel can handle
exceeds 4 GB of burst size.

	While the above maximum may be an excessively large burst value,
at 10 Gbit/sec, a 4 GB burst size corresponds to just under 3.5 seconds in
duration:

	2^32 bytes / 10 Gbit/sec
	2^32 bytes / 1250000000 bytes/sec
	equals ~3.43 sec

	So, at higher rates, burst sizes exceeding 4 GB are both
reasonable and feasible, up to the UINT_MAX limit for psched ticks.
Enabling this requires changes only to the user space processing of the
burst size parameter in tc.

	In principle, the other packet schedulers utilizing psched ticks
for burst sizing, htb and tbf, could be similarly changed to permit larger
burst sizes, but this patch set does not do so.

	Separately, for the burst duration calculation overflow (i.e.,
that the number of psched ticks exceeds UINT_MAX), under the current
implementation, one example of overflow is as follows:

# /sbin/tc filter add dev eth0 protocol ip prio 1 parent ffff: handle 1 fw police rate 1Mbit peakrate 10Gbit burst 34375000 mtu 64Kb conform-exceed reclassify

# /sbin/tc -raw filter get dev eth0 ingress protocol ip pref 1 handle 1 fw
filter ingress protocol ip pref 1 fw chain 0 handle 0x1  police 0x1 rate 1Mbit burst 15261b mtu 64Kb [001d1bf8] peakrate 10Gbit action reclassify overhead 0b 
        ref 1 bind 1 

	Note that the returned burst value is 15261b, which does not match
the supplied value of 34375000.  With this patch set applied, this
situation is flagged as an error.


[0] psched ticks are defined in the kernel in include/net/pkt_sched.h:

#define PSCHED_SHIFT                    6
#define PSCHED_TICKS2NS(x)              ((s64)(x) << PSCHED_SHIFT)
#define PSCHED_NS2TICKS(x)              ((x) >> PSCHED_SHIFT)

#define PSCHED_TICKS_PER_SEC            PSCHED_NS2TICKS(NSEC_PER_SEC)

	where PSCHED_TICKS_PER_SEC is 15625000.

	These values are exported to user space via /proc/net/psched, the
second field being PSCHED_TICKS2NS(1), which at present is 64 (0x40).  tc
uses this value to compute its internal "tick_in_usec" variable containing
the number of psched ticks per usec (15.625) used for the psched tick
computations.

	Lastly, note that PSCHED_SHIFT was previously 10, and changed to 6
in commit a4a710c4a7490 in 2009.  I have not tested backwards
compatibility of these changes with kernels of that era.


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-09-16 21:23 Jay Vosburgh
@ 2025-09-16 21:56 ` Jay Vosburgh
  0 siblings, 0 replies; 1651+ messages in thread
From: Jay Vosburgh @ 2025-09-16 21:56 UTC (permalink / raw)
  Cc: netdev, Jamal Hadi Salim, Stephen Hemminger, David Ahern

Jay Vosburgh <jay.vosburgh@canonical.com> wrote:

>
>
>Subject: [PATCH v2 0/4 iproute2-next] tc/police: Allow 64 bit burst size
>
>	In summary, this patchset changes the user space handling of the
>tc police burst parameter to permit burst sizes that exceed 4 GB when the
>specified rate is high enough that the kernel API for burst can accomodate
>such.

	Ignore this, will fix and repost.

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-10-05 14:16 ssrane_b23
  2025-10-05 14:16 ` syzbot
  0 siblings, 1 reply; 1651+ messages in thread
From: ssrane_b23 @ 2025-10-05 14:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Mathieu Desnoyers, linux-kernel,
	linux-trace-kernel, syzbot+c530b4d95ec5cd4f33a7

#syz test on: linux-next



^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-10-05 14:16 ssrane_b23
@ 2025-10-05 14:16 ` syzbot
  0 siblings, 0 replies; 1651+ messages in thread
From: syzbot @ 2025-10-05 14:16 UTC (permalink / raw)
  To: ssrane_b23
  Cc: linux-kernel, linux-trace-kernel, mathieu.desnoyers, mhiramat,
	rostedt, ssrane_b23, syzkaller-bugs

> #syz test on: linux-next

This crash does not have a reproducer. I cannot test it.

>
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH] counter: microchip-tcb-capture: Allow shared IRQ for multi-channel TCBs
@ 2025-10-06 10:51 Dharma Balasubiramani
  2025-10-08  7:06 ` Kamel Bouhara
  0 siblings, 1 reply; 1651+ messages in thread
From: Dharma Balasubiramani @ 2025-10-06 10:51 UTC (permalink / raw)
  To: Kamel Bouhara, William Breathitt Gray, Bence Csókás
  Cc: linux-arm-kernel, linux-iio, linux-kernel, Dharma Balasubiramani

Mark the interrupt as IRQF_SHARED to permit multiple counter channels to
share the same TCB IRQ line.

Each Timer/Counter Block (TCB) instance shares a single IRQ line among its
three internal channels. When multiple counter channels (e.g., counter@0
and counter@1) within the same TCB are enabled, the second call to
devm_request_irq() fails because the IRQ line is already requested by the
first channel.

Fixes: e5d581396821 ("counter: microchip-tcb-capture: Add IRQ handling")
Signed-off-by: Dharma Balasubiramani <dharma.b@microchip.com>
---
 drivers/counter/microchip-tcb-capture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/counter/microchip-tcb-capture.c b/drivers/counter/microchip-tcb-capture.c
index 1a299d1f350b..19d457ae4c3b 100644
--- a/drivers/counter/microchip-tcb-capture.c
+++ b/drivers/counter/microchip-tcb-capture.c
@@ -451,7 +451,7 @@ static void mchp_tc_irq_remove(void *ptr)
 static int mchp_tc_irq_enable(struct counter_device *const counter, int irq)
 {
 	struct mchp_tc_data *const priv = counter_priv(counter);
-	int ret = devm_request_irq(counter->parent, irq, mchp_tc_isr, 0,
+	int ret = devm_request_irq(counter->parent, irq, mchp_tc_isr, IRQF_SHARED,
 				   dev_name(counter->parent), counter);
 
 	if (ret < 0)

---
base-commit: fd94619c43360eb44d28bd3ef326a4f85c600a07
change-id: 20251006-microchip-tcb-edd8aeae36c4

Best regards,
-- 
Dharma Balasubiramani <dharma.b@microchip.com>



^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re: [PATCH] counter: microchip-tcb-capture: Allow shared IRQ for multi-channel TCBs
  2025-10-06 10:51 [PATCH] counter: microchip-tcb-capture: Allow shared IRQ for multi-channel TCBs Dharma Balasubiramani
@ 2025-10-08  7:06 ` Kamel Bouhara
  2025-10-08 20:46   ` Bence Csókás
  0 siblings, 1 reply; 1651+ messages in thread
From: Kamel Bouhara @ 2025-10-08  7:06 UTC (permalink / raw)
  To: Dharma Balasubiramani, g
  Cc: William Breathitt Gray, Bence Csókás, linux-arm-kernel,
	linux-iio, linux-kernel

On Mon, Oct 06, 2025 at 04:21:50PM +0530, Dharma Balasubiramani wrote:

Hello Dharma,

> Mark the interrupt as IRQF_SHARED to permit multiple counter channels to
> share the same TCB IRQ line.
>
> Each Timer/Counter Block (TCB) instance shares a single IRQ line among its
> three internal channels. When multiple counter channels (e.g., counter@0
> and counter@1) within the same TCB are enabled, the second call to
> devm_request_irq() fails because the IRQ line is already requested by the
> first channel.
>
> Fixes: e5d581396821 ("counter: microchip-tcb-capture: Add IRQ handling")
> Signed-off-by: Dharma Balasubiramani <dharma.b@microchip.com>
> ---
>  drivers/counter/microchip-tcb-capture.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/counter/microchip-tcb-capture.c b/drivers/counter/microchip-tcb-capture.c
> index 1a299d1f350b..19d457ae4c3b 100644
> --- a/drivers/counter/microchip-tcb-capture.c
> +++ b/drivers/counter/microchip-tcb-capture.c
> @@ -451,7 +451,7 @@ static void mchp_tc_irq_remove(void *ptr)
>  static int mchp_tc_irq_enable(struct counter_device *const counter, int irq)
>  {
>  	struct mchp_tc_data *const priv = counter_priv(counter);
> -	int ret = devm_request_irq(counter->parent, irq, mchp_tc_isr, 0,
> +	int ret = devm_request_irq(counter->parent, irq, mchp_tc_isr, IRQF_SHARED,
>  				   dev_name(counter->parent), counter);
>
>  	if (ret < 0)
>
> ---
> base-commit: fd94619c43360eb44d28bd3ef326a4f85c600a07
> change-id: 20251006-microchip-tcb-edd8aeae36c4
>

This change makes sense, thanks !

Reviewed-by: Kamel Bouhara <kamel.bouhara@bootlin.com>

> Best regards,
> --
> Dharma Balasubiramani <dharma.b@microchip.com>
>

--
Kamel Bouhara, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-10-08  7:06 ` Kamel Bouhara
@ 2025-10-08 20:46   ` Bence Csókás
  0 siblings, 0 replies; 1651+ messages in thread
From: Bence Csókás @ 2025-10-08 20:46 UTC (permalink / raw)
  To: Kamel Bouhara, Dharma Balasubiramani, g
  Cc: William Breathitt Gray, Bence Csókás, linux-arm-kernel,
	linux-iio, linux-kernel

Hi,

> On Mon, Oct 06, 2025 at 04:21:50PM +0530, Dharma Balasubiramani wrote:
> 
> Hello Dharma,
> 
>> Mark the interrupt as IRQF_SHARED to permit multiple counter channels to
>> share the same TCB IRQ line.
>>
>> Each Timer/Counter Block (TCB) instance shares a single IRQ line among its
>> three internal channels. When multiple counter channels (e.g., counter@0
>> and counter@1) within the same TCB are enabled, the second call to
>> devm_request_irq() fails because the IRQ line is already requested by the
>> first channel.
>>
>> Fixes: e5d581396821 ("counter: microchip-tcb-capture: Add IRQ handling")
>> Signed-off-by: Dharma Balasubiramani <dharma.b@microchip.com>
>> ---
>>   drivers/counter/microchip-tcb-capture.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/counter/microchip-tcb-capture.c b/drivers/counter/microchip-tcb-capture.c
>> index 1a299d1f350b..19d457ae4c3b 100644
>> --- a/drivers/counter/microchip-tcb-capture.c
>> +++ b/drivers/counter/microchip-tcb-capture.c
>> @@ -451,7 +451,7 @@ static void mchp_tc_irq_remove(void *ptr)
>>   static int mchp_tc_irq_enable(struct counter_device *const counter, int irq)
>>   {
>>   	struct mchp_tc_data *const priv = counter_priv(counter);
>> -	int ret = devm_request_irq(counter->parent, irq, mchp_tc_isr, 0,
>> +	int ret = devm_request_irq(counter->parent, irq, mchp_tc_isr, IRQF_SHARED,
>>   				   dev_name(counter->parent), counter);
>>
>>   	if (ret < 0)
>>
>> ---
>> base-commit: fd94619c43360eb44d28bd3ef326a4f85c600a07
>> change-id: 20251006-microchip-tcb-edd8aeae36c4
>>
> 
> This change makes sense, thanks !
> 
> Reviewed-by: Kamel Bouhara <kamel.bouhara@bootlin.com>
> 
>> Best regards,
>> --
>> Dharma Balasubiramani <dharma.b@microchip.com>
>>
> 
> --
> Kamel Bouhara, Bootlin
> Embedded Linux and kernel engineering
> https://bootlin.com

Looks reasonable to me as well.

Reviewed-by: Bence Csókás <bence98@sch.bme.hu>


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-11-04  9:22 Michael Roach
  2025-11-04 10:24 ` Kristoffer Haugsbakk
  0 siblings, 1 reply; 1651+ messages in thread
From: Michael Roach @ 2025-11-04  9:22 UTC (permalink / raw)
  To: git

Thank you for filling out a Git bug report!
Please answer the following questions to help us understand your issue.

What did you do before the bug happened? (Steps to reproduce your issue)

I ran:
  $ git grep -rl --color=never '#!/.*/env ruby'

What did you expect to happen? (Expected behavior)

Get a list of filenames where the contents matched the regex

What happened instead? (Actual behavior)

For one of my files named `ensure-string-env.rb` was printed with part of the path in colour,
and the first dash of the filename replaced with a colon.

What's different between what you expected and what actually happened?

I expected:
  images/deploy_tools/scripts/ensure-string-env.rb

I got:
  images/deploy_tools/scripts/ensure:string-env.rb

  The string up to the colon was printed in pink

Anything else you want to add:

I tested with both zsh and bash with the same result.
I teested with git 2.47.3 and 2.49.1 everything worked correctly.
I also tried other terminals to rule that out.

2.51.1 is the current version on Fedora 42

I was able to reproduce this with:
  $ echo foo > test-one
  $ git add test-one
  $ git grep -l foo
  test:one

This seems to be reproducible with any filename containing a dash

[System Info]
git version:
git version 2.51.1
cpu: x86_64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
libcurl: 8.11.1
OpenSSL: OpenSSL 3.2.6 30 Sep 2025
zlib-ng: 2.2.5
SHA-1: SHA1_DC
SHA-256: SHA256_BLK
default-ref-format: files
default-hash: sha1
uname: Linux 6.17.5-200.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Oct 24 14:10:01 UTC 2025 x86_64
compiler info: gnuc: 15.2
libc info: glibc: 2.41
$SHELL (typically, interactive shell): /bin/zsh

[Enabled Hooks]
pre-commit

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-11-04  9:22 Michael Roach
@ 2025-11-04 10:24 ` Kristoffer Haugsbakk
  2025-11-05 14:55   ` Re: Lucas Seiki Oshiro
  0 siblings, 1 reply; 1651+ messages in thread
From: Kristoffer Haugsbakk @ 2025-11-04 10:24 UTC (permalink / raw)
  To: Michael Roach, git

On Tue, Nov 4, 2025, at 10:22, Michael Roach wrote:
>[snip]
> For one of my files named `ensure-string-env.rb` was printed with part
> of the path in colour,
> and the first dash of the filename replaced with a colon.

I have seen something similar when using the Delta pager. I’m pretty
sure that it replaced a hyphen with a colon.

https://github.com/dandavison/delta

I don’t think I’ve seen this behavior with `git --no-pager`.

Don’t know about the coloring part (despite `--color=never`).

>[snip]

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-11-04 10:24 ` Kristoffer Haugsbakk
@ 2025-11-05 14:55   ` Lucas Seiki Oshiro
  2025-11-05 15:01     ` Re: Kristoffer Haugsbakk
  0 siblings, 1 reply; 1651+ messages in thread
From: Lucas Seiki Oshiro @ 2025-11-05 14:55 UTC (permalink / raw)
  To: Kristoffer Haugsbakk, Michael Roach; +Cc: git

> I have seen something similar when using the Delta pager. I’m pretty
> sure that it replaced a hyphen with a colon.

It's a known bug in Delta:

https://github.com/dandavison/delta/issues/1259

> I don’t think I’ve seen this behavior with `git --no-pager`.

I think it is a good idea to also see what happens when using another
pager, for example, less (`git -c core.pager=less ...`) or cat
(`git -c core.pager=cat ...`).

Michael, can you run with those three mentioned options and see what
happens? Last year I spent some hours trying to find the cause of the
same bug in Git but then I found out that it was actually a bug in
Delta.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-11-05 14:55   ` Re: Lucas Seiki Oshiro
@ 2025-11-05 15:01     ` Kristoffer Haugsbakk
  2025-11-06  0:05       ` Re: Lucas Seiki Oshiro
  0 siblings, 1 reply; 1651+ messages in thread
From: Kristoffer Haugsbakk @ 2025-11-05 15:01 UTC (permalink / raw)
  To: Lucas Seiki Oshiro, Michael Roach; +Cc: git

On Wed, Nov 5, 2025, at 15:55, Lucas Seiki Oshiro wrote:
>> I have seen something similar when using the Delta pager. I’m pretty
>> sure that it replaced a hyphen with a colon.
>
> It's a known bug in Delta:
>
> https://github.com/dandavison/delta/issues/1259
>
>> I don’t think I’ve seen this behavior with `git --no-pager`.
>
> I think it is a good idea to also see what happens when using another
> pager, for example, less (`git -c core.pager=less ...`) or cat
> (`git -c core.pager=cat ...`).
>
> Michael, can you run with those three mentioned options and see what
> happens? Last year I spent some hours trying to find the cause of the
> same bug in Git but then I found out that it was actually a bug in
> Delta.

Sorry, I didn’t see that he only replied to me previously:

On Tue, Nov 4, 2025, at 12:15, Michael Roach wrote:
> Just when I thought I had considered all the factors before reporting 
> this, you got it.
> I am using Delta as my pager. My tests with other git versions were via 
> Docker, so there was no pager.
>
> I confirmed that using `git --no-pager grep` doesn't have this issue.
>
> Sorry for the bug report noise and thanks for figuring that out.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-11-05 15:01     ` Re: Kristoffer Haugsbakk
@ 2025-11-06  0:05       ` Lucas Seiki Oshiro
  2025-11-06  8:09         ` Re: Michael Roach
  0 siblings, 1 reply; 1651+ messages in thread
From: Lucas Seiki Oshiro @ 2025-11-06  0:05 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: Michael Roach, git


> Sorry, I didn’t see that he only replied to me previously:

Thanks for forwarding that!

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-11-06  0:05       ` Re: Lucas Seiki Oshiro
@ 2025-11-06  8:09         ` Michael Roach
  0 siblings, 0 replies; 1651+ messages in thread
From: Michael Roach @ 2025-11-06  8:09 UTC (permalink / raw)
  To: Lucas Seiki Oshiro, Kristoffer Haugsbakk; +Cc: git

Sorry I didn't do a reply to all on Kristoffer's response. Can you tell it's my first time here? 
Thank you both for your time on this!



November 6, 2025 at 01:05, "Lucas Seiki Oshiro" <lucasseikioshiro@gmail.com mailto:lucasseikioshiro@gmail.com?to=%22Lucas%20Seiki%20Oshiro%22%20%3Clucasseikioshiro%40gmail.com%3E > wrote:


> 
> > 
> > Sorry, I didn’t see that he only replied to me previously:
> > 
> Thanks for forwarding that!
>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2025-11-05  3:38 niklaus.liu
  2025-11-05  8:56 ` AngeloGioacchino Del Regno
  0 siblings, 1 reply; 1651+ messages in thread
From: niklaus.liu @ 2025-11-05  3:38 UTC (permalink / raw)
  To: Matthias Brugger, AngeloGioacchino Del Regno
  Cc: linux-kernel, linux-arm-kernel, linux-mediatek,
	Project_Global_Chrome_Upstream_Group, sirius.wang, vince-wl.liu,
	jh.hsu, zhigang.qin, sen.chu, Niklaus Liu, niklaus.liu

Refer to the discussion in the link:
v3: https://patchwork.kernel.org/project/linux-mediatek/patch/20251104071252.12539-2-Niklaus.Liu@mediatek.com/

Subject: [PATCH v4 0/1] soc: mediatek: mtk-regulator-coupler: Add support for MT8189

changes in v4:
 - reply comment:
1. MTK hardware requires that vsram_gpu must be higher than vgpu; this rule must be satisfied.

2. When the GPU powers on, the mtcmos driver first calls regulator_enable to turn on vgpu, then calls regulator_enable to 
turn on vsram_gpu. When enabling vgpu, mediatek_regulator_balance_voltage sets the voltages for both vgpu and vsram_gpu. 
However, when enabling vsram_gpu, mediatek_regulator_balance_voltage is also executed, and at this time, the vsram_gpu voltage 
is set to the minimum voltage specified in the DTS, which does not comply with the requirement that vsram_gpu must be higher than vgpu.

3.During suspend, the voltages of vgpu and vsram_gpu should remain unchanged, and when resuming, vgpu and vsram_gpu should be 
restored to their previous voltages. When the vgpu voltage is adjusted, mediatek_regulator_balance_voltage already synchronizes the 
adjustment of vsram_gpu voltage. Therefore, adjusting the vsram_gpu voltage again in mediatek_regulator_balance_voltage is redundant. 

changes in v3:
 - modify for comment[add the new entry by alphabetical order]

changes in v2:
 - change title for patch
 - reply comment: This is a software regulator coupler mechanism, and the regulator-coupled-with
configuration has been added in the MT8189 device tree. This patchaddresses an issue reported by a
Chromebook customer. When the GPU regulator is turned on, mediatek_regulator_balance_voltage already
sets both the GPU and GPU_SRAM voltages at the same time, so there is no need to adjust the GPU_SRAM
voltage again in a second round. Therefore, a return is set for MT8189.
If the user calls mediatek_regulator_balance_voltage again for GPU_SRAM, it may cause abnormal behavior of GPU_SRAM.

changes in v1:
 - mediatek-regulator-coupler mechanism for platform MT8189

*** BLURB HERE ***

Niklaus Liu (1):
  soc: mediatek: mtk-regulator-coupler: Add support for MT8189

 drivers/soc/mediatek/mtk-regulator-coupler.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

-- 
2.46.0

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2025-11-05  3:38 niklaus.liu
@ 2025-11-05  8:56 ` AngeloGioacchino Del Regno
  0 siblings, 0 replies; 1651+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-11-05  8:56 UTC (permalink / raw)
  To: niklaus.liu, Matthias Brugger
  Cc: linux-kernel, linux-arm-kernel, linux-mediatek,
	Project_Global_Chrome_Upstream_Group, sirius.wang, vince-wl.liu,
	jh.hsu, zhigang.qin, sen.chu

Il 05/11/25 04:38, niklaus.liu ha scritto:
> Refer to the discussion in the link:
> v3: https://patchwork.kernel.org/project/linux-mediatek/patch/20251104071252.12539-2-Niklaus.Liu@mediatek.com/
> 
> Subject: [PATCH v4 0/1] soc: mediatek: mtk-regulator-coupler: Add support for MT8189

The subject of this email is .. empty. That's really bad, and it's the second time
that it happens. Please make sure that you're sending emails the right way, and/or
fix your client.

While at it, please also fix your name, as it should appear as "Niklaus Liu" and
not as "niklaus.liu".

> 
> changes in v4:
>   - reply comment:

Niklaus, please just reply inline to the emails instead of sending an entirely new
version just for a reply: it's easier for everyone to follow, and it's also easier
for me to read, and for you to send a reply by clicking one button :-)

> 1. MTK hardware requires that vsram_gpu must be higher than vgpu; this rule must be satisfied.
> 
> 2. When the GPU powers on, the mtcmos driver first calls regulator_enable to turn on vgpu, then calls regulator_enable to
> turn on vsram_gpu. When enabling vgpu, mediatek_regulator_balance_voltage sets the voltages for both vgpu and vsram_gpu.
> However, when enabling vsram_gpu, mediatek_regulator_balance_voltage is also executed, and at this time, the vsram_gpu voltage
> is set to the minimum voltage specified in the DTS, which does not comply with the requirement that vsram_gpu must be higher than vgpu.
> 

2. -> There's your problem! VSRAM_GPU has to be turned on *first*, VGPU has to be
turned on *last* instead.

Logically, you need SRAM up before the GPU is up as if the GPU tries to use SRAM
it'll produce unpowered access issues: even though it's *very* unlikely for that
to happen on older Mali, it's still a logical mistake that might, one day, come
back at us and create instabilities.

Now, the easy fix is to just invert the regulators in MFG nodes. As I explained
*multiple* times, you have a misconfiguration in your DT.

GPU subsystem main MFG -> VSRAM
GPU core complex MFG -> VGPU
GPU per-core MFG -> nothing

> 3.During suspend, the voltages of vgpu and vsram_gpu should remain unchanged, and when resuming, vgpu and vsram_gpu should be
> restored to their previous voltages. When the vgpu voltage is adjusted, mediatek_regulator_balance_voltage already synchronizes the
> adjustment of vsram_gpu voltage. Therefore, adjusting the vsram_gpu voltage again in mediatek_regulator_balance_voltage is redundant.

If you fix your DT, N.3 won't happen.

Regards,
Angelo

> 
> 
> changes in v3:
>   - modify for comment[add the new entry by alphabetical order]
> 
> changes in v2:
>   - change title for patch
>   - reply comment: This is a software regulator coupler mechanism, and the regulator-coupled-with
> configuration has been added in the MT8189 device tree. This patchaddresses an issue reported by a
> Chromebook customer. When the GPU regulator is turned on, mediatek_regulator_balance_voltage already
> sets both the GPU and GPU_SRAM voltages at the same time, so there is no need to adjust the GPU_SRAM
> voltage again in a second round. Therefore, a return is set for MT8189.
> If the user calls mediatek_regulator_balance_voltage again for GPU_SRAM, it may cause abnormal behavior of GPU_SRAM.
> 
> 
> changes in v1:
>   - mediatek-regulator-coupler mechanism for platform MT8189
> 
> *** BLURB HERE ***
> 
> Niklaus Liu (1):
>    soc: mediatek: mtk-regulator-coupler: Add support for MT8189
> 
>   drivers/soc/mediatek/mtk-regulator-coupler.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2026-01-11 21:10 Wesley B
  2026-01-12 13:28 ` Miguel Ojeda
  0 siblings, 1 reply; 1651+ messages in thread
From: Wesley B @ 2026-01-11 21:10 UTC (permalink / raw)
  To: rust-for-linux

From 71c099600448cdb639136bb15bdd40767dbfc0fd Mon Sep 17 00:00:00 2001
From: Wesley Bott <atticusfinch570@gmail.com>
Date: Sat, 10 Jan 2026 18:26:47 -0700
Subject: [PATCH] rust: Restore __new INVARIANT comment, updated docs as needed

Add back the //INVARIANT: comment in __new to reflect the
precondition that VALUE must fit within N bits. Updated docs
to reflect INVARIANT comment.

Link: https://github.com/Rust-for-Linux/linux/issues/1217
Suggested-by: Miguel Ojeda <ojeda@kernel.org>

Signed-off-by: Wesley Bott <atticusfinch570@gmail.com>
---
 rust/kernel/num/bounded.rs | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/rust/kernel/num/bounded.rs b/rust/kernel/num/bounded.rs
index f870080af8ac..07ce6a1dd6e3 100644
--- a/rust/kernel/num/bounded.rs
+++ b/rust/kernel/num/bounded.rs
@@ -282,10 +282,11 @@ impl<T, const N: u32> Bounded<T, N>
     /// All instances of [`Bounded`] must be created through this
method as it enforces most of the
     /// type invariants.
     ///
-    /// The caller remains responsible for checking, either
statically or dynamically, that `value`
-    /// can be represented as a `T` using at most `N` bits.
+    /// The caller must ensure checking, either statically or
dynamically, that `value`
+    /// can be represented as a `T` using at most `N` bits (0 < N <= T::BITS).
     const fn __new(value: T) -> Self {
         // Enforce the type invariants.
+        // INVARIANT: 'value' T can be represented using at most N
bits (0 < N <= T::BITS)
         const {
             // `N` cannot be zero.
             assert!(N != 0);
--
2.43.0

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2026-01-11 21:10 Wesley B
@ 2026-01-12 13:28 ` Miguel Ojeda
  0 siblings, 0 replies; 1651+ messages in thread
From: Miguel Ojeda @ 2026-01-12 13:28 UTC (permalink / raw)
  To: Wesley B; +Cc: rust-for-linux

On Sun, Jan 11, 2026 at 10:10 PM Wesley B <atticusfinch570@gmail.com> wrote:
>
> From 71c099600448cdb639136bb15bdd40767dbfc0fd Mon Sep 17 00:00:00 2001
> From: Wesley Bott <atticusfinch570@gmail.com>
> Date: Sat, 10 Jan 2026 18:26:47 -0700
> Subject: [PATCH] rust: Restore __new INVARIANT comment, updated docs as needed

These headers should be used to send an email with that subject etc.,
rather than embedding it. I would suggest trying to use `git
send-email` or `b4`.

> Link: https://github.com/Rust-for-Linux/linux/issues/1217
> Suggested-by: Miguel Ojeda <ojeda@kernel.org>

The suggestion was to remove a paragraph, not to edit it. Perhaps the
confusion is that the suggestion goes on top of the `rust-fixes`
branch, not mainline.

Thanks!

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2026-02-02 10:53 Anshumali Gaur
  2026-02-03  0:34 ` Jacob Keller
  0 siblings, 1 reply; 1651+ messages in thread
From: Anshumali Gaur @ 2026-02-02 10:53 UTC (permalink / raw)
  To: Jacob Keller
  Cc: netdev, linux-kernel, Sunil Goutham, Linu Cherian,
	Geetha sowjanya, Jerin Jacob, hariprasad, Subbaraya Sundeep,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni

On 2026-01-29 at 23:02:43, Jacob Keller (jacob.e.keller@intel.com) wrote:
> 
> 
> On 1/29/2026 1:19 AM, Anshumali Gaur wrote:
> > When both AF and PF drivers are built as modules, the PF driver in the
> > kexec kernel may probe before the AF driver is ready. This leads to
> > a crash due to uninitialized hardware state.
> > 
> > This patch ensures the PF driver properly detects and waits for AF
> > driver readiness before proceeding with initialization.
> > 
> 
> To me, the patch description is not sufficient to describe the what and why
> of this change.
> 
> Could you please provide a better explanation of how the addition of the
> provided shutdown handler fixes initialization?
>
Hi Jacob,
The issue being addressed here is specific to kexec and persistent AF
hardware state across kernel transitions. When both AF and PF drivers
are built as modules and a kexec kernel is performed, the PF driver in
the new kernel may probe before the AF driver has completed probing and
reinitializing the RVU hardware. In this scenario, the hardware state
left behind by the AF driver in the old kernel is still visible to the
PF driver in the new kernel resulting in crash due to stale state.
> > Fixes: 54494aa5d1e6 ("octeontx2-af: Add Marvell OcteonTX2 RVU AF driver")
> > Signed-off-by: Anshumali Gaur <agaur@marvell.com>
> > ---
> >   drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 11 +++++++++++
> >   1 file changed, 11 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
> > index 747fbdf2a908..8530df8b3fda 100644
> > --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
> > +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
> > @@ -3632,11 +3632,22 @@ static void rvu_remove(struct pci_dev *pdev)
> >   	devm_kfree(&pdev->dev, rvu);
> >   }
> > +static void rvu_shutdown(struct pci_dev *pdev)
> > +{
> > +	struct rvu *rvu = pci_get_drvdata(pdev);
> > +
> > +	if (!rvu)
> > +		return;
> > +
> > +	rvu_clear_rvum_blk_revid(rvu);
> 
> Here, I guess you are clearing some data about the device status. Does that
> mean that when you initialize later you will wait for the AF driver to
> finish probing and configure this? It would be nice to explain how this
> change fixes initialization.
>
The RVUM block revision field acts as an implicit indication that the AF
driver has completed its initialization. If this value is left uncleared
during kexec kernel booting, the PF driver may observe a non-zero/valid
RVUM block revision and incorrectly assume that the AF is already
initialized and ready, even though the AF driver in the kexec kernel has
not yet probed. This leads to PF initialization proceeding against
partially initialized hardware, resulting in a crash.
> > +}
> > +
> >   static struct pci_driver rvu_driver = {
> >   	.name = DRV_NAME,
> >   	.id_table = rvu_id_table,
> >   	.probe = rvu_probe,
> >   	.remove = rvu_remove,
> > +	.shutdown = rvu_shutdown,
> 
> This is the shutdown handler:
> > 
> >  * @shutdown:   Hook into reboot_notifier_list (kernel/sys.c).
> >  *              Intended to stop any idling DMA operations.
> >  *              Useful for enabling wake-on-lan (NIC) or changing
> >  *              the power state of a device before reboot.
> >  *              e.g. drivers/net/e100.c.
> 
> How does this have anything to do with initialization?
> 
> >   };
> >   static int __init rvu_init_module(void)
> 
> 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2026-02-02 10:53 Anshumali Gaur
@ 2026-02-03  0:34 ` Jacob Keller
  0 siblings, 0 replies; 1651+ messages in thread
From: Jacob Keller @ 2026-02-03  0:34 UTC (permalink / raw)
  To: Anshumali Gaur
  Cc: netdev, linux-kernel, Sunil Goutham, Linu Cherian,
	Geetha sowjanya, Jerin Jacob, hariprasad, Subbaraya Sundeep,
	Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni



On 2/2/2026 2:53 AM, Anshumali Gaur wrote:
> On 2026-01-29 at 23:02:43, Jacob Keller (jacob.e.keller@intel.com) wrote:
>>
>>
>> On 1/29/2026 1:19 AM, Anshumali Gaur wrote:
>>> When both AF and PF drivers are built as modules, the PF driver in the
>>> kexec kernel may probe before the AF driver is ready. This leads to
>>> a crash due to uninitialized hardware state.
>>>
>>> This patch ensures the PF driver properly detects and waits for AF
>>> driver readiness before proceeding with initialization.
>>>
>>
>> To me, the patch description is not sufficient to describe the what and why
>> of this change.
>>
>> Could you please provide a better explanation of how the addition of the
>> provided shutdown handler fixes initialization?
>>
> Hi Jacob,
> The issue being addressed here is specific to kexec and persistent AF
> hardware state across kernel transitions. When both AF and PF drivers
> are built as modules and a kexec kernel is performed, the PF driver in
> the new kernel may probe before the AF driver has completed probing and
> reinitializing the RVU hardware. In this scenario, the hardware state
> left behind by the AF driver in the old kernel is still visible to the
> PF driver in the new kernel resulting in crash due to stale state.
>>> Fixes: 54494aa5d1e6 ("octeontx2-af: Add Marvell OcteonTX2 RVU AF driver")
>>> Signed-off-by: Anshumali Gaur <agaur@marvell.com>
>>> ---
>>>    drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 11 +++++++++++
>>>    1 file changed, 11 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
>>> index 747fbdf2a908..8530df8b3fda 100644
>>> --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
>>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu.c
>>> @@ -3632,11 +3632,22 @@ static void rvu_remove(struct pci_dev *pdev)
>>>    	devm_kfree(&pdev->dev, rvu);
>>>    }
>>> +static void rvu_shutdown(struct pci_dev *pdev)
>>> +{
>>> +	struct rvu *rvu = pci_get_drvdata(pdev);
>>> +
>>> +	if (!rvu)
>>> +		return;
>>> +
>>> +	rvu_clear_rvum_blk_revid(rvu);
>>
>> Here, I guess you are clearing some data about the device status. Does that
>> mean that when you initialize later you will wait for the AF driver to
>> finish probing and configure this? It would be nice to explain how this
>> change fixes initialization.
>>
> The RVUM block revision field acts as an implicit indication that the AF
> driver has completed its initialization. If this value is left uncleared
> during kexec kernel booting, the PF driver may observe a non-zero/valid
> RVUM block revision and incorrectly assume that the AF is already
> initialized and ready, even though the AF driver in the kexec kernel has
> not yet probed. This leads to PF initialization proceeding against
> partially initialized hardware, resulting in a crash.

Makes sense. When shutting down you need to explicitly clear the stale 
data so that booting up (without a powercycle as in the kexec case) does 
not lead to stale data.

I'd appreciate a little more of this detail in the commit message 
personally. However, functionally it makes sense, so:

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH v5 0/3] iio: frequency: ad9523: fix checkpatch warnings
@ 2026-02-25 19:40 Bhargav Joshi
  2026-02-25 19:40 ` Bhargav Joshi
  0 siblings, 1 reply; 1651+ messages in thread
From: Bhargav Joshi @ 2026-02-25 19:40 UTC (permalink / raw)
  To: lars, Michael.Hennerich, jic23
  Cc: dlechner, nuno.sa, andy, linux-iio, linux-kernel, rougueprince47

These patches address several checkpatch warnings in the ad9523 driver.

Patch 1: Updated the macros to properly use their argument x.
Patch 2: Fixed the multi-line pointer dereferences.
Patch 3: Updated symbolic permissions to octal (0444/0200).

Changes in v5:
- Reflowed all commit messages to properly wrap near 72 characters as
requested.
- Collected Reviewed-by tags for patches 1 and 3.

Changes in v4:
- Used full name for SoB 

Changes in v3:
- Patch 1: updated macros to use '(x)' instead of 'x'.
- Patch 2: collected reviewed-by (no code changes).
- Patch 3: fixed vertical spacing and broken indentation.

bhargav (3):
  iio: frequency: ad9523: fix implicit variable usage in macros
  iio: frequency: ad9523: avoid multiple line dereferences for pdata
  iio: frequency: ad9523: fix checkpatch warnings for symbolic
    permissions

 drivers/iio/frequency/ad9523.c | 88 +++++++++++++---------------------
 1 file changed, 33 insertions(+), 55 deletions(-)

-- 
2.53.0


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2026-02-25 19:40 [PATCH v5 0/3] iio: frequency: ad9523: fix checkpatch warnings Bhargav Joshi
@ 2026-02-25 19:40 ` Bhargav Joshi
  2026-02-25 19:43   ` Andy Shevchenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Bhargav Joshi @ 2026-02-25 19:40 UTC (permalink / raw)
  To: lars, Michael.Hennerich, jic23
  Cc: dlechner, nuno.sa, andy, linux-iio, linux-kernel, rougueprince47,
	Andy Shevchenko

Subject: [PATCH v5 3/3] iio: frequency: ad9523: fix checkpatch warnings
for symbolic permissions

The driver currently defines device attributes using symbolic permission
flags (S_IRUGO and S_IWUSR). Update these to use octal permissions (0444
and 0200) to resolve checkpatch warnings.

Signed-off-by: Bhargav Joshi <rougueprince47@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
---
Changes in v5:
- Reflowed all commit messages to properly wrap near 72 characters as
  requested.
- Collected Reviewed-by tag.

Changes in v4:
- Used full name for SoB 

Changes in v3:
- Fixed vertical spacing and broken indentation.

 drivers/iio/frequency/ad9523.c | 78 +++++++++++++---------------------
 1 file changed, 29 insertions(+), 49 deletions(-)

diff --git a/drivers/iio/frequency/ad9523.c b/drivers/iio/frequency/ad9523.c
index 6daa2ea354a8..ad32eb66edca 100644
--- a/drivers/iio/frequency/ad9523.c
+++ b/drivers/iio/frequency/ad9523.c
@@ -558,55 +558,35 @@ static ssize_t ad9523_show(struct device *dev,
 	return ret;
 }
 
-static IIO_DEVICE_ATTR(pll1_locked, S_IRUGO,
-			ad9523_show,
-			NULL,
-			AD9523_STAT_PLL1_LD);
-
-static IIO_DEVICE_ATTR(pll2_locked, S_IRUGO,
-			ad9523_show,
-			NULL,
-			AD9523_STAT_PLL2_LD);
-
-static IIO_DEVICE_ATTR(pll1_reference_clk_a_present, S_IRUGO,
-			ad9523_show,
-			NULL,
-			AD9523_STAT_REFA);
-
-static IIO_DEVICE_ATTR(pll1_reference_clk_b_present, S_IRUGO,
-			ad9523_show,
-			NULL,
-			AD9523_STAT_REFB);
-
-static IIO_DEVICE_ATTR(pll1_reference_clk_test_present, S_IRUGO,
-			ad9523_show,
-			NULL,
-			AD9523_STAT_REF_TEST);
-
-static IIO_DEVICE_ATTR(vcxo_clk_present, S_IRUGO,
-			ad9523_show,
-			NULL,
-			AD9523_STAT_VCXO);
-
-static IIO_DEVICE_ATTR(pll2_feedback_clk_present, S_IRUGO,
-			ad9523_show,
-			NULL,
-			AD9523_STAT_PLL2_FB_CLK);
-
-static IIO_DEVICE_ATTR(pll2_reference_clk_present, S_IRUGO,
-			ad9523_show,
-			NULL,
-			AD9523_STAT_PLL2_REF_CLK);
-
-static IIO_DEVICE_ATTR(sync_dividers, S_IWUSR,
-			NULL,
-			ad9523_store,
-			AD9523_SYNC);
-
-static IIO_DEVICE_ATTR(store_eeprom, S_IWUSR,
-			NULL,
-			ad9523_store,
-			AD9523_EEPROM);
+static IIO_DEVICE_ATTR(pll1_locked, 0444, ad9523_show, NULL,
+		       AD9523_STAT_PLL1_LD);
+
+static IIO_DEVICE_ATTR(pll2_locked, 0444, ad9523_show, NULL,
+		       AD9523_STAT_PLL2_LD);
+
+static IIO_DEVICE_ATTR(pll1_reference_clk_a_present, 0444, ad9523_show, NULL,
+		       AD9523_STAT_REFA);
+
+static IIO_DEVICE_ATTR(pll1_reference_clk_b_present, 0444, ad9523_show, NULL,
+		       AD9523_STAT_REFB);
+
+static IIO_DEVICE_ATTR(pll1_reference_clk_test_present, 0444, ad9523_show, NULL,
+		       AD9523_STAT_REF_TEST);
+
+static IIO_DEVICE_ATTR(vcxo_clk_present, 0444, ad9523_show, NULL,
+		       AD9523_STAT_VCXO);
+
+static IIO_DEVICE_ATTR(pll2_feedback_clk_present, 0444, ad9523_show, NULL,
+		       AD9523_STAT_PLL2_FB_CLK);
+
+static IIO_DEVICE_ATTR(pll2_reference_clk_present, 0444, ad9523_show, NULL,
+		       AD9523_STAT_PLL2_REF_CLK);
+
+static IIO_DEVICE_ATTR(sync_dividers, 0200, NULL, ad9523_store,
+		       AD9523_SYNC);
+
+static IIO_DEVICE_ATTR(store_eeprom, 0200, NULL, ad9523_store,
+		       AD9523_EEPROM);
 
 static struct attribute *ad9523_attributes[] = {
 	&iio_dev_attr_sync_dividers.dev_attr.attr,
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2026-02-25 19:40 ` Bhargav Joshi
@ 2026-02-25 19:43   ` Andy Shevchenko
  0 siblings, 0 replies; 1651+ messages in thread
From: Andy Shevchenko @ 2026-02-25 19:43 UTC (permalink / raw)
  To: Bhargav Joshi
  Cc: lars, Michael.Hennerich, jic23, dlechner, nuno.sa, andy,
	linux-iio, linux-kernel, Andy Shevchenko

On Wed, Feb 25, 2026 at 9:42 PM Bhargav Joshi <rougueprince47@gmail.com> wrote:
>
> Subject: [PATCH v5 3/3] iio: frequency: ad9523: fix checkpatch warnings
> for symbolic permissions

Something went wrong...

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: ezcap401 needs USB_QUIRK_NO_BOS to function on 10gbs usb speed
@ 2026-03-13 10:11 Greg KH
  2026-03-13 11:01 ` Vyacheslav Vahnenko
  0 siblings, 1 reply; 1651+ messages in thread
From: Greg KH @ 2026-03-13 10:11 UTC (permalink / raw)
  To: CaTaTo; +Cc: linux-usb

On Fri, Mar 13, 2026 at 12:54:35PM +0300, CaTaTo wrote:
> Sorry again, last message went probably to Greg's address rather than this
> topic. I managed to make a patch, hopefully I didn't miss anything critical.

No need to attach it, just use 'git send-email' to send it directly.

Also, some comments on the patch:

> From 1ad243ebd0211a591665383d1382615bb9e3dc3a Mon Sep 17 00:00:00 2001
> From: Vyacheslav Vahnenko <vahnenko2003@gmail.com>
> Date: Fri, 13 Mar 2026 12:12:26 +0300
> Subject: [PATCH] Add USB_QUIRK_NO_BOS for ezcap401 usb capture card
> 
> Signed-off-by: Vyacheslav Vahnenko <vahnenko2003@gmail.com>

You need to have some changelog text, we can't take it without any
wording.

> ---
>  drivers/usb/core/quirks.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
> index 9e7e49712..8ef876315 100644
> --- a/drivers/usb/core/quirks.c
> +++ b/drivers/usb/core/quirks.c
> @@ -583,6 +583,9 @@ static const struct usb_device_id usb_quirk_list[] = {
>  	/* INTEL VALUE SSD */
>  	{ USB_DEVICE(0x8086, 0xf1a5), .driver_info = USB_QUIRK_RESET_RESUME },
>  
> +	/* ezcap401 - BOS descriptor fetch hangs at SuperSpeed Plus */
> +	{ USB_DEVICE(0x32ed, 0x0401), .driver_info = USB_QUIRK_NO_BOS },

The list should be sorted by id.  Please put this in the proper place in
the list.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2026-03-13 10:11 ezcap401 needs USB_QUIRK_NO_BOS to function on 10gbs usb speed Greg KH
@ 2026-03-13 11:01 ` Vyacheslav Vahnenko
  2026-03-13 12:04   ` Greg KH
  0 siblings, 1 reply; 1651+ messages in thread
From: Vyacheslav Vahnenko @ 2026-03-13 11:01 UTC (permalink / raw)
  To: gregkh; +Cc: linux-usb, vahnenko2003

Add USB_QUIRK_NO_BOS for ezcap401 capture card, without it dmesg will show "unable to get BOS descriptor or descriptor too short"
and "unable to read config index 0 descriptor/start: -71" errors and device will not able to work at full speed at 10gbs

Subject: ezcap401 needs USB_QUIRK_NO_BOS to function on 10gbs usb speed
Signed-off-by: Vyacheslav Vahnenko <vahnenko2003@gmail.com>
---
 drivers/usb/core/quirks.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
index 9e7e49712..0010f41a3 100644
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -574,6 +574,9 @@ static const struct usb_device_id usb_quirk_list[] = {
 	/* Alcor Link AK9563 SC Reader used in 2022 Lenovo ThinkPads */
 	{ USB_DEVICE(0x2ce3, 0x9563), .driver_info = USB_QUIRK_NO_LPM },
 
+	/* ezcap401 - BOS descriptor fetch hangs at SuperSpeed Plus */
+	{ USB_DEVICE(0x32ed, 0x0401), .driver_info = USB_QUIRK_NO_BOS },
+
 	/* DELL USB GEN2 */
 	{ USB_DEVICE(0x413c, 0xb062), .driver_info = USB_QUIRK_NO_LPM | USB_QUIRK_RESET_RESUME },
 
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2026-03-13 11:01 ` Vyacheslav Vahnenko
@ 2026-03-13 12:04   ` Greg KH
  0 siblings, 0 replies; 1651+ messages in thread
From: Greg KH @ 2026-03-13 12:04 UTC (permalink / raw)
  To: Vyacheslav Vahnenko; +Cc: linux-usb

On Fri, Mar 13, 2026 at 02:01:41PM +0300, Vyacheslav Vahnenko wrote:
> Add USB_QUIRK_NO_BOS for ezcap401 capture card, without it dmesg will show "unable to get BOS descriptor or descriptor too short"
> and "unable to read config index 0 descriptor/start: -71" errors and device will not able to work at full speed at 10gbs
> 
> Subject: ezcap401 needs USB_QUIRK_NO_BOS to function on 10gbs usb speed

Close, this should actually be the subject line, you forgot to have one
entirely in your email :)

And can you wrap the changelog text at 72 columns, like your editor
should have asked you to when making this change?  If you run your patch
through scripts/checkpatch.pl before sending it, it should tell you
about these types of things.

> Signed-off-by: Vyacheslav Vahnenko <vahnenko2003@gmail.com>
> ---
>  drivers/usb/core/quirks.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c
> index 9e7e49712..0010f41a3 100644
> --- a/drivers/usb/core/quirks.c
> +++ b/drivers/usb/core/quirks.c
> @@ -574,6 +574,9 @@ static const struct usb_device_id usb_quirk_list[] = {
>  	/* Alcor Link AK9563 SC Reader used in 2022 Lenovo ThinkPads */
>  	{ USB_DEVICE(0x2ce3, 0x9563), .driver_info = USB_QUIRK_NO_LPM },
>  
> +	/* ezcap401 - BOS descriptor fetch hangs at SuperSpeed Plus */
> +	{ USB_DEVICE(0x32ed, 0x0401), .driver_info = USB_QUIRK_NO_BOS },

This looks good, thanks for putting it in the right place.

Almost there!

greg k-h

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] net: ns83820: check DMA mapping errors in hard_start_xmit
@ 2026-03-31 10:05 Paolo Abeni
  2026-03-31 11:14 ` Wang Jun
  0 siblings, 1 reply; 1651+ messages in thread
From: Paolo Abeni @ 2026-03-31 10:05 UTC (permalink / raw)
  To: Wang Jun, kuba
  Cc: andrew+netdev, davem, edumazet, simon.horman, netdev,
	linux-kernel, gszhai, 25125332, 25125283, 23120469, stable

On 3/27/26 2:16 AM, Wang Jun wrote:
> The ns83820 driver currently ignores the return values of dma_map_single()
> and skb_frag_dma_map() in the transmit path. If DMA mapping fails due to
> IOMMU exhaustion or SWIOTLB pressure, the driver may proceed with invalid
> DMA addresses, potentially causing hardware errors, data corruption, or
> system instability.
> 
> Additionally, if mapping fails midway through processing fragmented
> packets,previously mapped DMA resources are not released, leading
> to DMA resource leaks.
> 
> Fix this by:
> 1. Checking dma_mapping_error() after each DMA mapping call.
> 2. Implementing an error handling path to unmap successfully
> mapped buffers(both linear and fragments) using
> dma_unmap_single().
> 3. Freeing the skb using dev_kfree_skb_any() to safely handle
> both process and softirq contexts, as hard_start_xmit may be
> called with IRQs enabled.
> 4. Returning NETDEV_TX_OK to drop the packet gracefully and
> prevent TX queue stagnation.
> 
> This ensures compliance with the DMA API guidelines and
> improves driver stability under memory pressure.
> 
> Fixes: fd9e4d6fec15 ("natsemi: switch from 'pci_' to 'dma_' API")

The fix tag looks wrong, as the above just to mechanical substitution of
used APIs. The issue is pre-existent

> Cc: stable@vger.kernel.org
> Signed-off-by: Wang Jun <1742789905@qq.com>
> ---
>  drivers/net/ethernet/natsemi/ns83820.c | 32 ++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
> 
> diff --git a/drivers/net/ethernet/natsemi/ns83820.c b/drivers/net/ethernet/natsemi/ns83820.c
> index cdbf82affa7b..90465e4977c3 100644
> --- a/drivers/net/ethernet/natsemi/ns83820.c
> +++ b/drivers/net/ethernet/natsemi/ns83820.c
> @@ -1051,6 +1051,12 @@ static netdev_tx_t ns83820_hard_start_xmit(struct sk_buff *skb,
>  	int stopped = 0;
>  	int do_intr = 0;
>  	volatile __le32 *first_desc;
> +	dma_addr_t frag_dma_addr[MAX_SKB_FRAGS];
> +	unsigned int frag_dma_len[MAX_SKB_FRAGS];
> +	int frag_mapped_count = 0;
> +	dma_addr_t main_buf = 0;
> +	unsigned int main_len = 0;
> +	int i;

Please respect the reverse christmas tree order above.

>  	dprintk("ns83820_hard_start_xmit\n");
>  
> @@ -1121,6 +1127,13 @@ static netdev_tx_t ns83820_hard_start_xmit(struct sk_buff *skb,
>  	buf = dma_map_single(&dev->pci_dev->dev, skb->data, len,
>  			     DMA_TO_DEVICE);
>  
> +	if (dma_mapping_error(&dev->pci_dev->dev, buf)) {
> +		dev_kfree_skb_any(skb);
> +		return NETDEV_TX_OK;
> +	}
> +	main_buf = buf;
> +	main_len = len;
> +
>  	first_desc = dev->tx_descs + (free_idx * DESC_SIZE);
>  
>  	for (;;) {
> @@ -1144,6 +1157,15 @@ static netdev_tx_t ns83820_hard_start_xmit(struct sk_buff *skb,
>  
>  		buf = skb_frag_dma_map(&dev->pci_dev->dev, frag, 0,
>  				       skb_frag_size(frag), DMA_TO_DEVICE);
> +		if (dma_mapping_error(&dev->pci_dev->dev, buf))
> +			goto dma_map_error;
> +
> +		if (frag_mapped_count < MAX_SKB_FRAGS) {
> +			frag_dma_addr[frag_mapped_count] = buf;
> +			frag_dma_len[frag_mapped_count] = skb_frag_size(frag);
> +			frag_mapped_count++;
> +		}
> +
>  		dprintk("frag: buf=%08Lx  page=%08lx offset=%08lx\n",
>  			(long long)buf, (long) page_to_pfn(frag->page),
>  			frag->page_offset);
> @@ -1166,6 +1188,16 @@ static netdev_tx_t ns83820_hard_start_xmit(struct sk_buff *skb,
>  	if (stopped && (dev->tx_done_idx != tx_done_idx) && start_tx_okay(dev))
>  		netif_start_queue(ndev);
>  
> +	return NETDEV_TX_OK;
> +dma_map_error:
> +	dma_unmap_single(&dev->pci_dev->dev, main_buf, main_len, DMA_TO_DEVICE);
> +	for (i = 0; i < frag_mapped_count; i++) {
> +		dma_unmap_single(&dev->pci_dev->dev, frag_dma_addr[i], frag_dma_len[i],
> +				 DMA_TO_DEVICE);
> +	}
> +
> +	dev_kfree_skb_any(skb);

AI review says:

If nr_free < MIN_TX_DESC_FREE earlier in the function, netif_stop_queue
is called and the stopped flag is set to 1. By returning NETDEV_TX_OK
directly from the error path, we skip the race check at the end of the
normal
flow:
	if (stopped && (dev->tx_done_idx != tx_done_idx) && start_tx_okay(dev))
		netif_start_queue(ndev);
If all pending packets completed concurrently right before the queue was
stopped, could this cause the TX queue to stall indefinitely until the
tx watchdog fires?


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
  2026-03-31 10:05 [PATCH] net: ns83820: check DMA mapping errors in hard_start_xmit Paolo Abeni
@ 2026-03-31 11:14 ` Wang Jun
  2026-03-31 12:09   ` Eric Dumazet
  0 siblings, 1 reply; 1651+ messages in thread
From: Wang Jun @ 2026-03-31 11:14 UTC (permalink / raw)
  To: Andrew Lunn, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: netdev, linux-kernel, gszhai, 25125332, 25125283, 23120469,
	Wang Jun

Hi Paolo Abeni,

This is v2 of the DMA mapping error handling fix for ns83820. Changes since v1:

- Added queue restart check in error path to avoid potential TX queue stall
  (as pointed out by the AI review)
- Adjusted variable declarations to follow reverse christmas tree order
- Switched from dma_unmap_single to dma_unmap_page for fragments

Thanks to reviewers for the feedback.

Subject: [PATCH v2] net: ns83820: fix DMA mapping error handling in
 hard_start_xmit

The ns83820 driver currently ignores the return values of dma_map_single()
and skb_frag_dma_map() in the transmit path. If DMA mapping fails due to
IOMMU exhaustion or SWIOTLB pressure, the driver may proceed with invalid
DMA addresses, potentially causing hardware errors, data corruption, or
system instability.

Additionally, if mapping fails midway through processing fragmented
packets, previously mapped DMA resources are not released, leading to
DMA resource leaks.

Fix this by:
1. Checking dma_mapping_error() after each DMA mapping call.
2. Implementing an error handling path to unmap successfully mapped
   buffers (both linear and fragments) using dma_unmap_single() /
   dma_unmap_page().
3. Freeing the skb using dev_kfree_skb_any() to safely handle both
   process and softirq contexts.
4. Returning NETDEV_TX_OK to drop the packet gracefully and prevent
   TX queue stagnation.
5. Moving the queue restart check (for race conditions when the queue
   was stopped) into a common label, so that the error path also
   executes it, avoiding a possible permanent TX queue stall.

This ensures compliance with the DMA API guidelines and improves driver
stability under memory pressure.

Signed-off-by: Wang Jun <1742789905@qq.com>
---
 drivers/net/ethernet/natsemi/ns83820.c | 31 +++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/natsemi/ns83820.c b/drivers/net/ethernet/natsemi/ns83820.c
index cdbf82affa7b..f8d037db4ffb 100644
--- a/drivers/net/ethernet/natsemi/ns83820.c
+++ b/drivers/net/ethernet/natsemi/ns83820.c
@@ -1051,6 +1051,12 @@ static netdev_tx_t ns83820_hard_start_xmit(struct sk_buff *skb,
 	int stopped = 0;
 	int do_intr = 0;
 	volatile __le32 *first_desc;
+	int i;
+	int frag_mapped_count = 0;
+	unsigned int main_len = 0;
+	unsigned int frag_dma_len[MAX_SKB_FRAGS];
+	dma_addr_t main_buf = 0;
+	dma_addr_t frag_dma_addr[MAX_SKB_FRAGS];
 
 	dprintk("ns83820_hard_start_xmit\n");
 
@@ -1120,6 +1126,12 @@ static netdev_tx_t ns83820_hard_start_xmit(struct sk_buff *skb,
 		len -= skb->data_len;
 	buf = dma_map_single(&dev->pci_dev->dev, skb->data, len,
 			     DMA_TO_DEVICE);
+	if (dma_mapping_error(&dev->pci_dev->dev, buf)) {
+		dev_kfree_skb_any(skb);
+		goto check_queue_and_return;
+	}
+	main_buf = buf;
+	main_len = len;
 
 	first_desc = dev->tx_descs + (free_idx * DESC_SIZE);
 
@@ -1144,6 +1156,15 @@ static netdev_tx_t ns83820_hard_start_xmit(struct sk_buff *skb,
 
 		buf = skb_frag_dma_map(&dev->pci_dev->dev, frag, 0,
 				       skb_frag_size(frag), DMA_TO_DEVICE);
+		if (dma_mapping_error(&dev->pci_dev->dev, buf))
+			goto dma_map_error;
+
+		if (frag_mapped_count < MAX_SKB_FRAGS) {
+			frag_dma_addr[frag_mapped_count] = buf;
+			frag_dma_len[frag_mapped_count] = skb_frag_size(frag);
+			frag_mapped_count++;
+		}
+
 		dprintk("frag: buf=%08Lx  page=%08lx offset=%08lx\n",
 			(long long)buf, (long) page_to_pfn(frag->page),
 			frag->page_offset);
@@ -1161,12 +1182,20 @@ static netdev_tx_t ns83820_hard_start_xmit(struct sk_buff *skb,
 	spin_unlock_irq(&dev->tx_lock);
 
 	kick_tx(dev);
-
+check_queue_and_return:
 	/* Check again: we may have raced with a tx done irq */
 	if (stopped && (dev->tx_done_idx != tx_done_idx) && start_tx_okay(dev))
 		netif_start_queue(ndev);
 
 	return NETDEV_TX_OK;
+dma_map_error:
+	dma_unmap_single(&dev->pci_dev->dev, main_buf, main_len, DMA_TO_DEVICE);
+	for (i = 0; i < frag_mapped_count; i++) {
+		dma_unmap_page(&dev->pci_dev->dev, frag_dma_addr[i],
+			       frag_dma_len[i], DMA_TO_DEVICE);
+	}
+	dev_kfree_skb_any(skb);
+	goto check_queue_and_return;
 }
 
 static void ns83820_update_stats(struct ns83820 *dev)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re:
  2026-03-31 11:14 ` Wang Jun
@ 2026-03-31 12:09   ` Eric Dumazet
  0 siblings, 0 replies; 1651+ messages in thread
From: Eric Dumazet @ 2026-03-31 12:09 UTC (permalink / raw)
  To: Wang Jun
  Cc: Andrew Lunn, David S . Miller, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, gszhai, 25125332, 25125283, 23120469

On Tue, Mar 31, 2026 at 4:15 AM Wang Jun <1742789905@qq.com> wrote:
>
> Hi Paolo Abeni,
>
> This is v2 of the DMA mapping error handling fix for ns83820. Changes since v1:
>
> - Added queue restart check in error path to avoid potential TX queue stall
>   (as pointed out by the AI review)
> - Adjusted variable declarations to follow reverse christmas tree order
> - Switched from dma_unmap_single to dma_unmap_page for fragments
>
> Thanks to reviewers for the feedback.
>
> Subject: [PATCH v2] net: ns83820: fix DMA mapping error handling in
>  hard_start_xmit
>

This is not how a new version of a patch needs to be submitted.

Look at https://patchwork.kernel.org/project/netdevbpf/list/ and
compare your patch with others...

Thanks.

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re;
@ 2026-04-12  6:24 Erick Lorch
  0 siblings, 0 replies; 1651+ messages in thread
From: Erick Lorch @ 2026-04-12  6:24 UTC (permalink / raw)
  To: bridge

Good day

I reviewed your reputable profile which gives me the intuition that you will be a potential business partner in a crude oil venture with my company worth two Million (2,000 000) barrels monthly involving the NATIONAL OIL CORPORATION OF LIBYA (NOC) and our Oil Refinery Company. This crude oil venture will gain commissions value  in revenue approximately per trade with the NOC;

REPLY IF YOU ARE INTERESTED

Regards,
Erick Lorch
Procurement Supervisor 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re;
@ 2026-04-12 10:09 Erick Lorch
  0 siblings, 0 replies; 1651+ messages in thread
From: Erick Lorch @ 2026-04-12 10:09 UTC (permalink / raw)
  To: bridge

Good day

I reviewed your reputable profile which gives me the intuition that you will be a potential business partner in a crude oil venture with my company worth two Million (2,000 000) barrels monthly involving the NATIONAL OIL CORPORATION OF LIBYA (NOC) and our Oil Refinery Company. This crude oil venture will gain commissions value  in revenue approximately per trade with the NOC;

REPLY IF YOU ARE INTERESTED

Regards,
Erick Lorch
Procurement Supervisor 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re;
@ 2026-04-12 13:42 Erick Lorch
  0 siblings, 0 replies; 1651+ messages in thread
From: Erick Lorch @ 2026-04-12 13:42 UTC (permalink / raw)
  To: bridge

Good day

I reviewed your reputable profile which gives me the intuition that you will be a potential business partner in a crude oil venture with my company worth two Million (2,000 000) barrels monthly involving the NATIONAL OIL CORPORATION OF LIBYA (NOC) and our Oil Refinery Company. This crude oil venture will gain commissions value  in revenue approximately per trade with the NOC;

REPLY IF YOU ARE INTERESTED

Regards,
Erick Lorch
Procurement Supervisor 

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt
@ 2026-04-23 16:06 Yongchao Wu
  2026-04-27  1:22 ` Peter Chen (CIX)
  0 siblings, 1 reply; 1651+ messages in thread
From: Yongchao Wu @ 2026-04-23 16:06 UTC (permalink / raw)
  To: peter.chen, pawell; +Cc: rogerq, gregkh, linux-usb, stable, Yongchao Wu

According to the cdns3 datasheet, the EPRST (Endpoint Reset) command
causes the DMA engine to reposition its internal pointer to the next
Transfer Descriptor (TD) if it was already processing one.

This issue is consistently observed during the ADB identification
process on macOS hosts, where the host issues a Clear_Halt. Although
commit 4bf2dd65135a ("usb: cdns3: gadget: toggle cycle bit before reset
endpoint") attempted to avoid DMA advance by toggling the cycle bit,
trace logs show that on certain hosts like macOS, the DMA pointer
(EP_TRADDR) still shifts after EPRST:

  cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
  cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030  <-- Should be f9c04000
  cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384

As shown above, the DMA pointer jumped to index 3 (offset 0x30), causing
the controller to skip the initial TRBs of the request. This leads to
data misalignment and ADB protocol hangs on macOS.

Fix this by manually restoring the EP_TRADDR register to the starting
physical address of the current request after the EPRST operation is
complete.

Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
Cc: stable@vger.kernel.org
Cc: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Yongchao Wu <yongchao.wu@autochips.com>
---
 drivers/usb/cdns3/cdns3-gadget.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index d59a60a16ec77..96653c7d18f20 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -2814,9 +2814,19 @@ int __cdns3_gadget_ep_clear_halt(struct cdns3_endpoint *priv_ep)
 	priv_ep->flags &= ~(EP_STALLED | EP_STALL_PENDING);

 	if (request) {
-		if (trb)
+		if (trb) {
 			*trb = trb_tmp;

+			/*
+			 * Per datasheet, EPRST causes DMA to reposition to the next TD.
+			 * Manually reset EP_TRADDR to the current TRB to prevent
+			 * the hardware from skipping the interrupted request.
+			 */
+			writel(EP_TRADDR_TRADDR(priv_ep->trb_pool_dma +
+						priv_req->start_trb * TRB_SIZE),
+						&priv_dev->regs->ep_traddr);
+		}
+
 		cdns3_rearm_transfer(priv_ep, 1);
 	}

base-commit: 46b513250491a7bfc97d98791dbe6a10bcc8129d
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 1651+ messages in thread

* Re: [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt
  2026-04-23 16:06 [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt Yongchao Wu
@ 2026-04-27  1:22 ` Peter Chen (CIX)
  2026-04-27  9:01   ` Pawel Laszczak
  0 siblings, 1 reply; 1651+ messages in thread
From: Peter Chen (CIX) @ 2026-04-27  1:22 UTC (permalink / raw)
  To: Yongchao Wu, pawell; +Cc: rogerq, gregkh, linux-usb, stable

On 26-04-24 00:06:01, Yongchao Wu wrote:
> According to the cdns3 datasheet, the EPRST (Endpoint Reset) command
> causes the DMA engine to reposition its internal pointer to the next
> Transfer Descriptor (TD) if it was already processing one.
> 
> This issue is consistently observed during the ADB identification
> process on macOS hosts, where the host issues a Clear_Halt. Although
> commit 4bf2dd65135a ("usb: cdns3: gadget: toggle cycle bit before reset
> endpoint") attempted to avoid DMA advance by toggling the cycle bit,
> trace logs show that on certain hosts like macOS, the DMA pointer
> (EP_TRADDR) still shifts after EPRST:
> 
>   cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
>   cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030  <-- Should be f9c04000
>   cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384
> 
> As shown above, the DMA pointer jumped to index 3 (offset 0x30), causing
> the controller to skip the initial TRBs of the request. This leads to
> data misalignment and ADB protocol hangs on macOS.

Pawel, Is it a hardware issue? The cycle bit has already been toggled
before the endpoint has been reset, why the DMA pointer still advances?

Peter

> 
> Fix this by manually restoring the EP_TRADDR register to the starting
> physical address of the current request after the EPRST operation is
> complete.
> 
> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
> Cc: stable@vger.kernel.org
> Cc: Peter Chen <peter.chen@kernel.org>
> Signed-off-by: Yongchao Wu <yongchao.wu@autochips.com>
> ---
>  drivers/usb/cdns3/cdns3-gadget.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
> index d59a60a16ec77..96653c7d18f20 100644
> --- a/drivers/usb/cdns3/cdns3-gadget.c
> +++ b/drivers/usb/cdns3/cdns3-gadget.c
> @@ -2814,9 +2814,19 @@ int __cdns3_gadget_ep_clear_halt(struct cdns3_endpoint *priv_ep)
>  	priv_ep->flags &= ~(EP_STALLED | EP_STALL_PENDING);
>  
>  	if (request) {
> -		if (trb)
> +		if (trb) {
>  			*trb = trb_tmp;
>  
> +			/*
> +			 * Per datasheet, EPRST causes DMA to reposition to the next TD.
> +			 * Manually reset EP_TRADDR to the current TRB to prevent
> +			 * the hardware from skipping the interrupted request.
> +			 */
> +			writel(EP_TRADDR_TRADDR(priv_ep->trb_pool_dma +
> +						priv_req->start_trb * TRB_SIZE),
> +						&priv_dev->regs->ep_traddr);
> +		}
> +
>  		cdns3_rearm_transfer(priv_ep, 1);
>  	}
>  
> 
> base-commit: 46b513250491a7bfc97d98791dbe6a10bcc8129d
> -- 
> 2.43.0
> 
> 

-- 

Best regards,
Peter

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt
  2026-04-27  1:22 ` Peter Chen (CIX)
@ 2026-04-27  9:01   ` Pawel Laszczak
  2026-04-27 22:59     ` Peter Chen (CIX)
  0 siblings, 1 reply; 1651+ messages in thread
From: Pawel Laszczak @ 2026-04-27  9:01 UTC (permalink / raw)
  To: Peter Chen (CIX), Yongchao Wu
  Cc: rogerq@kernel.org, gregkh@linuxfoundation.org,
	linux-usb@vger.kernel.org, stable@vger.kernel.org

>
>
>On 26-04-24 00:06:01, Yongchao Wu wrote:
>> According to the cdns3 datasheet, the EPRST (Endpoint Reset) command
>> causes the DMA engine to reposition its internal pointer to the next
>> Transfer Descriptor (TD) if it was already processing one.
>>
>> This issue is consistently observed during the ADB identification
>> process on macOS hosts, where the host issues a Clear_Halt. Although
>> commit 4bf2dd65135a ("usb: cdns3: gadget: toggle cycle bit before
>> reset
>> endpoint") attempted to avoid DMA advance by toggling the cycle bit,
>> trace logs show that on certain hosts like macOS, the DMA pointer
>> (EP_TRADDR) still shifts after EPRST:
>>
>>   cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
>>   cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030  <-- Should be f9c04000
>>   cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384
>>
>> As shown above, the DMA pointer jumped to index 3 (offset 0x30),
>> causing the controller to skip the initial TRBs of the request. This
>> leads to data misalignment and ADB protocol hangs on macOS.
>
>Pawel, Is it a hardware issue? The cycle bit has already been toggled before the
>endpoint has been reset, why the DMA pointer still advances?

Peter, do you remember what the TD looked like in your case?
Maybe that’s where the difference lies. The patch description states that
it jumps from 0xf9c04000 to 0xf9c04030, which would suggest that the TD
consists of three TRBs. The driver only changes the cycle bit on the
first one. 
I’m not entirely sure how the controller assembles this TD.
I need some time to try explain the controller's behavior in this case.

Yongchao, could you confirm if the TD consists of three TRBs?

Pawel

>
>Peter
>
>>
>> Fix this by manually restoring the EP_TRADDR register to the starting
>> physical address of the current request after the EPRST operation is
>> complete.
>>
>> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
>> Cc: stable@vger.kernel.org
>> Cc: Peter Chen <peter.chen@kernel.org>
>> Signed-off-by: Yongchao Wu <yongchao.wu@autochips.com>
>> ---
>>  drivers/usb/cdns3/cdns3-gadget.c | 12 +++++++++++-
>>  1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/cdns3/cdns3-gadget.c
>> b/drivers/usb/cdns3/cdns3-gadget.c
>> index d59a60a16ec77..96653c7d18f20 100644
>> --- a/drivers/usb/cdns3/cdns3-gadget.c
>> +++ b/drivers/usb/cdns3/cdns3-gadget.c
>> @@ -2814,9 +2814,19 @@ int __cdns3_gadget_ep_clear_halt(struct
>cdns3_endpoint *priv_ep)
>>  	priv_ep->flags &= ~(EP_STALLED | EP_STALL_PENDING);
>>
>>  	if (request) {
>> -		if (trb)
>> +		if (trb) {
>>  			*trb = trb_tmp;
>>
>> +			/*
>> +			 * Per datasheet, EPRST causes DMA to reposition to the
>next TD.
>> +			 * Manually reset EP_TRADDR to the current TRB to
>prevent
>> +			 * the hardware from skipping the interrupted request.
>> +			 */
>> +			writel(EP_TRADDR_TRADDR(priv_ep->trb_pool_dma +
>> +						priv_req->start_trb *
>TRB_SIZE),
>> +						&priv_dev->regs->ep_traddr);
>> +		}
>> +
>>  		cdns3_rearm_transfer(priv_ep, 1);
>>  	}
>>
>>
>> base-commit: 46b513250491a7bfc97d98791dbe6a10bcc8129d
>> --
>> 2.43.0
>>
>>
>
>--
>
>Best regards,
>Peter

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt
  2026-04-27  9:01   ` Pawel Laszczak
@ 2026-04-27 22:59     ` Peter Chen (CIX)
  2026-04-27 23:59       ` Yongchao Wu
  0 siblings, 1 reply; 1651+ messages in thread
From: Peter Chen (CIX) @ 2026-04-27 22:59 UTC (permalink / raw)
  To: Pawel Laszczak
  Cc: Yongchao Wu, rogerq@kernel.org, gregkh@linuxfoundation.org,
	linux-usb@vger.kernel.org, stable@vger.kernel.org

On 26-04-27 09:01:47, Pawel Laszczak wrote:
> >
> >
> >On 26-04-24 00:06:01, Yongchao Wu wrote:
> >> According to the cdns3 datasheet, the EPRST (Endpoint Reset) command
> >> causes the DMA engine to reposition its internal pointer to the next
> >> Transfer Descriptor (TD) if it was already processing one.
> >>
> >> This issue is consistently observed during the ADB identification
> >> process on macOS hosts, where the host issues a Clear_Halt. Although
> >> commit 4bf2dd65135a ("usb: cdns3: gadget: toggle cycle bit before
> >> reset
> >> endpoint") attempted to avoid DMA advance by toggling the cycle bit,
> >> trace logs show that on certain hosts like macOS, the DMA pointer
> >> (EP_TRADDR) still shifts after EPRST:
> >>
> >>   cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
> >>   cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030  <-- Should be f9c04000
> >>   cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384
> >>
> >> As shown above, the DMA pointer jumped to index 3 (offset 0x30),
> >> causing the controller to skip the initial TRBs of the request. This
> >> leads to data misalignment and ADB protocol hangs on macOS.
> >
> >Pawel, Is it a hardware issue? The cycle bit has already been toggled before the
> >endpoint has been reset, why the DMA pointer still advances?
> 
> Peter, do you remember what the TD looked like in your case?

Sorry, it was almost 6 years ago, I could not remember it well.

> Maybe that’s where the difference lies. The patch description states that
> it jumps from 0xf9c04000 to 0xf9c04030, which would suggest that the TD
> consists of three TRBs. The driver only changes the cycle bit on the
> first one. 

According to Yongchao's statement:

	According to the cdns3 datasheet, the EPRST (Endpoint Reset) command
	causes the DMA engine to reposition its internal pointer to the next
	Transfer Descriptor (TD) if it was already processing one.

My points are the cycle bit has toggled by SW before EPRST command, and EPRST
command would reset DMA pointer to the 1st TRB within TD, and it is executed
later than cycle bit toggles, why hardware could go on handling TRBs even the
first TRB's cycle bit is for software?

Peter

> I’m not entirely sure how the controller assembles this TD.
> I need some time to try explain the controller's behavior in this case.
> 
> Yongchao, could you confirm if the TD consists of three TRBs?
> 
> Pawel
-- 

Best regards,
Peter

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt
  2026-04-27 22:59     ` Peter Chen (CIX)
@ 2026-04-27 23:59       ` Yongchao Wu
  2026-04-28  9:58         ` Pawel Laszczak
  0 siblings, 1 reply; 1651+ messages in thread
From: Yongchao Wu @ 2026-04-27 23:59 UTC (permalink / raw)
  To: Peter Chen (CIX), Pawel Laszczak
  Cc: rogerq@kernel.org, gregkh@linuxfoundation.org,
	linux-usb@vger.kernel.org, stable@vger.kernel.org

On 26-04-27 09:01:47, Pawel Laszczak wrote:
 >>
 >>
 >> On 26-04-24 00:06:01, Yongchao Wu wrote:
 >>> According to the cdns3 datasheet, the EPRST (Endpoint Reset) command
 >>> causes the DMA engine to reposition its internal pointer to the next
 >>> Transfer Descriptor (TD) if it was already processing one.
 >>>
 >>> This issue is consistently observed during the ADB identification
 >>> process on macOS hosts, where the host issues a Clear_Halt. Although
 >>> commit 4bf2dd65135a ("usb: cdns3: gadget: toggle cycle bit before
 >>> reset
 >>> endpoint") attempted to avoid DMA advance by toggling the cycle bit,
 >>> trace logs show that on certain hosts like macOS, the DMA pointer
 >>> (EP_TRADDR) still shifts after EPRST:
 >>>
 >>>   cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
 >>>   cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030  <- Should be 
f9c04000
 >>>   cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384
 >>>
 >>> As shown above, the DMA pointer jumped to index 3 (offset 0x30),
 >>> causing the controller to skip the initial TRBs of the request. This
 >>> leads to data misalignment and ADB protocol hangs on macOS.
 >>
 >> Pawel, Is it a hardware issue? The cycle bit has already been 
toggled before the
 >> endpoint has been reset, why the DMA pointer still advances?
 >
 > Yongchao, could you confirm if the TD consists of three TRBs?
In our case, each TD consists of 4 TRBs.
The DMA pointer appears to advance within the same TD after EPRST.

Each 16KB request is split into 4 TRBs (4KB each):
- TRB0 - TRB2: CHAIN
- TRB3: IOC (last TRB of the TD)

After enqueue, the initial EP_TRADDR points to the first TRB:
   EP_TRADDR = 0xf9c04000 (TRB0)

After Clear_Halt (EPRST), it becomes:
   EP_TRADDR = 0xf9c04030 (TRB3)

Since each TRB is 12 bytes, the offset 0x30 corresponds to 4 TRBs.
This indicates that after EPRST, the DMA pointer skipped the entire
current Request and jumped directly to the start of the next Request
at 0xf9c04030

Below is the relevant trace (trimmed):

// enqueue request (16KB -> 4 TRBs)
cdns3_prepare_trb: dma buf: 0xf7abc000, size: 4096, ctrl: 0x00200415
cdns3_prepare_trb: dma buf: 0xf7abd000, size: 4096, ctrl: 0x00000415
cdns3_prepare_trb: dma buf: 0xf7abe000, size: 4096, ctrl: 0x00000415
cdns3_prepare_trb: dma buf: 0xf7abf000, size: 4096, ctrl: 0x00000425

cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04000

// Clear_Halt
cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030

This behavior is consistently observed on macOS hosts during
ADB enumeration.

So even though the cycle bit is toggled on the first TRB, the
controller still appears to advance the DMA pointer within the same TD
after EPRST.

Please let me know if you need more detailed logs or a full TRB
ring dump. I'd also appreciate any insight into how the controller
determines the next position after EPRST in this case.

Best regards,
Yongchao

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE: [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt
  2026-04-27 23:59       ` Yongchao Wu
@ 2026-04-28  9:58         ` Pawel Laszczak
  2026-04-28 14:48           ` Yongchao Wu
  0 siblings, 1 reply; 1651+ messages in thread
From: Pawel Laszczak @ 2026-04-28  9:58 UTC (permalink / raw)
  To: Yongchao Wu, Peter Chen (CIX)
  Cc: rogerq@kernel.org, gregkh@linuxfoundation.org,
	linux-usb@vger.kernel.org, stable@vger.kernel.org


>
>On 26-04-27 09:01:47, Pawel Laszczak wrote:
> >>
> >>
> >> On 26-04-24 00:06:01, Yongchao Wu wrote:
> >>> According to the cdns3 datasheet, the EPRST (Endpoint Reset) command
>>>> causes the DMA engine to reposition its internal pointer to the next  >>>
>Transfer Descriptor (TD) if it was already processing one.
> >>>
> >>> This issue is consistently observed during the ADB identification  >>>
>process on macOS hosts, where the host issues a Clear_Halt. Although  >>>
>commit 4bf2dd65135a ("usb: cdns3: gadget: toggle cycle bit before  >>> reset
>>>> endpoint") attempted to avoid DMA advance by toggling the cycle bit,
>>>> trace logs show that on certain hosts like macOS, the DMA pointer  >>>
>(EP_TRADDR) still shifts after EPRST:
> >>>
> >>>   cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
> >>>   cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030  <- Should be
>f9c04000
> >>>   cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384
> >>>
> >>> As shown above, the DMA pointer jumped to index 3 (offset 0x30),  >>>
>causing the controller to skip the initial TRBs of the request. This  >>> leads to
>data misalignment and ADB protocol hangs on macOS.
> >>
> >> Pawel, Is it a hardware issue? The cycle bit has already been toggled before
>the  >> endpoint has been reset, why the DMA pointer still advances?
> >
> > Yongchao, could you confirm if the TD consists of three TRBs?
>In our case, each TD consists of 4 TRBs.
>The DMA pointer appears to advance within the same TD after EPRST.
>
>Each 16KB request is split into 4 TRBs (4KB each):
>- TRB0 - TRB2: CHAIN
>- TRB3: IOC (last TRB of the TD)
>
>After enqueue, the initial EP_TRADDR points to the first TRB:
>   EP_TRADDR = 0xf9c04000 (TRB0)
>
>After Clear_Halt (EPRST), it becomes:
>   EP_TRADDR = 0xf9c04030 (TRB3)
>
>Since each TRB is 12 bytes, the offset 0x30 corresponds to 4 TRBs.
>This indicates that after EPRST, the DMA pointer skipped the entire current
>Request and jumped directly to the start of the next Request at 0xf9c04030
>
>Below is the relevant trace (trimmed):
>
>// enqueue request (16KB -> 4 TRBs)
>cdns3_prepare_trb: dma buf: 0xf7abc000, size: 4096, ctrl: 0x00200415
>cdns3_prepare_trb: dma buf: 0xf7abd000, size: 4096, ctrl: 0x00000415
>cdns3_prepare_trb: dma buf: 0xf7abe000, size: 4096, ctrl: 0x00000415
>cdns3_prepare_trb: dma buf: 0xf7abf000, size: 4096, ctrl: 0x00000425
>
>cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04000
>
>// Clear_Halt
>cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
>cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030
>
>This behavior is consistently observed on macOS hosts during ADB
>enumeration.
>
>So even though the cycle bit is toggled on the first TRB, the controller still
>appears to advance the DMA pointer within the same TD after EPRST.
>
>Please let me know if you need more detailed logs or a full TRB ring dump. I'd
>also appreciate any insight into how the controller determines the next
>position after EPRST in this case.
>

Can you confirm whether the host had already sent some data for this TD 
prior to the endpoint reset operation?

Pawel

>Best regards,
>Yongchao

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re: [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt
  2026-04-28  9:58         ` Pawel Laszczak
@ 2026-04-28 14:48           ` Yongchao Wu
  2026-05-04  9:15             ` Pawel Laszczak
  0 siblings, 1 reply; 1651+ messages in thread
From: Yongchao Wu @ 2026-04-28 14:48 UTC (permalink / raw)
  To: Pawel Laszczak, Peter Chen (CIX)
  Cc: rogerq@kernel.org, gregkh@linuxfoundation.org,
	linux-usb@vger.kernel.org, stable@vger.kernel.org

On 4/28/2026 5:58 PM, Pawel Laszczak wrote:
> 
>>
>> On 26-04-27 09:01:47, Pawel Laszczak wrote:
>>>>
>>>>
>>>> On 26-04-24 00:06:01, Yongchao Wu wrote:
>>>>> According to the cdns3 datasheet, the EPRST (Endpoint Reset) command
>>>>> causes the DMA engine to reposition its internal pointer to the next
>>>>> Transfer Descriptor (TD) if it was already processing one.
>>>>>
>>>>> This issue is consistently observed during the ADB identification
>>>>> process on macOS hosts, where the host issues a Clear_Halt. Although  
>>>>> commit 4bf2dd65135a ("usb: cdns3: gadget: toggle cycle bit before   reset
>>>>> reset endpoint") attempted to avoid DMA advance by toggling the cycle bit,
>>>>> trace logs show that on certain hosts like macOS, the DMA pointer  
>>>>> (EP_TRADDR) still shifts after EPRST:
>>>>>
>>>>>    cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
>>>>>    cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030  <- Should be f9c04000
>>>>>    cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384
>>>>>
>>>>> As shown above, the DMA pointer jumped to index 3 (offset 0x30),  
>>>>> causing the controller to skip the initial TRBs of the request. This  
>>>>> leads to data misalignment and ADB protocol hangs on macOS.
>>>>
>>>> Pawel, Is it a hardware issue? The cycle bit has already been toggled before the  
>>>> endpoint has been reset, why the DMA pointer still advances?
>>>
>>> Yongchao, could you confirm if the TD consists of three TRBs?
>> In our case, each TD consists of 4 TRBs.
>> The DMA pointer appears to advance within the same TD after EPRST.
>>
>> Each 16KB request is split into 4 TRBs (4KB each):
>> - TRB0 - TRB2: CHAIN
>> - TRB3: IOC (last TRB of the TD)
>>
>> After enqueue, the initial EP_TRADDR points to the first TRB:
>>    EP_TRADDR = 0xf9c04000 (TRB0)
>>
>> After Clear_Halt (EPRST), it becomes:
>>    EP_TRADDR = 0xf9c04030 (TRB3)
>>
>> Since each TRB is 12 bytes, the offset 0x30 corresponds to 4 TRBs.
>> This indicates that after EPRST, the DMA pointer skipped the entire current
>> Request and jumped directly to the start of the next Request at 0xf9c04030
>>
>> Below is the relevant trace (trimmed):
>>
>> // enqueue request (16KB -> 4 TRBs)
>> cdns3_prepare_trb: dma buf: 0xf7abc000, size: 4096, ctrl: 0x00200415
>> cdns3_prepare_trb: dma buf: 0xf7abd000, size: 4096, ctrl: 0x00000415
>> cdns3_prepare_trb: dma buf: 0xf7abe000, size: 4096, ctrl: 0x00000415
>> cdns3_prepare_trb: dma buf: 0xf7abf000, size: 4096, ctrl: 0x00000425
>>
>> cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04000
>>
>> // Clear_Halt
>> cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
>> cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030
>>
> 
> Can you confirm whether the host had already sent some data for this TD
> prior to the endpoint reset operation?
>

I confirm that the host sent no data prior to or during the EPRST operation.

TotalPhase Trace:
0,HS,2700,0:06.078.671,2.057.666 ms,0 B,,13,00,Set Configuration,Configuration=1
0,HS,2710,0:06.080.811,1.125.266 ms,,,,,[10 SOF],[Frames: 1243.7 - 1245.0]
0,HS,2711,0:06.080.955,992.550 us,2 B,,13,00,Get String Descriptor,Index=5 Length=2
0,HS,2733,0:06.082.061,125.083 us,,,,,[2 SOF],[Frames: 1245.1 - 1245.2]
0,HS,2734,0:06.082.119,104.566 us,28 B,,13,00,Get String Descriptor,Index=5 Length=28
0,HS,2756,0:06.082.311,355.935.283 ms,,,,,[2848 SOF],[Frames: 1245.3 - 1601.2]
0,HS,2757,0:06.438.196,105.033 us,4 B,,13,00,Get String Descriptor,Index=0 Length=256
0,HS,2778,0:06.438.371,875.233 us,,,,,[8 SOF],[Frames: 1601.3 - 1602.2]
//1. Host issues Clear_Halt
0,HS,2779,0:06.439.278,51.433 us,0 B,,13,00,Clear Endpoint Feature,Halt Endpoint 01 OUT
0,HS,2789,0:06.439.371,500.150 us,,,,,[5 SOF],[Frames: 1602.3 - 1602.7]
0,HS,2790,0:06.439.874,51.416 us,0 B,,13,00,Clear Endpoint Feature,Halt Endpoint 01 IN
0,HS,2800,0:06.439.996,250.116 us,,,,,[3 SOF],[Frames: 1603.0 - 1603.2]
//2. First OUT transaction happens
0,HS,2801,0:06.440.350,1.066 us,24 B,,13,01,OUT txn,43 4E 58 4E 01 00 00 01 00 00 10 00..
0,HS,2805,0:06.440.371,66 ns,,,,,[1 SOF],[Frame: 1603.3]
0,HS,2806,0:06.440.453,4.283 us,218 B,,13,01,OUT txn,68 6F 73 74 3A 3A 66 65 61 74 75 72..

> Pawel
> 
>> Best regards,
>> Yongchao
>> Best regards,
>> Yongchao


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* RE:
  2026-04-28 14:48           ` Yongchao Wu
@ 2026-05-04  9:15             ` Pawel Laszczak
  0 siblings, 0 replies; 1651+ messages in thread
From: Pawel Laszczak @ 2026-05-04  9:15 UTC (permalink / raw)
  To: Yongchao Wu, Peter Chen (CIX)
  Cc: rogerq@kernel.org, gregkh@linuxfoundation.org,
	linux-usb@vger.kernel.org, stable@vger.kernel.org

>>
>>>
>>> On 26-04-27 09:01:47, Pawel Laszczak wrote:
>>>>>
>>>>>
>>>>> On 26-04-24 00:06:01, Yongchao Wu wrote:
>>>>>> According to the cdns3 datasheet, the EPRST (Endpoint Reset)
>>>>>> command causes the DMA engine to reposition its internal pointer
>>>>>> to the next Transfer Descriptor (TD) if it was already processing one.
>>>>>>
>>>>>> This issue is consistently observed during the ADB identification
>>>>>> process on macOS hosts, where the host issues a Clear_Halt. Although
>>>>>> commit 4bf2dd65135a ("usb: cdns3: gadget: toggle cycle bit before
>reset
>>>>>> reset endpoint") attempted to avoid DMA advance by toggling the
>>>>>> cycle bit, trace logs show that on certain hosts like macOS, the
>>>>>> DMA pointer
>>>>>> (EP_TRADDR) still shifts after EPRST:
>>>>>>
>>>>>>    cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
>>>>>>    cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030  <- Should be
>f9c04000
>>>>>>    cdns3_gadget_giveback: ep1out: req: ... length: 16384/16384
>>>>>>
>>>>>> As shown above, the DMA pointer jumped to index 3 (offset 0x30),
>>>>>> causing the controller to skip the initial TRBs of the request.
>>>>>> This leads to data misalignment and ADB protocol hangs on macOS.
>>>>>
>>>>> Pawel, Is it a hardware issue? The cycle bit has already been
>>>>> toggled before the endpoint has been reset, why the DMA pointer still
>advances?
>>>>
>>>> Yongchao, could you confirm if the TD consists of three TRBs?
>>> In our case, each TD consists of 4 TRBs.
>>> The DMA pointer appears to advance within the same TD after EPRST.
>>>
>>> Each 16KB request is split into 4 TRBs (4KB each):
>>> - TRB0 - TRB2: CHAIN
>>> - TRB3: IOC (last TRB of the TD)
>>>
>>> After enqueue, the initial EP_TRADDR points to the first TRB:
>>>    EP_TRADDR = 0xf9c04000 (TRB0)
>>>
>>> After Clear_Halt (EPRST), it becomes:
>>>    EP_TRADDR = 0xf9c04030 (TRB3)
>>>
>>> Since each TRB is 12 bytes, the offset 0x30 corresponds to 4 TRBs.
>>> This indicates that after EPRST, the DMA pointer skipped the entire
>>> current Request and jumped directly to the start of the next Request
>>> at 0xf9c04030
>>>
>>> Below is the relevant trace (trimmed):
>>>
>>> // enqueue request (16KB -> 4 TRBs)
>>> cdns3_prepare_trb: dma buf: 0xf7abc000, size: 4096, ctrl: 0x00200415
>>> cdns3_prepare_trb: dma buf: 0xf7abd000, size: 4096, ctrl: 0x00000415
>>> cdns3_prepare_trb: dma buf: 0xf7abe000, size: 4096, ctrl: 0x00000415
>>> cdns3_prepare_trb: dma buf: 0xf7abf000, size: 4096, ctrl: 0x00000425
>>>
>>> cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04000
>>>
>>> // Clear_Halt
>>> cdns3_ctrl_req: Clear Endpoint Feature(Halt ep1out)
>>> cdns3_doorbell_epx: ep1out, ep_trbaddr f9c04030
>>>
>>
>> Can you confirm whether the host had already sent some data for this
>> TD prior to the endpoint reset operation?
>>
>
>I confirm that the host sent no data prior to or during the EPRST operation.

According to the specification, the controller may fetch TRB descriptors after
the endpoint has been initialized.
In complex Transfer Descriptors (TDs) consisting of several TRBs with the CH=1
bit set, the controller may fetch additional TRBs because it treats them as a
single logical entity.

I have not been able to determine exactly how many TRBs can be prefetched
in such a situation. 

According to the description of the EPRST bit:
After endpoint reset the software is responsible for it to re-set the Endpoint
TRADDR.

This fix looks correct to me, 

Can you confirm which version of controller do you have in usb_cap6 register?

Pawel

>
>TotalPhase Trace:
>0,HS,2700,0:06.078.671,2.057.666 ms,0 B,,13,00,Set
>Configuration,Configuration=1
>0,HS,2710,0:06.080.811,1.125.266 ms,,,,,[10 SOF],[Frames: 1243.7 - 1245.0]
>0,HS,2711,0:06.080.955,992.550 us,2 B,,13,00,Get String Descriptor,Index=5
>Length=2
>0,HS,2733,0:06.082.061,125.083 us,,,,,[2 SOF],[Frames: 1245.1 - 1245.2]
>0,HS,2734,0:06.082.119,104.566 us,28 B,,13,00,Get String Descriptor,Index=5
>Length=28
>0,HS,2756,0:06.082.311,355.935.283 ms,,,,,[2848 SOF],[Frames: 1245.3 -
>1601.2]
>0,HS,2757,0:06.438.196,105.033 us,4 B,,13,00,Get String Descriptor,Index=0
>Length=256
>0,HS,2778,0:06.438.371,875.233 us,,,,,[8 SOF],[Frames: 1601.3 - 1602.2] //1.
>Host issues Clear_Halt
>0,HS,2779,0:06.439.278,51.433 us,0 B,,13,00,Clear Endpoint Feature,Halt
>Endpoint 01 OUT
>0,HS,2789,0:06.439.371,500.150 us,,,,,[5 SOF],[Frames: 1602.3 - 1602.7]
>0,HS,2790,0:06.439.874,51.416 us,0 B,,13,00,Clear Endpoint Feature,Halt
>Endpoint 01 IN
>0,HS,2800,0:06.439.996,250.116 us,,,,,[3 SOF],[Frames: 1603.0 - 1603.2] //2.
>First OUT transaction happens
>0,HS,2801,0:06.440.350,1.066 us,24 B,,13,01,OUT txn,43 4E 58 4E 01 00 00 01
>00 00 10 00..
>0,HS,2805,0:06.440.371,66 ns,,,,,[1 SOF],[Frame: 1603.3]
>0,HS,2806,0:06.440.453,4.283 us,218 B,,13,01,OUT txn,68 6F 73 74 3A 3A 66 65
>61 74 75 72..
>
>> Pawel
>>
>>> Best regards,
>>> Yongchao
>>> Best regards,
>>> Yongchao


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2026-04-28 18:24 Fabio M. De Francesco
  2026-05-01 22:01 ` Dave Jiang
  0 siblings, 1 reply; 1651+ messages in thread
From: Fabio M. De Francesco @ 2026-04-28 18:24 UTC (permalink / raw)
  To: linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Dave Jiang, Alison Schofield,
	Vishal Verma, Ira Weiny, Dan Williams, Bjorn Helgaas,
	linux-kernel, linux-pci, Fabio M. De Francesco

Subject: [PATCH 0/2] PCI/CXL: Recover CXL Downstream Ports from PM Init failure

CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which a
Secondary Bus Reset, a Link Down, or Downstream Port Containment on a
CXL Downstream Port prevents Port PM Init from completing when ACS
Source Validation is enabled on the Downstream Port. The spec states
that another SBR alone does not recover the port and describes a
software recovery sequence.  

Patch 1 extends cxl_reset_bus_function(), the helper backing the cxl_bus
PCI/CXL reset method exposed to userspace via sysfs. It saves, clears,
and restores ACS Source Validation and Bus Master Enable on the CXL
Downstream Port around the SBR it issues. This keeps the userspace
cxl_bus reset path from leaving the port unable to complete PM Init.

Patch 2 adds a recovery pass during CXL enumeration. For each CXL
Downstream Port in a memdev's ancestry, the CXL core checks whether PM
Init has completed. If it has not, regardless of what caused the
failure, it invokes cxl_reset_bus_function() on the child below the port
in the hope of restoring the port to a usable state. CXL enumeration
re-runs after events that tear down and re-probe the memdev, including
DPC, AER, and Link Down, so those paths reach this recovery.

This small series is developed from an old RFC v3:
https://lore.kernel.org/linux-cxl/20260330193347.25072-1-fabio.m.de.francesco@linux.intel.com/

Fabio M. De Francesco (2):
  PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled
  cxl/core: Recover from PM Init failure via cxl_reset_bus_function()

drivers/cxl/core/pci.c        | 30 ++++++++++++++++++++
 drivers/cxl/core/port.c       | 22 +++++++++++++++
 drivers/cxl/cxlpci.h          |  3 ++
 drivers/pci/pci.c             | 52 ++++++++++++++++++++++++++++++++++-
 include/linux/pci.h           |  1 + 
 include/uapi/linux/pci_regs.h |  2 ++
 6 files changed, 109 insertions(+), 1 deletion(-)

-- 
2.53.0

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2026-04-28 18:24 Fabio M. De Francesco
@ 2026-05-01 22:01 ` Dave Jiang
  0 siblings, 0 replies; 1651+ messages in thread
From: Dave Jiang @ 2026-05-01 22:01 UTC (permalink / raw)
  To: Fabio M. De Francesco, linux-cxl
  Cc: Davidlohr Bueso, Jonathan Cameron, Alison Schofield, Vishal Verma,
	Ira Weiny, Dan Williams, Bjorn Helgaas, linux-kernel, linux-pci



On 4/28/26 11:24 AM, Fabio M. De Francesco wrote:
> Subject: [PATCH 0/2] PCI/CXL: Recover CXL Downstream Ports from PM Init failure
> 
> CXL r4.0 sec 8.1.5.1 Implementation Note describes a scenario in which a
> Secondary Bus Reset, a Link Down, or Downstream Port Containment on a

I'm not sure if this series covers a Link Down event (i.e. hotplug). As I recall, cxl_reset_bus_function() only happens via sysfs trigger.

DJ


> CXL Downstream Port prevents Port PM Init from completing when ACS
> Source Validation is enabled on the Downstream Port. The spec states
> that another SBR alone does not recover the port and describes a
> software recovery sequence.  
> 
> Patch 1 extends cxl_reset_bus_function(), the helper backing the cxl_bus
> PCI/CXL reset method exposed to userspace via sysfs. It saves, clears,
> and restores ACS Source Validation and Bus Master Enable on the CXL
> Downstream Port around the SBR it issues. This keeps the userspace
> cxl_bus reset path from leaving the port unable to complete PM Init.
> 
> Patch 2 adds a recovery pass during CXL enumeration. For each CXL
> Downstream Port in a memdev's ancestry, the CXL core checks whether PM
> Init has completed. If it has not, regardless of what caused the
> failure, it invokes cxl_reset_bus_function() on the child below the port
> in the hope of restoring the port to a usable state. CXL enumeration
> re-runs after events that tear down and re-probe the memdev, including
> DPC, AER, and Link Down, so those paths reach this recovery.
> 
> This small series is developed from an old RFC v3:
> https://lore.kernel.org/linux-cxl/20260330193347.25072-1-fabio.m.de.francesco@linux.intel.com/
> 
> Fabio M. De Francesco (2):
>   PCI/CXL: Allow PM Init to complete on cxl_bus reset if ACS SV enabled
>   cxl/core: Recover from PM Init failure via cxl_reset_bus_function()
> 
> drivers/cxl/core/pci.c        | 30 ++++++++++++++++++++
>  drivers/cxl/core/port.c       | 22 +++++++++++++++
>  drivers/cxl/cxlpci.h          |  3 ++
>  drivers/pci/pci.c             | 52 ++++++++++++++++++++++++++++++++++-
>  include/linux/pci.h           |  1 + 
>  include/uapi/linux/pci_regs.h |  2 ++
>  6 files changed, 109 insertions(+), 1 deletion(-)
> 


^ permalink raw reply	[flat|nested] 1651+ messages in thread

* (no subject)
@ 2026-05-09 18:01 Andrea Righi
  2026-05-09 18:07 ` Andrea Righi
  0 siblings, 1 reply; 1651+ messages in thread
From: Andrea Righi @ 2026-05-09 18:01 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak, Christian Loehle, Phil Auld,
	Koba Ko, Felix Abecassis, Balbir Singh, Joel Fernandes,
	Shrikanth Hegde, linux-kernel

This series attempts to improve SD_ASYM_CPUCAPACITY scheduling by introducing
SMT awareness.

= Problem =

Nominal per-logical-CPU capacity can overstate usable compute when an SMT
sibling is busy, because the physical core doesn't deliver its full nominal
capacity. So, several asym-cpu-capacity paths may pick high capacity idle CPUs
that are not actually good destinations.

= Solution =

This patch set aligns those paths with a simple rule already used elsewhere:
when SMT is active, prefer fully idle cores and avoid treating partially idle
SMT siblings as full-capacity targets where that would mislead load balance.

Patch set summary:
 - Attach sched_domain_shared to sd_asym_cpucapacity in SD_ASYM_CPUCAPACITY to
   use has_idle_cores hint consistently in the wakeup idle scan and rename
   sd_llc_shared -> sd_balance_shared.
 - Prefer fully-idle SMT cores in asym-capacity idle selection: in the wakeup
   fast path, extend select_idle_capacity() / asym_fits_cpu() so idle
   selection can prefer CPUs on fully idle cores.
 - Reject misfit pulls onto busy SMT siblings on SD_ASYM_CPUCAPACITY.
 - Add SIS_UTIL support to select_idle_capacity(): add to select_idle_capacity()
   the same SIS_UTIL-controlled idle-scan mechanism, already used by
   select_idle_cpu().

This patch set has been tested on the new NVIDIA Vera Rubin platform, where SMT
is enabled and the firmware exposes small frequency variations (+/-~5%) as
differences in CPU capacity, resulting in SD_ASYM_CPUCAPACITY being set.

Without these patches, performance can drop by up to ~2x with CPU-intensive
workloads, because the SD_ASYM_CPUCAPACITY idle selection policy does not
account for busy SMT siblings.

Alternative approaches have been evaluated, such as equalizing CPU capacities,
either by exposing uniform values via firmware or normalizing them in the kernel
by grouping CPUs within a small capacity window (+-5%).

However, the SMT-aware SD_ASYM_CPUCAPACITY approach has shown better results so
far. Improving this policy also seems worthwhile in general, as future platforms
may enable SMT with asymmetric CPU topologies.

Performance results on Vera Rubin with SD_ASYM_CPUCAPACITY (mainline) vs
SD_ASYM_CPUCAPACITY + SMT

- NVBLAS benchblas (one task / SMT core):

 +---------------------------------+--------+
 | Configuration                   | gflops |
 +---------------------------------+--------+
 | ASYM (mainline) + SIS_UTIL      |  5478  |
 | ASYM (mainline) + NO_SIS_UTIL   |  5491  |
 |                                 |        |
 | NO ASYM + SIS_UTIL              |  8912  |
 | NO ASYM + NO_SIS_UTIL           |  8978  |
 |                                 |        |
 | ASYM + SMT + SIS_UTIL           |  9259  |
 | ASYM + SMT + NO_SIS_UTIL        |  9291  |
 +---------------------------------+--------+

 - DCPerf MediaWiki (all CPUs):

 +---------------------------------+--------+--------+--------+--------+
 | Configuration                   |   rps  |  p50   |  p95   |  p99   |
 +---------------------------------+--------+--------+--------+--------+
 | ASYM (mainline) + SIS_UTIL      |  7994  |  0.052 |  0.223 |  0.246 |
 | ASYM (mainline) + NO_SIS_UTIL   |  7993  |  0.052 |  0.221 |  0.245 |
 |                                 |        |        |        |        |
 | NO ASYM + SIS_UTIL              |  8113  |  0.067 |  0.184 |  0.225 |
 | NO ASYM + NO_SIS_UTIL           |  8093  |  0.068 |  0.184 |  0.223 |
 |                                 |        |        |        |        |
 | ASYM + SMT + SIS_UTIL           |  8129  |  0.076 |  0.149 |  0.188 |
 | ASYM + SMT + NO_SIS_UTIL        |  8138  |  0.076 |  0.148 |  0.186 |
 +---------------------------------+--------+--------+--------+--------+

In the MediaWiki case SMT awareness is less impactful, because for the majority
of the run all CPUs are used, but it still seems to provide some benefits at
reducing tail latency.

Tests have also been conducted on NVIDIA Grace (which does not support SMT) to
ensure that SIS_UTIL support in select_idle_capacity() does not introduce
regressions and results show slight improvements under the same workloads.

See also:
 - https://lore.kernel.org/lkml/20260324005509.1134981-1-arighi@nvidia.com
 - https://lore.kernel.org/lkml/20260318092214.130908-1-arighi@nvidia.com

Changes in v6:
 - Simplify the SIS_UTIL early-exit in select_idle_capacity(): drop the
   best_fits == ASYM_IDLE_CORE_UCLAMP_MISFIT guard and exit once the scan
   budget is exhausted, matching select_idle_cpu() (Dietmar Eggemann)
 - Move the sd_llc_shared -> sd_balance_shared rename in nohz_balancer_kick()
   from the NOHZ RCU cleanup patch into the sd_asym_cpucapacity attach patch,
   where it logically belongs (Dietmar Eggemann)
 - Rename prefers_idle_core to has_idle_core in select_idle_capacity()
   (Dietmar Eggemann)
 - Use ASYM_IDLE_CORE_COMPLETE_MISFIT instead of ASYM_IDLE_CORE_BIAS in the
   select_idle_capacity() comments (Vincent Guittot)
 - Expand the asym_fits_state docstring with a per-rank table and an
   explanation of ASYM_IDLE_CORE_BIAS as an offset rather than a state
 - Small code comment adjustments based on previous reviews
 - Link to v5: https://lore.kernel.org/all/20260428144352.3575863-1-arighi@nvidia.com

Changes in v5:
 - Drop redundant RCU protection in nohz_balancer_kick() (Prateek Nayak)
 - Do not remove CPU capacity asymmetry / SMT warning (Prateek Nayak)
 - Link to v4: https://lore.kernel.org/all/20260428051720.3180182-1-arighi@nvidia.com

Changes in v4:
 - Rename sd_llc_shared -> sd_balance_shared
 - Add preliminary cleanup patch to use guard(rcu)() for sched_domain RCU
   (Prateek Nayak)
 - Apply SIS_UTIL scan cap only with !prefers_idle_core, matching
   select_idle_cpu() / has_idle_core logic (Vincent Guittot)
 - Cache env->dst_cpu idle state to reduce is_core_idle() calls (Prateek Nayak)
 - Remove warning about CPU capacity asymmetry not supporting SMT
 - Link to v3: https://lore.kernel.org/all/20260423074135.380390-1-arighi@nvidia.com

Changes in v3:
 - Add SIS_UTIL support to select_idle_capacity() (K Prateek Nayak)
 - Attach sched_domain_shared to sd_asym_cpucapacity (K Prateek Nayak)
 - Add enum for the different fit state (K Prateek Nayak)
 - Update has_idle_cores hint (Vincent Guittot)
 - Link to v2: https://lore.kernel.org/all/20260403053654.1559142-1-arighi@nvidia.com

Changes in v2:
 - Rework SMT awareness logic in select_idle_capacity() (K Prateek Nayak)
 - Drop EAS and find_new_ilb() changes for now
 - Link to v1: https://lore.kernel.org/all/20260326151211.1862600-1-arighi@nvidia.com

Git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arighi/linux.git sched-asym-smt-v6

Andrea Righi (3):
      sched/fair: Drop redundant RCU read lock in NOHZ kick path
      sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection
      sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity

K Prateek Nayak (2):
      sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity
      sched/fair: Add SIS_UTIL support to select_idle_capacity()

 kernel/sched/fair.c     | 206 ++++++++++++++++++++++++++++++++++++++----------
 kernel/sched/sched.h    |   2 +-
 kernel/sched/topology.c |  95 +++++++++++++++++++---
 3 files changed, 248 insertions(+), 55 deletions(-)

^ permalink raw reply	[flat|nested] 1651+ messages in thread

* Re:
  2026-05-09 18:01 Andrea Righi
@ 2026-05-09 18:07 ` Andrea Righi
  0 siblings, 0 replies; 1651+ messages in thread
From: Andrea Righi @ 2026-05-09 18:07 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Valentin Schneider, K Prateek Nayak, Christian Loehle, Phil Auld,
	Koba Ko, Felix Abecassis, Balbir Singh, Joel Fernandes,
	Shrikanth Hegde, linux-kernel

On Sat, May 09, 2026 at 08:01:20PM +0200, Andrea Righi wrote:
> This series attempts to improve SD_ASYM_CPUCAPACITY scheduling by introducing
> SMT awareness.

Somehow I messed up the subject in the cover letter, I'll re-send.
Ignore this one and sorry for the noise.

-Andrea

^ permalink raw reply	[flat|nested] 1651+ messages in thread

end of thread, other threads:[~2026-05-09 18:07 UTC | newest]

Thread overview: 1651+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-09 13:14 [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
2007-08-09 12:41 ` Arnd Bergmann
2007-08-09 14:29   ` Chris Snook
2007-08-09 15:30     ` Arnd Bergmann
2007-08-14 22:31 ` Christoph Lameter
2007-08-14 22:45   ` Chris Snook
2007-08-14 22:51     ` Christoph Lameter
2007-08-14 23:08   ` Satyam Sharma
2007-08-14 23:04     ` Chris Snook
2007-08-14 23:14       ` Christoph Lameter
2007-08-15  6:49       ` Herbert Xu
2007-08-15  6:49         ` Herbert Xu
2007-08-15  8:18         ` Heiko Carstens
2007-08-15 13:53           ` Stefan Richter
2007-08-15 14:35             ` Satyam Sharma
2007-08-15 14:52               ` Herbert Xu
2007-08-15 16:09                 ` Stefan Richter
2007-08-15 16:27                   ` Paul E. McKenney
2007-08-15 17:13                     ` Satyam Sharma
2007-08-15 18:31                     ` Segher Boessenkool
2007-08-15 18:57                       ` Paul E. McKenney
2007-08-15 19:54                         ` Satyam Sharma
2007-08-15 20:17                           ` Paul E. McKenney
2007-08-15 20:52                             ` Segher Boessenkool
2007-08-15 22:42                               ` Paul E. McKenney
2007-08-15 20:47                           ` Segher Boessenkool
2007-08-16  0:36                             ` Satyam Sharma
2007-08-16  0:32                               ` your mail Herbert Xu
2007-08-16  0:58                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Satyam Sharma
2007-08-16  0:51                                   ` Herbert Xu
2007-08-16  1:18                                     ` Satyam Sharma
2007-08-16  1:38                               ` Segher Boessenkool
2007-08-15 21:05                         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
2007-08-15 22:44                           ` Paul E. McKenney
2007-08-16  1:23                             ` Segher Boessenkool
2007-08-16  2:22                               ` Paul E. McKenney
2007-08-15 19:58               ` Stefan Richter
2007-08-16  0:39           ` [PATCH] i386: Fix a couple busy loops in mach_wakecpu.h:wait_for_init_deassert() Satyam Sharma
2007-08-24 11:59             ` Denys Vlasenko
2007-08-24 12:07               ` Andi Kleen
2007-08-24 12:12               ` Kenn Humborg
2007-08-24 12:12                 ` Kenn Humborg
2007-08-24 14:25                 ` Denys Vlasenko
2007-08-24 17:34                   ` Linus Torvalds
2007-08-24 13:30               ` Satyam Sharma
2007-08-24 17:06                 ` Christoph Lameter
2007-08-24 20:26                   ` Denys Vlasenko
2007-08-24 20:34                     ` Chris Snook
2007-08-24 16:19               ` Luck, Tony
2007-08-24 16:19                 ` Luck, Tony
2007-08-15 16:13         ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Chris Snook
2007-08-15 23:40           ` Herbert Xu
2007-08-15 23:51             ` Paul E. McKenney
2007-08-16  1:30               ` Segher Boessenkool
2007-08-16  2:30                 ` Paul E. McKenney
2007-08-16 19:33                   ` Segher Boessenkool
2007-08-16  1:26             ` Segher Boessenkool
2007-08-16  2:23               ` Nick Piggin
2007-08-16 19:32                 ` Segher Boessenkool
2007-08-17  2:19                   ` Nick Piggin
2007-08-17  3:16                     ` Paul Mackerras
2007-08-17  3:32                       ` Nick Piggin
2007-08-17  3:50                         ` Linus Torvalds
2007-08-17 23:59                           ` Paul E. McKenney
2007-08-18  0:09                             ` Herbert Xu
2007-08-18  1:08                               ` Paul E. McKenney
2007-08-18  1:24                                 ` Christoph Lameter
2007-08-18  1:41                                   ` Satyam Sharma
2007-08-18  4:13                                     ` Linus Torvalds
2007-08-18 13:36                                       ` Satyam Sharma
2007-08-18 21:54                                       ` Paul E. McKenney
2007-08-18 22:41                                         ` Linus Torvalds
2007-08-18 23:19                                           ` Paul E. McKenney
2007-08-24 12:19                                       ` Denys Vlasenko
2007-08-24 17:19                                         ` Linus Torvalds
2007-08-18 21:56                                   ` Paul E. McKenney
2007-08-20 13:31                                   ` Chris Snook
2007-08-20 22:04                                     ` Segher Boessenkool
2007-08-20 22:48                                       ` Russell King
2007-08-20 23:02                                         ` Segher Boessenkool
2007-08-21  0:05                                           ` Paul E. McKenney
2007-08-21  7:08                                             ` Russell King
2007-08-21  7:05                                           ` Russell King
2007-08-21  9:33                                             ` Paul Mackerras
2007-08-21 11:37                                               ` Andi Kleen
2007-08-21 14:48                                               ` Segher Boessenkool
2007-08-21 16:16                                                 ` Paul E. McKenney
2007-08-21 22:51                                                   ` Valdis.Kletnieks
2007-08-22  0:50                                                     ` Paul E. McKenney
2007-08-22 21:38                                                     ` Adrian Bunk
2007-08-21 14:39                                             ` Segher Boessenkool
2007-08-17  3:42                       ` Linus Torvalds
2007-08-17  5:18                         ` Paul E. McKenney
2007-08-17  5:56                         ` Satyam Sharma
2007-08-17  7:26                           ` Nick Piggin
2007-08-17  8:47                             ` Satyam Sharma
2007-08-17  9:15                               ` Nick Piggin
2007-08-17 10:12                                 ` Satyam Sharma
2007-08-17 12:14                                   ` Nick Piggin
2007-08-17 13:05                                     ` Satyam Sharma
2007-08-17  9:48                               ` Paul Mackerras
2007-08-17 10:23                                 ` Satyam Sharma
2007-08-17 22:49                           ` Segher Boessenkool
2007-08-17 23:51                             ` Satyam Sharma
2007-08-17 23:55                               ` Segher Boessenkool
2007-08-17  6:42                         ` Geert Uytterhoeven
2007-08-17  8:52                         ` Andi Kleen
2007-08-17 10:08                           ` Satyam Sharma
2007-08-17 22:29                         ` Segher Boessenkool
2007-08-17 17:37                     ` Segher Boessenkool
2007-08-14 23:26     ` Paul E. McKenney
2007-08-15 10:35     ` Stefan Richter
2007-08-15 12:04       ` Herbert Xu
2007-08-15 12:31       ` Satyam Sharma
2007-08-15 13:08         ` Stefan Richter
2007-08-15 13:11           ` Stefan Richter
2007-08-15 13:47           ` Satyam Sharma
2007-08-15 14:25             ` Paul E. McKenney
2007-08-15 15:33               ` Herbert Xu
2007-08-15 16:08                 ` Paul E. McKenney
2007-08-15 17:18                   ` Satyam Sharma
2007-08-15 17:33                     ` Paul E. McKenney
2007-08-15 18:05                       ` Satyam Sharma
2007-08-15 18:19                 ` David Howells
2007-08-15 18:45                   ` Paul E. McKenney
2007-08-15 23:41                     ` Herbert Xu
2007-08-15 23:53                       ` Paul E. McKenney
2007-08-16  0:12                         ` Herbert Xu
2007-08-16  0:23                           ` Paul E. McKenney
2007-08-16  0:30                             ` Herbert Xu
2007-08-16  0:49                               ` Paul E. McKenney
2007-08-16  0:53                                 ` Herbert Xu
2007-08-16  1:14                                   ` Paul E. McKenney
2007-08-15 17:55               ` Satyam Sharma
2007-08-15 19:07                 ` Paul E. McKenney
2007-08-15 21:07                   ` Segher Boessenkool
2007-08-15 20:58                 ` Segher Boessenkool
2007-08-15 18:31         ` Segher Boessenkool
2007-08-15 19:40           ` Satyam Sharma
2007-08-15 20:42             ` Segher Boessenkool
2007-08-16  1:23               ` Satyam Sharma
2007-08-15 23:22         ` Paul Mackerras
2007-08-16  0:26           ` Christoph Lameter
2007-08-16  0:34             ` Paul Mackerras
2007-08-16  0:40               ` Christoph Lameter
2007-08-16  0:39             ` Paul E. McKenney
2007-08-16  0:42               ` Christoph Lameter
2007-08-16  0:53                 ` Paul E. McKenney
2007-08-16  0:59                   ` Christoph Lameter
2007-08-16  1:14                     ` Paul E. McKenney
2007-08-16  1:41                       ` Christoph Lameter
2007-08-16  2:15                         ` Satyam Sharma
2007-08-16  2:08                           ` Herbert Xu
2007-08-16  2:18                             ` Christoph Lameter
2007-08-16  3:23                               ` Paul Mackerras
2007-08-16  3:33                                 ` Herbert Xu
2007-08-16  3:48                                   ` Paul Mackerras
2007-08-16  4:03                                     ` Herbert Xu
2007-08-16  4:34                                       ` Paul Mackerras
2007-08-16  5:37                                         ` Herbert Xu
2007-08-16  6:00                                           ` Paul Mackerras
2007-08-16 18:50                                             ` Christoph Lameter
2007-08-16 18:48                                 ` Christoph Lameter
2007-08-16 19:44                                 ` Segher Boessenkool
2007-08-16  2:18                             ` Chris Friesen
2007-08-16  2:32                         ` Paul E. McKenney
2007-08-16  1:51                 ` Paul Mackerras
2007-08-16  2:00                   ` Herbert Xu
2007-08-16  2:05                     ` Paul Mackerras
2007-08-16  2:11                       ` Herbert Xu
2007-08-16  2:35                         ` Paul E. McKenney
2007-08-16  3:15                         ` Paul Mackerras
2007-08-16  3:43                           ` Herbert Xu
2007-08-16  2:15                       ` Christoph Lameter
2007-08-16  2:17                         ` Christoph Lameter
2007-08-16  2:33                       ` Satyam Sharma
2007-08-16  3:01                         ` Satyam Sharma
2007-08-16  4:11                           ` Paul Mackerras
2007-08-16  5:39                             ` Herbert Xu
2007-08-16  6:56                               ` Paul Mackerras
2007-08-16  7:09                                 ` Herbert Xu
2007-08-16  8:06                                   ` Stefan Richter
2007-08-16  8:10                                     ` Herbert Xu
2007-08-16  9:54                                       ` Stefan Richter
2007-08-16 10:31                                         ` Stefan Richter
2007-08-16 10:42                                           ` Herbert Xu
2007-08-16 16:34                                             ` Paul E. McKenney
2007-08-16 23:59                                               ` Herbert Xu
2007-08-17  1:01                                                 ` Paul E. McKenney
2007-08-17  7:39                                                   ` Satyam Sharma
2007-08-17 14:31                                                     ` Paul E. McKenney
2007-08-17 18:31                                                       ` Satyam Sharma
2007-08-17 18:56                                                         ` Paul E. McKenney
2007-08-17  3:15                                               ` Nick Piggin
2007-08-17  4:02                                                 ` Paul Mackerras
2007-08-17  4:39                                                   ` Nick Piggin
2007-08-17  7:25                                                 ` Stefan Richter
2007-08-17  8:06                                                   ` Nick Piggin
2007-08-17  8:58                                                     ` Satyam Sharma
2007-08-17  9:15                                                       ` Nick Piggin
2007-08-17 10:03                                                         ` Satyam Sharma
2007-08-17 11:50                                                           ` Nick Piggin
2007-08-17 12:50                                                             ` Satyam Sharma
2007-08-17 12:56                                                               ` Nick Piggin
2007-08-18  2:15                                                                 ` Satyam Sharma
2007-08-17 10:48                                                     ` Stefan Richter
2007-08-17 10:58                                                       ` Stefan Richter
2007-08-18 14:35                                                     ` LDD3 pitfalls (was Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures) Stefan Richter
2007-08-20 13:28                                                       ` Chris Snook
2007-08-17 22:14                                                 ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Segher Boessenkool
2007-08-17  5:04                                             ` Paul Mackerras
2007-08-16 10:35                                         ` Herbert Xu
2007-08-16 19:48                                       ` Chris Snook
2007-08-17  0:02                                         ` Herbert Xu
2007-08-17  2:04                                           ` Chris Snook
2007-08-17  2:13                                             ` Herbert Xu
2007-08-17  2:31                                             ` Nick Piggin
2007-08-17  5:09                                       ` Paul Mackerras
2007-08-17  5:32                                         ` Herbert Xu
2007-08-17  5:41                                           ` Paul Mackerras
2007-08-17  8:28                                             ` Satyam Sharma
2007-08-16 14:48                                   ` Ilpo Järvinen
2007-08-16 16:19                                     ` Stefan Richter
2007-08-16 19:55                                     ` Chris Snook
2007-08-16 20:20                                       ` Christoph Lameter
2007-08-17  1:02                                         ` Paul E. McKenney
2007-08-17  1:28                                           ` Herbert Xu
2007-08-17  5:07                                             ` Paul E. McKenney
2007-08-17  2:16                                         ` Paul Mackerras
2007-08-17  3:03                                           ` Linus Torvalds
2007-08-17  3:43                                             ` Paul Mackerras
2007-08-17  3:53                                               ` Herbert Xu
2007-08-17  6:26                                                 ` Satyam Sharma
2007-08-17  8:38                                                   ` Nick Piggin
2007-08-17  9:14                                                     ` Satyam Sharma
2007-08-17  9:31                                                       ` Nick Piggin
2007-08-17 10:55                                                         ` Satyam Sharma
2007-08-17 12:39                                                           ` Nick Piggin
2007-08-17 13:36                                                             ` Satyam Sharma
2007-08-17 16:48                                                             ` Linus Torvalds
2007-08-17 18:50                                                               ` Chris Friesen
2007-08-17 18:54                                                                 ` Arjan van de Ven
2007-08-17 19:49                                                                   ` Paul E. McKenney
2007-08-17 19:49                                                                     ` Arjan van de Ven
2007-08-17 20:12                                                                       ` Paul E. McKenney
2007-08-17 19:08                                                                 ` Linus Torvalds
2007-08-20 13:15                                                               ` Chris Snook
2007-08-20 13:32                                                                 ` Herbert Xu
2007-08-20 13:38                                                                   ` Chris Snook
2007-08-20 22:07                                                                     ` Segher Boessenkool
2007-08-21  5:46                                                                 ` Linus Torvalds
2007-08-21  7:04                                                                   ` David Miller
2007-08-21 13:50                                                                     ` Chris Snook
2007-08-21 14:59                                                                       ` Segher Boessenkool
2007-08-21 16:31                                                                       ` Satyam Sharma
2007-08-21 16:43                                                                       ` Linus Torvalds
2007-09-09 18:02                                                               ` Denys Vlasenko
2007-09-09 18:18                                                                 ` Arjan van de Ven
2007-09-10 10:56                                                                   ` Denys Vlasenko
2007-09-10 11:15                                                                     ` Herbert Xu
2007-09-10 12:22                                                                     ` Kyle Moffett
2007-09-10 13:38                                                                       ` Denys Vlasenko
2007-09-10 14:16                                                                         ` Denys Vlasenko
2007-09-10 15:09                                                                           ` Linus Torvalds
2007-09-10 16:46                                                                             ` Denys Vlasenko
2007-09-10 19:59                                                                               ` Kyle Moffett
2007-09-10 18:59                                                                             ` Christoph Lameter
2007-09-10 23:19                                                                             ` [PATCH] Document non-semantics of atomic_read() and atomic_set() Chris Snook
2007-09-10 23:44                                                                               ` Paul E. McKenney
2007-09-11 19:35                                                                               ` Christoph Lameter
2007-09-10 14:51                                                                     ` [PATCH 0/24] make atomic_read() behave consistently across all architectures Arjan van de Ven
2007-09-10 14:38                                                                       ` Denys Vlasenko
2007-09-10 17:02                                                                         ` Arjan van de Ven
2007-08-17 11:08                                                     ` Stefan Richter
2007-08-17 22:09                                             ` Segher Boessenkool
2007-08-17 17:41                                         ` Segher Boessenkool
2007-08-17 18:38                                           ` Satyam Sharma
2007-08-17 23:17                                             ` Segher Boessenkool
2007-08-17 23:55                                               ` Satyam Sharma
2007-08-18  0:04                                                 ` Segher Boessenkool
2007-08-18  1:56                                                   ` Satyam Sharma
2007-08-18  2:15                                                     ` Segher Boessenkool
2007-08-18  3:33                                                       ` Satyam Sharma
2007-08-18  5:18                                                         ` Segher Boessenkool
2007-08-18 13:20                                                           ` Satyam Sharma
2007-09-10 18:59                                           ` Christoph Lameter
2007-09-10 20:54                                             ` Paul E. McKenney
2007-09-10 21:36                                               ` Christoph Lameter
2007-09-10 21:50                                                 ` Paul E. McKenney
2007-09-11  2:27                                             ` Segher Boessenkool
2007-08-16 21:08                                       ` Luck, Tony
2007-08-16 21:08                                         ` Luck, Tony
2007-08-16 19:55                                     ` Chris Snook
2007-08-16 18:54                             ` Christoph Lameter
2007-08-16 20:07                               ` Paul E. McKenney
2007-08-16  3:05                         ` Paul Mackerras
2007-08-16 19:39                           ` Segher Boessenkool
2007-08-16  2:07                   ` Segher Boessenkool
2007-08-24 12:50           ` Denys Vlasenko
2007-08-24 17:15             ` Christoph Lameter
2007-08-24 20:21               ` Denys Vlasenko
2007-08-16  3:37         ` Bill Fink
2007-08-16  5:20           ` Satyam Sharma
2007-08-16  5:57             ` Satyam Sharma
2007-08-16  9:25               ` Satyam Sharma
2007-08-16 21:00               ` Segher Boessenkool
2007-08-17  4:32                 ` Satyam Sharma
2007-08-17 22:38                   ` Segher Boessenkool
2007-08-18 14:42                     ` Satyam Sharma
2007-08-16 20:50             ` Segher Boessenkool
2007-08-16 22:40               ` David Schwartz
2007-08-17  4:36                 ` Satyam Sharma
2007-08-17  4:24               ` Satyam Sharma
2007-08-17 22:34                 ` Segher Boessenkool
2007-08-15 19:59       ` Christoph Lameter
  -- strict thread matches above, loose matches on Subject: below --
2013-06-09 22:06 Abraham Lincon
2013-06-09 22:06 RE: Abraham Lincon
2013-06-27 21:21 emirates
2013-06-28 10:12 Re: emirates
2013-06-28 10:14 Re: emirates
2013-07-08 21:52 Jeffrey (Sheng-Hui) Chu
2013-07-08 22:04 ` Joe Perches
2013-07-09 13:22 ` Re: Arend van Spriel
2013-07-10  9:12   ` Re: Samuel Ortiz
2013-07-28 14:21 piuvatsa
2013-07-28  9:49 ` Tomas Pospisek
2013-08-13  9:56 (unknown), Christian König
2013-08-13 14:47 ` Alex Deucher
2013-08-23  6:18 info
     [not found] <B719EF0A9FB7A247B5147CD67A83E60E011FEB76D1@EXCH10-MB3.paterson.k12.nj.us>
2013-08-23 10:47 ` Ruiz, Irma
2013-08-23 10:47 ` RE: Ruiz, Irma
2013-08-23 10:47 ` RE: Ruiz, Irma
2013-08-28 11:07 Marc Murphy
2013-08-28 11:23 ` Sedat Dilek
2013-09-03 23:50 (unknown), Matthew Garrett
     [not found] ` <1378252218-18798-1-git-send-email-matthew.garrett-05XSO3Yj/JvQT0dZR+AlfA@public.gmane.org>
2013-09-04 15:53   ` Kees Cook
     [not found]     ` <CAGXu5jLCTU1MG4fYDzpT=TAP9DRAUuRuhZNB+edJsOzN4iXbDw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-09-04 16:05       ` Re: Josh Boyer
2013-10-10 14:38 陶治江
2013-10-10 14:46 ` Lucas De Marchi
2013-11-09  5:14 reply15
2013-12-20 11:49 RE: Unify Loan Company
2013-12-21 16:48 (unknown), Alex Barattini
2013-12-23  1:44 ` Aaron Lu
2013-12-23 16:24   ` Re: Alex Barattini
2014-01-20  9:24 Re: Mark Reyes Guus
2014-01-20  9:35 Re: Mark Reyes Guus
2014-01-20  9:35 Re: Mark Reyes Guus
2014-02-06 11:54 "Sparse checkout leaves no entry on working directory" all the time on Windows 7 on Git 1.8.5.2.msysgit.0 konstunn
2014-02-06 13:20 ` Johannes Sixt
2014-02-06 19:56   ` Constantine Gorbunov
2014-02-10 14:35 Viswanatham, RaviTeja
2014-02-10 18:35 ` Marcel Holtmann
2014-02-11  7:13   ` Re: Andrei Emeltchenko
2014-02-23 16:22 tigran.mkrtchyan
2014-02-23 16:41 ` Trond Myklebust
2014-02-23 18:04   ` Re: Mkrtchyan, Tigran
2014-03-01  6:56 Re: Anton 'EvilMan' Danilov
2014-03-16 12:01 Re; Nieuwenhuis,Sonja S.B.M.
2014-04-13 21:01 (unknown), Marcus White
2014-04-15  0:59 ` Marcus White
2014-04-16 21:17   ` Re: Marcelo Tosatti
2014-04-17 21:33     ` Re: Marcus White
2014-04-21 21:49       ` Re: Marcelo Tosatti
     [not found] <blk-mq updates>
     [not found] ` <1397464212-4454-1-git-send-email-hch@lst.de>
2014-04-15 20:16   ` Re: Jens Axboe
2014-05-02  9:42 "csum failed" that was not detected by scrub Jaap Pieroen
2014-05-02 10:20 ` Duncan
2014-05-02 17:48   ` Jaap Pieroen
2014-05-03 13:31     ` Re: Frank Holton
2014-06-15 20:36 Re: Angela D.Dawes
2014-06-16  7:10 Re: Angela D.Dawes
2014-07-03 16:30 W. Cheung
2014-07-24  8:35 Richard Wong
2014-07-24  8:35 Re: Richard Wong
2014-07-24  8:36 Re: Richard Wong
2014-07-24  8:37 Re: Richard Wong
2014-08-06 12:06 (unknown), Daniel Smedegaard Buus
2014-08-06 17:10 ` Slava Pestov
2014-08-06 17:50   ` Re: Daniel Smedegaard Buus
2014-08-18 15:38 Re: Mrs. Hajar Vaserman.
2014-08-24 13:14 (unknown), Christian König
2014-08-24 13:34 ` Mike Lothian
2014-08-25  9:10   ` Re: Christian König
2014-09-07 13:24     ` Re: Markus Trippelsdorf
2014-09-08  3:47       ` Re: Alex Deucher
2014-09-08  7:13         ` Re: Markus Trippelsdorf
2014-09-08 11:36 (unknown), R. Klomp
     [not found] ` <CAOqJoqGSRUw_UT4LhqpYX-WX6AEd2ReAWjgNS76Cra-SMKw3NQ@mail.gmail.com>
2014-09-08 14:36   ` R. Klomp
2014-09-10  0:00     ` Re: David Aguilar
2014-09-15 15:10       ` Re: R. Klomp
     [not found] <6A286AB51AD8EC4180C4B2E9EF1D0A027AAD7EFF1E@exmb01.wrschool.net>
2014-09-08 17:36 ` Deborah Mayher
2014-09-08 17:36 ` RE: Deborah Mayher
2014-09-08 17:36 ` RE: Deborah Mayher
2014-10-13  6:18 geohughes
2014-10-13  6:18 Re: geohughes-q6EoVN9bke7vnOemgxGiVw
     [not found] <BEC3AE959B8BB340894B239B5A7882B929B02748@LPPTCPMXMBX02.LPCH.NET>
2014-10-30  9:26 ` Tarzon, Megan
2014-11-14 18:56 milke-Bd11Sj57+SE
2014-11-14 18:56 re: milke
2014-11-14 20:49 salim-Re5JQEeQqe8AvxtiuMwx3w
2014-11-14 20:50 Re: salim
2014-11-17 20:11 Re: salim
2014-11-26 18:38 (unknown), Travis Williams
2014-11-26 20:49 ` NeilBrown
2014-11-29 15:08   ` Re: Peter Grandi
2014-12-01 13:02 Re: Quan Han
2014-12-01 13:02 Re: Quan Han
2014-12-01 13:02 Re: Quan Han
2014-12-01 13:02 Re: Quan Han
2014-12-01 13:02 Re: Quan Han
2014-12-06 13:18 Re: Quan Han
2015-01-28  7:15 "brcmfmac: brcmf_sdio_htclk: HT Avail timeout" on Thinkpad Tablet 10 Sébastien Bourdeauducq
2015-01-30 14:40 ` Arend van Spriel
2015-09-09 16:55   ` Oleg Kostyuchenko
2015-02-28 11:21 Jonathan Cameron
2015-02-28 11:22 ` Jonathan Cameron
2015-03-13  1:34 (unknown) cody.taylor
2015-03-13  2:00 ` Duy Nguyen
     [not found] <CANSxx61FaNp5SBXJ8Y+pWn0eDcunmibKR5g8rttnWGdGwEMHCA@mail.gmail.com>
2015-03-18 20:45 ` Re: Junio C Hamano
2015-03-18 21:06   ` Re: Stefan Beller
2015-03-18 21:17     ` Re: Jeff King
2015-03-18 21:28       ` Re: Jeff King
2015-03-18 21:33         ` Re: Junio C Hamano
2015-03-18 21:45           ` Re: Stefan Beller
2015-04-08 20:44 (unknown), Mamta Upadhyay
2015-04-08 21:58 ` Thomas Braun
2015-04-09 11:27   ` Re: Konstantin Khomoutov
     [not found] <CA+yqC4Y2oi4ji-FHuOrXEsxLoYsnckFoX2WYHZwqh5ZGuq7snA@mail.gmail.com>
2015-05-12 15:04 ` Re: Sam Leffler
     [not found] <E1Yz4NQ-0000Cw-B5@feisty.vs19.net>
2015-05-31 15:37 ` Re: Roman Volkov
2015-05-31 15:53   ` Re: Hans de Goede
     [not found] <132D0DB4B968F242BE373429794F35C22559D38329@NHS-PCLI-MBC011.AD1.NHS.NET>
2015-06-08 11:09 ` Practice Trinity (NHS SOUTH SEFTON CCG)
2015-06-08 11:09   ` RE: Practice Trinity (NHS SOUTH SEFTON CCG)
2015-06-10 18:17 RE: Robert Reynolds
     [not found] <CAHxZcryF7pNoENh8vpo-uvcEo5HYA5XgkZFWrLEHM5Hhf5ay+Q@mail.gmail.com>
2015-07-05 16:38 ` t0021
     [not found] <CACy=+DtdZOUT4soNZ=zz+_qhCfM=C8Oa0D5gjRC7QM3nYi4oEw@mail.gmail.com>
2015-07-11 18:37 ` Re: Mustapha Abiola
2015-07-28 18:54 FREELOTTO-u79uwXL29TY76Z2rM5mHXA, PROMO-u79uwXL29TY76Z2rM5mHXA
2015-08-05 12:47 (unknown) Ivan Chernyavsky
2015-08-15  9:19 ` Duy Nguyen
2015-08-17 17:49   ` Re: Junio C Hamano
2015-08-11 10:57 zso2bytom
2015-08-19 11:09 christain147
2015-08-19 13:01 Re: christain147
2015-08-19 13:01 Re: christain147
2015-08-19 13:01 Re: christain147
2015-08-19 13:01 Re: christain147
2015-08-19 14:04 Re: christain147
2015-08-19 19:41 Re: christain147
2015-09-01 12:01 Re: Zariya
2015-09-01 12:01 Re: Zariya
2015-09-01 16:06 Re: Zariya
2015-09-01 16:06 Re: Zariya
2015-09-30 12:06 Apple-Free-Lotto
2015-10-24  5:02 RE: JO Bower
2015-10-24  5:02 RE: JO Bower
2015-10-26  7:30 Davies
2015-10-29  2:40 Unknown, 
2015-11-01 20:03 RE: Mario, Franco
     [not found] <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D@MGCCCMAIL2010-5.mgccc.cc.ms.us>
     [not found] ` <D0613EBE33E8FD439137DAA95CCF59555B7A5A4D-np6RRm/yoI0WMyNdQYMtvx125T75Kgqw2GnX7Qjzz7g@public.gmane.org>
2015-11-24 13:21   ` RE: Amis, Ryann
2015-11-24 13:21 ` RE: Amis, Ryann
2015-11-24 13:21 ` RE: Amis, Ryann
2016-01-13 11:34 Alexey Ivanov
2016-01-13 13:12 ` Michal Kazior
     [not found]   ` <CAGvpMW9d8RZGpfBd2H0W35fVUQoi9jcZvQmTC7ztW+dPVcxOhg@mail.gmail.com>
2016-01-13 14:05     ` Re: Michal Kazior
2016-01-13 14:45       ` Re: Alexey Ivanov
2016-01-13 14:54         ` Re: Michal Kazior
2016-01-14  5:36           ` Re: Alexey Ivanov
2016-01-14  7:21             ` Re: Michal Kazior
2016-01-14 11:14               ` Re: Alexey Ivanov
2016-01-14 11:26                 ` Re: Shajakhan, Mohammed Shafi (Mohammed Shafi)
2016-01-14 12:33                   ` Re: Alexey Ivanov
2016-01-14 17:45             ` Re: Peter Oh
2016-01-13 12:46 Adam Richter
     [not found] <569A640D.801@gmail.com>
2016-01-22  7:40 ` (unknown) mr. sindar
2016-01-22  9:24   ` Ralf Mardorf
2016-02-26  1:19 Fredrick Prashanth John Berchmans
2016-02-26  7:37 ` Richard Weinberger
2016-04-11 19:04 (unknown), miwilliams
2016-04-12  4:33 ` Stefan Beller
2016-04-22  8:25 (unknown) Daniel Lezcano
2016-04-22  8:27 ` Daniel Lezcano
     [not found] <E5ACCB586875944EB0AE0E3EFA32B4F526FAD24C@exchange0.winona.edu>
2016-05-16 23:02 ` Weichert, Brian
2016-05-18 16:26 (unknown), Warner Losh
     [not found] ` <CANCZdfow154vh3kHqUNUM6CoBsC9Vu3_+SEjFG1dz=FOkc9vsg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-05-18 18:02   ` Rob Herring
     [not found]     ` <CAL_Jsq+s3PjzKCaT03EaqNCoyuwDQ6dXHDF808+U=hjvvfRYdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2016-05-18 22:01       ` Re: Warner Losh
2016-06-14  7:06 Raphael Poggi
2016-06-24  8:17 ` Raphaël Poggi
2016-06-24 11:49   ` Re: Sascha Hauer
2016-07-15 18:16 Re: Arnold Zeigler
2016-09-01  2:02 Fennec Fox
2016-09-01  3:10 ` Jeff Mahoney
2016-09-01 19:32   ` Re: Kai Krakow
2016-09-10 21:51 Re: Michelle Ouellette
2016-09-27 16:50 Rajat Jain
2016-09-27 16:57 ` Rajat Jain
2016-11-06 21:00 (unknown), Dennis Dataopslag
2016-11-07 16:50 ` Wols Lists
2016-11-07 17:13   ` Re: Wols Lists
2016-11-17 20:33 ` Re: Dennis Dataopslag
2016-11-17 22:12   ` Re: Wols Lists
2016-11-09 17:55 bepi
2016-11-10  6:57 ` Alex Powell
2016-11-10 13:00   ` Re: bepi
2016-11-15  4:40 Apply
2017-01-25  0:11 [PATCH 7/7] completion: recognize more long-options Cornelius Weig
2017-01-25  0:21 ` Stefan Beller
2017-01-25  0:43   ` Cornelius Weig
2017-01-25  0:52     ` Re: Stefan Beller
2017-01-25  0:54   ` Re: Linus Torvalds
2017-01-25  1:32     ` Re: Eric Wong
2017-02-16 19:41 simran singhal
2017-02-16 19:44 ` SIMRAN SINGHAL
2017-02-23 15:09 Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 ` RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:09 RE: Qin's Yanjun
2017-02-23 15:10 RE: Qin's Yanjun
2017-02-23 15:13 RE: Qin's Yanjun
2017-02-23 15:15 RE: Qin's Yanjun
2017-03-19 15:00 Ilan Schwarts
2017-03-23 17:12 ` Jeff Mahoney
2017-04-01  5:31 Re: USPS Delivery
2017-04-10  3:17 Qin Yan jun
2017-04-11 14:37 USPS Priority Delivery
2017-04-13 15:58 (unknown), Scott Ellentuch
     [not found] ` <CAK2H+efb3iKA5P3yd7uRqJomci6ENvrB1JRBBmtQEpEvyPMe7w@mail.gmail.com>
2017-04-13 16:38   ` Scott Ellentuch
     [not found] <CALDO+SZPQGmp4VH0LvCh95uXWvwzAoj+wN-rm0pGu5e0wCcyNw@mail.gmail.com>
2017-04-19 18:13 ` Re: Joe Stringer
2017-04-28  8:20 (unknown), Anatolij Gustschin
2017-04-28  8:43 ` Linus Walleij
2017-04-28  9:26   ` Re: Anatolij Gustschin
2017-04-28 18:27 Re: USPS Ground Support
2017-04-29 22:53 Re: USPS Station Management
2017-05-03  5:59 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03  6:23 Re: H.A
2017-05-03 11:26 Re: Paul Lopez-Bravo
     [not found] <CAMj-D2DO_CfvD77izsGfggoKP45HSC9aD6auUPAYC9Yeq_aX7w@mail.gmail.com>
2017-05-04 16:44 ` Re: gengdongjiu
2017-05-04 23:57 Re: Tammy
2017-05-16 22:46 Re: USPS Parcels Delivery
     [not found] <20170519213731.21484-1-mrugiero@gmail.com>
2017-05-20  8:48 ` Re: Boris Brezillon
2017-07-07 17:04 Mrs Alice Walton
2017-07-19  8:03 Lynne Smith
2017-07-19  8:03 Re: Lynne Smith
2017-07-27  1:12 Re: Marie Angèle Ouattara
2017-09-24 16:59 Estrin, Alex
     [not found] ` <F3529576D8E232409F431C309E29399336CD9886-8k97q/ur5Z1cIJlls4ac1rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-09-25  5:48   ` Leon Romanovsky
2017-10-01 10:53 Pierre
2017-10-18 14:31 Mrs. Marie Angèle O
2017-11-01 14:57 Mrs Hsu Wealther
2017-11-12  2:21 hsed
2017-11-13 18:56 ` Stefan Beller
2017-11-13 14:42 Re: Amos Kalonzo
2017-11-13 14:44 Re: Amos Kalonzo
2017-11-13 14:44 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 ` Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:55 Re: Amos Kalonzo
2017-11-13 14:56 Re: Amos Kalonzo
2017-11-13 14:57 Re: Amos Kalonzo
2017-11-13 15:00 Re: Amos Kalonzo
2017-11-13 15:01 Re: Amos Kalonzo
2017-11-13 15:04 Re: Amos Kalonzo
2017-11-20 15:10 Viet Nguyen
2017-11-20 20:07 ` Stefan Beller
2018-01-24 19:54 Re: Amy Riddering
2018-01-24 22:11 Re: Amy Riddering
2018-01-27  3:56 Re: Emile Kenold
2018-02-05  5:28 Re: Fahama Vaserman
2018-02-27  1:18 Alan Gage
2018-02-27 10:26 ` René Scharfe
2018-02-27 13:39 [Outreachy kernel] Re: Re: [PATCH] h [Patch] Fixed unnecessary typecasting to in. Error found with checkpatch. Signed-off-by: Nishka Dasgupta <nishka.dasgupta_ug18@ashoka.edu.in> Julia Lawall
2018-02-27 13:53 ` Nishka Dasgupta
2018-02-27 13:58 [Outreachy kernel] Re: Julia Lawall
2018-02-27 14:07 ` Re: Nishka Dasgupta
2018-03-01 19:33 [PATCH v2] staging: vc04_services: bcm2835-camera: Add blank line Greg KH
2018-03-01 20:20 ` Nishka Dasgupta
2018-03-01 20:31   ` Re: Greg KH
2018-03-08 18:23     ` Re: Nishka Dasgupta
2018-03-08 18:33       ` Re: Greg KH
2018-03-01 20:04 [Outreachy kernel] [PATCH v2] staging: ks7010: Remove spaces after typecast to int Julia Lawall
2018-03-01 21:20 ` Nishka Dasgupta
2018-03-01 21:17 Re: Nishka Dasgupta
2018-03-02 18:01 [Outreachy kernel] Help with Mutt Julia Lawall
2018-03-03 18:27 ` Nishka Dasgupta
2018-03-03 18:38   ` Re: Julia Lawall
     [not found] <[PATCH xf86-video-amdgpu 0/3] Add non-desktop and leasing support>
2018-03-03  4:49 ` (unknown), Keith Packard
     [not found]   ` <20180303044931.6902-1-keithp-aN4HjG94KOLQT0dZR+AlfA@public.gmane.org>
2018-03-05 10:02     ` Michel Dänzer
     [not found]       ` <82fc592b-f680-c663-1a0f-7b522ca932d2-otUistvHUpPR7s880joybQ@public.gmane.org>
2018-03-05 16:41         ` Re: Keith Packard
2018-03-05 17:06 (unknown) Meghana Madhyastha
2018-03-05 19:24 ` Noralf Trønnes
     [not found] <CABxXbAeQTGbiAEaFHK4RUTFGxt0A+KnCCmhJNU9XDivW5=SL-Q@mail.gmail.com>
2018-03-08 18:23 ` Ivan Lapuz
2018-03-08 18:33   ` Tommy Bowditch
2018-03-08 18:36   ` Re: Ibrahim Tachijian
     [not found] <CAAEAJfB76xseRqnYQfRihXY6g0Jyqwt8zfddU1W7CXDg3xEFFg@mail.gmail.com>
2018-04-02 11:20 ` Re: Ratheendran R
2018-04-02 17:19   ` Re: Steve deRosier
2018-04-04  7:31     ` Re: Arend van Spriel
2018-04-27  0:54 [PATCH v3 2/3] merge: Add merge.renames config setting Ben Peart
2018-04-27 18:19 ` Elijah Newren
2018-04-30 13:11   ` Ben Peart
2018-04-30 16:12     ` Re: Elijah Newren
2018-05-02 14:33       ` Re: Ben Peart
2018-08-28 17:34 Bills, Jason M
2018-08-28 17:59 ` Brad Bishop
2018-08-28 23:26   ` Bills, Jason M
2018-09-04 20:46     ` Brad Bishop
2018-09-04 21:28       ` Re: Ed Tanous
2018-09-04 22:34         ` Re: Brad Bishop
2018-09-04 23:18           ` Re: Ed Tanous
2018-09-04 23:42             ` Re: Brad Bishop
2018-09-05 21:20               ` Re: Bills, Jason M
2018-10-08 13:33 Netravnen
2018-10-08 13:34 ` Inderpreet Saini
2018-10-21 16:25 (unknown), Michael Tirado
2018-10-22  0:26 ` Dave Airlie
2018-10-21 20:23   ` Re: Michael Tirado
2018-10-22  1:50     ` Re: Dave Airlie
2018-10-21 22:20       ` Re: Michael Tirado
2018-10-23  1:47       ` Re: Michael Tirado
2018-10-23  6:23         ` Re: Dave Airlie
2018-10-26 12:54 Mohanraj B
2018-10-27 16:55 ` Jens Axboe
2018-10-29 14:20 Re: Beierl, Mark
2018-10-29 14:37 ` Re: Mohanraj B
2018-11-06  1:21 RE, Miss Juliet Muhammad
2018-11-11  4:20 RE, Miss Juliet Muhammad
     [not found] <CACikiw1uNCYKzo9vjG=AZHpARWv-nzkCX=D-aWBssM7vYZrQdQ@mail.gmail.com>
2018-11-12 10:09 ` Ravi Kumar
2018-11-15 13:11 ` Re: Ondrej Mosnacek
     [not found] <CAJUWh6qyHerKg=-oaFN+USa10_Aag5+SYjBOeLCX1qM+WcDUwA@mail.gmail.com>
2018-11-23  7:52 ` Re: Chris Murphy
2018-11-23  9:34   ` Re: Andy Leadbetter
2018-11-24 14:03 RE, Miss Sharifah Ahmad Mustahfa
2018-11-24 14:16 RE, Miss Sharifah Ahmad Mustahfa
2018-11-24 14:16 RE, Miss Sharifah Ahmad Mustahfa
2018-11-24 14:19 RE, Miss Sharifah Ahmad Mustahfa
     [not found] <20181130011234.32674-1-axboe@kernel.dk>
2018-11-30  2:09 ` Jens Axboe
2018-12-04  2:28 RE, Ms Sharifah Ahmad Mustahfa
2018-12-21 15:22 kenneth johansson
2018-12-22  8:18 ` Richard Weinberger
2019-01-07 17:28 [PATCH] arch/arm/mm: Remove duplicate header Souptick Joarder
2019-01-17 11:23 ` Souptick Joarder
2019-01-17 11:28   ` Mike Rapoport
2019-01-31  5:54     ` Souptick Joarder
2019-01-31 12:58       ` Vladimir Murzin
2019-02-01 12:32         ` Re: Souptick Joarder
2019-02-01 12:36           ` Re: Vladimir Murzin
2019-02-01 12:41             ` Re: Souptick Joarder
2019-02-01 13:02               ` Re: Vladimir Murzin
2019-02-01 15:15               ` Re: Russell King - ARM Linux admin
2019-02-01 15:22                 ` Re: Russell King - ARM Linux admin
2019-01-23 10:50 Christopher Hagler
2019-01-23 14:16 ` Cody Kratzer
2019-01-23 14:25   ` Re: Thomas Braun
2019-01-23 16:00   ` Re: Christopher Hagler
2019-01-23 16:35     ` Randall S. Becker
2019-01-24 17:11   ` Johannes Schindelin
2019-02-16  0:08 Re: Graham Loan Firm
2019-02-16  4:17 Re; Richard Wahl
2019-02-18 23:41 Pablo Mancilla
2019-02-19  2:20 Re: Pablo Mancilla
2019-03-05 14:57 [GSoC][PATCH v2 3/3] t3600: use helpers to replace test -d/f/e/s <path> Eric Sunshine
2019-03-05 23:38 ` Rohit Ashiwal
2019-03-19 14:41 No subject Maxim Levitsky
2019-03-20 11:03 ` Felipe Franciosi
2019-03-20 19:08   ` Re: Maxim Levitsky
2019-03-21 16:12     ` Re: Stefan Hajnoczi
2019-03-21 16:21       ` Re: Keith Busch
2019-03-21 16:41         ` Re: Felipe Franciosi
2019-03-21 17:04           ` Re: Maxim Levitsky
2019-03-22  7:54             ` Re: Felipe Franciosi
2019-03-22 10:32               ` Re: Maxim Levitsky
2019-03-22 15:30               ` Re: Keith Busch
2019-03-25 15:44                 ` Re: Felipe Franciosi
2019-05-21  0:06 [PATCH v6 0/3] add new ima hook ima_kexec_cmdline to measure kexec boot cmdline args Prakhar Srivastava
2019-05-21  0:06 ` [PATCH v6 2/3] add a new ima template field buf Prakhar Srivastava
2019-05-24 15:12   ` Mimi Zohar
2019-05-24 15:42     ` Roberto Sassu
2019-05-24 15:47       ` Re: Roberto Sassu
2019-05-24 18:09         ` Re: Mimi Zohar
2019-05-24 19:00           ` Re: prakhar srivastava
2019-05-24 19:15             ` Re: Mimi Zohar
     [not found] <E1hUrZM-0007qA-Q8@sslproxy01.your-server.de>
2019-05-29 19:54 ` Re: Alex Williamson
2019-06-13  7:02 Re: Erling Persson Foundation
     [not found] <DM5PR19MB165765D43BE979AB51A9897E9EEB0@DM5PR19MB1657.namprd19.prod.outlook.com>
2019-06-18  9:41 ` Re: Enrico Weigelt, metux IT consult
     [not found] <20190703063132.GA27292@ls3530.dellerweb.de>
2019-07-03  6:38 ` Re: Helge Deller
2019-08-20 17:23 William Baker
2019-08-20 17:27 ` Yagnatinsky, Mark
2019-08-30 19:54 [PATCH] Revert "asm-generic: Remove unneeded __ARCH_WANT_SYS_LLSEEK macro" Arnd Bergmann
     [not found] ` <20190830202959.3539-1-msuchanek@suse.de>
2019-08-30 20:32   ` Arnd Bergmann
     [not found] <CAGkTAxsV0zS_E64criQM-WtPKpSyW2PL=+fjACvnx2=m7piwXg@mail.gmail.com>
2019-09-27  6:37 ` Re: Michael Kerrisk (man-pages)
2019-10-27 21:47 Re: Margaret Kwan Wing Han
2019-11-14 11:37 SGV INVESTMENT
2019-11-15 16:03 Martin Nicolay
2019-11-15 16:29 ` Martin Ågren
2019-11-15 16:37   ` Re: Martin Ågren
     [not found] <20191205030032.GA26925@ray.huang@amd.com>
2019-12-09  1:26 ` Quan, Evan
2019-12-19 12:31 liming wu
2019-12-20  1:13 ` Andreas Dilger
2019-12-20 17:06 [alsa-devel] [PATCH v5 2/7] ASoC: tegra: Allow 24bit and 32bit samples Ben Dooks
2019-12-22 17:08 ` Dmitry Osipenko
2020-01-05  0:04   ` Ben Dooks
2020-01-05  1:48     ` Dmitry Osipenko
2020-01-05 10:53       ` Ben Dooks
2020-01-06 19:00         ` [alsa-devel] [Linux-kernel] " Ben Dooks
2020-01-07  1:39           ` Dmitry Osipenko
2020-01-23 19:38             ` Ben Dooks
2020-01-24 16:50               ` Jon Hunter
2020-01-27 19:20                 ` Dmitry Osipenko
2020-01-28 12:13                   ` Mark Brown
2020-01-28 17:42                     ` Dmitry Osipenko
2020-01-28 18:19                       ` Jon Hunter
2020-01-29  0:17                         ` Dmitry Osipenko
     [not found]                           ` <2ff97414-f0a5-7224-0e53-6cad2ed0ccd2-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2020-01-30  8:05                             ` Ben Dooks
     [not found] <mailman.6.1579205674.8101.b.a.t.m.a.n@lists.open-mesh.org>
2020-01-17  7:44 ` Re: Simon Wunderlich
2020-02-06  2:24 Re: Viviane Jose Pereira
2020-02-06  6:36 Re: Viviane Jose Pereira
2020-02-11 22:34 (unknown) Rajat Jain
     [not found] ` <20200211223400.107604-1-rajatja-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2020-02-12  9:30   ` Jarkko Nikula
2020-02-12  9:30     ` Re: Jarkko Nikula
     [not found]     ` <b3397374-0cb8-cf6c-0555-34541a1c108c-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2020-02-12 10:24       ` Re: Andy Shevchenko
2020-02-12 10:24         ` Re: Andy Shevchenko
     [not found] <20200224173733.16323-1-axboe@kernel.dk>
2020-02-24 17:38 ` Re: Jens Axboe
2020-02-26 11:57 (no subject) Ville Syrjälä
2020-02-26 12:08 ` Linus Walleij
2020-02-26 12:08   ` Re: Linus Walleij
2020-02-26 14:34   ` Re: Ville Syrjälä
2020-02-26 14:34     ` Re: Ville Syrjälä
2020-02-26 14:56     ` Re: Linus Walleij
2020-02-26 14:56       ` Re: Linus Walleij
2020-02-26 15:08       ` Re: Ville Syrjälä
2020-02-26 15:08         ` Re: Ville Syrjälä
2020-03-03 15:27 Gene Chen
2020-03-04 14:56 ` Matthias Brugger
2020-03-04 14:56   ` Re: Matthias Brugger
2020-03-04 15:15   ` Re: Lee Jones
2020-03-04 15:15     ` Re: Lee Jones
2020-03-04 18:00     ` Re: Matthias Brugger
2020-03-04 18:00       ` Re: Matthias Brugger
2020-03-08 17:19 Re: Francois Pinault
2020-03-08 17:33 Re: Francois Pinault
2020-03-08 17:33 ` Re: Francois Pinault
2020-03-08 17:33 Re: Francois Pinault
2020-03-08 19:12 Re: Francois Pinault
2020-03-16 23:07 Sankalp Bhardwaj
2020-03-17  9:13 ` Valdis Klētnieks
2020-03-17 10:10   ` Re: suvrojit
     [not found] <CALeDE9OeBx6v6nGVjeydgF1vpfX1Bus319h3M1=49PMETdaCtw@mail.gmail.com>
2020-03-20 11:49 ` Re: Josh Boyer
2020-03-27  8:36 (unknown) chenanqing
2020-03-27  8:59 ` Ilya Dryomov
2020-03-27  9:20 (unknown) chenanqing
     [not found] ` <5e7dc543.vYG3wru8B/me1sOV%chenanqing-Oq79sGaMObY@public.gmane.org>
2020-03-27 15:53   ` Lee Duncan
2020-03-27 15:53     ` Re: Lee Duncan
     [not found] <CAPXXXSDVGeEK_NCSkDMwTpuvVxYkWGdQk=L=bz+RN4XLiGZmcg@mail.gmail.com>
     [not found] ` <CAPXXXSBYcU1QamovmP-gVTXms67Xi_QpMCV=V3570q1nnuWqNw@mail.gmail.com>
2020-04-04 21:05   ` Re: Ruslan Bilovol
2020-04-05  1:27     ` Re: Alan Stern
2020-04-05  1:27       ` Re: Alan Stern
     [not found]       ` <CAPXXXSBLHYdHNSS4aM2Ax07+GQSB1WbPziOrk0iVWf-LXLmQRg@mail.gmail.com>
     [not found]         ` <CAPXXXSAajets4AqcBKt8aRd8V1AL4bjAmCyuBOKr8qBG-AHO1A@mail.gmail.com>
2020-04-05  2:51           ` Re: Colin Williams
2020-04-18 12:26 Re: Levi Brown
2020-05-06  5:52 Jiaxun Yang
2020-05-06 17:17 ` Nick Desaulniers
2020-05-14  8:17 Maksim Iushchenko
2020-05-14 10:29 ` fboehm
2020-05-21  0:22 Re: STOREBRAND
2020-06-24 13:54 Re; test02
2020-06-24 13:54 Re; test02
2020-06-30 17:56 (unknown) Vasiliy Kupriakov
2020-07-10 20:36 ` Andy Shevchenko
2020-07-16 21:22 Mauro Rossi
2020-07-20  9:00 ` Christian König
2020-07-20  9:59   ` Re: Mauro Rossi
2020-07-22  2:51     ` Re: Alex Deucher
2020-07-22  7:56       ` Re: Mauro Rossi
2020-07-24 18:31         ` Re: Alex Deucher
2020-07-26 15:31           ` Re: Mauro Rossi
2020-07-27 18:31             ` Re: Alex Deucher
2020-07-27 19:46               ` Re: Mauro Rossi
2020-07-27 19:54                 ` Re: Alex Deucher
2020-08-05 11:02 [PATCH v4] arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium) Amit Pundir
2020-08-06 22:31 ` Konrad Dybcio
2020-08-12 13:37   ` Amit Pundir
2020-08-12 10:54 Re: Alex Anadi
2020-09-15  2:40 Dave Airlie
2020-09-15  7:53 ` Christian König
     [not found] <20201008181606.460499-1-sandy.8925@gmail.com>
2020-10-09  6:47 ` Re: Thomas Zimmermann
2020-10-09  7:14   ` Re: Thomas Zimmermann
2020-10-09  7:38     ` Re: Sandeep Raghuraman
2020-10-09  7:51       ` Re: Thomas Zimmermann
2020-10-09 15:48         ` Re: Alex Deucher
2020-11-06 10:44 Luis Gerhorst
2020-11-06 14:34 ` Pavel Begunkov
2020-11-30 10:31 Oleksandr Tyshchenko
2020-11-30 16:21 ` Alex Bennée
2020-12-29 15:32   ` Re: Roger Pau Monné
2020-12-02  1:10 [PATCH] lib/find_bit: Add find_prev_*_bit functions Yun Levi
2020-12-02  9:47 ` Andy Shevchenko
2020-12-02 10:04   ` Rasmus Villemoes
2020-12-02 11:50     ` Yun Levi
     [not found]       ` <CAAH8bW-jUeFVU-0OrJzK-MuGgKJgZv38RZugEQzFRJHSXFRRDA@mail.gmail.com>
2020-12-02 18:22         ` Yun Levi
2020-12-02 21:26           ` Yury Norov
2020-12-02 22:51             ` Yun Levi
2020-12-03  1:23               ` Yun Levi
2020-12-03  8:33                 ` Rasmus Villemoes
2020-12-03  9:47                   ` Re: Yun Levi
2020-12-03 18:46                     ` Re: Yury Norov
2020-12-03 18:52                       ` Re: Willy Tarreau
2020-12-04  1:36                         ` Re: Yun Levi
2020-12-04 18:14                           ` Re: Yury Norov
2020-12-05  0:45                             ` Re: Yun Levi
2020-12-05 11:10                       ` Re: Rasmus Villemoes
2020-12-05 18:20                         ` Re: Yury Norov
     [not found] <CAGMNF6W8baS_zLYL8DwVsbfPWTP2ohzRB7xutW0X=MUzv93pbA@mail.gmail.com>
2020-12-02 17:09 ` Re: Kun Yi
2020-12-02 17:09   ` Re: Kun Yi
2021-01-08 10:32 Misono Tomohiro
2021-01-08 12:30 ` Arnd Bergmann
2021-01-08 12:30   ` Re: Arnd Bergmann
2021-01-08 10:35 misono.tomohiro
     [not found] <w2q9lf-sait7s-qswxlnzeof4i-7j13q0-zgu9pt-xk3x5enp994p-kewn2p-o86qyug0mutj-91m157sheva0-4k2l8v20kyjp-heu04baxqdc7op987-9zc0bxi0jcgo-wyl26layz5p9-esqncc-g48ass.1610618007875@email.android.com>
2021-01-14 10:09 ` Alexander Kapshuk
2021-01-19  0:10 David Howells
2021-01-20 14:46 ` Jarkko Sakkinen
     [not found] <CAMCTd2kkax9P-OFNHYYz8nKuaKOOkz-zoJ7h2nZ6maUGmjXC-g@mail.gmail.com>
2021-03-16 12:16 ` Re: westjoshuaalan
2021-04-05  0:01 Mitali Borkar
2021-04-06  7:03 ` Arnd Bergmann
2021-04-05 21:12 David Villasana Jiménez
2021-04-06  5:17 ` Greg KH
2021-04-15 13:41 Emmanuel Blot
2021-04-15 16:07 ` Palmer Dabbelt
2021-04-15 22:27 ` Re: Alistair Francis
     [not found] <b84772b0-e009-3b68-4e74-525ad8531f95@gmail.com>
2021-04-23 13:57 ` Re: Ivan Koveshnikov
2021-04-23 20:35   ` Re: Kajetan Puchalski
     [not found] <CAJr+-6ZR2oH0J4D_Ou13JvX8HLUUK=MKQwD0Kn53cmvAuT99bg@mail.gmail.com>
2021-04-27  7:56 ` Re: Fox Chen
2021-05-15 22:57 Dmitry Baryshkov
2021-06-02 21:45 ` Dmitry Baryshkov
2021-06-02 21:45   ` Re: Dmitry Baryshkov
     [not found] <60a57e3a.lbqA81rLGmtH2qoy%Radisson97@gmx.de>
2021-05-21 11:04 ` Re: Alejandro Colomar (man-pages)
2021-06-06 19:19 Davidlohr Bueso
2021-06-07 16:02 ` André Almeida
     [not found] <CAFBCWQJX4Xy8Sot7en5JBTuKrzy=_6xFkc+QgOxJEC7G6x+jzg@mail.gmail.com>
2021-06-12  3:43 ` Re: Ammar Faizi
     [not found] <CAGsV3ysM+p_HAq+LgOe4db09e+zRtvELHUQzCjF8FVE2UF+3Ow@mail.gmail.com>
2021-06-29 13:52 ` Re: Alex Deucher
2021-07-16 17:07 Subhasmita Swain
2021-07-16 18:15 ` Lukas Bulwahn
2021-07-27  2:59 [PATCH v9] iomap: Support file tail packing Gao Xiang
2021-07-27 15:10 ` Darrick J. Wong
2021-07-27 15:23   ` Andreas Grünbacher
2021-07-27 15:30   ` Re: Gao Xiang
2021-08-21 14:40 TECOB270_Ganesh Pawar
2021-08-21 23:52 ` Jeff King
     [not found] <CAKPXbjesQH_k1Z7k4kNwpoAf-jYgbUaPqPCgNTJZ35peVBy_pA@mail.gmail.com>
2021-08-29 12:01 ` Re: Lukas Bulwahn
2021-09-03 20:51 Mr. James Khmalo
2021-10-08  1:24 Dmitry Baryshkov
2021-10-12 23:59 ` Linus Walleij
2021-10-13  3:46   ` Re: Dmitry Baryshkov
2021-10-13 23:39     ` Re: Linus Walleij
2021-10-17 16:54   ` Re: Bjorn Andersson
2021-10-17 21:31     ` Re: Linus Walleij
2021-10-17 21:35 ` Re: Linus Walleij
     [not found] <20211011231530.GA22856@t>
2021-10-12  1:23 ` James Bottomley
2021-10-12  2:30   ` Bart Van Assche
2021-11-02  9:48 [PATCH v5 00/11] Add support for X86/ACPI camera sensor/PMIC setup with clk and regulator platform data Hans de Goede
2021-11-02  9:49 ` [PATCH v5 05/11] clk: Introduce clk-tps68470 driver Hans de Goede
     [not found]   ` <163588780885.2993099.2088131017920983969@swboyd.mtv.corp.google.com>
2021-11-25 15:01     ` Hans de Goede
     [not found] <CAGGnn3JZdc3ETS_AijasaFUqLY9e5Q1ZHK3+806rtsEBnAo5Og@mail.gmail.com>
2021-11-23 17:20 ` Re: Christian COMMARMOND
     [not found] <20211126221034.21331-1-lukasz.bartosik@semihalf.com--annotate>
2021-11-29 21:59 ` Re: sean.wang
2021-11-29 21:59   ` Re: sean.wang
2021-12-20  6:46 Ralf Beck
2021-12-20  7:55 ` Greg KH
2021-12-20 10:01 ` Re: Oliver Neukum
     [not found] <20211229092443.GA10533@L-PF27918B-1352.localdomain>
2022-01-05  6:05 ` Re: Jason Wang
2022-01-05  6:27   ` Re: Jason Wang
2022-01-13 17:53 Varun Sethi
2022-01-14 17:17 ` Fabio Estevam
2022-01-14 10:54 Li RongQing
2022-01-14 10:55 ` Paolo Bonzini
2022-01-14 17:13   ` Re: Sean Christopherson
2022-01-14 17:17     ` Re: Paolo Bonzini
2022-01-20 15:28 Myrtle Shah
2022-01-20 15:37 ` Vitaly Wool
2022-01-20 23:29   ` Re: Damien Le Moal
2022-02-04 21:45   ` Re: Palmer Dabbelt
2022-01-24 12:43 Arınç ÜNAL
2022-01-25 14:03 ` Sergio Paracuellos
2022-01-25 15:24   ` Re: Arınç ÜNAL
2022-01-25 15:50     ` Re: Sergio Paracuellos
     [not found] <10b1995b392e490aaa2db645f219015e@dji.com>
2022-01-17 12:54 ` 转发: Caine Chen
2022-02-03 11:49   ` Daniel Vacek
2022-02-10  7:10 [PATCH] net/failsafe: Fix crash due to global devargs syntax parsing from secondary process madhuker.mythri
2022-02-10 15:00 ` Ferruh Yigit
2022-02-10 16:08   ` Gaëtan Rivet
2022-02-11 15:06 Re: Caine Chen
2022-02-13 22:40 Ronnie Sahlberg
2022-02-14  7:52 ` ronnie sahlberg
2022-03-04  8:47 Re: Harald Hauge
     [not found] <20220301070226.2477769-1-jaydeepjd.8914>
2022-03-06 11:10 ` Jaydeep P Das
2022-03-06 11:22   ` Jaydeep Das
     [not found] <Yj1hkpyUqJE9sQ2p@redhat.com>
2022-03-25  7:52 ` Re: Jason Wang
2022-03-25  9:10   ` Re: Michael S. Tsirkin
2022-03-25  9:20     ` Re: Jason Wang
2022-03-25 10:09       ` Re: Michael S. Tsirkin
2022-03-28  4:56         ` Re: Jason Wang
2022-03-28  5:59           ` Re: Michael S. Tsirkin
2022-03-28  6:18             ` Re: Jason Wang
2022-03-28 10:40               ` Re: Michael S. Tsirkin
2022-03-29  7:12                 ` Re: Jason Wang
2022-03-29 14:08                   ` Re: Michael S. Tsirkin
2022-03-30  2:40                     ` Re: Jason Wang
2022-03-30  5:14                       ` Re: Michael S. Tsirkin
2022-03-30  5:53                         ` Re: Jason Wang
2022-03-29  8:35                 ` Re: Thomas Gleixner
2022-03-29 14:37                   ` Re: Michael S. Tsirkin
2022-03-29 18:13                     ` Re: Thomas Gleixner
2022-03-29 22:04                       ` Re: Michael S. Tsirkin
2022-03-30  2:38                         ` Re: Jason Wang
2022-03-30  5:09                           ` Re: Michael S. Tsirkin
2022-03-30  5:53                             ` Re: Jason Wang
2022-04-12  6:55                   ` Re: Michael S. Tsirkin
2022-04-05 21:41 rcu_sched self-detected stall on CPU Miguel Ojeda
2022-04-06  9:31 ` Zhouyi Zhou
2022-04-06 17:00   ` Paul E. McKenney
2022-04-08  7:23     ` Michael Ellerman
2022-04-08 14:42       ` Michael Ellerman
2022-04-13  5:11         ` Nicholas Piggin
2022-04-22 15:53           ` Thomas Gleixner
2022-04-22 15:53             ` Re: Thomas Gleixner
2022-04-23  2:29             ` Re: Nicholas Piggin
2022-04-23  2:29               ` Re: Nicholas Piggin
2022-04-06  7:51 Christian König
2022-04-06 12:59 ` Daniel Vetter
2022-04-17 17:43 [PATCH v3 00/60] target/arm: Cleanups, new features, new cpus Richard Henderson
2022-04-17 17:43 ` [PATCH v3 06/60] target/arm: Change CPUArchState.aarch64 to bool Richard Henderson
2022-04-19 11:17   ` Alex Bennée
2022-05-15 20:36 [PATCH bpf-next 1/2] cpuidle/rcu: Making arch_cpu_idle and rcu_idle_exit noinstr Jiri Olsa
2023-05-20  9:47 ` Ze Gao
2023-05-21  3:58   ` Yonghong Song
2023-05-21 15:10     ` Re: Ze Gao
2023-05-21 20:26       ` Re: Jiri Olsa
2023-05-22  1:36         ` Re: Masami Hiramatsu
2023-05-22  2:07         ` Re: Ze Gao
2023-05-23  4:38           ` Re: Yonghong Song
2023-05-23  5:30           ` Re: Masami Hiramatsu
2023-05-23  6:59             ` Re: Paul E. McKenney
2023-05-25  0:13               ` Re: Masami Hiramatsu
2023-05-21  8:08   ` Re: Jiri Olsa
2023-05-21 10:09     ` Re: Masami Hiramatsu
2023-05-21 14:19       ` Re: Ze Gao
2022-05-19  9:54 Christian König
2022-05-19 10:50 ` Matthew Auld
2022-05-20  7:11   ` Re: Christian König
2022-06-06  5:33 Fenil Jain
2022-06-06  5:51 ` Greg Kroah-Hartman
2022-08-26 22:03 Zach O'Keefe
2022-08-31 21:47 ` Yang Shi
2022-09-01  0:24   ` Re: Zach O'Keefe
2022-08-28 21:01 Nick Neumann
2022-09-01 17:44 ` Nick Neumann
2022-09-12 12:36 Christian König
2022-09-13  2:04 ` Alex Deucher
2022-09-14 13:12 Amjad Ouled-Ameur
2022-09-14 13:18 ` Amjad Ouled-Ameur
2022-09-14 13:18   ` Re: Amjad Ouled-Ameur
2022-11-09 14:34 Denis Arefev
2022-11-09 14:44 ` Greg Kroah-Hartman
2022-11-18  2:00 Jiamei Xie
2022-11-18  7:47 ` Michal Orzel
2022-11-18  9:02   ` Re: Julien Grall
2022-11-18 18:11 Re: Mr. JAMES
2022-11-18 19:33 Re: Mr. JAMES
2022-11-21 11:11 Denis Arefev
2022-11-21 14:28 ` Jason Yan
2023-01-18 20:59 [PATCH v5 0/5] CXL Poison List Retrieval & Tracing alison.schofield
2023-01-27  1:59 ` Dan Williams
2023-01-27 16:10   ` Alison Schofield
2023-01-27 19:16     ` Re: Dan Williams
2023-01-27 21:36       ` Re: Alison Schofield
2023-01-27 22:04         ` Re: Dan Williams
     [not found] <20230122193117.GA28689@Debian-50-lenny-64-minimal>
2023-01-22 21:42 ` Re: Alejandro Colomar
2023-01-24 20:01   ` Re: Helge Kreutzmann
2023-02-28  6:32 Re: Mahmut Akten
2023-03-12  6:52 [PATCH v2] uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2 Greg Kroah-Hartman
2023-03-27 13:54 ` Yaroslav Furman
2023-03-27 14:19   ` Greg Kroah-Hartman
2023-05-11 12:58 Ryan Roberts
2023-05-11 13:13 ` Ryan Roberts
2023-05-30  1:31 RE; Olena Shevchenko
2023-05-30  2:46 RE; Olena Shevchenko
     [not found] <CAKEZqKKdQ9EhRobSmq0sV76arfpk6m5XqA-=XQP_M3VRG=M-eg@mail.gmail.com>
2023-06-08  8:13 ` chenlei0x
     [not found] <010d01d999f4$257ae020$7070a060$@mirroredgenetworks.com>
     [not found] ` <CAEhhANphwWt5iOMc5Yqp1tT1HGoG_GsCuUWBWeVX4zxL6JwUiw@mail.gmail.com>
     [not found]   ` <CAEhhANom-MGPCqEk5LXufMkxvnoY0YRUrr0r07s0_7F=eCQH5Q@mail.gmail.com>
2023-06-08 10:51     ` Re: Daniel Little
2023-06-27 11:10 Alvaro a-m
2023-06-27 11:15 ` Michael Kjörling
     [not found] <64b09dbb.630a0220.e80b9.e2ed@mx.google.com>
2023-07-14  8:05 ` Re: Andy Shevchenko
     [not found] <TXJgqLzlM6oCfTXKSqrSBk@txt.att.net>
2023-08-09  5:12 ` Re: Luna Jernberg
     [not found] <c8d43894-7e66-4a01-88fc-10708dc53b6b@amd.com>
     [not found] ` <878r7z4kb4.ffs@tglx>
     [not found]   ` <e79dea49-0c07-4ca2-b359-97dd1bc579c8@amd.com>
     [not found]     ` <87ttqhcotn.ffs@tglx>
     [not found]       ` <87v8avawe0.ffs@tglx>
     [not found]         ` <32bcaa8a-0413-4aa4-97a0-189830da8654@amd.com>
     [not found]           ` <ZTkzYA3w2p3L4SVA@localhost>
     [not found]             ` <87jzra6235.ffs@tglx>
     [not found]               ` <875y2u5s8g.ffs@tglx>
2023-10-25 22:11                 ` Re: Mario Limonciello
2023-10-26  9:27                   ` Re: Thomas Gleixner
2023-11-11  4:21 Andrew Worsley
2023-11-11  8:22 ` Javier Martinez Canillas
2023-11-30 21:51 [NDCTL PATCH v3 2/2] cxl: Add check for regions before disabling memdev Dave Jiang
2024-04-17  6:46 ` Yao Xingtao
2024-04-17 18:14   ` Verma, Vishal L
2024-04-22  7:26     ` Re: Xingtao Yao (Fujitsu)
2023-12-07  4:40 Emma Tebibyte
2023-12-07  5:00 ` Christoph Anton Mitterer
2023-12-07  5:29   ` Re: Lawrence Velázquez
2024-01-16  6:46 meir elisha
2024-01-16  7:05 ` Dan Carpenter
2024-01-18 22:19 [RFC] [PATCH 0/3] xfs: use large folios for buffers Dave Chinner
2024-01-22 10:13 ` Andi Kleen
2024-01-22 11:53   ` Dave Chinner
2024-01-24  0:14 [PATCH v3 0/7] net/gve: RSS Support for GVE Driver Joshua Washington
     [not found] ` <20240126173317.2779230-1-joshwash@google.com>
2024-01-31 14:58   ` Ferruh Yigit
2024-03-07  6:07 KR Kim
2024-03-07  8:01 ` Miquel Raynal
2024-03-08  1:27   ` Re: Kyeongrho.Kim
     [not found]   ` <SE2P216MB210205B301549661575720CC833A2@SE2P216MB2102.KORP216.PROD.OUTLOOK.COM>
2024-03-29  4:41     ` Re: Kyeongrho.Kim
2024-04-19 15:46 George Guo
2024-04-23 16:48 ` Greg KH
2024-04-23 14:21 [PATCH dovetail] x86: irq_pipeline: Add missing definition for !CONFIG_IRQ_PIPELINE Philippe Gerum
2024-04-24  8:58 ` Fabian Scheler
2024-04-24  9:02   ` Scheler, Fabian
     [not found] <CGME20240520102002epcas2p3d0944968114a664556cbd74d53beddee@epcas2p3.samsung.com>
2024-05-20 10:09 ` Minwoo Im
2024-05-20 13:34   ` Vincent Fu
2024-05-21  0:00     ` Re: Minwoo Im
2024-06-11 16:54 Jacob Pan
2024-06-12  2:04 ` Sean Christopherson
2024-06-12  2:55   ` Re: Xin Li
2024-06-26  6:11 Totoro W
2024-06-26  7:09 ` Eduard Zingerman
2024-07-06 11:20 [PATCH v2 09/60] i2c: cp2615: reword according to newest specification Wolfram Sang
2024-07-10  6:41 ` [PATCH v3] " Wolfram Sang
2024-07-10 17:51   ` Bence Csókás
2024-07-14 19:59 raschupkin.ri
2024-07-15 20:20 ` Joe Lawrence
2024-07-15 22:45   ` Re: Roman Rashchupkin
2024-07-16  9:28   ` Re: Nicolai Stange
     [not found]   ` <66963d60.170a0220.70a9a.8866SMTPIN_ADDED_BROKEN@mx.google.com>
2024-07-16  9:53     ` Re: Roman Rashchupkin
2024-07-25 14:52       ` Re: Joe Lawrence
2024-07-16 17:33 ` Re: Song Liu
2024-07-15 21:06 Phil Dennis-Jordan
2024-07-16  6:07 ` Akihiko Odaki
2024-07-17 11:16   ` Re: Phil Dennis-Jordan
2024-08-14  8:03 howard_wang
2024-08-14 15:04 ` Stephen Hemminger
2024-08-16 11:07 Xi Ruoyao
2024-08-19 12:40 ` Huacai Chen
2024-08-19 13:01   ` Re: Jason A. Donenfeld
2024-08-19 15:22     ` Re: Xi Ruoyao
2024-08-19 15:54       ` Re: Xi Ruoyao
2024-08-19 15:22   ` Re: Xi Ruoyao
2024-08-27  9:45 ` Re: Jason A. Donenfeld
2024-08-22 20:54 [PATCH 2/2] scsi: ufs: core: Fix the code for entering hibernation Bao D. Nguyen
2024-08-22 21:08 ` Bart Van Assche
2024-08-23 12:01   ` Manivannan Sadhasivam
2024-08-23 14:23     ` Bart Van Assche
2024-08-23 14:58       ` Manivannan Sadhasivam
2024-08-23 16:07         ` Bart Van Assche
2024-08-23 16:48           ` Manivannan Sadhasivam
2024-08-23 18:05             ` Bart Van Assche
2024-08-24  2:29               ` Manivannan Sadhasivam
2024-08-24  2:48                 ` Bart Van Assche
2024-08-24  3:03                   ` Manivannan Sadhasivam
2024-08-26  6:48                     ` Can Guo
2024-09-13 17:11 David Hunter
2024-09-13 20:39 ` Shuah Khan
2024-09-17  7:10 Akhil P Oommen
2024-09-17  7:24 ` Dmitry Baryshkov
2024-10-10 22:44 Re: PRIVATE
2024-10-15 22:48 Daniel Yang
2024-10-16  1:27 ` Jakub Kicinski
2024-10-17  9:09 Paulo Miguel Almeida
2024-10-17  9:12 ` Paulo Miguel Almeida
2024-11-23  1:39 the Hide
2024-11-23  7:32 ` Christoph Biedl
2024-11-25 19:23 Re: Robert Harewood
2024-11-25 20:13 Re: Robert Harewood
2025-01-08 13:59 Jiang Liu
2025-01-08 14:10 ` Christian König
2025-01-08 16:33 ` Re: Mario Limonciello
2025-01-09  5:34   ` Re: Gerry Liu
2025-01-09 17:10     ` Re: Mario Limonciello
2025-01-13  1:19       ` Re: Gerry Liu
2025-01-13 21:59         ` Re: Mario Limonciello
2025-04-18  7:46 Shung-Hsi Yu
2025-04-18  7:49 ` Shung-Hsi Yu
2025-04-23 17:30 ` Re: patchwork-bot+netdevbpf
2025-04-22  1:53 [PATCH bpf-next] bpf: Remove bpf_get_smp_processor_id_proto Alexei Starovoitov
2025-04-22  8:04 ` Feng Yang
2025-04-22 14:37   ` Alexei Starovoitov
2025-04-24  0:40 Cong Wang
2025-04-24  0:59 ` Jiayuan Chen
2025-04-24  9:19   ` Re: Jiayuan Chen
2025-05-09 17:38 Shawn Anastasio
2025-05-10 19:50 ` Trilok Soni
2025-05-14 20:21 Nicolas Pitre
2025-05-15  8:33 ` Jiri Slaby
2025-07-01 13:44 Emanuele Ghidoli
2025-07-11  2:21 ` Fabio Estevam
     [not found] <CADU64hCr7mshqfBRE2Wp8zf4BHBdJoLLH=VJt2MrHeR+zHOV4w@mail.gmail.com>
2025-07-20 18:26 ` >
2025-07-20 19:30   ` David Lechner
2025-07-20 19:30     ` Re: David Lechner
2025-07-21  6:52     ` Re: Krzysztof Kozlowski
2025-07-21  6:52       ` Re: Krzysztof Kozlowski
     [not found]       ` <CADU64hDZeyaCpHXBmSG1rtHjpxmjejT7asK9oGBUMF55eYeh4w@mail.gmail.com>
2025-07-21 14:09         ` Re: David Lechner
2025-07-21 14:09           ` Re: David Lechner
2025-07-21  7:52   ` Re: Andy Shevchenko
2025-07-21  7:52     ` Re: Andy Shevchenko
2025-08-06  3:34 Sang-Heon Jeon
2025-08-06  3:44 ` Sang-Heon Jeon
2025-08-12 13:34 Baoquan He
2025-08-12 13:49 ` Baoquan He
2025-08-13 15:48 [PATCH 6.15 000/480] 6.15.10-rc1 review Jon Hunter
2025-08-13 17:25 ` Jon Hunter
2025-08-14 15:36   ` Greg KH
2025-08-15 16:20     ` Re: Jon Hunter
2025-08-15 16:53       ` Re: Greg KH
2025-08-18 16:08 [PATCH] documentation/arm64 : kdump fixed typo errors Jonathan Corbet
2025-09-08  9:54 ` hariconscious
2025-09-08 13:23   ` Jonathan Corbet
2025-08-20 14:33 Christian König
2025-08-20 15:23 ` David Hildenbrand
2025-08-21  8:10   ` Re: Christian König
2025-08-25 19:10     ` Re: David Hildenbrand
2025-08-26  8:38       ` Re: Christian König
2025-08-26  8:46         ` Re: David Hildenbrand
2025-08-26  9:00           ` Re: Christian König
2025-08-26  9:17             ` Re: David Hildenbrand
2025-08-26  9:56               ` Re: Christian König
2025-08-26 12:07                 ` Re: David Hildenbrand
2025-08-26 16:09                   ` Re: Christian König
2025-08-26 14:27                 ` Re: Thomas Hellström
2025-08-26 12:37         ` Re: David Hildenbrand
2025-08-23 22:53 [RFC PATCH] time: introduce BOOT_TIME_TRACKER and minimal boot timestamp Thomas Gleixner
2025-09-01  4:05 ` Kaiwan N Billimoria
2025-09-01  5:57   ` Kaiwan N Billimoria
2025-08-27  6:48 [PATCH] net/netfilter/ipvs: Fix data-race in ip_vs_add_service / ip_vs_out_hook Julian Anastasov
2025-08-27 14:43 ` Zhang Tengfei
2025-08-27 21:37   ` Pablo Neira Ayuso
2025-08-29  2:01 xinpeng.wang
2025-08-29  2:42 ` bluez.test.bot
2025-09-15 19:52 Yury Norov (NVIDIA)
2025-09-16 14:48 ` Simon Horman
2025-09-16 15:22   ` Re: Yury Norov
2025-09-16 21:23 Jay Vosburgh
2025-09-16 21:56 ` Jay Vosburgh
2025-10-05 14:16 ssrane_b23
2025-10-05 14:16 ` syzbot
2025-10-06 10:51 [PATCH] counter: microchip-tcb-capture: Allow shared IRQ for multi-channel TCBs Dharma Balasubiramani
2025-10-08  7:06 ` Kamel Bouhara
2025-10-08 20:46   ` Bence Csókás
2025-11-04  9:22 Michael Roach
2025-11-04 10:24 ` Kristoffer Haugsbakk
2025-11-05 14:55   ` Re: Lucas Seiki Oshiro
2025-11-05 15:01     ` Re: Kristoffer Haugsbakk
2025-11-06  0:05       ` Re: Lucas Seiki Oshiro
2025-11-06  8:09         ` Re: Michael Roach
2025-11-05  3:38 niklaus.liu
2025-11-05  8:56 ` AngeloGioacchino Del Regno
2026-01-11 21:10 Wesley B
2026-01-12 13:28 ` Miguel Ojeda
2026-02-02 10:53 Anshumali Gaur
2026-02-03  0:34 ` Jacob Keller
2026-02-25 19:40 [PATCH v5 0/3] iio: frequency: ad9523: fix checkpatch warnings Bhargav Joshi
2026-02-25 19:40 ` Bhargav Joshi
2026-02-25 19:43   ` Andy Shevchenko
2026-03-13 10:11 ezcap401 needs USB_QUIRK_NO_BOS to function on 10gbs usb speed Greg KH
2026-03-13 11:01 ` Vyacheslav Vahnenko
2026-03-13 12:04   ` Greg KH
2026-03-31 10:05 [PATCH] net: ns83820: check DMA mapping errors in hard_start_xmit Paolo Abeni
2026-03-31 11:14 ` Wang Jun
2026-03-31 12:09   ` Eric Dumazet
2026-04-12  6:24 Re; Erick Lorch
2026-04-12 10:09 Re; Erick Lorch
2026-04-12 13:42 Re; Erick Lorch
2026-04-23 16:06 [PATCH] usb: cdns3: gadget: fix request skipping after clearing halt Yongchao Wu
2026-04-27  1:22 ` Peter Chen (CIX)
2026-04-27  9:01   ` Pawel Laszczak
2026-04-27 22:59     ` Peter Chen (CIX)
2026-04-27 23:59       ` Yongchao Wu
2026-04-28  9:58         ` Pawel Laszczak
2026-04-28 14:48           ` Yongchao Wu
2026-05-04  9:15             ` Pawel Laszczak
2026-04-28 18:24 Fabio M. De Francesco
2026-05-01 22:01 ` Dave Jiang
2026-05-09 18:01 Andrea Righi
2026-05-09 18:07 ` Andrea Righi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.