LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH] [POWERPC] powerpc: Add -mno-spe for ARCH=powerpc builds
From: Kumar Gala @ 2007-10-19  3:20 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev
In-Reply-To: <18199.59909.763111.810973@cargo.ozlabs.ibm.com>


On Oct 18, 2007, at 6:19 PM, Paul Mackerras wrote:

> Kumar Gala writes:
>
>> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
>
> Your commit message doesn't give any reason why you are doing this, or
> any explanation of what goes wrong without it.  In fact, the commit
> message is completely empty. :)  Please resubmit with a decent commit
> message.

I will, just as an FYI I based this on your commit for -mno-altivec  
(which has no rationale for the commit) :)

- k

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Linus Torvalds @ 2007-10-19  3:26 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linux Kernel Mailing List, linuxppc-dev, Thomas Gleixner, akpm,
	Ingo Molnar
In-Reply-To: <alpine.LFD.0.999.0710181941020.26902@woody.linux-foundation.org>

On Thu, 18 Oct 2007, Linus Torvalds wrote:
>
> I *think* it should work with something like
> 
> 	for (;;) {
> 		smp_rmb();
> 		if (!spin_is_locked(&desc->lock)) {
> 			smp_rmb();
> 			if (!(desc->status & IRQ_INPROGRESS)
> 				break;
> 		}
> 		cpu_relax();
> 	}

I'm starting to doubt this. 

One of the issues is that we still need the smp_mb() in front of the loop 
(because we want to serialize the loop with any writes in the caller).

The other issue is that I don't think it's enough that we saw the 
descriptor lock unlocked, and then the IRQ_INPROGRESS bit clear. It might 
have been unlocked *while* the IRQ was in progress, but the interrupt 
handler is now in its last throes, and re-takes the spinlock and clears 
the IRQ_INPROGRESS thing. But we're not actually happy until we've seen 
the IRQ_INPROGRESS bit clear and the spinlock has been released *again*.

So those two tests should actually be the other way around: we want to see 
the IRQ_INPROGRESS bit clear first.

It's all just too damn subtle and clever. Something like this should not 
need to be that subtle. 

Maybe the rigth thing to do is to not rely on *any* ordering what-so-ever, 
and just make the rule be: "if you look at the IRQ_INPROGRESS bit, you'd 
better hold the descriptor spinlock", and not have any subtle ordering 
issues at all.

But that makes us have a loop with getting/releasing the lock all the 
time, and then we get back to horrid issues with cacheline bouncing and 
unfairness of cache accesses across cores (ie look at the issues we had 
with the runqueue starvation in wait_task_inactive()).

Those were fixed by starting out with the non-locked and totally unsafe 
versions, but then having one last "check with lock held, and repeat only 
if that says things went south". 

See commit fa490cfd15d7ce0900097cc4e60cfd7a76381138 and ponder. Maybe we 
should take the same approach here, and do something like

	repeat:
		/* Optimistic, no-locking loop */
		while (desc->status & IRQ_INPROGRESS)
			cpu_relax();

		/* Ok, that indicated we're done: double-check carefully */
		spin_lock_irqsave(&desc->lock, flags);
		status = desc->status;
		spin_unlock_irqrestore(&desc->lock, flags);

		/* Oops, that failed? */
		if (status & IRQ_INPROGRESS)
			goto repeat;

Hmm?

			Linus

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Herbert Xu @ 2007-10-19  3:28 UTC (permalink / raw)
  To: Nick Piggin
  Cc: herbert, linux-kernel, linuxppc-dev, tglx, akpm, torvalds, mingo
In-Reply-To: <200710191252.18272.nickpiggin@yahoo.com.au>

Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
>> First of all let's agree on some basic assumptions:
>>
>> * A pair of spin lock/unlock subsumes the effect of a full mb.
> 
> Not unless you mean a pair of spin lock/unlock as in
> 2 spin lock/unlock pairs (4 operations).
> 
> *X = 10;
> spin_lock(&lock);
> /* *Y speculatively loaded here */
> /* store to *X leaves CPU store queue here */
> spin_unlock(&lock);
> y = *Y;

Good point.

Although in this case we're still safe because in the worst
cases:

CPU0				CPU1
irq_sync = 1
synchronize_irq
	spin lock
	load IRQ_INPROGRESS
irq_sync sync is visible
	spin unlock
				spin lock
					load irq_sync
	while (IRQ_INPROGRESS)
		wait
	return
				set IRQ_INPROGRESS
				spin unlock
				tg3_msi
					ack IRQ
					if (irq_sync)
						return
				spin lock
				clear IRQ_INPROGRESS
				spin unlock

------------------------------------------------------------

CPU0				CPU1
				spin lock
					load irq_sync
irq_sync = 1
synchronize_irq
				set IRQ_INPROGRESS
				spin unlock
	spin lock
	load IRQ_INPROGRESS
irq_sync sync is visible
	spin unlock
	while (IRQ_INPROGRESS)
		wait
				tg3_msi
					ack IRQ
					if (irq_sync)
						return
					do work
				spin lock
				clear IRQ_INPROGRESS
				spin unlock
	return

So because we're using the same lock on both sides, it does
do the right thing even without the memory barrier.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* powerpc 440 monitor program
From: Bai Shuwei @ 2007-10-19  3:32 UTC (permalink / raw)
  To: linuxppc-embedded, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1688 bytes --]

all,
    hi, when I boot my system, there are no output on the screen.
I choice the framebuffer and my monitor card is ATI PAGE 128 PCI, qt/e GUI.
When i compile the kernel, the boot logo and aty 128fb has be compiled into
kernel patched with videoboot, and x86emu. The video parameters set:
     video=aty128fb:640x480-16@7.
    The kenel can recognize the monitor card, and the frame buffer device
has run.
The output showed below:

    videoboot: Booting PCI video card bus 0, function 0, device 6
   aty128fb: Found Intel x86 BIOS ROM Image
   aty128fb: Rage128 BIOS located
   aty128fb: Rage128 RK PCI [chip rev 0x2] 32M 64-bit SDR SGRAM (2:1)
   fb0: ATY Rage128 frame buffer device on Rage128 RK PCI

My problem is there is no output on the monitor,and cannot see the penguin
boot logo.
I also run a program to test the framebuffer, which can read data from the
fb. After running the qt/e program, which can run scussessfully, there is no
output.!
   The test program reading data are showed below;


id = ATY Rage128
depth =  16
smemlen = 33554432
line_length = 0
FB_TYPE_ = 0
type_aux = 0
FB_VISUAL_ = 3
xpanstep = 8
ypanstep = 1
ywrapstep = 0
mmio_start = bbffc0000d
mmio_len = 8192
accel = 32
xres = 640
yres = 480
xres_virtul = 640
yres_virtul = 480
xoffset = 0
yoffset = 0
bits_per_pixel = 16

         My linux kernel is DENX 2.6.19.

        What can I do to make my system output into monitor.?

best regards!

buroc
-- 

Add: Tianshui South Road 222, Lanzhou, P.R.China
Tel: +86-931-8912025
Zip Code: 730000
URL: oss.lzu.edu.cn
Email: baishuwei@gmail.com, buroc@126.com

Tel: +86-931-8912025
Zip Code: 730000
URL: oss.lzu.edu.cn
Email: baishuwei@gmail.com, buroc@126.com

[-- Attachment #2: Type: text/html, Size: 2511 bytes --]

^ permalink raw reply

* Re: powerpc 440 monitor program
From: David Gibson @ 2007-10-19  3:49 UTC (permalink / raw)
  To: Bai Shuwei; +Cc: linuxppc-dev, linuxppc-embedded
In-Reply-To: <f3566d60710182032h69e2d2bejafd96f18dae95829@mail.gmail.com>

On Fri, Oct 19, 2007 at 11:32:33AM +0800, Bai Shuwei wrote:
> all,
>     hi, when I boot my system, there are no output on the screen.
> I choice the framebuffer and my monitor card is ATI PAGE 128 PCI, qt/e GUI.
> When i compile the kernel, the boot logo and aty 128fb has be compiled into
> kernel patched with videoboot, and x86emu. The video parameters set:
>      video=aty128fb:640x480-16@7.
>     The kenel can recognize the monitor card, and the frame buffer device
> has run.
> The output showed below:
> 
>     videoboot: Booting PCI video card bus 0, function 0, device 6
>    aty128fb: Found Intel x86 BIOS ROM Image
>    aty128fb: Rage128 BIOS located
>    aty128fb: Rage128 RK PCI [chip rev 0x2] 32M 64-bit SDR SGRAM (2:1)
>    fb0: ATY Rage128 frame buffer device on Rage128 RK PCI
> 
> My problem is there is no output on the monitor,and cannot see the penguin
> boot logo.

Uh.. most 440 systems use a serial console.  It ought in theory to be
possible to use a framebuffer instead, but it wouldn't surprise me if
rather a lot of debugging was necessary.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply

* RE: [PATCH v6 4/9] add platform support for MPC837x MDS board
From: Li Yang-r58472 @ 2007-10-19  3:56 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: linuxppc-dev, paulus
In-Reply-To: <20071019104318.edfd2ad1.sfr@canb.auug.org.au>

> -----Original Message-----
> From: Stephen Rothwell [mailto:sfr@canb.auug.org.au]=20
> Sent: Friday, October 19, 2007 8:43 AM
> To: Li Yang-r58472
> Cc: galak@kernel.crashing.org; paulus@samba.org;=20
> linuxppc-dev@ozlabs.org
> Subject: Re: [PATCH v6 4/9] add platform support for MPC837x MDS board

> > +static int __init mpc837x_mds_probe(void) {
> > +        unsigned long root =3D of_get_flat_dt_root();
> > +
> > +        return of_flat_dt_is_compatible(root, "fsl,mpc837xmds");
>=20
> To call these two routines, you should include <asm/prom.h> directly.

<asm/prom.h> is included in <linux/of.h> now.  Do you think it should be
included explicitly?

- Leo

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-19  4:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Herbert Xu, Linux Kernel Mailing List, linuxppc-dev,
	Thomas Gleixner, akpm, Ingo Molnar
In-Reply-To: <alpine.LFD.0.999.0710182007170.26902@woody.linux-foundation.org>

> 
> 	repeat:
> 		/* Optimistic, no-locking loop */
> 		while (desc->status & IRQ_INPROGRESS)
> 			cpu_relax();
> 
> 		/* Ok, that indicated we're done: double-check carefully */
> 		spin_lock_irqsave(&desc->lock, flags);
> 		status = desc->status;
> 		spin_unlock_irqrestore(&desc->lock, flags);
> 
> 		/* Oops, that failed? */
> 		if (status & IRQ_INPROGRESS)
> 			goto repeat;
> 
> Hmm?

Paulus and I convinced ourselves that this would work. If we call our
variable that gets set before synchronize_irq and read in the IRQ
handler "foo", we get to:

	- writing foo can travel down at most to just before the unlock in the
code above

	- reading foo can travel up out of the IRQ handler at most just after
the lock in the code that sets IRQ_INPROGRESS.

The whole lock/set IRQ_INPROGRESS/unlock path can then only happen
before the locked section above, in which case we see and wait nicely
and all is good, or after, in which case the store to foo will be
visible to the IRQ handler as it will be ordered with the unlock in the
code above.

Pfiew !

So Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Thanks !

Ben.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Herbert Xu @ 2007-10-19  4:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, linuxppc-dev, Thomas Gleixner, akpm,
	Ingo Molnar
In-Reply-To: <alpine.LFD.0.999.0710182007170.26902@woody.linux-foundation.org>

On Thu, Oct 18, 2007 at 08:26:45PM -0700, Linus Torvalds wrote:
> 
> 
> On Thu, 18 Oct 2007, Linus Torvalds wrote:
> >
> > I *think* it should work with something like
> > 
> > 	for (;;) {
> > 		smp_rmb();
> > 		if (!spin_is_locked(&desc->lock)) {
> > 			smp_rmb();
> > 			if (!(desc->status & IRQ_INPROGRESS)
> > 				break;
> > 		}
> > 		cpu_relax();
> > 	}
> 
> I'm starting to doubt this. 
> 
> One of the issues is that we still need the smp_mb() in front of the loop 
> (because we want to serialize the loop with any writes in the caller).

Right.  I think if we accept the definition of a spin lock/unlock
that Nick outlined earlier, then we can see that on the IRQ path
there simply isn't a memory barrier between the changing of the
status field and the execution of the action:

	spin_lock
	set IRQ_INPROGRESS
	spin_unlock

	action

	spin_lock
	clear IRQ_INPROGRESS
	spin_unlock

What may happen is that action can either float upwards to give

	spin_lock
	action
	set IRQ_INPROGRESS
	spin_unlock

	spin_lock
	clear IRQ_INPROGRESS
	spin_unlock

or it can float downwards to give

	spin_lock
	set IRQ_INPROGRESS
	spin_unlock

	spin_lock
	clear IRQ_INPROGRESS
	action
	spin_unlock

As a result we can add as many barriers as we want on the slow
(synchronize_irq) path and it just won't do any good (not unless
we add some barriers on the IRQ path == fast path).

What we do have on the right though is the fact in some ways
action behaves as if it's inside the spin lock even though it's
not.  In particular, action cannot float up past the first spin
lock nor can it float down past the last spin unlock.

> See commit fa490cfd15d7ce0900097cc4e60cfd7a76381138 and ponder. Maybe we 
> should take the same approach here, and do something like
> 
> 	repeat:
> 		/* Optimistic, no-locking loop */
> 		while (desc->status & IRQ_INPROGRESS)
> 			cpu_relax();
> 
> 		/* Ok, that indicated we're done: double-check carefully */
> 		spin_lock_irqsave(&desc->lock, flags);
> 		status = desc->status;
> 		spin_unlock_irqrestore(&desc->lock, flags);
> 
> 		/* Oops, that failed? */
> 		if (status & IRQ_INPROGRESS)
> 			goto repeat;

That's why I think this patch is in fact the only one that
solves all the races in this thread.  The case that it solves
which the lock/unlock patch does not is the one where action
flows downwards past the clearing of IRQ_INPROGRESS.  I missed
this case earlier.

In fact this bug exists elsewhere too.  For example, the network
stack does this in net/sched/sch_generic.c:

        /* Wait for outstanding qdisc_run calls. */
	while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
		yield();

This has the same problem as the current synchronize_irq code.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-19  4:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Herbert Xu, Linux Kernel Mailing List, linuxppc-dev,
	Thomas Gleixner, akpm, Ingo Molnar
In-Reply-To: <1192767073.7367.84.camel@pasglop>

> The whole lock/set IRQ_INPROGRESS/unlock path can then only happen
> before the locked section above, in which case we see and wait nicely
> and all is good, or after, in which case the store to foo will be
> visible to the IRQ handler as it will be ordered with the unlock in the
> code above.

Note that napi_synchronize needs a slightly different treatement.

Here, the situation boils down to:

one CPU does:

	foo = 1;
	while(test_bit(bar))
		barrier();

and the other:

	if (!test_and_set_bit(bar)) {
		read & use foo
		smp_mb__before_clear_bit();
		clear_bit(bar);
	}

The good thing here is that read & use foo is part of the critical
section (I hate hand-made locks ...) defined by bar which makes
things somewhat easier than the synchronize_irq() case.

I think a simple smp_mb(); here after foo = 1; is enough, which means
basically just having an smp_mp(); inside napi_synchronize(), before
the test_bit(). Or do I miss something ?

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-19  4:29 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linux Kernel Mailing List, linuxppc-dev, Thomas Gleixner, akpm,
	Linus Torvalds, Ingo Molnar
In-Reply-To: <20071019042025.GA9617@gondor.apana.org.au>


On Fri, 2007-10-19 at 12:20 +0800, Herbert Xu wrote:
> 
> That's why I think this patch is in fact the only one that
> solves all the races in this thread.  The case that it solves
> which the lock/unlock patch does not is the one where action
> flows downwards past the clearing of IRQ_INPROGRESS.  I missed
> this case earlier.
> 
> In fact this bug exists elsewhere too.  For example, the network
> stack does this in net/sched/sch_generic.c:
> 
>         /* Wait for outstanding qdisc_run calls. */
>         while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
>                 yield();
> 
> This has the same problem as the current synchronize_irq code.

The network stack always made be nervous with it's bitops use as locks.
Any loop of test_bit() alone as a synchronisation method, without locks
or barriers is, imho, inherently buggy.

Ben.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-19  4:35 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linux Kernel Mailing List, linuxppc-dev, Thomas Gleixner, akpm,
	Linus Torvalds, Ingo Molnar
In-Reply-To: <20071019042025.GA9617@gondor.apana.org.au>

> What may happen is that action can either float upwards to give
> 
> 	spin_lock
> 	action
> 	set IRQ_INPROGRESS
> 	spin_unlock
> 
> 	spin_lock
> 	clear IRQ_INPROGRESS
> 	spin_unlock
> 
> or it can float downwards to give
> 
> 	spin_lock
> 	set IRQ_INPROGRESS
> 	spin_unlock
> 
> 	spin_lock
> 	clear IRQ_INPROGRESS
> 	action
> 	spin_unlock
> 

Well, we are generally safer here. That is, unless action is a store,
it will not pass spin_lock, at least not on powerpc afaik.

In fact, if action, as it is in our case, is something like

if (foo)
	return;

We cant execute the store inside spin_lock() without having loaded
foo, there is an implicit dependency here.

But anyway, Linus patch fixes that too if it was a problem. Now if
we grep for while (test_bit and while (!test_bit I'm sure we'll find
other similar surprises.

That's also one of the reasons why I _love_ nick patches that add a
proper lock/unlock API on bits, because at least those who are doing
those hand-made pseudo-locks with bitops to save space will be
getting a proper lock/unlock API, no more excuse.

The network stack is doing more fancy things so it's harder (though I
wonder sometimes if it's really worth the risks taken for not using
spinlocks... maybe).

Ben.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Herbert Xu @ 2007-10-19  4:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, linuxppc-dev, Thomas Gleixner, akpm,
	Ingo Molnar
In-Reply-To: <20071019042025.GA9617@gondor.apana.org.au>

On Fri, Oct 19, 2007 at 12:20:25PM +0800, Herbert Xu wrote:
>
> That's why I think this patch is in fact the only one that
> solves all the races in this thread.  The case that it solves
> which the lock/unlock patch does not is the one where action
> flows downwards past the clearing of IRQ_INPROGRESS.  I missed
> this case earlier.

OK, here is the patch again with a changelog:

[IRQ]: Fix synchronize_irq races with IRQ handler

As it is some callers of synchronize_irq rely on memory barriers
to provide synchronisation against the IRQ handlers.  For example,
the tg3 driver does

	tp->irq_sync = 1;
	smp_mb();
	synchronize_irq();

and then in the IRQ handler:

	if (!tp->irq_sync)
		netif_rx_schedule(dev, &tp->napi);

Unfortunately memory barriers only work well when they come in
pairs.  Because we don't actually have memory barriers on the
IRQ path, the memory barrier before the synchronize_irq() doesn't
actually protect us.

In particular, synchronize_irq() may return followed by the
result of netif_rx_schedule being made visible.

This patch (mostly written by Linus) fixes this by using spin
locks instead of memory barries on the synchronize_irq() path.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 80eab7a..1f31422 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -29,12 +29,28 @@
 void synchronize_irq(unsigned int irq)
 {
 	struct irq_desc *desc = irq_desc + irq;
+	unsigned int status;
 
 	if (irq >= NR_IRQS)
 		return;
 
-	while (desc->status & IRQ_INPROGRESS)
-		cpu_relax();
+	do {
+		unsigned long flags;
+
+		/*
+		 * Wait until we're out of the critical section.  This might
+		 * give the wrong answer due to the lack of memory barriers.
+		 */
+		while (desc->status & IRQ_INPROGRESS)
+			cpu_relax();
+
+		/* Ok, that indicated we're done: double-check carefully. */
+		spin_lock_irqsave(&desc->lock, flags);
+		status = desc->status;
+		spin_unlock_irqrestore(&desc->lock, flags);
+
+		/* Oops, that failed? */
+	} while (status & IRQ_INPROGRESS);
 }
 EXPORT_SYMBOL(synchronize_irq);
 

^ permalink raw reply related

* Re: [PATCH v6 4/9] add platform support for MPC837x MDS board
From: Stephen Rothwell @ 2007-10-19  4:48 UTC (permalink / raw)
  To: Li Yang-r58472; +Cc: linuxppc-dev, paulus
In-Reply-To: <989B956029373F45A0B8AF0297081890019B5C3A@zch01exm26.fsl.freescale.net>

[-- Attachment #1: Type: text/plain, Size: 474 bytes --]

On Fri, 19 Oct 2007 11:56:34 +0800 "Li Yang-r58472" <LeoLi@freescale.com> wrote:
>
> <asm/prom.h> is included in <linux/of.h> now.  Do you think it should be
> included explicitly?

Yes.  It may not be included by linux/of.h in the future.

(My general opinion is that if you use something, you should explictly
include the file that defines/declares it.)

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Nick Piggin @ 2007-10-19  4:49 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-kernel, linuxppc-dev, tglx, akpm, torvalds, mingo
In-Reply-To: <E1IiiX6-0002V0-00@gondolin.me.apana.org.au>

On Friday 19 October 2007 13:28, Herbert Xu wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> >> First of all let's agree on some basic assumptions:
> >>
> >> * A pair of spin lock/unlock subsumes the effect of a full mb.
> >
> > Not unless you mean a pair of spin lock/unlock as in
> > 2 spin lock/unlock pairs (4 operations).
> >
> > *X = 10;
> > spin_lock(&lock);
> > /* *Y speculatively loaded here */
> > /* store to *X leaves CPU store queue here */
> > spin_unlock(&lock);
> > y = *Y;
>
> Good point.
>
> Although in this case we're still safe because in the worst
> cases:
>
> CPU0				CPU1
> irq_sync = 1
> synchronize_irq
> 	spin lock
> 	load IRQ_INPROGRESS
> irq_sync sync is visible
> 	spin unlock
> 				spin lock
> 					load irq_sync
> 	while (IRQ_INPROGRESS)
> 		wait
> 	return
> 				set IRQ_INPROGRESS
> 				spin unlock
> 				tg3_msi
> 					ack IRQ
> 					if (irq_sync)
> 						return
> 				spin lock
> 				clear IRQ_INPROGRESS
> 				spin unlock
>
> ------------------------------------------------------------
>
> CPU0				CPU1
> 				spin lock
> 					load irq_sync
> irq_sync = 1
> synchronize_irq
> 				set IRQ_INPROGRESS
> 				spin unlock
> 	spin lock
> 	load IRQ_INPROGRESS
> irq_sync sync is visible
> 	spin unlock
> 	while (IRQ_INPROGRESS)
> 		wait
> 				tg3_msi
> 					ack IRQ
> 					if (irq_sync)
> 						return
> 					do work
> 				spin lock
> 				clear IRQ_INPROGRESS
> 				spin unlock
> 	return
>
> So because we're using the same lock on both sides, it does
> do the right thing even without the memory barrier.


Yeah, if you've got the lock on both sides there, then I
agree it will be correct.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-19  4:58 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Linux Kernel Mailing List, linuxppc-dev, Thomas Gleixner, akpm,
	Linus Torvalds, Ingo Molnar
In-Reply-To: <20071019044806.GA10080@gondor.apana.org.au>


On Fri, 2007-10-19 at 12:48 +0800, Herbert Xu wrote:

> [IRQ]: Fix synchronize_irq races with IRQ handler
> 
> As it is some callers of synchronize_irq rely on memory barriers
> to provide synchronisation against the IRQ handlers.  For example,
> the tg3 driver does
> 
> 	tp->irq_sync = 1;
> 	smp_mb();
> 	synchronize_irq();
> 
> and then in the IRQ handler:
> 
> 	if (!tp->irq_sync)
> 		netif_rx_schedule(dev, &tp->napi);
> 
> Unfortunately memory barriers only work well when they come in
> pairs.  Because we don't actually have memory barriers on the
> IRQ path, the memory barrier before the synchronize_irq() doesn't
> actually protect us.
> 
> In particular, synchronize_irq() may return followed by the
> result of netif_rx_schedule being made visible.
> 
> This patch (mostly written by Linus) fixes this by using spin
> locks instead of memory barries on the synchronize_irq() path.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Good for me.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Cheers,
Ben.

^ permalink raw reply

* [PATCH] NULL terminate the pci_device_ids in pasemi_edac
From: Stephen Rothwell @ 2007-10-19  5:07 UTC (permalink / raw)
  To: Olof Johansson; +Cc: ppc-dev, Andrew Morton

Fixes:
drivers/edac/pasemi_edac: struct pci_device_id is 32 bytes.  The last of 1 is:
0x00 0x00 0x19 0x59 0x00 0x00 0xa0 0x0a 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
FATAL: drivers/edac/pasemi_edac: struct pci_device_id is not terminated with a NULL entry!

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
 drivers/edac/pasemi_edac.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index e66cdd4..9007d06 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -270,6 +270,7 @@ static void __devexit pasemi_edac_remove(struct pci_dev *pdev)
 
 static const struct pci_device_id pasemi_edac_pci_tbl[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_PASEMI, 0xa00a) },
+	{ }
 };
 
 MODULE_DEVICE_TABLE(pci, pasemi_edac_pci_tbl);
-- 
1.5.3.4

^ permalink raw reply related

* Tiny login: user add command
From: pjmaiya @ 2007-10-19  5:23 UTC (permalink / raw)
  To: linuxppc-embedded

[-- Attachment #1: Type: text/plain, Size: 800 bytes --]

Hi,
 I am using montavista linux. Using TCT tool i have added package for user creation. I am having following problem
  a.. If I use tiny login package, I will be getting useradd binary but number of parameter are few like
 Usage: adduser [OPTIONS]... <USER> 
Options:

  -h <directory>    specify home directory
  -s <shell>        specify shell
  -g <gecos>        specify GECOS stringa.. If I don't use tiny login package, I will be selecting useradd package from admin menu. But I am unable to execute this command since it gives follwing error
      /usr/sbin/groupadd missing..

  a.. Actually I want useradd command similar to linux where more argument are taken, especially I wanted 'user' to be part of more than one group.
can anyone help me out..

thanx in advance,
pjmaiya

[-- Attachment #2: Type: text/html, Size: 1818 bytes --]

^ permalink raw reply

* Re: Merge dtc
From: Milton Miller @ 2007-10-19  5:34 UTC (permalink / raw)
  To: David Gibson; +Cc: ppcdev, Paul Mackerras, Sam Ravnborg
In-Reply-To: <20071019013048.GC30283@localhost.localdomain>

On Oct 18, 2007, at 8:30 PM, David Gibson wrote:

> On Thu, Oct 18, 2007 at 12:49:54PM -0500, Milton Miller wrote:
>> On Tue Oct 16 15:02:17 EST 2007, David Gibson wrote:
>>> Too big for the list, full patch at
>>> http://ozlabs.org/~dgibson/home/merge-dtc.patch+
>>
>> So split it up.   The obvious one is "here is the unique content, then
>> copy these files from dtc git" would have been better.
>
> *facepalm*
>
> The point of the small patches thing is not to glom together logically
> separate patches.  It's *not* to split patches up simply for the sake
> of making them smaller.

The point of posting on the list is to encourage review before merging. 
  Hiding them behind a download means a lot fewer people are looking at 
the code.

> You've suggested the closest thing to a logical split here, but even
> then - the dtc files are dead without the Makefile changes, and the
> Makefile changes won't build without the dtc files (so the patches
> would have to go in reversed order to what you suggest).
>
> Not to mention that the Makefile patch will be tiny and the raw import
> will still be way too big for the list.

I'm talking about split for review, not necessarily merge.

We split by function for review so that different people pay closer 
attention to their areas of intrest and speciality.  This helps prevent 
people being distracted by the bulk.

The files that are verbatim from the other project have been reviewed 
in that project (at least in theory), and therefore are not likely to 
draw comments.   Especially since they are (1) shadows from another 
project and (2) host files, there is more flexibility and less review 
required, eg relaxed coding standards, correctness already tested, and 
in this case multiple platform testing.

Sam's suggestion to split the generated files was because they require 
no review other than for damage.

In contrast, the files such as the make rules are original and unique 
to this import, and therefore draw comments during review.  If you 
weren't asking for a review, why did you post to the mailing list?

Oh, and the files being in the tree but dead is fine from a bisect 
standpoint.  The patch can be reverted if the makefile doesn't go in 
(not that a maintainer would commit them separately).

>> I finally went and looked at the url.  The Kbuild integration is 
>> wrong.
>
> Wrong?  Not how you'd prefer them perhaps...

Wrong in that its unlike every other program that Kbuild makes.

>>   It should build dtc in dtc-src and run the binary from there, and 
>> the
>> rules should be in a Kconfig as a normal host-target in that 
>> directory.
>
> Why?  This approach has the advantage that the subdirectory contains
> *only* verbatim imported files, which could well make updates simpler.

The file name Kbuild is unlikely to conflict with a file from that 
other repository.  It doesn't contain all the files in that repository 
(or even all in a subdirectory) so there is some filtering going on 
anyways.

>    (I don't have a problem with that Kbuild including Makefie.dtc).
>>
>> things like the dependancy on scripts_basic in the top level
>> Makefile.
>
> I'm not even sure what you mean by this.

I was trying to give a hint where you could find "compile programs in 
that directory so I can run them here" for you to draw upon.

^ permalink raw reply

* [NET]: Fix possible dev_deactivate race condition
From: Herbert Xu @ 2007-10-19  5:36 UTC (permalink / raw)
  To: David S. Miller
  Cc: Linux Kernel Mailing List, linuxppc-dev, Thomas Gleixner, netdev,
	akpm, Linus Torvalds, Ingo Molnar
In-Reply-To: <20071019042025.GA9617@gondor.apana.org.au>

On Fri, Oct 19, 2007 at 12:20:25PM +0800, Herbert Xu wrote:
>
> In fact this bug exists elsewhere too.  For example, the network
> stack does this in net/sched/sch_generic.c:
> 
>         /* Wait for outstanding qdisc_run calls. */
> 	while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> 		yield();
> 
> This has the same problem as the current synchronize_irq code.

OK here is the fix for that case.

[NET]: Fix possible dev_deactivate race condition

The function dev_deactivate is supposed to only return when
all outstanding transmissions have completed.  Unfortunately
it is possible for store operations in the driver's transmit
function to only become visible after dev_deactivate returns.

This patch fixes this by taking the queue lock after we see
the end of the queue run.  This ensures that all effects of
any previous transmit calls are visible.

If however we detect that there is another queue run occuring,
then we'll warn about it because this should never happen as
we have pointed dev->qdisc to noop_qdisc within the same queue
lock earlier in the functino.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index e01d576..b3b7420 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -556,6 +556,7 @@ void dev_deactivate(struct net_device *dev)
 {
 	struct Qdisc *qdisc;
 	struct sk_buff *skb;
+	int running;
 
 	spin_lock_bh(&dev->queue_lock);
 	qdisc = dev->qdisc;
@@ -571,12 +572,31 @@ void dev_deactivate(struct net_device *dev)
 
 	dev_watchdog_down(dev);
 
-	/* Wait for outstanding dev_queue_xmit calls. */
+	/* Wait for outstanding qdisc-less dev_queue_xmit calls. */
 	synchronize_rcu();
 
 	/* Wait for outstanding qdisc_run calls. */
-	while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
-		yield();
+	do {
+		while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
+			yield();
+
+		/*
+		 * Double-check inside queue lock to ensure that all effects
+		 * of the queue run are visible when we return.
+		 */
+		spin_lock_bh(&dev->queue_lock);
+		running = test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state);
+		spin_unlock_bh(&dev->queue_lock);
+
+		/*
+		 * The running flag should never be set at this point because
+		 * we've already set dev->qdisc to noop_qdisc *inside* the same
+		 * pair of spin locks.  That is, if any qdisc_run starts after
+		 * our initial test it should see the noop_qdisc and then
+		 * clear the RUNNING bit before dropping the queue lock.  So
+		 * if it is set here then we've found a bug.
+		 */
+	} while (WARN_ON_ONCE(running));
 }
 
 void dev_init_scheduler(struct net_device *dev)

^ permalink raw reply related

* Re: [NET]: Fix possible dev_deactivate race condition
From: David Miller @ 2007-10-19  5:38 UTC (permalink / raw)
  To: herbert; +Cc: linux-kernel, linuxppc-dev, tglx, netdev, akpm, torvalds, mingo
In-Reply-To: <20071019053624.GA10560@gondor.apana.org.au>

From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 19 Oct 2007 13:36:24 +0800

> [NET]: Fix possible dev_deactivate race condition
> 
> The function dev_deactivate is supposed to only return when
> all outstanding transmissions have completed.  Unfortunately
> it is possible for store operations in the driver's transmit
> function to only become visible after dev_deactivate returns.
> 
> This patch fixes this by taking the queue lock after we see
> the end of the queue run.  This ensures that all effects of
> any previous transmit calls are visible.
> 
> If however we detect that there is another queue run occuring,
> then we'll warn about it because this should never happen as
> we have pointed dev->qdisc to noop_qdisc within the same queue
> lock earlier in the functino.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Applied, thanks Herbert!

^ permalink raw reply

* Re: Merge dtc
From: Milton Miller @ 2007-10-19  5:56 UTC (permalink / raw)
  To: David Gibson; +Cc: ppcdev, Sam Ravnborg, Paul Mackerras
In-Reply-To: <20071019014510.GD30283@localhost.localdomain>

On Oct 18, 2007, at 8:45 PM, David Gibson wrote:
> On Thu, Oct 18, 2007 at 09:59:26PM +0200, Sam Ravnborg wrote:
>> On Thu, Oct 18, 2007 at 12:49:54PM -0500, Milton Miller wrote:
>>> On Tue Oct 16 15:02:17 EST 2007, David Gibson wrote:
>>>
>>>> This very large patch incorporates a copy of dtc into the kernel
>>>> source, in arch/powerpc/boot/dtc-src.  This means that dtc is no
>>>> longer an external dependency to build kernels with configurations
>>>> which need a dtb file.
>
>> As Milton already pointed out you should build dtc in the
>> dtc directory (why the -src prefix??).
>
> The -src suffix is only there because I'm not building in the
> directory - we can't have both a dtc binary and a dtc directory in
> arch/powerpc/boot.

So run the dtc binary stored in the sub directory.  Thats what we do 
elsewhere.

> Ok, so how do I build in the subdirectory?  I was going to do that,
> but couldn't for the life of me figure out how.

Documentation/kbuild/makefiles.txt  6.4 boot images:

         "$(Q)$(MAKE) $(build)=<dir>" is the recommended way to invoke
         make in a subdirectory.

Section 4 Host Program Support is also relavent, and mentions $(always).


>> And the dtc specific Makefile looks like something from
>> the late 80'. Please drop all these ALLUPPERCASE variables
>> and accept a little bit of redundancy.
>
> Hrm... I'm pretty dubious about this.  Practically every Makefile in
> the universe, *except* Kbuild uses uppercase for most variables.
> Makefile.dtc is imported verbatim from the standalone dtc package, and
> is supposed to have the minimal information about what needs to be
> built to import into Makefiles that actually know how to build things.
>
>> Then mere humans may be able to read the Makefile.
>
> Says a maintainer of Kbuild, about my tiny and not-very-complex
> Makefile fragment... um, ok...

overley complex calls to override source, conditional rules based on 
shipped files?  Its not a trivial fragment.

milton

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Herbert Xu @ 2007-10-19  5:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Linux Kernel Mailing List, linuxppc-dev, Thomas Gleixner, akpm,
	Linus Torvalds, Ingo Molnar
In-Reply-To: <1192768014.7367.96.camel@pasglop>

On Fri, Oct 19, 2007 at 02:26:54PM +1000, Benjamin Herrenschmidt wrote:
>
> I think a simple smp_mb(); here after foo = 1; is enough, which means
> basically just having an smp_mp(); inside napi_synchronize(), before
> the test_bit(). Or do I miss something ?

Yes I think you're right.  In this case we do have barriers
everywhere else so this should work.

Although if you want napi_synchronize to have the property that
when it returns all NAPI processing effects are visible then
you'd need another smp_mb() after the loop.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 1/2] qemu platform, v2
From: Rob Landley @ 2007-10-19  6:28 UTC (permalink / raw)
  To: Grant Likely
  Cc: Milton Miller, linuxppc-dev, Paul Mackerras, Christoph Hellwig
In-Reply-To: <fa686aa40710181029o6852b1b0pf6259f27cabfd085@mail.gmail.com>

On Thursday 18 October 2007 12:29:08 pm Grant Likely wrote:
> On 10/18/07, Milton Miller <miltonm@bga.com> wrote:
> > If we say only some boards or ports are special and need to build then
> > I would vote for shipping asm files.  If we think we need to build any
> > random embedded platform without installing dtc then we should merge
> > dtc.
>
> I don't think we do.  It's looking like there are going to be out of
> tree users of dtc also (The are some patches floating around for
> u-boot to use the device tree for it's own initialization).  I don't
> think it's unreasonable to install dtc for embedded development.

There are out of tree users of yacc and lex, but the kernel has *.c_shipped 
files so as not to require that external tool of people who simply want to 
build the sucker without modifying it.  "make oldconfig" was modified not to 
require curses.

How about *.S_shipped files?

> I like the idea of shipping asm files to support the qemu target; at
> least until qemu gets better firmware.

I've now gotten qemu to boot a kernel I built, using Milton's 1/2 patch and a 
a self-contained build of a 4k ppc_boot.rom (which includes the dtc source 
because I don't expect anybody else to have it installed, either).  
Description and links to source, binaries, and build scripts in this message:

http://lists.gnu.org/archive/html/qemu-devel/2007-10/msg00415.html

There's a problem with qemu-cvs (detailed in the message), but qemu-system-ppc 
0.9.0 boots fine, up to a shell prompt.

> Cheers,
> g.

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.

^ permalink raw reply

* Re: [PATCH] NULL terminate the pci_device_ids in pasemi_edac
From: Olof Johansson @ 2007-10-19  6:47 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: ppc-dev, Andrew Morton
In-Reply-To: <20071019150722.8d48eb1d.sfr@canb.auug.org.au>

On Fri, Oct 19, 2007 at 03:07:22PM +1000, Stephen Rothwell wrote:
> Fixes:
> drivers/edac/pasemi_edac: struct pci_device_id is 32 bytes.  The last of 1 is:
> 0x00 0x00 0x19 0x59 0x00 0x00 0xa0 0x0a 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
> 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
> FATAL: drivers/edac/pasemi_edac: struct pci_device_id is not terminated with a NULL entry!
> 
> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>

Acked-by: Olof Johansson <olof@lixom.net>

Thanks for fixing this, Stephen. ID lists without termination can really
cause weird problems at the most random times since it depends on how
the kernel is built whether they do any (immediate) harm or not.


-Olof

^ permalink raw reply

* Re: Merge dtc
From: David Gibson @ 2007-10-19  6:55 UTC (permalink / raw)
  To: Milton Miller; +Cc: ppcdev, Sam Ravnborg, Paul Mackerras
In-Reply-To: <1b5c57d93e45d94d007b543ea5e2de02@bga.com>

On Fri, Oct 19, 2007 at 12:56:41AM -0500, Milton Miller wrote:
> On Oct 18, 2007, at 8:45 PM, David Gibson wrote:
> > On Thu, Oct 18, 2007 at 09:59:26PM +0200, Sam Ravnborg wrote:
> >> On Thu, Oct 18, 2007 at 12:49:54PM -0500, Milton Miller wrote:
> >>> On Tue Oct 16 15:02:17 EST 2007, David Gibson wrote:
> >>>
> >>>> This very large patch incorporates a copy of dtc into the kernel
> >>>> source, in arch/powerpc/boot/dtc-src.  This means that dtc is no
> >>>> longer an external dependency to build kernels with configurations
> >>>> which need a dtb file.
> >
> >> As Milton already pointed out you should build dtc in the
> >> dtc directory (why the -src prefix??).
> >
> > The -src suffix is only there because I'm not building in the
> > directory - we can't have both a dtc binary and a dtc directory in
> > arch/powerpc/boot.
> 
> So run the dtc binary stored in the sub directory.  Thats what we do 
> elsewhere.

Yes, yes, which boils down to your other point of building in the
subdirectory.  It's not a separate issue.

> > Ok, so how do I build in the subdirectory?  I was going to do that,
> > but couldn't for the life of me figure out how.
> 
> Documentation/kbuild/makefiles.txt  6.4 boot images:
> 
>          "$(Q)$(MAKE) $(build)=<dir>" is the recommended way to invoke
>          make in a subdirectory.

But where does it go?  To have somewhere to put that rule, we'd still
need a target in arch/powerpc/boot itself.

> Section 4 Host Program Support is also relavent, and mentions $(always).

I know I tried using $(always), but I couldn't figure out how to make
it do something useful.

> >> And the dtc specific Makefile looks like something from
> >> the late 80'. Please drop all these ALLUPPERCASE variables
> >> and accept a little bit of redundancy.
> >
> > Hrm... I'm pretty dubious about this.  Practically every Makefile in
> > the universe, *except* Kbuild uses uppercase for most variables.
> > Makefile.dtc is imported verbatim from the standalone dtc package, and
> > is supposed to have the minimal information about what needs to be
> > built to import into Makefiles that actually know how to build things.
> >
> >> Then mere humans may be able to read the Makefile.
> >
> > Says a maintainer of Kbuild, about my tiny and not-very-complex
> > Makefile fragment... um, ok...
> 
> overley complex calls to override source, 

I'm not sure what you're referring to...  even if you mean in
Makefile.dtc or in the diff to arch/powerpc/boot/Makefile.

>conditional rules based on 
> shipped files?  

Or here, for sure.  The shipped/lex/yacc rules are just based on those
from Kconfig and genksyms.

> Its not a trivial fragment.

Compared to the behemoth that is Kbuild...

I'm happy to improve the Makefile integration, but you seem to be
rather vague on how, and the Kbuild documentation makes my brain hurt.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox