LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [microblaze-uclinux] RE: [PATCH v3] Device tree bindings for Xilinx devices
From: Michal Simek @ 2007-10-20  7:05 UTC (permalink / raw)
  To: Grant Likely
  Cc: Leonid, Arnd Bergmann, microblaze-uclinux, linuxppc-dev,
	Wolfgang Reissnegger
In-Reply-To: <fa686aa40710192247u294b9329p538bbaf715610040@mail.gmail.com>

Hi Grant,

>> Hi Steve and all,
>> >Here's a full .dts generated using an updated version of
>> >gen_mhs_devtree.py, following the proposal.
>> >It happens to be a microblaze system, but you get the idea.
>>
>> I think that is no good idea generate dts with all information.
>> Especially information about PVR - number 2 means - Full PVR and you can
>> obtain information directly from PVR. It is waste of memory space.
>>                         xilinx,pvr = <2>;
>>
>> In my opinion will be better generate only parameters which you want not all.
>> That smells with unusable parameters.
>
>That is the hard part about crafting the dts; deciding which
>parameters the OS is going to care about and which ones are
>irrelevant.  The goal is to sufficiently and uniquely describe the
>board so that the OS can easily figure out what exactly what it needs
>to do to drive the board, but not to try and describe every aspect
>which it knows about.  Anything that is pollable (ie. USB devices)
>doesn't need to be in the tree.
>
>It's also important to remember that the device tree will *never* be
>perfect.  Eventually something will come up that the device tree
>doesn't describe well(a bug/quirk, something described poorly, etc).
>In this case, as long as the device tree is specific enough to
>identify which version of the board it is running on; we can alwasy
>add platform specific fixups for that unique system.
>

yes, but I think that is the right time to specify which parameters are relevant.
I will focus on in EDK renerator.

Michal Simek

^ permalink raw reply

* Re: [BUG] powerpc does not save msi state [was Re: [PATCH 5/7] pci: Export the pci_restore_msi_state() function
From: Michael Ellerman @ 2007-10-20  6:43 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, mcarlson, linuxppc-dev, mchan, linux-pci
In-Reply-To: <20071019.175308.54212640.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 1576 bytes --]


On Fri, 2007-10-19 at 17:53 -0700, David Miller wrote:
> From: linas@austin.ibm.com (Linas Vepstas)
> Date: Fri, 19 Oct 2007 19:46:10 -0500
> 
> > FWIW, it looks like not all that many arches do this; the output
> > for grep -r address_hi * is pretty thin. Then, looking at
> > i386/kernel/io_apic.c as an example, one can see that the 
> > msi state save happens "by accident" if CONFIG_SMP is enabled;
> > and so its surely broekn on uniprocesor machines.
> 
> I don't see this, in all cases write_msi_msg() will transfer
> the given "*msg" to entry->msg by this assignment in
> drivers/pci/msi.c:
> 
> void write_msi_msg(unsigned int irq, struct msi_msg *msg)
> {
>  ...
> 	entry->msg = *msg;
> }
> 
> So as long as write_msi_msg() is invoked, it will be saved
> properly.
> 
> Platforms need not do this explicitly.

I'm short on context here, and it's Saturday, so excuse me if I'm
missing the point somewhere.

On pseries machines we don't call write_msi_msg(), because we don't
control the contents of the message, firmware does. So entry->msg will
be bogus.

That's a pity, but AFAIK it shouldn't be a problem because we don't
enable CONFIG_PM on those machines anyway. If we ever want to we'll need
to sort out with firmware how that will work WRT restoring MSI state.

cheers

-- 
Michael Ellerman
OzLabs, IBM Australia Development Lab

wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-20  6:13 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linuxppc-dev list, akpm, Linus Torvalds, Linux Kernel list
In-Reply-To: <200710200806.41562.maximlevitsky@gmail.com>


On Sat, 2007-10-20 at 08:06 +0200, Maxim Levitsky wrote:

> 	/* Disable interrupts, DMA, and rest of the chip*/
> 	saa_writel(SAA7134_IRQ1, 0);
> 	saa_writel(SAA7134_IRQ2, 0);
> 	saa_writel(SAA7134_MAIN_CTRL, 0);
	
> 	dev->insuspend = 1;
> 	synchronize_irq(pci_dev->irq);
> 
> 	/* ACK pending interrupts just in case*/
> 	saa_writel(SAA7134_IRQ_REPORT,saa_readl(SAA7134_IRQ_REPORT));
> 
> 	......
> This should be bullet-proof.

Hopefully :-)

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Maxim Levitsky @ 2007-10-20  6:06 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev list, akpm, Linus Torvalds, Linux Kernel list
In-Reply-To: <1192859184.6745.8.camel@pasglop>

On Saturday 20 October 2007 07:46:24 Benjamin Herrenschmidt wrote:
> 
> > I probably need to add this synchronize_irq() logic in dmfe.c too, but I probably do it later,
> > I think I am overestimating this race, since most drivers don't do dev->insuspend checks in IRQ handler.
> > Maybe even just use free_irq() after all....
> 
> Most drivers are probably underestimating the race :-)
> 
> free_irq() would work provided that you did the masking on chip before
> (and unmask only after request_irq on the way back in). But it's a bit
> like using a 10 tons truck to crush an ant... 
Agreed.

So, I will add synchronize_irq() to both saa7134, and dmfe, the two drivers that their .suspend/.resume 
routines were written by me.

I already added a synchronize_irq()  plus few more fixes to the driver , but those patches are still in v4l tree.

I now has this:

	saa_writel(SAA7134_IRQ1, 0);
	saa_writel(SAA7134_IRQ2, 0);
	saa_writel(SAA7134_MAIN_CTRL, 0);

	synchronize_irq(pci_dev->irq);
	dev->insuspend = 1;

and I will probably need (with the synchronize_irq patch applied)


	/* Disable interrupts, DMA, and rest of the chip*/
	saa_writel(SAA7134_IRQ1, 0);
	saa_writel(SAA7134_IRQ2, 0);
	saa_writel(SAA7134_MAIN_CTRL, 0);
	
	dev->insuspend = 1;
	synchronize_irq(pci_dev->irq);

	/* ACK pending interrupts just in case*/
	saa_writel(SAA7134_IRQ_REPORT,saa_readl(SAA7134_IRQ_REPORT));

	......
This should be bullet-proof.


> 
> Ben.
> 
> 
> 
Best regards,
	Maxim Levitsky

^ permalink raw reply

* Re: [microblaze-uclinux] RE: [PATCH v3] Device tree bindings for Xilinx devices
From: Grant Likely @ 2007-10-20  5:47 UTC (permalink / raw)
  To: Michal Simek
  Cc: Leonid, Arnd Bergmann, microblaze-uclinux, linuxppc-dev,
	Wolfgang Reissnegger
In-Reply-To: <1300.2720-24480-247293449-1192847306@seznam.cz>

On 10/19/07, Michal Simek <Monstr@seznam.cz> wrote:
> Hi Steve and all,
> >Here's a full .dts generated using an updated version of
> >gen_mhs_devtree.py, following the proposal.
> >It happens to be a microblaze system, but you get the idea.
>
> I think that is no good idea generate dts with all information.
> Especially information about PVR - number 2 means - Full PVR and you can
> obtain information directly from PVR. It is waste of memory space.
>                         xilinx,pvr = <2>;
>
> In my opinion will be better generate only parameters which you want not all.
> That smells with unusable parameters.

That is the hard part about crafting the dts; deciding which
parameters the OS is going to care about and which ones are
irrelevant.  The goal is to sufficiently and uniquely describe the
board so that the OS can easily figure out what exactly what it needs
to do to drive the board, but not to try and describe every aspect
which it knows about.  Anything that is pollable (ie. USB devices)
doesn't need to be in the tree.

It's also important to remember that the device tree will *never* be
perfect.  Eventually something will come up that the device tree
doesn't describe well(a bug/quirk, something described poorly, etc).
In this case, as long as the device tree is specific enough to
identify which version of the board it is running on; we can alwasy
add platform specific fixups for that unique system.

-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
grant.likely@secretlab.ca
(403) 399-0195

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-20  5:46 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linuxppc-dev list, akpm, Linus Torvalds, Linux Kernel list
In-Reply-To: <200710200736.22129.maximlevitsky@gmail.com>



> I probably need to add this synchronize_irq() logic in dmfe.c too, but I probably do it later,
> I think I am overestimating this race, since most drivers don't do dev->insuspend checks in IRQ handler.
> Maybe even just use free_irq() after all....

Most drivers are probably underestimating the race :-)

free_irq() would work provided that you did the masking on chip before
(and unmask only after request_irq on the way back in). But it's a bit
like using a 10 tons truck to crush an ant... 

Ben.

^ permalink raw reply

* Re: [PATCH v3] Device tree bindings for Xilinx devices
From: Grant Likely @ 2007-10-20  5:38 UTC (permalink / raw)
  To: Stephen Neuendorffer
  Cc: linuxppc-dev, Leonid, Wolfgang Reissnegger, Arnd Bergmann,
	microblaze-uclinux
In-Reply-To: <20071019234347.38C1111C006B@mail3-dub.bigfish.com>

On 10/19/07, Stephen Neuendorffer <stephen.neuendorffer@xilinx.com> wrote:
>
> Here's a full .dts generated using an updated version of
> gen_mhs_devtree.py, following the proposal.
> It happens to be a microblaze system, but you get the idea.
>
> Grant: Is this pretty what you intend?

Pretty close; comments below on how we still need to change gen_mhs_devtree.py

BTW, thanks for doing this.

Cheers,
g.

>
> Steve
>
> / {
>         #address-cells = <1>;
>         #size-cells = <1>;
>         compatible = "ibm,plb4";

Not quite; the bus itself needs to be one level deeper.  Compatible
here is refering to the platform itself, not the bus and so should be
the actual board name or something similar.  Maybe something like:
"xlnx,ml403","xlnx,generic-virtex4";

All devices should be children or grandchildren of a 'plb' node.

>         model = "system.mhs";
Should be the model name of the board in the form "<mfg>,<model>".  It
might be appropriate to also have a property which describes the
version of the FPGA build or some other way to identify where this
FPGA bitstream came from.  I'll need to think about this some more.

>         Ethernet_MAC {
>                 compatible =
> "xilinx,opb-ethernet-1.04.a\0xilinx,opb-ethernet";

Yes; this is the idea; but I don't like "xilinx,opb-ethernet".  I
think we should always specify specific versions and not try to guess
what the 'generic' device compatible interface is.

Also, dtc now accepts the form "string1","string2".  the embedded '\0'
is no longer needed.

>                 device_type = "opb_ethernet";
device_type = "network";

>                 interrupt-parent = <101>;
interrupt-parent = <&opb_intc0>;   (and a label needs to be added to
the interrupt node).

>                 interrupts = < 1 0 >;
>                 reg = < 40c00000 10000 >;
>                 xilinx,cam-exist = <0>;
>                 xilinx,dev-blk-id = <0>;
>                 xilinx,dev-mir-enable = <0>;
>                 xilinx,dma-present = <1>;
>                 xilinx,include-dev-pencoder = <0>;
>                 xilinx,ipif-rdfifo-depth = <4000>;
>                 xilinx,ipif-wrfifo-depth = <4000>;
>                 xilinx,jumbo-exist = <0>;
>                 xilinx,mac-fifo-depth = <10>;
>                 xilinx,mii-exist = <1>;
>                 xilinx,opb-clk-period-ps = <2710>;
>                 xilinx,reset-present = <1>;
>                 xilinx,rx-dre-type = <0>;
>                 xilinx,rx-include-csum = <0>;
>                 xilinx,tx-dre-type = <0>;
>                 xilinx,tx-include-csum = <0>;

I got a comment from one of the ppc folks that we should use the stock
ticker abbreviation 'xlnx,' here instead of 'xilinx,'

>         } ;
>         IIC_EEPROM {

Node name needs to be unique.  Convention is to use the form
device@address.  We could either use the instance name or the IP core
name here; I'm not sure which is best.

>                 compatible = "xilinx,opb-iic-1.02.a\0xilinx,opb-iic";
>                 device_type = "opb_iic";

device_type needs to be dropped from this node because i2c is not a
standard device type.

>                 interrupt-parent = <101>;
>                 interrupts = < 2 0 >;
>                 reg = < 40800000 10000 >;
>                 xilinx,clk-freq = <5f5e100>;
>                 xilinx,iic-freq = <186a0>;
>                 xilinx,ten-bit-adr = <0>;

i2c devices can optionally be listed as sub-nodes here; but of course
gen-mhs-devtree doesn't know about these.

>         } ;
>         RS232_Uart_1 {
>                 compatible =
> "xilinx,opb-uartlite-1.00.b\0xilinx,opb-uartlite";
>                 device_type = "opb_uartlite";
device_type = serial;

>                 interrupt-parent = <101>;
>                 interrupts = < 3 0 >;
>                 reg = < 40600000 10000 >;
>                 xilinx,baudrate = <2580>;
should be 'current-speed = <2580>' because this is standard device
type 'serial'.

>                 xilinx,clk-freq = <5f5e100>;
It might be better for this to be 'clock-frequency'; one of the
standard properties in serial devices

>                 xilinx,data-bits = <8>;
>                 xilinx,odd-parity = <0>;
>                 xilinx,use-parity = <0>;
>         } ;
>         chosen {
>                 bootargs = "root=/dev/xsysace/disc0/part2";
>                 interrupt-controller = <101>;
interrupt-controller doesn't belong here.

>                 linux,platform = <600>;
What's 'linux,platform' for?

>         } ;
>         cpus {
>                 #address-cells = <1>;
>                 #cpus = <1>;
I don't think this property is necessary.

>                 #size-cells = <0>;
>                 microblaze_0,6.00. {
>                         32-bit;
>                         clock-frequency = <5f5e1000>;
dtc now accepts the form <d#value> for decimal values; might be better
for readability here.

>                         d-cache-line-size = <10>;
>                         d-cache-size = <4000>;
>                         device_type = "cpu";
>                         i-cache-line-size = <10>;
>                         i-cache-size = <4000>;
>                         linux,boot-cpu;
>                         reg = <0>;
>                         timebase-frequency = <1fca055>;
>                         xilinx,cache-byte-size = <4000>;
>                         xilinx,dcache-baseaddr = <50000000>;
>                         xilinx,dcache-byte-size = <4000>;
>                         xilinx,dcache-highaddr = <5fffffff>;
>                         xilinx,debug-enabled = <1>;
>                         xilinx,div-zero-exception = <1>;
>                         xilinx,dopb-bus-exception = <1>;
>                         xilinx,fpu-exception = <1>;
>                         xilinx,icache-baseaddr = <50000000>;
>                         xilinx,icache-highaddr = <5fffffff>;
>                         xilinx,ill-opcode-exception = <1>;
>                         xilinx,iopb-bus-exception = <1>;
>                         xilinx,number-of-pc-brk = <2>;
>                         xilinx,pvr = <2>;
>                         xilinx,unaligned-exceptions = <1>;
>                         xilinx,use-barrel = <1>;
>                         xilinx,use-dcache = <1>;
>                         xilinx,use-div = <1>;
>                         xilinx,use-fpu = <1>;
>                         xilinx,use-icache = <1>;
>                         xilinx,use-msr-instr = <1>;
>                         xilinx,use-pcmp-instr = <1>;
>                 } ;
>         } ;
>         debug_module {
>                 compatible = "xilinx,opb-mdm-2.00.a\0xilinx,opb-mdm";
>                 device_type = "opb_mdm";
>                 reg = < 41400000 10000 >;
>                 xilinx,mb-dbg-ports = <1>;
>                 xilinx,uart-width = <8>;
>                 xilinx,use-uart = <1>;
>         } ;
>         memory@50000000 {
>                 device_type = "memory";
>                 edk_name = "DDR2_SDRAM_32Mx32";
xlnx,edk_name perhaps?  Might be better to embed this in the node name.

>                 memreg:reg = < 50000000 10000000 >;
>         } ;
>         opb_hwicap_0 {
>                 compatible =
> "xilinx,opb-hwicap-1.10.a\0xilinx,opb-hwicap";
>                 device_type = "opb_hwicap";
Drop device type for this node; it's non-standard.

>                 reg = < 41300000 10000 >;
>         } ;
>         opb_intc_0 {
>                 #interrupt-cells = <2>;
>                 compatible = "xilinx,opb-intc-1.00.c\0xilinx,opb-intc";
>                 device_type = "opb_intc";
>                 interrupt-controller;
>                 linux,phandle = <101>;
You can drop this property and add the label 'opb_intc_0' to this node instead.
>                 reg = < 41200000 10000 >;
>         } ;
>         opb_timer_1 {
>                 compatible =
> "xilinx,opb-timer-1.00.b\0xilinx,opb-timer";
>                 device_type = "opb_timer";
Again, device_type needs to be dropped

>                 interrupt-parent = <101>;
>                 interrupts = < 0 0 >;
>                 reg = < 41c00000 10000 >;
>                 xilinx,count-width = <20>;
>                 xilinx,one-timer-only = <1>;
>         } ;
> } ;
>


-- 
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.
grant.likely@secretlab.ca
(403) 399-0195

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Maxim Levitsky @ 2007-10-20  5:36 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev list, akpm, Linus Torvalds, Linux Kernel list
In-Reply-To: <1192856675.6745.5.camel@pasglop>

On Saturday 20 October 2007 07:04:35 Benjamin Herrenschmidt wrote:
> 
> > 1) some drivers use pci_disable_device(), and pci_enable_device().
> > should I use it too?
> 
> I generally don't do the former, and I would expect the late to be done
> by pci_restore_state() for you. pci_disable_device(), last I looked,
> only cleared the bus master bit though, which might be a good idea to
> do.
> 
> People in ACPI/x86 land, are there more good reasons to do one or the
> other ?
> 
> That reminds me that I volunteered to write a documentation on how
> drivers should do all that stuff at KS and didn't get to actually doing
> it yet. shame ... I'll try to start something asap.
> 
> > 2) I accidentally did this:
> > 	pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
> > 	pci_save_state(pci_dev);
> > 
> > I somehow thought that this is correct, that I should save the pci config state
> > after the power-down, but now I know that it isn't correct.
> 
> Right, you need to do the save_state before the power down. You need to
> avoid pretty much any access to the device after the save state other
> than the pending set_power_state on resume.
> 
> > I now need to send a patch for dmfe.c network driver that has the same commands written by me.
> > (but it works perfectly anyway)
> 
> On x86 desktop... might have surprises on others.
> 
> > Is it possible to access pci configuration space in D3?
> 
> It's only really safe to access the PM register itself, though I suppose
> you should be able to walk the capability chain to do that. But I
> wouldnt recommend doing anything else.
> 
> > And lastly speaking of network drivers, one issue came to my mind:
> > most network drivars has a packet queue and in case of dmfe it is located in main memory,
> > and card does dma from it.
> 
> Note that the network stack nowadays does a fair bit of cleaning up for
> you before your suspend routine is called....
> > 
> > in .suspend I ignore that some packets may be in that queue, and I want
> > to ask, whenever there are better ways to do that.
> > 
> > 
> > this is my dmfe .suspend routine.
> > 
> > 	/* Disable upper layer interface */
> > 	netif_device_detach(dev);
> 

> 
> Looks allright on a quick glance appart from the bits we already
> discussed.


> 
> > I guess, everybody makes mistakes... :-)
> > 
> > Other network drivers has a bit more complicated .suspend/.resume routines, 
> > but I didn't see a driver waiting for output queue to finish
> 
> I think the network stack does that nowadays but we'll have to double
> check, that's based on what DaveM told me at KS.
> 
> Ben. 
> 
> 

Hi,

Thanks a lot.
I fix the order of calls in dmfe.c
and in saa7134-core.c.

I probably need to add this synchronize_irq() logic in dmfe.c too, but I probably do it later,
I think I am overestimating this race, since most drivers don't do dev->insuspend checks in IRQ handler.
Maybe even just use free_irq() after all....


Best regards,
	Maxim Levitsky

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-20  5:04 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linuxppc-dev list, akpm, Linus Torvalds, Linux Kernel list
In-Reply-To: <200710200624.58261.maximlevitsky@gmail.com>


> 1) some drivers use pci_disable_device(), and pci_enable_device().
> should I use it too?

I generally don't do the former, and I would expect the late to be done
by pci_restore_state() for you. pci_disable_device(), last I looked,
only cleared the bus master bit though, which might be a good idea to
do.

People in ACPI/x86 land, are there more good reasons to do one or the
other ?

That reminds me that I volunteered to write a documentation on how
drivers should do all that stuff at KS and didn't get to actually doing
it yet. shame ... I'll try to start something asap.

> 2) I accidentally did this:
> 	pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
> 	pci_save_state(pci_dev);
> 
> I somehow thought that this is correct, that I should save the pci config state
> after the power-down, but now I know that it isn't correct.

Right, you need to do the save_state before the power down. You need to
avoid pretty much any access to the device after the save state other
than the pending set_power_state on resume.

> I now need to send a patch for dmfe.c network driver that has the same commands written by me.
> (but it works perfectly anyway)

On x86 desktop... might have surprises on others.

> Is it possible to access pci configuration space in D3?

It's only really safe to access the PM register itself, though I suppose
you should be able to walk the capability chain to do that. But I
wouldnt recommend doing anything else.

> And lastly speaking of network drivers, one issue came to my mind:
> most network drivars has a packet queue and in case of dmfe it is located in main memory,
> and card does dma from it.

Note that the network stack nowadays does a fair bit of cleaning up for
you before your suspend routine is called....
> 
> in .suspend I ignore that some packets may be in that queue, and I want
> to ask, whenever there are better ways to do that.
> 
> 
> this is my dmfe .suspend routine.
> 
> 	/* Disable upper layer interface */
> 	netif_device_detach(dev);

The above -might- not be needed any more, I have to check. I used to do
it too on my drivers.

> 	/* Disable Tx/Rx */
> 	db->cr6_data &= ~(CR6_RXSC | CR6_TXSC);
> 	update_cr6(db->cr6_data, dev->base_addr);
> 
> 	/* Disable Interrupt */
> 	outl(0, dev->base_addr + DCR7);
> 	outl(inl (dev->base_addr + DCR5), dev->base_addr + DCR5);
> 
> 	/* Fre RX buffers */
> 	dmfe_free_rxbuffer(db);
> 
> 	/* Enable WOL */
> 	pci_read_config_dword(pci_dev, 0x40, &tmp);
> 	tmp &= ~(DMFE_WOL_LINKCHANGE|DMFE_WOL_MAGICPACKET);
> 
> 	if (db->wol_mode & WAKE_PHY)
> 		tmp |= DMFE_WOL_LINKCHANGE;
> 	if (db->wol_mode & WAKE_MAGIC)
> 		tmp |= DMFE_WOL_MAGICPACKET;
> 
> 	pci_write_config_dword(pci_dev, 0x40, tmp);
> 
> 	pci_enable_wake(pci_dev, PCI_D3hot, 1);
> 	pci_enable_wake(pci_dev, PCI_D3cold, 1);
> 
> 	/* Power down device*/
> 	pci_set_power_state(pci_dev, pci_choose_state (pci_dev,state));
> 	pci_save_state(pci_dev);
> 

Looks allright on a quick glance appart from the bits we already
discussed.

> I guess, everybody makes mistakes... :-)
> 
> Other network drivers has a bit more complicated .suspend/.resume routines, 
> but I didn't see a driver waiting for output queue to finish

I think the network stack does that nowadays but we'll have to double
check, that's based on what DaveM told me at KS.

Ben. 

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Maxim Levitsky @ 2007-10-20  4:24 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev list, akpm, Linus Torvalds, Linux Kernel list
In-Reply-To: <1192852561.17235.23.camel@pasglop>

On Saturday 20 October 2007 05:56:01 Benjamin Herrenschmidt wrote:
> 
> > I have read this thread and I concluded few things:
> > 
> > 1) It is impossible to know that the card won't send more interrupts:
> > Even if I do a read from the device, the IRQ can be pending in the bus/APIC
> > It is even possible (and likely) that the IRQ line will be shared, thus the 
> > handler can be called by non-relevant device.
> > 
> > 2) the synchronize_irq(); in .suspend is useless:
> > an IRQ can happen immediately after this synchronize_irq();
> > and interrupt even the .suspend()
> > (At least theoretically)
> 
> It's not totally useless not. It guarantees that by the time your
> are out of your suspend(), a simultaneous instance of the IRQ handler
> is either finished, or it (or any subsequent occurence) have seen
> your insuspend flag set to 1.
> 
> > Thus I now understand that .suspend() should do:
> > 
> > 	saa_writel(SAA7134_IRQ1, 0);
> > 	saa_writel(SAA7134_IRQ2, 0);
> > 	saa_writel(SAA7134_MAIN_CTRL, 0);
> > 
> > 	dev->insuspend = 1;
> > 	smp_wmb();
> > 
> > 	/* at that point the _request to disable card's IRQs was issued, we don't know 
> > 	   that there will be no irqs anymore.
> > 	   the smp_mb(); guaranties that the IRQ handler will bail out in that case. */
> > 	
> > 	/* .......*/
> > 
> > 	pci_save_state(pci_dev);
> > 	pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
> > 	return 0;
> 
> The above doesn't handle the case where the IRQ handle just passed the
> if (insuspend) test. You may end up calling pci_set_power_state() while
> in the middle of some further HW accesses in the handler.
> 
> You still need synchronize_irq() for that reason. Or use a spinlock.
> 
> > and the interrupt handler:
> > 
> > 	smp_rmb();
> > 	if (dev->insuspend)
> > 		goto out;
> > 
> > Am I right?
> 
> Not quite :-)
> 
> Also not that the work we are doing with synchronize_irq, if it gets
> merged, renders it unnecessary for you to add those two memory barriers,
> synchronize_irq will pretty much do the ordering for you.
> 
> Cheers,
> Ben.
> 
> 


Fine, I Agree, and this is why I used it in first place, I just forgot.
To wait for already running IRQ handler, that tested the dev->insuspend.
(And I assumed that interrupts are disabled on the cpu that runs .suspend, but I know now
that this isn't true.)

Besides that can I ask few more .suspend/resume questions:

1) some drivers use pci_disable_device(), and pci_enable_device().
should I use it too?

2) I accidentally did this:
	pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
	pci_save_state(pci_dev);

I somehow thought that this is correct, that I should save the pci config state
after the power-down, but now I know that it isn't correct.
I now need to send a patch for dmfe.c network driver that has the same commands written by me.
(but it works perfectly anyway)
Is it possible to access pci configuration space in D3?

And lastly speaking of network drivers, one issue came to my mind:
most network drivars has a packet queue and in case of dmfe it is located in main memory,
and card does dma from it.


in .suspend I ignore that some packets may be in that queue, and I want
to ask, whenever there are better ways to do that.


this is my dmfe .suspend routine.

	/* Disable upper layer interface */
	netif_device_detach(dev);

	/* Disable Tx/Rx */
	db->cr6_data &= ~(CR6_RXSC | CR6_TXSC);
	update_cr6(db->cr6_data, dev->base_addr);

	/* Disable Interrupt */
	outl(0, dev->base_addr + DCR7);
	outl(inl (dev->base_addr + DCR5), dev->base_addr + DCR5);

	/* Fre RX buffers */
	dmfe_free_rxbuffer(db);

	/* Enable WOL */
	pci_read_config_dword(pci_dev, 0x40, &tmp);
	tmp &= ~(DMFE_WOL_LINKCHANGE|DMFE_WOL_MAGICPACKET);

	if (db->wol_mode & WAKE_PHY)
		tmp |= DMFE_WOL_LINKCHANGE;
	if (db->wol_mode & WAKE_MAGIC)
		tmp |= DMFE_WOL_MAGICPACKET;

	pci_write_config_dword(pci_dev, 0x40, tmp);

	pci_enable_wake(pci_dev, PCI_D3hot, 1);
	pci_enable_wake(pci_dev, PCI_D3cold, 1);

	/* Power down device*/
	pci_set_power_state(pci_dev, pci_choose_state (pci_dev,state));
	pci_save_state(pci_dev);


I guess, everybody makes mistakes... :-)

Other network drivers has a bit more complicated .suspend/.resume routines, 
but I didn't see a driver waiting for output queue to finish

Best regards,
	Maxim Levitsky

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-20  4:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linuxppc-dev list, akpm, Maxim Levitsky, Linux Kernel list
In-Reply-To: <1192853044.17235.32.camel@pasglop>


> >  - even when you ignore the interrupt (because the driver doesn't care, 
> >    it's suspending), you need to make sure the hardware gets shut up by 
> >    reading (or writing) the proper interrupt status register.
> >
> >    Otherwise, with a level interrupt, the interrupt will continue to be 
> >    held active ("screaming") and the CPU will get interrupted over and 
> >    over again, until the irq subsystem will just turn the irq off 
> >    entirely.
> 
> His suspend routine wrote to the IRQ mask (or equivalent) register in
> his code example, thus the HW should shut up eventually, thus that isn't
> strictly necessary, the IRQ in that case is just a "short
> interrupt" (noticed by the PIC and delivered but possibly not asserted
> anymore at this stage or about to go down).

In fact, he -must not- ack it. Because is the HW is really down (in D3),
got knows what accessing the ACK register will do. I can give you
ideas... from nothing on most x86 desktops to machine checks on most
powerpc machines, though I could imagine some cards bad enough to lock
your bus up if you try to access a register while they are in D3 (I've
seen some of those critters and it seems not all bridges timeout on
cards that just keep sending retries).

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-20  4:06 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linuxppc-dev list, akpm, Linus Torvalds, Linux Kernel list
In-Reply-To: <200710200510.01409.maximlevitsky@gmail.com>


> >  - even when you ignore the interrupt (because the driver doesn't care, 
> >    it's suspending), you need to make sure the hardware gets shut up by 
> >    reading (or writing) the proper interrupt status register.
> I agree, but while device is powered off, its registers can't be accessed
> Thus, if I ack the IRQ every time the handler is called, I will access the 
> powered off device (this is probably won't hurt a lot, but a bit incorrectly)

It will actually crash your machine on some platforms. So no, best is to
-not- ack. The masking is enough, the IRQ will go down eventually.

Ben.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-20  4:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linuxppc-dev list, akpm, Maxim Levitsky, Linux Kernel list
In-Reply-To: <alpine.LFD.0.999.0710191924520.3794@woody.linux-foundation.org>


On Fri, 2007-10-19 at 19:25 -0700, Linus Torvalds wrote:
> 
> On Sat, 20 Oct 2007, Maxim Levitsky wrote:
> > 
> > and the interrupt handler:
> > 
> > 	smp_rmb();
> > 	if (dev->insuspend)
> > 		goto out;
> 
> Something like that can work, yes. However, you need to make sure that:
> 
>  - even when you ignore the interrupt (because the driver doesn't care, 
>    it's suspending), you need to make sure the hardware gets shut up by 
>    reading (or writing) the proper interrupt status register.
>
>    Otherwise, with a level interrupt, the interrupt will continue to be 
>    held active ("screaming") and the CPU will get interrupted over and 
>    over again, until the irq subsystem will just turn the irq off 
>    entirely.

His suspend routine wrote to the IRQ mask (or equivalent) register in
his code example, thus the HW should shut up eventually, thus that isn't
strictly necessary, the IRQ in that case is just a "short
interrupt" (noticed by the PIC and delivered but possibly not asserted
anymore at this stage or about to go down).

We have many scenario generating such short interrupts (pretty much any
time we use interrupt disable/enable on the chip, for example it's
common with NAPI).

This is one of the reasons why we used to have a "timebomb" causing an
IRQ to erroneously get disabled by the IRQ core assuming the IRQ was
stuck, just because we accumulated too many short interrupts on that
line. A recent patch by Alan fixed that to use some more sensible
algorithm checking the time since the last spurrious interrupt, though I
suspect he's being too optimistic on his timeout value and some
scenario, such as NAPI with just the wrong packet rate, can still
trigger the bug. I need  to find a piece of HW slow enough in
de-asserting the IRQ to try to make a repro-case one of these days.

>  - when you resume, make sure that you get the engine going again, with 
>    the understanding that some interrupts may have gotten ignored.

And you forgot that he still needs synchronize_irq(), or his IRQ handler
might well be in the middle of doing something on another CPU, past the
point where it tested "insuspend", which will do no good when a
simultaneous pci_set_power_state() puts the chip into D3. On some
machines, this will machine check. Either that or he needs a spinlock in
his IRQ handler that he also takes around the code that sets insuspend. 

> Also, in general, these kinds of things shouldn't always even be 
> neicessary. If you use the suspend_late()/resume_early() hooks, those will 
> be called with interrupts off, and while they can be harder to debug (and 
> may be worth avoiding for non-critical drivers), they do allow for simpler 
> models partly exactly because they don't need to worry about interrupts 
> etc.

Yes, this is a easier way to deal with devices that are on a directly
mapped bus (path to them isn't gone by the time you hit suspend_late)
and don't need to do fancy things involing waiting for a queue to drain
using interrupts etc... (at one stage of HW complexity, having to
re-implement polled versions of everything is just not worth the
benefit). In general, suspend_late should cover the need of most PCI
devices I suppose.

Ben.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Benjamin Herrenschmidt @ 2007-10-20  3:56 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: linuxppc-dev list, akpm, Linus Torvalds, Linux Kernel list
In-Reply-To: <200710200402.43106.maximlevitsky@gmail.com>


> I have read this thread and I concluded few things:
> 
> 1) It is impossible to know that the card won't send more interrupts:
> Even if I do a read from the device, the IRQ can be pending in the bus/APIC
> It is even possible (and likely) that the IRQ line will be shared, thus the 
> handler can be called by non-relevant device.
> 
> 2) the synchronize_irq(); in .suspend is useless:
> an IRQ can happen immediately after this synchronize_irq();
> and interrupt even the .suspend()
> (At least theoretically)

It's not totally useless not. It guarantees that by the time your
are out of your suspend(), a simultaneous instance of the IRQ handler
is either finished, or it (or any subsequent occurence) have seen
your insuspend flag set to 1.

> Thus I now understand that .suspend() should do:
> 
> 	saa_writel(SAA7134_IRQ1, 0);
> 	saa_writel(SAA7134_IRQ2, 0);
> 	saa_writel(SAA7134_MAIN_CTRL, 0);
> 
> 	dev->insuspend = 1;
> 	smp_wmb();
> 
> 	/* at that point the _request to disable card's IRQs was issued, we don't know 
> 	   that there will be no irqs anymore.
> 	   the smp_mb(); guaranties that the IRQ handler will bail out in that case. */
> 	
> 	/* .......*/
> 
> 	pci_save_state(pci_dev);
> 	pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
> 	return 0;

The above doesn't handle the case where the IRQ handle just passed the
if (insuspend) test. You may end up calling pci_set_power_state() while
in the middle of some further HW accesses in the handler.

You still need synchronize_irq() for that reason. Or use a spinlock.

> and the interrupt handler:
> 
> 	smp_rmb();
> 	if (dev->insuspend)
> 		goto out;
> 
> Am I right?

Not quite :-)

Also not that the work we are doing with synchronize_irq, if it gets
merged, renders it unnecessary for you to add those two memory barriers,
synchronize_irq will pretty much do the ordering for you.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Herbert Xu @ 2007-10-20  3:37 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: akpm, Linus Torvalds, Linux Kernel list, linuxppc-dev list
In-Reply-To: <200710200402.43106.maximlevitsky@gmail.com>

On Sat, Oct 20, 2007 at 02:02:42AM +0000, Maxim Levitsky wrote:
>
> Thus I now understand that .suspend() should do:
> 
> 	saa_writel(SAA7134_IRQ1, 0);
> 	saa_writel(SAA7134_IRQ2, 0);
> 	saa_writel(SAA7134_MAIN_CTRL, 0);
> 
> 	dev->insuspend = 1;
> 	smp_wmb();

If we patch synchronize_irq() as discussed here then you can
use it in place of smp_wmb() and remove the smp_rmb from the
interrupt handler (the latter is the path that matters).

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* [PATCH] fix build break in arch/ppc/syslib/m8260_setup.c
From: Olof Johansson @ 2007-10-20  3:39 UTC (permalink / raw)
  To: galak; +Cc: linuxppc-dev, paulus

Fix build break and warnings in current mainline git:

arch/ppc/syslib/m8260_setup.c: In function 'm8260_setup_arch':
arch/ppc/syslib/m8260_setup.c:63: error: implicit declaration of function 'identify_ppc_sys_by_name_and_id'
arch/ppc/syslib/m8260_setup.c:64: warning: passing argument 1 of 'in_be32' makes pointer from integer without a cast
arch/ppc/syslib/m8260_setup.c: In function 'm8260_show_cpuinfo':
arch/ppc/syslib/m8260_setup.c:158: warning: format '%08x' expects type 'unsigned int', but argument 5 has type 'long unsigned int'
arch/ppc/syslib/m8260_setup.c:158: warning: format '%d' expects type 'int', but argument 6 has type 'long unsigned int'
arch/ppc/syslib/m8260_setup.c:158: warning: format '%u' expects type 'unsigned int', but argument 7 has type 'long unsigned int'
arch/ppc/syslib/m8260_setup.c:158: warning: format '%u' expects type 'unsigned int', but argument 8 has type 'long unsigned int'
arch/ppc/syslib/m8260_setup.c:158: warning: format '%u' expects type 'unsigned int', but argument 9 has type 'long unsigned int'
make[1]: *** [arch/ppc/syslib/m8260_setup.o] Error 1
make[1]: *** Waiting for unfinished jobs....


Signed-off-by: Olof Johansson <olof@lixom.net>


diff --git a/arch/ppc/syslib/m8260_setup.c b/arch/ppc/syslib/m8260_setup.c
index 15f0d73..46588fa 100644
--- a/arch/ppc/syslib/m8260_setup.c
+++ b/arch/ppc/syslib/m8260_setup.c
@@ -25,6 +25,7 @@
 #include <asm/machdep.h>
 #include <asm/bootinfo.h>
 #include <asm/time.h>
+#include <asm/ppc_sys.h>
 
 #include "cpm2_pic.h"
 
@@ -61,7 +62,7 @@ m8260_setup_arch(void)
 #endif
 
 	identify_ppc_sys_by_name_and_id(BOARD_CHIP_NAME,
-				in_be32(CPM_MAP_ADDR + CPM_IMMR_OFFSET));
+			in_be32((void *)CPM_MAP_ADDR + CPM_IMMR_OFFSET));
 
 	m82xx_board_setup();
 }
@@ -147,12 +148,12 @@ m8260_show_cpuinfo(struct seq_file *m)
 	seq_printf(m, "vendor\t\t: %s\n"
 		   "machine\t\t: %s\n"
 		   "\n"
-		   "mem size\t\t: 0x%08x\n"
-		   "console baud\t\t: %d\n"
+		   "mem size\t\t: 0x%08lx\n"
+		   "console baud\t\t: %ld\n"
 		   "\n"
-		   "core clock\t: %u MHz\n"
-		   "CPM  clock\t: %u MHz\n"
-		   "bus  clock\t: %u MHz\n",
+		   "core clock\t: %lu MHz\n"
+		   "CPM  clock\t: %lu MHz\n"
+		   "bus  clock\t: %lu MHz\n",
 		   CPUINFO_VENDOR, CPUINFO_MACHINE, bp->bi_memsize,
 		   bp->bi_baudrate, bp->bi_intfreq / 1000000,
 		   bp->bi_cpmfreq / 1000000, bp->bi_busfreq / 1000000);

^ permalink raw reply related

* Re: [PATCH] synchronize_irq needs a barrier
From: Maxim Levitsky @ 2007-10-20  3:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: akpm, Linux Kernel list, linuxppc-dev list
In-Reply-To: <alpine.LFD.0.999.0710191924520.3794@woody.linux-foundation.org>

On Saturday 20 October 2007 04:25:34 Linus Torvalds wrote:
> 
> On Sat, 20 Oct 2007, Maxim Levitsky wrote:
> > 
> > and the interrupt handler:
> > 
> > 	smp_rmb();
> > 	if (dev->insuspend)
> > 		goto out;
> 
> Something like that can work, yes. However, you need to make sure that:
> 
>  - even when you ignore the interrupt (because the driver doesn't care, 
>    it's suspending), you need to make sure the hardware gets shut up by 
>    reading (or writing) the proper interrupt status register.
I agree, but while device is powered off, its registers can't be accessed
Thus, if I ack the IRQ every time the handler is called, I will access the 
powered off device (this is probably won't hurt a lot, but a bit incorrectly)

> 
>    Otherwise, with a level interrupt, the interrupt will continue to be 
>    held active ("screaming") and the CPU will get interrupted over and 
>    over again, until the irq subsystem will just turn the irq off 
>    entirely.
> 
>  - when you resume, make sure that you get the engine going again, with 
>    the understanding that some interrupts may have gotten ignored.
Here it isn't a problem, this is a video capture card, and I suspend it by just stopping dma
on all active buffers even if in the middle of capture, and I send those buffers to card again
to fill them from the beginning during the resume.
> 
> Also, in general, these kinds of things shouldn't always even be 
> neicessary. If you use the suspend_late()/resume_early() hooks, those will 
> be called with interrupts off, and while they can be harder to debug (and 
> may be worth avoiding for non-critical drivers), they do allow for simpler 
> models partly exactly because they don't need to worry about interrupts 
> etc.
Exactly, I am aware of suspend_late , but I don't want to abuse it.
> 
> 		Linus
> 


Best regards,
	Maxim Levitsky

^ permalink raw reply

* Re: [PATCH] Fix build break in tsi108.c
From: Jeff Garzik @ 2007-10-20  3:04 UTC (permalink / raw)
  To: Olof Johansson; +Cc: joe, netdev, linuxppc-dev
In-Reply-To: <20071020020420.GA20073@lixom.net>

Olof Johansson wrote:
> Fix build break:
> 
> drivers/net/tsi108_eth.c: In function 'tsi108_init_one':
> drivers/net/tsi108_eth.c:1633: error: expected ')' before 'dev'
> drivers/net/tsi108_eth.c:1633: warning: too few arguments for format
> make[2]: *** [drivers/net/tsi108_eth.o] Error 1
> 
> 
> Signed-off-by: Olof Johansson <olof@lixom.net>
> 
> diff --git a/drivers/net/tsi108_eth.c b/drivers/net/tsi108_eth.c
> index df10af7..35d15e8 100644
> --- a/drivers/net/tsi108_eth.c
> +++ b/drivers/net/tsi108_eth.c
> @@ -1629,7 +1629,7 @@ tsi108_init_one(struct platform_device *pdev)
>  		goto register_fail;
>  	}
>  
> -	printk(KERN_INFO "%s: Tsi108 Gigabit Ethernet, MAC: %s\n"
> +	printk(KERN_INFO "%s: Tsi108 Gigabit Ethernet, MAC: %s\n",
>  	       dev->name, print_mac(mac, dev->dev_addr));
>  #ifdef DEBUG

applied

^ permalink raw reply

* Re: [PATCH] phy/bitbang: missing MODULE_LICENSE
From: Jeff Garzik @ 2007-10-20  3:03 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: netdev, linuxppc-dev
In-Reply-To: <20071018122021.2acd128d.randy.dunlap@oracle.com>

Randy Dunlap wrote:
> From: Randy Dunlap <randy.dunlap@oracle.com>
> 
> Missing MODULE_LICENSE(), loading this module taints the kernel.
> 
> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
> ---
>  drivers/net/phy/mdio-bitbang.c |    2 ++
>  1 file changed, 2 insertions(+)

applied this and the NAPI_Howto Kconfig patch

^ permalink raw reply

* ep88xc_defconfig doesn't build
From: Olof Johansson @ 2007-10-20  2:43 UTC (permalink / raw)
  To: scott.wood; +Cc: linuxppc-dev

Hi,

Did it ever?! I get this with current mainline when building default target:

  WRAP    arch/powerpc/boot/cuImage.8xx
DTC: dts->dtb  on file "/work/work/linux/k.org/arch/powerpc/boot/dts/ep88xc.dts"
ERROR: Missing /chosen node
Input tree has errors
  WRAP    arch/powerpc/boot/zImage.ep88xc
make[1]: *** [arch/powerpc/boot/cuImage.8xx] Error 1
make[1]: *** Waiting for unfinished jobs....
DTC: dts->dtb  on file "/work/work/linux/k.org/arch/powerpc/boot/dts/ep88xc.dts"
ERROR: Missing /chosen node
Input tree has errors
make[1]: *** [arch/powerpc/boot/zImage.ep88xc] Error 1
make: *** [zImage] Error 2



-Olof

^ permalink raw reply

* Re: [microblaze-uclinux] RE: [PATCH v3] Device tree bindings for Xilinx devices
From: Michal Simek @ 2007-10-20  2:28 UTC (permalink / raw)
  To: microblaze-uclinux
  Cc: Leonid, Arnd Bergmann, linuxppc-dev, Wolfgang Reissnegger
In-Reply-To: <20071019234347.38C1111C006B@mail3-dub.bigfish.com>

Hi Steve and all,
>Here's a full .dts generated using an updated version of
>gen_mhs_devtree.py, following the proposal.
>It happens to be a microblaze system, but you get the idea.

I think that is no good idea generate dts with all information.
Especially information about PVR - number 2 means - Full PVR and you can
obtain information directly from PVR. It is waste of memory space.
			xilinx,pvr = <2>;

In my opinion will be better generate only parameters which you want not all.
That smells with unusable parameters.

Michal Simek

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Linus Torvalds @ 2007-10-20  2:25 UTC (permalink / raw)
  To: Maxim Levitsky; +Cc: akpm, Linux Kernel list, linuxppc-dev list
In-Reply-To: <200710200402.43106.maximlevitsky@gmail.com>



On Sat, 20 Oct 2007, Maxim Levitsky wrote:
> 
> and the interrupt handler:
> 
> 	smp_rmb();
> 	if (dev->insuspend)
> 		goto out;

Something like that can work, yes. However, you need to make sure that:

 - even when you ignore the interrupt (because the driver doesn't care, 
   it's suspending), you need to make sure the hardware gets shut up by 
   reading (or writing) the proper interrupt status register.

   Otherwise, with a level interrupt, the interrupt will continue to be 
   held active ("screaming") and the CPU will get interrupted over and 
   over again, until the irq subsystem will just turn the irq off 
   entirely.

 - when you resume, make sure that you get the engine going again, with 
   the understanding that some interrupts may have gotten ignored.

Also, in general, these kinds of things shouldn't always even be 
neicessary. If you use the suspend_late()/resume_early() hooks, those will 
be called with interrupts off, and while they can be harder to debug (and 
may be worth avoiding for non-critical drivers), they do allow for simpler 
models partly exactly because they don't need to worry about interrupts 
etc.

		Linus

^ permalink raw reply

* Re: [PATCH] synchronize_irq needs a barrier
From: Maxim Levitsky @ 2007-10-20  2:02 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev list, akpm, Linus Torvalds, Linux Kernel list
In-Reply-To: <1192670742.12879.5.camel@pasglop>

On Thursday 18 October 2007 03:25:42 Benjamin Herrenschmidt wrote:
> synchronize_irq needs at the very least a compiler barrier and a
> read barrier on SMP, but there are enough cases around where a
> write barrier is also needed and it's not a hot path so I prefer
> using a full smp_mb() here.
> 
> It will degrade to a compiler barrier on !SMP.
> 
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
> 
> Index: linux-work/kernel/irq/manage.c
> ===================================================================
> --- linux-work.orig/kernel/irq/manage.c	2007-10-18 11:22:16.000000000 +1000
> +++ linux-work/kernel/irq/manage.c	2007-10-18 11:22:20.000000000 +1000
> @@ -33,6 +33,7 @@ void synchronize_irq(unsigned int irq)
>  	if (irq >= NR_IRQS)
>  		return;
>  
> +	smp_mb();
>  	while (desc->status & IRQ_INPROGRESS)
>  		cpu_relax();
>  }
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Hi,

I have read this thread and I concluded few things:

1) It is impossible to know that the card won't send more interrupts:
Even if I do a read from the device, the IRQ can be pending in the bus/APIC
It is even possible (and likely) that the IRQ line will be shared, thus the 
handler can be called by non-relevant device.

2) the synchronize_irq(); in .suspend is useless:
an IRQ can happen immediately after this synchronize_irq();
and interrupt even the .suspend()
(At least theoretically)


Thus I now understand that .suspend() should do:

	saa_writel(SAA7134_IRQ1, 0);
	saa_writel(SAA7134_IRQ2, 0);
	saa_writel(SAA7134_MAIN_CTRL, 0);

	dev->insuspend = 1;
	smp_wmb();

	/* at that point the _request to disable card's IRQs was issued, we don't know 
	   that there will be no irqs anymore.
	   the smp_mb(); guaranties that the IRQ handler will bail out in that case. */
	
	/* .......*/

	pci_save_state(pci_dev);
	pci_set_power_state(pci_dev, pci_choose_state(pci_dev, state));
	return 0;

and the interrupt handler:

	smp_rmb();
	if (dev->insuspend)
		goto out;

Am I right?
	Best regards,
		Maxim Levitsky

^ permalink raw reply

* [PATCH] Fix build break in tsi108.c
From: Olof Johansson @ 2007-10-20  2:04 UTC (permalink / raw)
  To: jgarzik; +Cc: joe, netdev, linuxppc-dev

Fix build break:

drivers/net/tsi108_eth.c: In function 'tsi108_init_one':
drivers/net/tsi108_eth.c:1633: error: expected ')' before 'dev'
drivers/net/tsi108_eth.c:1633: warning: too few arguments for format
make[2]: *** [drivers/net/tsi108_eth.o] Error 1


Signed-off-by: Olof Johansson <olof@lixom.net>

diff --git a/drivers/net/tsi108_eth.c b/drivers/net/tsi108_eth.c
index df10af7..35d15e8 100644
--- a/drivers/net/tsi108_eth.c
+++ b/drivers/net/tsi108_eth.c
@@ -1629,7 +1629,7 @@ tsi108_init_one(struct platform_device *pdev)
 		goto register_fail;
 	}
 
-	printk(KERN_INFO "%s: Tsi108 Gigabit Ethernet, MAC: %s\n"
+	printk(KERN_INFO "%s: Tsi108 Gigabit Ethernet, MAC: %s\n",
 	       dev->name, print_mac(mac, dev->dev_addr));
 #ifdef DEBUG
 	data->msg_enable = DEBUG;

^ permalink raw reply related

* Re: [BUG] powerpc does not save msi state [was Re: [PATCH 5/7] pci: Export the pci_restore_msi_state() function
From: Benjamin Herrenschmidt @ 2007-10-20  1:29 UTC (permalink / raw)
  To: Linas Vepstas
  Cc: Michael Ellerman, netdev, mcarlson, linuxppc-dev, mchan,
	linux-pci, David Miller
In-Reply-To: <20071020004610.GR29903@austin.ibm.com>


> I'm cc'ing the powerpc mailing list to point this out: 
> it looks like only cell/axon_msi.c and mpic_u3msi.c 
> bother do do anything.  I guess that there aren't any old 
> macintosh laptops that have msi on them? Because without
> this, suspend and resume breaks.

The only macs that can do any form of MSIs are the G5s using
mpic_u3msi.c

> Paul,
> On the off chance your reading this, I'll send a pseries
> patch on Monday, with luck (and some other patches too).
> I'm not touching any of the other plaforms, you and benh 
> would know those better.

Or rather Michael as he wrote the ppc MSI support. I'll check with him

Ben.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox