* RE: [PATCH] Add USB to MPC8349 PB platform support
From: Li Yang-r58472 @ 2006-07-18 7:40 UTC (permalink / raw)
To: Kumar Gala, Dan Malek; +Cc: linuxppc-dev, linux-usb-devel
In-Reply-To: <71D808D5-8227-4B0D-AF41-FADFC4B12463@kernel.crashing.org>
> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> Sent: Tuesday, July 18, 2006 5:39 AM
> To: Dan Malek
> Cc: Li Yang-r58472; linuxppc-dev@ozlabs.org;
linux-usb-devel@lists.sourceforge.net
> Subject: Re: [PATCH] Add USB to MPC8349 PB platform support
>=20
>=20
> On Jul 17, 2006, at 3:17 PM, Dan Malek wrote:
>=20
> >
> > On Jul 17, 2006, at 3:16 PM, Kumar Gala wrote:
> >
> >> I disagree. You are coming from this from a board that does
> >> everything under the sun. I'd like to avoid having this type of
> >> initialization in the kernel. There is a whole additional kitchen
> >> sink that could move into the kernel as well.
> >
> > Well, I'm going to have to disagree with your disagreement :-)
> > The kernel should not assume things are properly initialized
> > and rely on the boot rom to do such things. I have several
> > reasons for this. One is that we are always pressed to make
> > embedded systems boot more quickly, and taking time to
> > initialize things in the boot rom just makes that a totally
> > inflexible system design. We don't need to initialize things
> > we don't use, or can postpone until later. Two, it makes
> > us dependent upon a particular boot rom, or boot rom
> > behavior, that not all boards may choose to support.
> > Three, board designs may have external logic that requires
> > a certain start up sequence or control register access
> > that complicates the boot rom in it's ability to share
> > code or implementation.
>=20
> Well, I think there is a coupling that exists between whatever your
> boot rom is and the kernel. If you are trying to optimize boot time
> I'd say one thing you would want is to avoid multiple writing the
> same configuration registers.
>=20
> I dont have an issue if a fixed function board decides to do these
> things in their kernel init instead of their boot rom. I however,
> don't want thousand and one config options to support all the various
> ways one can configure the Freescale board.
We won't have the thousand and one config options, making use of the
device
tree. So this is not a problem.
>=20
> > There are more, but I think you see the trend. In my
> > years of doing this kind of development, you can't
> > assume a boot rom is going to do much more than initialize
> > memory and load the kernel. I prefer the flexibility
> > to be in the kernel, and not in the boot rom, because it
> > is so much easier to develop and control.
>=20
> - kumar
^ permalink raw reply
* PPC440GP board hangs
From: Denny @ 2006-07-18 10:38 UTC (permalink / raw)
To: linuxppc-embedded
In-Reply-To: <3DBBCC5C2604704D9351456CFE118AC10460F146@msilexch01.marvell.com>
[-- Attachment #1: Type: text/plain, Size: 4489 bytes --]
Hi,
After success uncompress the kernel to ram, it hangs,
## Transferring control to Linux (at address 0000000)
Does anyone encounter the symton on PPC440GP? I double checked my external UART clock and system clock both in hareware and software, they are all right.
the log buffer is as below:
BDI>md 0x01ed5c4 100
001ed5c4 : 0x20322e36 540159542 2.6
001ed5c8 : 0x2e313420 774976544 .14
001ed5cc : 0x28726f6f 678588271 (roo
001ed5d0 : 0x73696f6e 1936289646 sion
001ed5d4 : 0x20322e36 540159542 2.6
001ed5d8 : 0x2e313420 774976544 .14
001ed5dc : 0x28726f6f 678588271 (roo
001ed5e0 : 0x6c646f7d 1818521469 ldo}
001ed5e4 : 0x61696e29 1634299433 ain)
001ed5e8 : 0x20286763 539518819 (gc
001ed5ec : 0x63207665 1663071845 c ve
001ed5f0 : 0x6c646f7d 1818521469 ldo}
001ed5f4 : 0x61696e29 1634299433 ain)
001ed5f8 : 0x20286763 539518819 (gc
001ed5fc : 0x63207665 1663071845 c ve
001ed600 : 0x5820054c 1478493516 X .L
001ed604 : 0x444b2034 1145774132 DK 4
001ed608 : 0x2f302034 791683124 /0 4
001ed60c : 0x2e302e30 774909488 .0.0
001ed610 : 0x5820054c 1478493516 X .L
001ed614 : 0x444b2034 1145774132 DK 4
001ed618 : 0x2f302034 791683124 /0 4
001ed61c : 0x2e302e30 774909488 .0.0
001ed620 : 0x2031383a 540096570 18:
001ed624 : 0x31383a33 825768499 18:3
001ed628 : 0x34204353 874529619 4 CS
001ed62c : 0x54203630 1411397168 T 60
001ed630 : 0x2031383a 540096570 18:
001ed634 : 0x31383a33 825768499 18:3
001ed638 : 0x34204353 874529619 4 CS
001ed63c : 0x54203630 1411397168 T 60
001ed640 : 0x6e652063 1852121187 ne c
001ed644 : 0x6865636b 1751475051 heck
001ed648 : 0x20696e20 543780384 in
001ed64c : 0x6b65726e 1801810542 kern
001ed650 : 0x6e652063 1852121187 ne c
001ed654 : 0x6865636b 1751475051 heck
001ed658 : 0x20696e20 543780384 in
001ed65c : 0x6b65726e 1801810542 kern
001ed660 : 0x656c206d 1701585005 el m
001ed664 : 0x6f64652e 1868850478 ode.
001ed668 : 0x0a3c343e 171717694 .<4>
001ed66c : 0x504c4230 1347174960 PLB0
001ed670 : 0x3a204245 975192645 : BE
001ed674 : 0x41523d30 1095908656 AR=0
001ed678 : 0x78303030 2016423984 x000
001ed67c : 0x30303030 808464432 0000
001ed680 : 0x32306563 842032483 20ec
001ed684 : 0x30303030 808464432 0000
001ed688 : 0x36204143 908083523 6 AC
001ed68c : 0x523d2020 1379737632 R=
001ed690 : 0x30783962 813185378 0x9b
001ed694 : 0x30303030 808464432 0000
001ed698 : 0x30302042 808460354 00 B
001ed69c : 0x4553523d 1163088445 ESR=
001ed6a0 : 0x20307830 540047408 0x0
001ed6a4 : 0x63303030 1664102448 c000
001ed6a8 : 0x3030300a 808464394 000.
001ed6ac : 0x3c343e50 1010056784 <4>P
001ed6b0 : 0x4f42303a 1329737786 OB0:
001ed6b4 : 0x20424541 541214017 BEA
001ed6b8 : 0x523d3078 1379741816 R=0x
001ed6bc : 0x30303030 808464432 0000
001ed6c0 : 0x30303030 808464432 0000
001ed6c4 : 0x30303030 808464432 0000
001ed6c8 : 0x30303030 808464432 0000
001ed6cc : 0x20424553 541214035 BES
001ed6d0 : 0x52303d30 1378893104 R0=0
001ed6d4 : 0x78303030 2016423984 x000
001ed6d8 : 0x30303030 808464432 0000
001ed6dc : 0x30204245 807420485 0 BE
001ed6e0 : 0x5352313d 1397895485 SR1=
001ed6e4 : 0x30783030 813183024 0x00
001ed6e8 : 0x30303030 808464432 0000
001ed6ec : 0x30300a3c 808454716 00.<
001ed6f0 : 0x343e4f50 876498768 4>OP
001ed6f4 : 0x42303a20 1110456864 B0:
001ed6f8 : 0x42454152 1111834962 BEAR
001ed6fc : 0x3d307830 1026586672 =0x0
001ed700 : 0x30303030 808464432 0000
001ed704 : 0x30303030 808464432 0000
001ed708 : 0x30303030 808464432 0000
001ed70c : 0x30303020 808464416 000
001ed710 : 0x42535441 1112757313 BSTA
001ed714 : 0x543d3078 1413296248 T=0x
001ed718 : 0x30303030 808464432 0000
001ed71c : 0x30303030 808464432 0000
001ed720 : 0x0a3c343e 171717694 .<4>
001ed724 : 0x4f6f7073 1332703347 Oops
001ed728 : 0x3a206d61 975203681 : ma
001ed72c : 0x6368696e 1667787118 chin
001ed730 : 0x65206368 1696621416 e ch
001ed734 : 0x65636b2c 1701014316 eck,
001ed738 : 0x20736967 544434535 sig
001ed73c : 0x3a203720 975189792 : 7
001ed740 : 0x5b23315d 1529033053 [#1]
001ed744 : 0x0a3c343e 171717694 .<4>
001ed748 : 0x4e49503a 1313427514 NIP:
001ed74c : 0x20433030 541274160 C00
001ed750 : 0x30443946 809777478 0D9F
BDI>
Any point on this is very appreciated!
- Denny
[-- Attachment #2: Type: text/html, Size: 7306 bytes --]
^ permalink raw reply
* page locking in PowerPC cores
From: Parav Pandit @ 2006-07-18 11:03 UTC (permalink / raw)
To: linuxppc-embedded
[-- Attachment #1: Type: text/plain, Size: 552 bytes --]
Hi,
We allocate memory for DMA operation on PCI device using pci_alloc_constistent().
I want to know how this function is boil down to the actual instruction which locks the page in the RAM so that it cannot be paged out and dma can do its work.
Can anybody tell me which instructions do we use to set this page entry in TLB and page table?
What bit gets set in the PTE for this?
Regards,
Parav Pandit
---------------------------------
Do you Yahoo!?
Next-gen email? Have it all with the all-new Yahoo! Mail Beta.
[-- Attachment #2: Type: text/html, Size: 784 bytes --]
^ permalink raw reply
* Linux bin Commands download
From: none none @ 2006-07-18 12:35 UTC (permalink / raw)
To: linuxppc-dev
Hi
I would like to download precompiled linux commands
and programs for powerPC but i cannot find any
binaries anyware. Are there any links?
thanks in advance
sotirakopoulos andreas
___________________________________________________________
All new Yahoo! Mail "The new Interface is stunning in its simplicity and ease of use." - PC Magazine
http://uk.docs.yahoo.com/nowyoucan.html
^ permalink raw reply
* AltiVec in the kernel
From: Matt Sealey @ 2006-07-18 12:48 UTC (permalink / raw)
To: linuxppc-dev
Once upon a time we were all told this wouldn't work for some reason,
but a lot of documentation now hints that it does actually work and
for instance there is a RAID5/6 driver (for G5) which uses AltiVec
in a kernel context.
But I didn't find any definitive documentation on how one goes about
it. The largest clue I found was in Documentation/cpu_features.txt:
#ifdef CONFIG_ALTIVEC
BEGIN_FTR_SECTION
mfspr r22,SPRN_VRSAVE /* if G4, save vrsave register value */
stw r22,THREAD_VRSAVE(r23)
END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
#endif /* CONFIG_ALTIVEC */
So we can use AltiVec by implementing this kind of wrapper around
kernel functions which may use AltiVec?
In the code above is there ANY significance of r22 and r23 other
than that they are fairly high up and probably marked as "will
be trashed" by all the relevant ABIs and so?
Just curious, as I would like to investigate writing some docs at
least on this (in article fashion) to go with PPCZone, Libfreevec
and so on. I think there is a problem here in that simply developers
who may be interested in doing this kind of optimized code do not
know where to start (and we are thinking from a point of view of
also teaching sessions too, like we did at FTF Frankfurt 2004, so
after we teach them what AltiVec is etc. we demonstrate application
AND kernel functionality and the quirks associated with it).
--
Matt Sealey <matt@genesi-usa.com>
Manager, Genesi, Developer Relations
^ permalink raw reply
* Re: [PATCH] Add USB to MPC8349 PB platform support
From: Kumar Gala @ 2006-07-18 13:52 UTC (permalink / raw)
To: Li Yang-r58472; +Cc: linuxppc-dev, linux-usb-devel
In-Reply-To: <4879B0C6C249214CBE7AB04453F84E4D0509EB@zch01exm20.fsl.freescale.net>
[snip]
>> Well, I think there is a coupling that exists between whatever your
>> boot rom is and the kernel. If you are trying to optimize boot time
>> I'd say one thing you would want is to avoid multiple writing the
>> same configuration registers.
>>
>> I dont have an issue if a fixed function board decides to do these
>> things in their kernel init instead of their boot rom. I however,
>> don't want thousand and one config options to support all the various
>> ways one can configure the Freescale board.
>
> We won't have the thousand and one config options, making use of the
> device
> tree. So this is not a problem.
I'm talking about opening the door to a ton of options, not that we
have them now. For example, your patch doesnt handle the USB PHYs if
they are on the MDS instead of the SYS board. It doesn't handle
setting SCCR properly for different frequency choices. I'm concerned
about where to draw the line because of all the ways a user can
configure the MDS board.
- kumar
^ permalink raw reply
* Re: AltiVec in the kernel
From: Kumar Gala @ 2006-07-18 13:53 UTC (permalink / raw)
To: matt; +Cc: linuxppc-dev list, Paul Mackerras
In-Reply-To: <004c01c6aa68$6f580d00$99dfdfdf@bakuhatsu.net>
On Jul 18, 2006, at 7:48 AM, Matt Sealey wrote:
>
> Once upon a time we were all told this wouldn't work for some reason,
> but a lot of documentation now hints that it does actually work and
> for instance there is a RAID5/6 driver (for G5) which uses AltiVec
> in a kernel context.
Using Altivec generally in the kernel is still something that is not
recommended. The key to using it is in disabling preemption, this
ensures that when the code is done the Altivec register state is back
to how the kernel found it.
preempt_disable();
enable_kernel_altivec();
raid6_altivec$#_gen_syndrome_real(disks, bytes, ptrs);
preempt_enable();
> But I didn't find any definitive documentation on how one goes about
> it. The largest clue I found was in Documentation/cpu_features.txt:
>
> #ifdef CONFIG_ALTIVEC
> BEGIN_FTR_SECTION
> mfspr r22,SPRN_VRSAVE /* if G4, save vrsave register value */
> stw r22,THREAD_VRSAVE(r23)
> END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
> #endif /* CONFIG_ALTIVEC */
>
> So we can use AltiVec by implementing this kind of wrapper around
> kernel functions which may use AltiVec?
>
> In the code above is there ANY significance of r22 and r23 other
> than that they are fairly high up and probably marked as "will
> be trashed" by all the relevant ABIs and so?
I'd guess those were the registers used by the code this was snipped
from.
> Just curious, as I would like to investigate writing some docs at
> least on this (in article fashion) to go with PPCZone, Libfreevec
> and so on. I think there is a problem here in that simply developers
> who may be interested in doing this kind of optimized code do not
> know where to start (and we are thinking from a point of view of
> also teaching sessions too, like we did at FTF Frankfurt 2004, so
> after we teach them what AltiVec is etc. we demonstrate application
> AND kernel functionality and the quirks associated with it).
I'm pretty sure Paul looked into using AltiVec for memory operations
in the kernel and didn't see a significant benefit to it.
- kumar
^ permalink raw reply
* RE: AltiVec in the kernel
From: Matt Sealey @ 2006-07-18 15:10 UTC (permalink / raw)
To: 'Kumar Gala'
Cc: 'linuxppc-dev list', 'Paul Mackerras'
In-Reply-To: <E40C6B7A-F2E8-4DB8-97D3-2CFF88C1D90A@kernel.crashing.org>
> -----Original Message-----
> From: Kumar Gala [mailto:galak@kernel.crashing.org]
> Sent: Tuesday, July 18, 2006 8:53 AM
> To: matt@genesi-usa.com
> Cc: linuxppc-dev list; Paul Mackerras
> Subject: Re: AltiVec in the kernel
>
>
> On Jul 18, 2006, at 7:48 AM, Matt Sealey wrote:
>
> > for instance there is a RAID5/6 driver (for G5) which uses
> > AltiVec in a kernel context.
>
> Using Altivec generally in the kernel is still something that
> is not recommended. The key to using it is in disabling
> preemption, this ensures that when the code is done the
> Altivec register state is back to how the kernel found it.
>
> preempt_disable();
> enable_kernel_altivec();
>
> raid6_altivec$#_gen_syndrome_real(disks, bytes, ptrs);
>
> preempt_enable();
Why isn't it recommended?
For instance on FreeBSD and other operating systems they have
designed the functionality in there as it would be a feature
people would want to use. QNX uses AltiVec to perform the
context switch and message passing and keep latency down.
Restricting AltiVec to userspace code (applications..) really
means you are barely ever using it. Kernel functions and
drivers are called every second of every day.. it's about
making AltiVec really used and not having the unit sit twiddling
it's thumbs until you REALLY NEED TO DECODE A JPEG VERY FAST.
There are thousands of things it could be doing. One example
could be.. in-kernel compression and encryption subroutines.
> > teach them what AltiVec is etc. we demonstrate application
> > AND kernel functionality and the quirks associated with it).
>
> I'm pretty sure Paul looked into using AltiVec for memory
> operations in the kernel and didn't see a significant benefit to it.
We had our own guy look at it and he presented some significant
performance improvements. One problem was, though, that the best
improvement in theory came from a function which needed to be
called very early in kernel boot, well before AltiVec was
enabled, and everything else is marginal at best (1.n times
improvement, but it is still 0.n more than 1.0). I am not clear
on this and cannot find my discussion on the subject in my logs
and email backups, so. I will leave it for now.
There is also plenty of example code (libmotovec, Freescale
Application Notes) which improve things like TCP checksumming
and so on using AltiVec. These patches are even used in EEMBC
benchmarks to boost the scores.
There is also plenty of examples of userspace code (as before,
checksumming, encryption, compression/decompression) which has
been improved. libfreevec includes some changes to the zlib
window functions. For example the kernel includes an MD5, SHA,
zlib compression framework.. mostly ported userspace code and
standard libraries. Would these not be candidates? The development
and speed improvements are even capable of being tested in
userspace (and this is a GREAT teaching aid also; show how to
improve some userspace app. Then show the differences it needed
to go into the kernel. Benchmark both. Detail result.)
I think there are thousands of places where AltiVec could be
used - even sparingly - to provide good performance improvements.
>From your reply I suspect that these would be places which do
not rely on the effects preemption has on performance (i.e.
you trade preemption for AltiVec and gain).
I don't think people investigate it too much because the first
thing they hit is lack of documentation, and then "well we don't
really recommend it". I think this makes Linux the worst OS a
developer would want to run on a G4 and G5, then? :D
--
Matt Sealey <matt@genesi-usa.com>
Manager, Genesi, Developer Relations
^ permalink raw reply
* Re: [PATCH] Add USB to MPC8349 PB platform support
From: Dan Malek @ 2006-07-18 15:19 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev, linux-usb-devel
In-Reply-To: <41AD230F-91E8-442A-B9F9-9892D14BEF12@kernel.crashing.org>
On Jul 18, 2006, at 9:52 AM, Kumar Gala wrote:
> ..... I'm concerned about where to draw the line because of all
> the ways a user can configure the MDS board.
IMHO, you choose one configuration that works and
make that the board port. If someone wants to change
it later, it would be a good exercise for them to learn
how to do such things From years of working with these
things I'll suggest that it's not worthwhile to make this
so complicated that you think it's going to be everything
for anyone. It's not the way people use these boards.
The best thing we should do is clearly document
one board configuration and have the Linux configuration
that matches. It's horribly misleading to present a board
that is infinitely configurable and implying you can
find a Linux configuration to match. They are guaranteed
to find something that doesn't work on their first
attempt, and then all of the "broken this" and "broken
that" finger pointing starts :-)
Thanks.
-- Dan
^ permalink raw reply
* Re: [PATCH] Add USB to MPC8349 PB platform support
From: Wolfgang Denk @ 2006-07-18 15:53 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev, linux-usb-devel
In-Reply-To: <41AD230F-91E8-442A-B9F9-9892D14BEF12@kernel.crashing.org>
In message <41AD230F-91E8-442A-B9F9-9892D14BEF12@kernel.crashing.org> you wrote:
>
> I'm talking about opening the door to a ton of options, not that we
> have them now. For example, your patch doesnt handle the USB PHYs if
If you really assume that all this has to be handled in so many
configuration options then it probably makes not much difference if
you do this in the kernel or in any boot loader - you're just
shifting effort and responsibility to somebody else. I agree with
Dan's argumentation.
Best regards,
Wolfgang Denk
--
Software Engineering: Embedded and Realtime Systems, Embedded Linux
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
Chapter 1 -- The story so far:
In the beginning the Universe was created. This has made a lot of
people very angry and been widely regarded as a bad move.
^ permalink raw reply
* Re: AltiVec in the kernel
From: Paul Mackerras @ 2006-07-18 17:43 UTC (permalink / raw)
To: matt; +Cc: linuxppc-dev
In-Reply-To: <004c01c6aa68$6f580d00$99dfdfdf@bakuhatsu.net>
Matt Sealey writes:
> Once upon a time we were all told this wouldn't work for some reason,
> but a lot of documentation now hints that it does actually work and
> for instance there is a RAID5/6 driver (for G5) which uses AltiVec
> in a kernel context.
It's possible, with some restrictions, basically the same restrictions
on using floating point in the kernel.
Kernel use of altivec interacts with the lazy altivec context switch
that we do on UP kernels, and the fact that the kernel context switch
doesn't save/restore the altivec state. That means that before using
altivec in the kernel you may have to save away the altivec state, and
you have to make sure you don't sleep or get preempted while using
altivec.
> But I didn't find any definitive documentation on how one goes about
> it. The largest clue I found was in Documentation/cpu_features.txt:
>
> #ifdef CONFIG_ALTIVEC
> BEGIN_FTR_SECTION
> mfspr r22,SPRN_VRSAVE /* if G4, save vrsave register value */
> stw r22,THREAD_VRSAVE(r23)
> END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC)
> #endif /* CONFIG_ALTIVEC */
>
> So we can use AltiVec by implementing this kind of wrapper around
> kernel functions which may use AltiVec?
No, that's irrelevant; that just has to do with the VRSAVE register,
not the altivec state. In fact VRSAVE isn't actually even part of the
altivec state.
> In the code above is there ANY significance of r22 and r23 other
> than that they are fairly high up and probably marked as "will
> be trashed" by all the relevant ABIs and so?
I hope we do a bit better than "probably" ... :) No, there is no
particular significance to the choice of r22 and r23. If you read the
code you will see that those registers are saved at the beginning of
the context switch routine and restored (from the new process's stack)
at the end.
Paul.
^ permalink raw reply
* RE: AltiVec in the kernel
From: Paul Mackerras @ 2006-07-18 17:56 UTC (permalink / raw)
To: matt; +Cc: 'linuxppc-dev list'
In-Reply-To: <005701c6aa7c$632a48e0$99dfdfdf@bakuhatsu.net>
Matt Sealey writes:
> Why isn't it recommended?
Because the overhead of saving away the user altivec state and
restoring it can easily overwhelm any advantage you get from using
altivec.
> We had our own guy look at it and he presented some significant
> performance improvements. One problem was, though, that the best
> improvement in theory came from a function which needed to be
> called very early in kernel boot, well before AltiVec was
> enabled, and everything else is marginal at best (1.n times
> improvement, but it is still 0.n more than 1.0). I am not clear
> on this and cannot find my discussion on the subject in my logs
> and email backups, so. I will leave it for now.
I tried using altivec for memory copies, and while I was able to show
an improvement in speed of copying stuff that was hot in the cache,
there was no overall improvement in the context of everything else the
kernel does. In other words, the things being copied were generally
not hot in the cache, and the CPU was able to saturate the memory
bandwidth using ordinary loads and stores.
> There is also plenty of example code (libmotovec, Freescale
> Application Notes) which improve things like TCP checksumming
> and so on using AltiVec. These patches are even used in EEMBC
> benchmarks to boost the scores.
TCP checksumming is simple enough that it is limited by memory
bandwidth rather than computation speed. This is another example
where you can show an improvement on a microbenchmark because the data
is hot in the cache, but the improvement doesn't translate into any
real improvement in a real application.
> There is also plenty of examples of userspace code (as before,
> checksumming, encryption, compression/decompression) which has
> been improved. libfreevec includes some changes to the zlib
> window functions. For example the kernel includes an MD5, SHA,
> zlib compression framework.. mostly ported userspace code and
> standard libraries. Would these not be candidates?
A lot of compression and encryption algorithms, by their very nature,
are very difficult to parallelize enough to get any significant
improvement from altivec. I looked at SHA1 for instance, and the
sequential dependencies in the computation are such that it is
practically impossible to find a way to do 4 things in parallel. The
sequential dependencies are of course a critical part of the way that
SHA1 ensures that a small change in any part of the input data results
in substantial changes in every byte of the output.
> I think there are thousands of places where AltiVec could be
> used - even sparingly - to provide good performance improvements.
I think that there are actually very few places in the kernel where we
are doing something which is parallelizable, sufficiently
compute-intensive, and not bound by memory bandwidth, to be worth
using altivec.
Paul.
^ permalink raw reply
* RE: AltiVec in the kernel
From: Benjamin Herrenschmidt @ 2006-07-18 18:39 UTC (permalink / raw)
To: matt; +Cc: 'linuxppc-dev list', 'Paul Mackerras'
In-Reply-To: <005701c6aa7c$632a48e0$99dfdfdf@bakuhatsu.net>
> I don't think people investigate it too much because the first
> thing they hit is lack of documentation, and then "well we don't
> really recommend it". I think this makes Linux the worst OS a
> developer would want to run on a G4 and G5, then? :D
It's not recommended for the same reason the FPU isn't used in the
kernel and x86 doesn't use SSE / MMX there neither except in a few
places where it does make sense like the RAID code. It's possible that
it might be interesting to do it for some of the crypto modules as well
and we certainly welcome any patch using altivec to improve some other
aspect of the kernel provided that it does indeed... improve
performances :)
Part of the problem is the cost of enabling/disabling it and
saving/restoring the vector registers that get clobbered when using it.
Essentially, the kernel entry only saves and restores GPRs. Not FPRs,
not VRs. This is done to keep the cost of kernel entry low. Which means
that at any given point in time, the altivec and FPU units contain
whatever context last used by userland. If the kernel wants to use it
for it's own, in thus needs to flush that context to the thread struct
(which also means that the unit will be disabled on the way back to
userland and re-faulted in when used again). That's what
enable_kernel_altivec() does (and the similar enable_kernel_fp()). This
cannot happen at interrupt time though and you shouldn't be holding
locks thus it may be a problem with some of the crypto stuffs as I think
they can be used in some weird code path. It's also important that no
scheduling happen until you are done with the unit, which is why you
have to disable preemption, since otherwise, the unit could be re-used
by userland behind your back.
Another alternative which can work at interrupt time, but requires a bit
of assembly hackery, is to manually enable MSR:VEC (if not already set)
and save and restore all the altivec registers modified by the code.
Ben.
^ permalink raw reply
* [PATCH 5/5] add MAINTAINERS entry for snd-aoa
From: Johannes Berg @ 2006-07-18 17:28 UTC (permalink / raw)
To: linuxppc-dev; +Cc: alsa-devel
In-Reply-To: <20060718172841.046446000@sipsolutions.net>
This adds me into the MAINTAINERS file for the AOA driver.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
--- linux-2.6-fetch.orig/MAINTAINERS 2006-07-18 13:39:52.543476868 +0200
+++ linux-2.6-fetch/MAINTAINERS 2006-07-18 13:57:27.843476868 +0200
@@ -292,6 +292,13 @@ L: info-linux@geode.amd.com
W: http://www.amd.com/us-en/ConnectivitySolutions/TechnicalResources/0,,50_2334_2452_11363,00.html
S: Supported
+AOA (Apple Onboard Audio) ALSA DRIVER
+P: Johannes Berg
+M: johannes@sipsolutions.net
+L: linuxppc-dev@ozlabs.org
+L: alsa-devel@alsa-project.org
+S: Maintained
+
APM DRIVER
P: Stephen Rothwell
M: sfr@canb.auug.org.au
--
^ permalink raw reply
* [PATCH 1/5] aoa: feature gpio layer: fix IRQ access
From: Johannes Berg @ 2006-07-18 17:28 UTC (permalink / raw)
To: linuxppc-dev; +Cc: alsa-devel
In-Reply-To: <20060718172841.046446000@sipsolutions.net>
The IRQ rework caused some hiccups here, in some cases we call
get_irq without a device node. This patch makes it catch that
case and return NO_IRQ when it happens, along with changing the
place where the irq is checked to check for NO_IRQ instead of -1.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
--- linux-2.6-fetch.orig/sound/aoa/core/snd-aoa-gpio-feature.c 2006-07-16 22:10:02.467630929 +0200
+++ linux-2.6-fetch/sound/aoa/core/snd-aoa-gpio-feature.c 2006-07-16 22:13:41.891630929 +0200
@@ -112,7 +112,10 @@ static struct device_node *get_gpio(char
static void get_irq(struct device_node * np, int *irqptr)
{
- *irqptr = irq_of_parse_and_map(np, 0);
+ if (np)
+ *irqptr = irq_of_parse_and_map(np, 0);
+ else
+ *irqptr = NO_IRQ;
}
/* 0x4 is outenable, 0x1 is out, thus 4 or 5 */
@@ -322,7 +325,7 @@ static int ftr_set_notify(struct gpio_ru
return -EINVAL;
}
- if (irq == -1)
+ if (irq == NO_IRQ)
return -ENODEV;
mutex_lock(¬if->mutex);
--
^ permalink raw reply
* [PATCH 0/5] powerpc sound, some more patches
From: Johannes Berg @ 2006-07-18 17:28 UTC (permalink / raw)
To: linuxppc-dev; +Cc: alsa-devel
Here's a new and hopefully final round of patches fixing problems with my
new snd-aoa driver. There are fixes for the PPC Mac Mini as well as for a
problem with snd-powermac that I had noticed.
^ permalink raw reply
* [PATCH 3/5] make snd-powermac load even when it cant bind the device
From: Johannes Berg @ 2006-07-18 17:28 UTC (permalink / raw)
To: linuxppc-dev; +Cc: alsa-devel
In-Reply-To: <20060718172841.046446000@sipsolutions.net>
This patch makes snd-powermac load when it can't bind the device right
away. That's the expected behaviour for hotplugging, but fixes an
important problem I was seeing with doing a modprobe snd-powermac with
a version that refuses loading on machines with layout-id: snd-powermac
would create a bunch of uevents and then refuse to load, the uevents
causing udev to reload it again, ad eternum.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
--- linux-2.6-fetch.orig/sound/ppc/powermac.c 2006-07-18 18:16:53.633476868 +0200
+++ linux-2.6-fetch/sound/ppc/powermac.c 2006-07-18 18:40:39.243476868 +0200
@@ -181,21 +181,14 @@ static int __init alsa_card_pmac_init(vo
if ((err = platform_driver_register(&snd_pmac_driver)) < 0)
return err;
device = platform_device_register_simple(SND_PMAC_DRIVER, -1, NULL, 0);
- if (!IS_ERR(device)) {
- if (platform_get_drvdata(device))
- return 0;
- platform_device_unregister(device);
- err = -ENODEV;
- } else
- err = PTR_ERR(device);
- platform_driver_unregister(&snd_pmac_driver);
- return err;
+ return 0;
}
static void __exit alsa_card_pmac_exit(void)
{
- platform_device_unregister(device);
+ if (!IS_ERR(device))
+ platform_device_unregister(device);
platform_driver_unregister(&snd_pmac_driver);
}
--
^ permalink raw reply
* [PATCH 4/5] aoa: platform function gpio: ignore errors from functions that dont exist
From: Johannes Berg @ 2006-07-18 17:28 UTC (permalink / raw)
To: linuxppc-dev; +Cc: alsa-devel
In-Reply-To: <20060718172841.046446000@sipsolutions.net>
Sometimes we simply want to turn off or on everything, and when recently a
warning was added when a certain platform function can't be called, this
triggered all the time in those cases. This patch shows the warning only if
the error was different from the function not existing.
The alternative would be to not even try calling the function when it
doesn't exist by first checking which exist and then only calling those that
do, but that adds complexity that isn't necessary.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
--- linux-2.6-fetch.orig/sound/aoa/core/snd-aoa-gpio-pmf.c 2006-07-18 19:24:30.273476868 +0200
+++ linux-2.6-fetch/sound/aoa/core/snd-aoa-gpio-pmf.c 2006-07-18 19:24:55.103476868 +0200
@@ -18,7 +18,7 @@ static void pmf_gpio_set_##name(struct g
\
if (unlikely(!rt)) return; \
rc = pmf_call_function(rt->node, #name "-mute", &args); \
- if (rc) \
+ if (rc && rc != -ENODEV) \
printk(KERN_WARNING "pmf_gpio_set_" #name \
" failed, rc: %d\n", rc); \
rt->implementation_private &= ~(1<<bit); \
--
^ permalink raw reply
* [PATCH 2/5] aoa: fix toonie codec
From: Johannes Berg @ 2006-07-18 17:28 UTC (permalink / raw)
To: linuxppc-dev; +Cc: alsa-devel
In-Reply-To: <20060718172841.046446000@sipsolutions.net>
This patch fixes the toonie codec to be actually usable.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
--- linux-2.6-fetch.orig/sound/aoa/codecs/snd-aoa-codec-toonie.c 2006-07-18 11:40:32.363476868 +0200
+++ linux-2.6-fetch/sound/aoa/codecs/snd-aoa-codec-toonie.c 2006-07-18 11:52:29.783476868 +0200
@@ -51,6 +51,13 @@ static struct transfer_info toonie_trans
{}
};
+static int toonie_usable(struct codec_info_item *cii,
+ struct transfer_info *ti,
+ struct transfer_info *out)
+{
+ return 1;
+}
+
#ifdef CONFIG_PM
static int toonie_suspend(struct codec_info_item *cii, pm_message_t state)
{
@@ -69,6 +76,7 @@ static struct codec_info toonie_codec_in
.sysclock_factor = 256,
.bus_factor = 64,
.owner = THIS_MODULE,
+ .usable = toonie_usable,
#ifdef CONFIG_PM
.suspend = toonie_suspend,
.resume = toonie_resume,
@@ -79,19 +87,20 @@ static int toonie_init_codec(struct aoa_
{
struct toonie *toonie = codec_to_toonie(codec);
+ /* nothing connected? what a joke! */
+ if (toonie->codec.connected != 1)
+ return -ENOTCONN;
+
if (aoa_snd_device_new(SNDRV_DEV_LOWLEVEL, toonie, &ops)) {
printk(KERN_ERR PFX "failed to create toonie snd device!\n");
return -ENODEV;
}
- /* nothing connected? what a joke! */
- if (toonie->codec.connected != 1)
- return -ENOTCONN;
-
if (toonie->codec.soundbus_dev->attach_codec(toonie->codec.soundbus_dev,
aoa_get_card(),
&toonie_codec_info, toonie)) {
printk(KERN_ERR PFX "error creating toonie pcm\n");
+ snd_device_free(aoa_get_card(), toonie);
return -ENODEV;
}
--
^ permalink raw reply
* Re: [PATCH 1/5] aoa: feature gpio layer: fix IRQ access
From: Benjamin Herrenschmidt @ 2006-07-18 18:45 UTC (permalink / raw)
To: Johannes Berg; +Cc: linuxppc-dev, alsa-devel
In-Reply-To: <20060718173012.532719000@sipsolutions.net>
On Tue, 2006-07-18 at 19:28 +0200, Johannes Berg wrote:
> plain text document attachment (aoa-get-irq-fix.patch)
> The IRQ rework caused some hiccups here, in some cases we call
> get_irq without a device node. This patch makes it catch that
> case and return NO_IRQ when it happens, along with changing the
> place where the irq is checked to check for NO_IRQ instead of -1.
>
> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>
> --- linux-2.6-fetch.orig/sound/aoa/core/snd-aoa-gpio-feature.c 2006-07-16 22:10:02.467630929 +0200
> +++ linux-2.6-fetch/sound/aoa/core/snd-aoa-gpio-feature.c 2006-07-16 22:13:41.891630929 +0200
> @@ -112,7 +112,10 @@ static struct device_node *get_gpio(char
>
> static void get_irq(struct device_node * np, int *irqptr)
> {
> - *irqptr = irq_of_parse_and_map(np, 0);
> + if (np)
> + *irqptr = irq_of_parse_and_map(np, 0);
> + else
> + *irqptr = NO_IRQ;
> }
>
> /* 0x4 is outenable, 0x1 is out, thus 4 or 5 */
> @@ -322,7 +325,7 @@ static int ftr_set_notify(struct gpio_ru
> return -EINVAL;
> }
>
> - if (irq == -1)
> + if (irq == NO_IRQ)
> return -ENODEV;
>
> mutex_lock(¬if->mutex);
>
> --
^ permalink raw reply
* Re: [PATCH] panic_on_oops: remove ssleep()
From: Horms @ 2006-07-18 19:13 UTC (permalink / raw)
To: Andrew Morton
Cc: chris, tony.luck, linux-ia64, discuss, ak, linux-kernel,
linuxppc-dev, paulus, anton, rmk
In-Reply-To: <20060717172341.6d49f109.akpm@osdl.org>
On Mon, Jul 17, 2006 at 05:23:41PM -0700, Andrew Morton wrote:
> On Mon, 17 Jul 2006 19:10:59 -0400
> Horms <horms@verge.net.au> wrote:
>
> > On Tue, Jul 18, 2006 at 12:27:51AM +0200, Andi Kleen wrote:
> > > On Monday 17 July 2006 18:17, Horms wrote:
> > > ...
> > > Keeping the delay might be actually useful so that you can see the panic
> > > before system reboots when reboot on panic is enabled. I would just use a loop
> > > of mdelays(1) with touch_nmi_watchdog/touch_softirq_watchdog()s
> > > inbetween.
> >
> > Ok, I will look into making that happen. I agree that the pause is
> > quite useful.
>
> It's kind-of already implemented, via pause_on_oops. Perhaps doing
> something like
>
> if (panic_on_oops)
> pause_on_oops = max(pause_on_oops, 5*HZ);
>
> would be sufficient.
Thanks, that may well be sufficient. And I assume that it is nicely out
of the arch-dependant code in die(). I will poke around a bit more.
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
^ permalink raw reply
* Re: [PATCH] panic_on_oops: remove ssleep()
From: Horms @ 2006-07-18 19:15 UTC (permalink / raw)
To: Chuck Ebbert
Cc: Andrew Morton, Chris Zankel, Tony Luck, linux-ia64, discuss,
Andi Kleen, linux-kernel, linuxppc-dev, Paul Mackerras,
Anton Blanchard, Russell King
In-Reply-To: <200607172126_MC3-1-C544-E35A@compuserve.com>
On Mon, Jul 17, 2006 at 09:22:17PM -0400, Chuck Ebbert wrote:
> In-Reply-To: <31687.FP.7244@verge.net.au>
>
> On Mon, 17 Jul 2006 12:17:20 -0400, Horms wrote:
>
> > This patch is part of an effort to unify the panic_on_oops behaviour
> > across all architectures that implement it.
> >
> > It was pointed out to me by Andi Kleen that if an oops has occured
> > in interrupt context, then calling sleep() in the oops path will only cause
> > a panic, and that it would be really better for it not to be in the path at
> > all.
>
> i386 already checks in_interrupt() and panics immediately:
Very good point. I guess that needs to be moved to after
panic_on_oops() if the change that Andi suggests works out.
--
Horms
H: http://www.vergenet.net/~horms/
W: http://www.valinux.co.jp/en/
^ permalink raw reply
* [PATCH 0/3] powerpc: Instrument Hypervisor Calls
From: Mike Kravetz @ 2006-07-18 20:47 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
A small update from the last version. By popular demand, both
wall time (mftb) and cpu cycles(PURR) are collected for each call.
It is interesting to see these two values side by side in the
output files.
--
Mike
^ permalink raw reply
* [PATCH 1/3] powerpc: Instrument Hypervisor Calls: merge headers
From: Mike Kravetz @ 2006-07-18 20:48 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
In-Reply-To: <20060718204723.GA6104@w-mikek2.ibm.com>
Move all the Hypervisor call definitions to to a single header file.
--
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
diff -Naupr linux-2.6.17.6/drivers/net/ibmveth.h linux-2.6.17.6.work/drivers/net/ibmveth.h
--- linux-2.6.17.6/drivers/net/ibmveth.h 2006-07-15 19:00:43.000000000 +0000
+++ linux-2.6.17.6.work/drivers/net/ibmveth.h 2006-07-18 19:33:47.000000000 +0000
@@ -41,16 +41,6 @@
#define IbmVethMcastRemoveFilter 0x2UL
#define IbmVethMcastClearFilterTable 0x3UL
-/* hcall numbers */
-#define H_VIO_SIGNAL 0x104
-#define H_REGISTER_LOGICAL_LAN 0x114
-#define H_FREE_LOGICAL_LAN 0x118
-#define H_ADD_LOGICAL_LAN_BUFFER 0x11C
-#define H_SEND_LOGICAL_LAN 0x120
-#define H_MULTICAST_CTRL 0x130
-#define H_CHANGE_LOGICAL_LAN_MAC 0x14C
-#define H_FREE_LOGICAL_LAN_BUFFER 0x1D4
-
/* hcall macros */
#define h_register_logical_lan(ua, buflst, rxq, fltlst, mac) \
plpar_hcall_norets(H_REGISTER_LOGICAL_LAN, ua, buflst, rxq, fltlst, mac)
diff -Naupr linux-2.6.17.6/include/asm-powerpc/hvcall.h linux-2.6.17.6.work/include/asm-powerpc/hvcall.h
--- linux-2.6.17.6/include/asm-powerpc/hvcall.h 2006-07-15 19:00:43.000000000 +0000
+++ linux-2.6.17.6.work/include/asm-powerpc/hvcall.h 2006-07-18 19:33:47.000000000 +0000
@@ -155,9 +155,15 @@
#define H_VIO_SIGNAL 0x104
#define H_SEND_CRQ 0x108
#define H_COPY_RDMA 0x110
+#define H_REGISTER_LOGICAL_LAN 0x114
+#define H_FREE_LOGICAL_LAN 0x118
+#define H_ADD_LOGICAL_LAN_BUFFER 0x11C
+#define H_SEND_LOGICAL_LAN 0x120
+#define H_MULTICAST_CTRL 0x130
#define H_SET_XDABR 0x134
#define H_STUFF_TCE 0x138
#define H_PUT_TCE_INDIRECT 0x13C
+#define H_CHANGE_LOGICAL_LAN_MAC 0x14C
#define H_VTERM_PARTNER_INFO 0x150
#define H_REGISTER_VTERM 0x154
#define H_FREE_VTERM 0x158
@@ -187,11 +193,14 @@
#define H_GET_HCA_INFO 0x1B8
#define H_GET_PERF_COUNT 0x1BC
#define H_MANAGE_TRACE 0x1C0
+#define H_FREE_LOGICAL_LAN_BUFFER 0x1D4
#define H_QUERY_INT_STATE 0x1E4
#define H_POLL_PENDING 0x1D8
#define H_JOIN 0x298
#define H_ENABLE_CRQ 0x2B0
+#define MAX_HCALL_OPCODES (H_ENABLE_CRQ >> 2)
+
#ifndef __ASSEMBLY__
/* plpar_hcall() -- Generic call interface using above opcodes
^ permalink raw reply
* [PATCH 2/3] powerpc: Instrument Hypervisor Calls: add wrappers
From: Mike Kravetz @ 2006-07-18 20:49 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
In-Reply-To: <20060718204723.GA6104@w-mikek2.ibm.com>
Add wrappers which perform the actual hypervisor call instrumentation.
--
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
diff -Naupr linux-2.6.17.6/arch/powerpc/Kconfig.debug linux-2.6.17.6.work/arch/powerpc/Kconfig.debug
--- linux-2.6.17.6/arch/powerpc/Kconfig.debug 2006-07-15 19:00:43.000000000 +0000
+++ linux-2.6.17.6.work/arch/powerpc/Kconfig.debug 2006-07-18 19:56:20.000000000 +0000
@@ -18,6 +18,20 @@ config DEBUG_STACK_USAGE
This option will slow down process creation somewhat.
+config HCALL_STATS
+ bool "Hypervisor call instrumentation"
+ depends on PPC_PSERIES && DEBUG_FS
+ help
+ Adds code to keep track of the number of hypervisor calls made and
+ the amount of time spent in hypervisor calls: both wall time (based
+ on time base) and cpu time (based on PURR). A directory named
+ hcall_inst is added at the root of the debugfs filesystem. Within
+ the hcall_inst directory are files that contain CPU specific call
+ statistics.
+
+ This option will add a small amount of overhead to all hypervisor
+ calls.
+
config DEBUGGER
bool "Enable debugger hooks"
depends on DEBUG_KERNEL
diff -Naupr linux-2.6.17.6/arch/powerpc/platforms/pseries/Makefile linux-2.6.17.6.work/arch/powerpc/platforms/pseries/Makefile
--- linux-2.6.17.6/arch/powerpc/platforms/pseries/Makefile 2006-07-15 19:00:43.000000000 +0000
+++ linux-2.6.17.6.work/arch/powerpc/platforms/pseries/Makefile 2006-07-18 19:56:20.000000000 +0000
@@ -9,3 +9,4 @@ obj-$(CONFIG_EEH) += eeh.o eeh_cache.o e
obj-$(CONFIG_HVC_CONSOLE) += hvconsole.o
obj-$(CONFIG_HVCS) += hvcserver.o
+obj-$(CONFIG_HCALL_STATS) += hvCall_inst.o
diff -Naupr linux-2.6.17.6/arch/powerpc/platforms/pseries/hvCall.S linux-2.6.17.6.work/arch/powerpc/platforms/pseries/hvCall.S
--- linux-2.6.17.6/arch/powerpc/platforms/pseries/hvCall.S 2006-07-15 19:00:43.000000000 +0000
+++ linux-2.6.17.6.work/arch/powerpc/platforms/pseries/hvCall.S 2006-07-18 19:56:20.000000000 +0000
@@ -11,7 +11,35 @@
#include <asm/hvcall.h>
#include <asm/processor.h>
#include <asm/ppc_asm.h>
-
+
+/*
+ * If hcall statistics are desired, all routines are wrapped with code
+ * that does the statistic gathering.
+ */
+#ifndef CONFIG_HCALL_STATS
+#define PLPAR_HCALL plpar_hcall
+#define PLPAR_HCALL_NORETS plpar_hcall_norets
+#define PLPAR_HCALL_8ARG_2RET plpar_hcall_8arg_2ret
+#define PLPAR_HCALL_4OUT plpar_hcall_4out
+#define PLPAR_HCALL_7ARG_7RET plpar_hcall_7arg_7ret
+#define PLPAR_HCALL_9ARG_9RET plpar_hcall_9arg_9ret
+#else
+#define PLPAR_HCALL plpar_hcall_base
+#define PLPAR_HCALL_NORETS plpar_hcall_norets_base
+#define PLPAR_HCALL_8ARG_2RET plpar_hcall_8arg_2ret_base
+#define PLPAR_HCALL_4OUT plpar_hcall_4out_base
+#define PLPAR_HCALL_7ARG_7RET plpar_hcall_7arg_7ret_base
+#define PLPAR_HCALL_9ARG_9RET plpar_hcall_9arg_9ret_base
+
+/*
+ * A special 'indirect' call to a C based wrapper if statistics are desired.
+ * See plpar_hcall_norets_C function header for more details.
+ */
+_GLOBAL(plpar_hcall_norets)
+ b plpar_hcall_norets_C
+
+#endif
+
#define STK_PARM(i) (48 + ((i)-3)*8)
.text
@@ -25,7 +53,7 @@
unsigned long *out2, R9
unsigned long *out3); R10
*/
-_GLOBAL(plpar_hcall)
+_GLOBAL(PLPAR_HCALL)
HMT_MEDIUM
mfcr r0
@@ -52,7 +80,7 @@ _GLOBAL(plpar_hcall)
/* Simple interface with no output values (other than status) */
-_GLOBAL(plpar_hcall_norets)
+_GLOBAL(PLPAR_HCALL_NORETS)
HMT_MEDIUM
mfcr r0
@@ -76,7 +104,7 @@ _GLOBAL(plpar_hcall_norets)
unsigned long arg8, 112(R1)
unsigned long *out1); 120(R1)
*/
-_GLOBAL(plpar_hcall_8arg_2ret)
+_GLOBAL(PLPAR_HCALL_8ARG_2RET)
HMT_MEDIUM
mfcr r0
@@ -102,7 +130,7 @@ _GLOBAL(plpar_hcall_8arg_2ret)
unsigned long *out3, R10
unsigned long *out4); 112(R1)
*/
-_GLOBAL(plpar_hcall_4out)
+_GLOBAL(PLPAR_HCALL_4OUT)
HMT_MEDIUM
mfcr r0
@@ -144,7 +172,7 @@ _GLOBAL(plpar_hcall_4out)
unsigned long *out6, 102(R1)
unsigned long *out7); 100(R1)
*/
-_GLOBAL(plpar_hcall_7arg_7ret)
+_GLOBAL(PLPAR_HCALL_7ARG_7RET)
HMT_MEDIUM
mfcr r0
@@ -193,7 +221,7 @@ _GLOBAL(plpar_hcall_7arg_7ret)
unsigned long *out8, 94(R1)
unsigned long *out9, 92(R1)
*/
-_GLOBAL(plpar_hcall_9arg_9ret)
+_GLOBAL(PLPAR_HCALL_9ARG_9RET)
HMT_MEDIUM
mfcr r0
diff -Naupr linux-2.6.17.6/arch/powerpc/platforms/pseries/hvCall_inst.c linux-2.6.17.6.work/arch/powerpc/platforms/pseries/hvCall_inst.c
--- linux-2.6.17.6/arch/powerpc/platforms/pseries/hvCall_inst.c 1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.17.6.work/arch/powerpc/platforms/pseries/hvCall_inst.c 2006-07-18 19:57:44.000000000 +0000
@@ -0,0 +1,216 @@
+/*
+ * Copyright (C) 2006 Mike Kravetz IBM Corporation
+ *
+ * Hypervisor Call Instrumentation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/kernel.h>
+#include <linux/percpu.h>
+#include <linux/debugfs.h>
+#include <linux/seq_file.h>
+#include <linux/cpumask.h>
+#include <asm/hvcall.h>
+#include <asm/firmware.h>
+
+DEFINE_PER_CPU(struct hcall_stats[MAX_HCALL_OPCODES+1], hcall_stats);
+
+/*
+ * Common update of the per-CPU/per-hcall statistics
+ */
+static inline void update_stats(unsigned long opcode,
+ unsigned long t_tb_before,
+ unsigned long t_cpu_before)
+{
+ unsigned long op_index = opcode >> 2;
+ struct hcall_stats *hs = &__get_cpu_var(hcall_stats[op_index]);
+
+ hs->tb_total += (mftb() - t_tb_before);
+ hs->cpu_total += (mfspr(SPRN_PURR) - t_cpu_before);
+ hs->num_calls++;
+}
+
+/*
+ * plpar_hcall wrapper
+ */
+long plpar_hcall(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long *out1,
+ unsigned long *out2,
+ unsigned long *out3)
+{
+ long rc;
+ unsigned long t_tb_before, t_cpu_before;
+
+ t_tb_before = mftb();
+ t_cpu_before = mfspr(SPRN_PURR);
+ rc = plpar_hcall_base(opcode, arg1, arg2, arg3, arg4, out1, out2, out3);
+
+ update_stats(opcode, t_tb_before, t_cpu_before);
+ return rc;
+}
+
+/*
+ * A C based wrapper for plpar_hcall_norets
+ * The wrapper for plpar_hcall_norets is a special case because the function
+ * takes a variable number of arguments. It is almost impossible to write a
+ * wrapper for a function that takes a variable number of arguments in C.
+ * Therefore, there is an assembly routine in hvCall.S that simply branches
+ * to this C wrapper. This 'indirection' takes care of the variable arguments
+ * issue. This C wrapper has a fixed maximum number of arguments.
+ */
+long plpar_hcall_norets_C(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long arg5,
+ unsigned long arg6)
+{
+ long rc;
+ unsigned long t_tb_before, t_cpu_before;
+
+ t_tb_before = mftb();
+ t_cpu_before = mfspr(SPRN_PURR);
+ rc = plpar_hcall_norets_base(opcode, arg1, arg2, arg3, arg4, arg5,
+ arg6);
+
+ update_stats(opcode, t_tb_before, t_cpu_before);
+ return rc;
+}
+
+/*
+ * plpar_hcall_8arg_2ret wrapper
+ */
+long plpar_hcall_8arg_2ret(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long arg5,
+ unsigned long arg6,
+ unsigned long arg7,
+ unsigned long arg8,
+ unsigned long *out1)
+{
+ long rc;
+ unsigned long t_tb_before, t_cpu_before;
+
+ t_tb_before = mftb();
+ t_cpu_before = mfspr(SPRN_PURR);
+ rc = plpar_hcall_8arg_2ret_base(opcode, arg1, arg2, arg3, arg4, arg5,
+ arg6, arg7, arg8, out1);
+
+ update_stats(opcode, t_tb_before, t_cpu_before);
+ return rc;
+}
+
+/*
+ * plpar_hcall_4out wrapper
+ */
+long plpar_hcall_4out(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long *out1,
+ unsigned long *out2,
+ unsigned long *out3,
+ unsigned long *out4)
+{
+ long rc;
+ unsigned long t_tb_before, t_cpu_before;
+
+ t_tb_before = mftb();
+ t_cpu_before = mfspr(SPRN_PURR);
+ rc = plpar_hcall_4out_base(opcode, arg1, arg2, arg3, arg4, out1,
+ out2, out3, out4);
+
+ update_stats(opcode, t_tb_before, t_cpu_before);
+ return rc;
+}
+
+/*
+ * plpar_hcall_7arg_7ret wrapper
+ */
+long plpar_hcall_7arg_7ret(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long arg5,
+ unsigned long arg6,
+ unsigned long arg7,
+ unsigned long *out1,
+ unsigned long *out2,
+ unsigned long *out3,
+ unsigned long *out4,
+ unsigned long *out5,
+ unsigned long *out6,
+ unsigned long *out7)
+{
+ long rc;
+ unsigned long t_tb_before, t_cpu_before;
+
+ t_tb_before = mftb();
+ t_cpu_before = mfspr(SPRN_PURR);
+ rc = plpar_hcall_7arg_7ret_base(opcode, arg1, arg2, arg3, arg4, arg5,
+ arg6, arg7, out1, out2, out3, out4,
+ out5, out6, out7);
+
+ update_stats(opcode, t_tb_before, t_cpu_before);
+ return rc;
+}
+
+/*
+ * plpar_hcall_9arg_9ret wrapper
+ */
+long plpar_hcall_9arg_9ret(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long arg5,
+ unsigned long arg6,
+ unsigned long arg7,
+ unsigned long arg8,
+ unsigned long arg9,
+ unsigned long *out1,
+ unsigned long *out2,
+ unsigned long *out3,
+ unsigned long *out4,
+ unsigned long *out5,
+ unsigned long *out6,
+ unsigned long *out7,
+ unsigned long *out8,
+ unsigned long *out9)
+{
+ long rc;
+ unsigned long t_tb_before, t_cpu_before;
+
+ t_tb_before = mftb();
+ t_cpu_before = mfspr(SPRN_PURR);
+ rc = plpar_hcall_9arg_9ret_base(opcode, arg1, arg2, arg3, arg4, arg5,
+ arg6, arg7, arg8, arg9, out1, out2,
+ out3, out4, out5, out6, out7, out8,
+ out9);
+
+ update_stats(opcode, t_tb_before, t_cpu_before);
+ return rc;
+}
diff -Naupr linux-2.6.17.6/include/asm-powerpc/hvcall.h linux-2.6.17.6.work/include/asm-powerpc/hvcall.h
--- linux-2.6.17.6/include/asm-powerpc/hvcall.h 2006-07-18 19:35:00.000000000 +0000
+++ linux-2.6.17.6.work/include/asm-powerpc/hvcall.h 2006-07-18 19:56:20.000000000 +0000
@@ -292,6 +292,87 @@ long plpar_hcall_9arg_9ret(unsigned long
unsigned long *out8,
unsigned long *out9);
+
+/* For hcall instrumentation. One structure per-hcall, per-CPU */
+struct hcall_stats {
+ unsigned long num_calls; /* number of calls (on this CPU) */
+ unsigned long tb_total; /* total wall time (mftb) of calls. */
+ unsigned long cpu_total; /* total cpu time (PURR) of calls. */
+};
+
+/* If Hypervisor call instrumentation is enabled, the assembly routine
+ * names are changed from 'plpar_hcall*' to 'plpar_hcall*_base' and
+ * 'plpar_hcall*' routines become instrumented wrappers. The following
+ * are declarations for the renamed 'plpar_hcall*_base' routines.
+ */
+long plpar_hcall_base (unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long *out1,
+ unsigned long *out2,
+ unsigned long *out3);
+
+long plpar_hcall_norets_base(unsigned long opcode, ...);
+
+long plpar_hcall_8arg_2ret_base(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long arg5,
+ unsigned long arg6,
+ unsigned long arg7,
+ unsigned long arg8,
+ unsigned long *out1);
+
+long plpar_hcall_4out_base(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long *out1,
+ unsigned long *out2,
+ unsigned long *out3,
+ unsigned long *out4);
+
+long plpar_hcall_7arg_7ret_base(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long arg5,
+ unsigned long arg6,
+ unsigned long arg7,
+ unsigned long *out1,
+ unsigned long *out2,
+ unsigned long *out3,
+ unsigned long *out4,
+ unsigned long *out5,
+ unsigned long *out6,
+ unsigned long *out7);
+
+long plpar_hcall_9arg_9ret_base(unsigned long opcode,
+ unsigned long arg1,
+ unsigned long arg2,
+ unsigned long arg3,
+ unsigned long arg4,
+ unsigned long arg5,
+ unsigned long arg6,
+ unsigned long arg7,
+ unsigned long arg8,
+ unsigned long arg9,
+ unsigned long *out1,
+ unsigned long *out2,
+ unsigned long *out3,
+ unsigned long *out4,
+ unsigned long *out5,
+ unsigned long *out6,
+ unsigned long *out7,
+ unsigned long *out8,
+ unsigned long *out9);
+
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_HVCALL_H */
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox