LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 0/4] Some 47x patches for the powerpc-4xx tree
From: Benjamin Herrenschmidt @ 2010-08-19  3:19 UTC (permalink / raw)
  To: Dave Kleikamp; +Cc: linuxppc-dev list
In-Reply-To: <1282149960.3679.0.camel@shaggy-w500.austin.ibm.com>

On Wed, 2010-08-18 at 11:45 -0500, Dave Kleikamp wrote:
> Sorry!  Forgot to change the subject.
> 
> Shaggy
> 
> On Wed, 2010-08-18 at 11:44 -0500, Dave Kleikamp wrote:
> > Josh,
> > Here are some bug fixes for the powerpc-4xx tree.  It'd be nice if they
> > could make it into 2.6.46.

Yeah and I'm sure they can make it into 2.6.46... if you want to wait
that long ! In the meantime, 2.6.36 will do :-)

Cheers,
Ben.

> > Thanks,
> > Shaggy
> > 
> > Dave Kleikamp (4):
> >   powerpc/47x: Make sure mcsr is cleared before enabling machine check
> >     interrupts
> >   powerpc/47x: Remove redundant line from cputable.c
> >   powerpc/4xx: Index interrupt stacks by physical cpu
> >   powerpc/47x: Add an isync before the tlbivax instruction
> > 
> >  arch/powerpc/kernel/cputable.c   |    1 -
> >  arch/powerpc/kernel/head_44x.S   |    4 ++++
> >  arch/powerpc/kernel/irq.c        |   15 ++++++++-------
> >  arch/powerpc/kernel/setup_32.c   |    9 +++++----
> >  arch/powerpc/mm/tlb_nohash_low.S |    1 +
> >  5 files changed, 18 insertions(+), 12 deletions(-)
> > 
> 

^ permalink raw reply

* Re:
From: Stephen Rothwell @ 2010-08-19  1:04 UTC (permalink / raw)
  To: Wolfgang Denk
  Cc: linuxppc-dev, Rupjyoti Sarmah, Prodyut Hazarika, Mark Miesfeld
In-Reply-To: <20100818205646.57783157D71@gemini.denx.de>

[-- Attachment #1: Type: text/plain, Size: 2597 bytes --]

Hi Wolfgang,

On Wed, 18 Aug 2010 22:56:46 +0200 Wolfgang Denk <wd@denx.de> wrote:
>
> drivers/ata/sata_dwc_460ex.c: At top level:
> drivers/ata/sata_dwc_460ex.c:1592: warning: 'struct of_device' declared inside parameter list
> drivers/ata/sata_dwc_460ex.c:1592: warning: its scope is only this definition or declaration, which is probably not what you want
> drivers/ata/sata_dwc_460ex.c: In function 'sata_dwc_probe':
> drivers/ata/sata_dwc_460ex.c:1607: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1614: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1616: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1622: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1628: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1630: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1646: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1650: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1652: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1658: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1660: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1667: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1676: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1678: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1691: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1693: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c: At top level:
> drivers/ata/sata_dwc_460ex.c:1705: warning: 'struct of_device' declared inside parameter list
> drivers/ata/sata_dwc_460ex.c: In function 'sata_dwc_remove':
> drivers/ata/sata_dwc_460ex.c:1707: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c:1720: error: dereferencing pointer to incomplete type
> drivers/ata/sata_dwc_460ex.c: At top level:
> drivers/ata/sata_dwc_460ex.c:1736: warning: initialization from incompatible pointer type
> drivers/ata/sata_dwc_460ex.c:1737: warning: initialization from incompatible pointer type

I think most of these (if not all) are fixed in Linus' tree today.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [PATCH 1/5] ptp: Added a brand new class driver for ptp clocks.
From: john stultz @ 2010-08-19  0:12 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Rodolfo Giometti, Arnd Bergmann, netdev, devicetree-discuss,
	linux-kernel, linuxppc-dev, linux-arm-kernel, Krzysztof Halasa
In-Reply-To: <20100818071942.GA4096@riccoc20.at.omicron.at>

On Wed, 2010-08-18 at 09:19 +0200, Richard Cochran wrote:
> On Tue, Aug 17, 2010 at 05:22:43PM -0700, john stultz wrote:
>>
> > So while to me, it think it would be more ideal (or maybe just less
> > different) to have a read-only interface (like the RTC), leaving PTPd to
> > manage offset calculations and use that to steer the system time. I can
> > acknowledge the need to have some way to correct the freq so the packet
> > timestamps are corrected.
> 
> The PTPd need not change the system time at all for PTP clock to be
> useful. (see below)

Right, obviously an ok-solution is often more useful then no-solution.
But that doesn't mean we shouldn't shoot for a good or even
great-solution. :)


> > I still feel a little concerned over the timer/alarm related interfaces.
> > Could you explain why the alarm interface is necessary? 
> 
> The timer/alarm stuff is "ancillary" and is not at all necessary. It
> is just a "nice to have." I will happily remove it, if it is too
> troubling for people.

If there's a compelling argument for it, I'm interested to hear. But
again, it seems like just
yet-another-way-to-get-alarm/timer-functionality, so before we add an
extra API (or widen an existing API) I'd like to understand the need.

But maybe it might simplify the discussion to pull it for now, but
keeping it in mind to possibly include later as an extension?

> > So really I think my initial negative gut reaction to this was mostly
> > out of the fact that you introduced a char dev that provides almost 100%
> > coverage of the posix-time interface. That is duplication we definitely
> > don't want. 
> 
> The reason why I modelled the char device on the posix interface was
> to make the API more familiar to application programmers. After the
> recent discussion (and having reviewed the posix clock implementation
> in Linux), I now think it would be even better to simply offer a new
> posic clock ID for PTP.
> 
> I was emulating the posix interface. Instead I should use it directly.

I'm definitely interested to see what you come up with here. I'm still
hesitant with adding a PTP clock_id, but extending the posix-clocks
interface in this way isn't unprecedented (see: CLOCK_SGI_CYCLE) I just
would like to make sure we don't end up with a clock_id namespace
littered with oddball clocks that were not well abstracted (see:
CLOCK_SGI_CYCLE :).

For instance: imagine if instead of keeping the clocksource abstraction
internal to the timekeeping core, we exposed each clocksource to
userland via a clock_id.  Every arch would have different ids, and each
arch might have multiple ids. Programming against that would be a huge
pain.

So in thinking about this, try to focus on what the new clock_id
provides that the other existing clockids do not? Are they at comparable
levels of abstraction? 15 years from now, are folks likely to still be
using it? Will it be maintainable? etc...


> > Also I think the documentation I've read about PTP (likely just due to
> > the engineering focus) has an odd inverted sense of priority, focusing
> > on keeping obscure hardware clocks on NIC cards in sync, rather then the
> > the more tangible feature of keeping the system time in sync.
> > 
> > This could be comically interpreted as trying to create a shadow-time on
> > the system that is the "real time" and "yea, maybe we'll let the system
> > know what time it is, but user-apps who want to know the score can send
> > a magic ioctl to /dev/something and get the real deal". ;)  I'm sure
> > that's not the case, but I'd like to keep any confusion in userland
> > about which time is the best time to a minimum (ie: use the system
> > time).
> 
> You are right. As John Eidson's excellent book points out, modern
> computers and operating systems provide surprisingly little support
> for programming based on absolute time. It is not PTP's fault. PTP is
> actually a step in the right direction, but it doesn't yet really fit
> in to the present computing world.

You'll have to forgive me, as I haven't had the time to check out that
book. What exactly do you mean by operating systems provide little
support for programming based on absolute time?


> Okay, here is the Big Picture.
> 
> 1. Use Case: SW timestamping
> 
>    PTP with software timestamping (ie without special hardware) can
>    acheive synchronization within a few dozen microseconds, after
>    about twenty minutes. This is sufficient for very many people. The
>    new API (whether char device or syscall) fully and simply supports
>    this use case. When the PTPd adjusts the PTP clock, it is actually
>    adjusting the system time, just like NTPd.

Again this illustrates the inversion of focus: system time is merely one
of many possible PTP clocks in the larger PTP framework.

The way I tend to see it: PTP is just one of the many ways to sync
system time.


> 2. Use Case: HW timestamping for industrial control
> 
>    PTP with hardware timestamping can acheive synchronization within
>    100 nanoseconds after one minute. If you want to do something with
>    your wonderfully synchronization PTP clock, it must have some kind
>    of special hardware, like timestamping external signals or
>    generating one-shot or periodic outputs. The new API (whether char
>    device or syscall) supports this use case via the ancillary
>    commands.
> 
>    In this case, the end user has an application that interfaces with
>    the outside world via the PTP clock API. Such a specialized
>    application (for example, motor control) uses only the PTP API,
>    since it knows that the standard posix API cannot help. It is
>    irrelevant that the system time is not synchronized, in this case.
> 
>    The PTP clock hardware may or may not provide a hardware interface
>    (interrupt) to the main CPU. In this case, it does not matter. The
>    PTP clock is useful all by itself.

These specialized applications are part of what concerns me the most. 

For example, I can see some parallels between things like audio
processing, where you have a buffer consumed by the card at a certain
rate. Now, the card has its own crystal it uses to time its consumption,
so it has its own time domain, and could drift from system time. Thus
you want to trigger buffer-refill interrupts off of the audio card's
clock, not the system time which might run the risk of being late.

But again, we don't expose the audio hardware clock to userland in the
same way we expose system time.

So, instead of a chardev or a posix clock id, would it make more sense
for the PTP adjustment interface to be more closely to the
nic/phy/whatever interface?  Much like how the audio apis do it?

Again, my knowledge in the networking stack is pretty limited. But it
would seem that having an interface that does something to the effect of
"adjust the timestamp clock on the hardware that generated it from this
packet by Xppb" would feel like the right level of abstraction. Its
closely related to SO_TIMESTAMP, option right? Would something like
using the setsockopt/getsockopt interface with
SO_TIMESTAMP_ADJUST/OFFSET/SET/etc be reasonable?


> 3. Use Case: HW timestamping with PPS to host
> 
>    This case is the same as case 2, with the exception that the PTP
>    clock can interrupt the main CPU. The PTP clock driver advertises
>    the "PPS" capability. When enabled, the PTP layer delivers events
>    via the existing Linux PPS subsystem. Programs like NTPd can use
>    these events to regulate the system time.
> 
>    This means that the system clock and the PTP clock will be at least
>    as well synchronized as when using a traditionial radio clock, GPS,
>    or IRIG-B method. In my opinion, this will be good enough for any
>    practical purpose. For example, let's say you want to run a
>    periodic task synchronized to the absolute wall clock time. Your
>    scheduling latency will be a dozen microseconds or so. Your PPS
>    synchronized system clock should be close enough to the PTP clock
>    to support this.

And yes, this seems perfectly reasonable feature to add. Its not
controversial to me, because its likely to work within the existing
interfaces and won't expose anything new to userland.


Again, sorry to be such a pain about all of this. I know its frustrating
to bring your hard work to lkml and then get a lot of non-specific
push-back to "do it differently". You're earlier comments about being
wary of adding syscalls because it requires lots of review and debate
were spot on, but *any* user-visible API really requires the same level
of care (which, of course, they don't always get). So please don't take
my comments personally. Keep pushing and I will either come around or
maybe we can find a better middle-ground.

thanks
-john

^ permalink raw reply

* (no subject)
From: Wolfgang Denk @ 2010-08-18 20:56 UTC (permalink / raw)
  To: Rupjyoti Sarmah; +Cc: linuxppc-dev, Prodyut Hazarika, Mark Miesfeld

Dear Rupjyoti,

drivers/ata/sata_dwc_460ex.c fails to build in current mainline:

...
  CC      drivers/ata/sata_dwc_460ex.o
drivers/ata/sata_dwc_460ex.c:43:1: warning: "DRV_NAME" redefined
In file included from drivers/ata/sata_dwc_460ex.c:38:
drivers/ata/libata.h:31:1: warning: this is the location of the previous definition
drivers/ata/sata_dwc_460ex.c:44:1: warning: "DRV_VERSION" redefined
drivers/ata/libata.h:32:1: warning: this is the location of the previous definition
drivers/ata/sata_dwc_460ex.c: In function 'sata_dwc_exec_command_by_tag':
drivers/ata/sata_dwc_460ex.c:1356: warning: passing argument 1 of 'ata_get_cmd_descript' makes integer from pointer without a cast
drivers/ata/sata_dwc_460ex.c: At top level:
drivers/ata/sata_dwc_460ex.c:1592: warning: 'struct of_device' declared inside parameter list
drivers/ata/sata_dwc_460ex.c:1592: warning: its scope is only this definition or declaration, which is probably not what you want
drivers/ata/sata_dwc_460ex.c: In function 'sata_dwc_probe':
drivers/ata/sata_dwc_460ex.c:1607: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1614: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1616: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1622: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1628: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1630: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1646: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1650: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1652: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1658: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1660: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1667: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1676: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1678: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1691: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1693: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c: At top level:
drivers/ata/sata_dwc_460ex.c:1705: warning: 'struct of_device' declared inside parameter list
drivers/ata/sata_dwc_460ex.c: In function 'sata_dwc_remove':
drivers/ata/sata_dwc_460ex.c:1707: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c:1720: error: dereferencing pointer to incomplete type
drivers/ata/sata_dwc_460ex.c: At top level:
drivers/ata/sata_dwc_460ex.c:1736: warning: initialization from incompatible pointer type
drivers/ata/sata_dwc_460ex.c:1737: warning: initialization from incompatible pointer type
make[2]: *** [drivers/ata/sata_dwc_460ex.o] Error 1

Do you have any hints how to fix that?

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
Few people do business well who do nothing else.
                                       -- Philip Earl of Chesterfield

^ permalink raw reply

* [PATCH] Correct rtas_data_buf locking in dlpar code
From: Nathan Fontenot @ 2010-08-18 19:58 UTC (permalink / raw)
  To: linuxppc-dev

The dlpar code can cause a deadlock to occur when making the RTAS
configure-connector call.  This occurs because we make kmalloc calls,
which can block, while parsing the rtas_data_buf and holding the
rtas_data_buf_lock.  This an cause issues if someone else attempts
to grab the rtas_data_bug_lock.

This patch alleviates this issue by copying the contents of the rtas_data_buf
to a local buffer before parsing.  This allows us to only hold the
rtas_data_buf_lock around the RTAS configure-connector calls.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>

---
 arch/powerpc/platforms/pseries/dlpar.c |   42 ++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 13 deletions(-)

Index: powerpc/arch/powerpc/platforms/pseries/dlpar.c
===================================================================
--- powerpc.orig/arch/powerpc/platforms/pseries/dlpar.c	2010-08-05 10:55:46.000000000 -0500
+++ powerpc/arch/powerpc/platforms/pseries/dlpar.c	2010-08-18 11:42:10.000000000 -0500
@@ -129,20 +129,35 @@ struct device_node *dlpar_configure_conn
 	struct property *property;
 	struct property *last_property = NULL;
 	struct cc_workarea *ccwa;
+	char *data_buf;
 	int cc_token;
-	int rc;
+	int rc = -1;
 
 	cc_token = rtas_token("ibm,configure-connector");
 	if (cc_token == RTAS_UNKNOWN_SERVICE)
 		return NULL;
 
-	spin_lock(&rtas_data_buf_lock);
-	ccwa = (struct cc_workarea *)&rtas_data_buf[0];
+	data_buf = kzalloc(RTAS_DATA_BUF_SIZE, GFP_KERNEL);
+	if (!data_buf)
+		return NULL;
+
+	ccwa = (struct cc_workarea *)&data_buf[0];
 	ccwa->drc_index = drc_index;
 	ccwa->zero = 0;
 
-	rc = rtas_call(cc_token, 2, 1, NULL, rtas_data_buf, NULL);
-	while (rc) {
+	do {
+		/* Since we release the rtas_data_buf lock between configure
+		 * connector calls we want to re-populate the rtas_data_buffer
+		 * with the contents of the previous call.
+		 */
+		spin_lock(&rtas_data_buf_lock);
+
+		memcpy(rtas_data_buf, data_buf, RTAS_DATA_BUF_SIZE);
+		rc = rtas_call(cc_token, 2, 1, NULL, rtas_data_buf, NULL);
+		memcpy(data_buf, rtas_data_buf, RTAS_DATA_BUF_SIZE);
+
+		spin_unlock(&rtas_data_buf_lock);
+
 		switch (rc) {
 		case NEXT_SIBLING:
 			dn = dlpar_parse_cc_node(ccwa);
@@ -197,18 +212,19 @@ struct device_node *dlpar_configure_conn
 			       "returned from configure-connector\n", rc);
 			goto cc_error;
 		}
+	} while (rc);
 
-		rc = rtas_call(cc_token, 2, 1, NULL, rtas_data_buf, NULL);
+cc_error:
+	kfree(data_buf);
+
+	if (rc) {
+		if (first_dn)
+			dlpar_free_cc_nodes(first_dn);
+
+		return NULL;
 	}
 
-	spin_unlock(&rtas_data_buf_lock);
 	return first_dn;
-
-cc_error:
-	if (first_dn)
-		dlpar_free_cc_nodes(first_dn);
-	spin_unlock(&rtas_data_buf_lock);
-	return NULL;
 }
 
 static struct device_node *derive_parent(const char *path)

^ permalink raw reply

* Re: [PATCH 3/3] powerpc/85xx: Cleanup QE initialization for MPC85xxMDS boards
From: Timur Tabi @ 2010-08-18 19:31 UTC (permalink / raw)
  To: Anton Vorontsov; +Cc: linuxppc-dev
In-Reply-To: <20100608195557.GC11446@oksana.dev.rtsoft.ru>

On Tue, Jun 8, 2010 at 2:55 PM, Anton Vorontsov <avorontsov@mvista.com> wrote:
> The mpc85xx_mds_setup_arch() function is incomprehensible
> and unmaintainable. Factor out all QE specific stuff into
> mpc85xx_mds_qe_init() and mpc85xx_mds_reset_ucc_phys().
>
> Also move QE stuff out of mpc85xx_mds_pic_init().
>
> The diff is unreadable, but only because the code was so. ;-)
> It should be better now, and less indented.
>
> Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
> ---

This patch introduces breaks mpc85xx_smp_defconfig:

  CC      arch/powerpc/platforms/85xx/mpc85xx_mds.o
arch/powerpc/platforms/85xx/mpc85xx_mds.c: In function 'mpc85xx_mds_setup_arch':
arch/powerpc/platforms/85xx/mpc85xx_mds.c:367: error: 'np' undeclared
(first use in this function)
arch/powerpc/platforms/85xx/mpc85xx_mds.c:367: error: (Each undeclared
identifier is reported only once
arch/powerpc/platforms/85xx/mpc85xx_mds.c:367: error: for each
function it appears in.)

Frankly, I've been seeing a lot of build breaks with
mpc85xx_[smp_]defconfig.  I had a patch that added a
p1022_ds_defconfig, but was told instead to update
mpc85xx_[smp_]defconfig.  However, it appears that no one is really
testing mpc85xx_[smp_]defconfig, since there are now *two* failures
(this one, and CONFIG_RAPIDIO) with these defconfig files.

-- 
Timur Tabi
Linux kernel developer at Freescale

^ permalink raw reply

* [PATCH] powerpc/pci: fix checking for child bridges in PCI code.
From: Grant Likely @ 2010-08-18 18:27 UTC (permalink / raw)
  To: benh, linuxppc-dev, linux-kernel; +Cc: Julia Lawall

pci_device_to_OF_node() can return null, and list_for_each_entry will
never enter the loop when dev is NULL, so it looks like this test is
a typo.

Reported-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
---
 arch/powerpc/kernel/pci_of_scan.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c
index 6ddb795..e751506 100644
--- a/arch/powerpc/kernel/pci_of_scan.c
+++ b/arch/powerpc/kernel/pci_of_scan.c
@@ -336,7 +336,7 @@ static void __devinit __of_scan_bus(struct device_node *node,
 		if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE ||
 		    dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
 			struct device_node *child = pci_device_to_OF_node(dev);
-			if (dev)
+			if (child)
 				of_scan_pci_bridge(child, dev);
 		}
 	}

^ permalink raw reply related

* Re: [PATCH 0/4] Some 47x patches for the powerpc-4xx tree
From: Dave Kleikamp @ 2010-08-18 16:45 UTC (permalink / raw)
  To: Josh Boyer; +Cc: linuxppc-dev list
In-Reply-To: <1282149866-4710-1-git-send-email-shaggy@linux.vnet.ibm.com>

Sorry!  Forgot to change the subject.

Shaggy

On Wed, 2010-08-18 at 11:44 -0500, Dave Kleikamp wrote:
> Josh,
> Here are some bug fixes for the powerpc-4xx tree.  It'd be nice if they
> could make it into 2.6.46.
> 
> Thanks,
> Shaggy
> 
> Dave Kleikamp (4):
>   powerpc/47x: Make sure mcsr is cleared before enabling machine check
>     interrupts
>   powerpc/47x: Remove redundant line from cputable.c
>   powerpc/4xx: Index interrupt stacks by physical cpu
>   powerpc/47x: Add an isync before the tlbivax instruction
> 
>  arch/powerpc/kernel/cputable.c   |    1 -
>  arch/powerpc/kernel/head_44x.S   |    4 ++++
>  arch/powerpc/kernel/irq.c        |   15 ++++++++-------
>  arch/powerpc/kernel/setup_32.c   |    9 +++++----
>  arch/powerpc/mm/tlb_nohash_low.S |    1 +
>  5 files changed, 18 insertions(+), 12 deletions(-)
> 

-- 
Dave Kleikamp
IBM Linux Technology Center

^ permalink raw reply

* [PATCH 4/4] powerpc/47x: Add an isync before the tlbivax instruction
From: Dave Kleikamp @ 2010-08-18 16:44 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Dave Kleikamp, linuxppc-dev list
In-Reply-To: <1282149866-4710-1-git-send-email-shaggy@linux.vnet.ibm.com>

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
 arch/powerpc/mm/tlb_nohash_low.S |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/tlb_nohash_low.S b/arch/powerpc/mm/tlb_nohash_low.S
index cfa7682..b9d9fed 100644
--- a/arch/powerpc/mm/tlb_nohash_low.S
+++ b/arch/powerpc/mm/tlb_nohash_low.S
@@ -200,6 +200,7 @@ _GLOBAL(_tlbivax_bcast)
 	rlwimi	r5,r4,0,16,31
 	wrteei	0
 	mtspr	SPRN_MMUCR,r5
+	isync
 /*	tlbivax	0,r3 - use .long to avoid binutils deps */
 	.long 0x7c000624 | (r3 << 11)
 	isync
-- 
1.7.1

^ permalink raw reply related

* [PATCH 3/4] powerpc/4xx: Index interrupt stacks by physical cpu
From: Dave Kleikamp @ 2010-08-18 16:44 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Dave Kleikamp, linuxppc-dev list
In-Reply-To: <1282149866-4710-1-git-send-email-shaggy@linux.vnet.ibm.com>

The interrupt stacks need to be indexed by the physical cpu since the
critical, debug and machine check handlers use the contents of SPRN_PIR to
index the critirq_ctx, dbgirq_ctx, and mcheckirq_ctx arrays.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/irq.c      |   15 ++++++++-------
 arch/powerpc/kernel/setup_32.c |    9 +++++----
 2 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index d3ce67c..52e9c95 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -446,22 +446,23 @@ struct thread_info *mcheckirq_ctx[NR_CPUS] __read_mostly;
 void exc_lvl_ctx_init(void)
 {
 	struct thread_info *tp;
-	int i;
+	int i, hw_cpu;
 
 	for_each_possible_cpu(i) {
-		memset((void *)critirq_ctx[i], 0, THREAD_SIZE);
-		tp = critirq_ctx[i];
+		hw_cpu = get_hard_smp_processor_id(i);
+		memset((void *)critirq_ctx[hw_cpu], 0, THREAD_SIZE);
+		tp = critirq_ctx[hw_cpu];
 		tp->cpu = i;
 		tp->preempt_count = 0;
 
 #ifdef CONFIG_BOOKE
-		memset((void *)dbgirq_ctx[i], 0, THREAD_SIZE);
-		tp = dbgirq_ctx[i];
+		memset((void *)dbgirq_ctx[hw_cpu], 0, THREAD_SIZE);
+		tp = dbgirq_ctx[hw_cpu];
 		tp->cpu = i;
 		tp->preempt_count = 0;
 
-		memset((void *)mcheckirq_ctx[i], 0, THREAD_SIZE);
-		tp = mcheckirq_ctx[i];
+		memset((void *)mcheckirq_ctx[hw_cpu], 0, THREAD_SIZE);
+		tp = mcheckirq_ctx[hw_cpu];
 		tp->cpu = i;
 		tp->preempt_count = HARDIRQ_OFFSET;
 #endif
diff --git a/arch/powerpc/kernel/setup_32.c b/arch/powerpc/kernel/setup_32.c
index a10ffc8..93666f9 100644
--- a/arch/powerpc/kernel/setup_32.c
+++ b/arch/powerpc/kernel/setup_32.c
@@ -258,17 +258,18 @@ static void __init irqstack_early_init(void)
 #if defined(CONFIG_BOOKE) || defined(CONFIG_40x)
 static void __init exc_lvl_early_init(void)
 {
-	unsigned int i;
+	unsigned int i, hw_cpu;
 
 	/* interrupt stacks must be in lowmem, we get that for free on ppc32
 	 * as the memblock is limited to lowmem by MEMBLOCK_REAL_LIMIT */
 	for_each_possible_cpu(i) {
-		critirq_ctx[i] = (struct thread_info *)
+		hw_cpu = get_hard_smp_processor_id(i);
+		critirq_ctx[hw_cpu] = (struct thread_info *)
 			__va(memblock_alloc(THREAD_SIZE, THREAD_SIZE));
 #ifdef CONFIG_BOOKE
-		dbgirq_ctx[i] = (struct thread_info *)
+		dbgirq_ctx[hw_cpu] = (struct thread_info *)
 			__va(memblock_alloc(THREAD_SIZE, THREAD_SIZE));
-		mcheckirq_ctx[i] = (struct thread_info *)
+		mcheckirq_ctx[hw_cpu] = (struct thread_info *)
 			__va(memblock_alloc(THREAD_SIZE, THREAD_SIZE));
 #endif
 	}
-- 
1.7.1

^ permalink raw reply related

* [PATCH 2/4] powerpc/47x: Remove redundant line from cputable.c
From: Dave Kleikamp @ 2010-08-18 16:44 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Dave Kleikamp, linuxppc-dev list
In-Reply-To: <1282149866-4710-1-git-send-email-shaggy@linux.vnet.ibm.com>

There are two entries for .cpu_user_features in
arch/powerpc/kernel/cputable.c.  Remove the one that doesn't belong

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/cputable.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 65e2b4e..1f9123f 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -1826,7 +1826,6 @@ static struct cpu_spec __initdata cpu_specs[] = {
 		.cpu_features		= CPU_FTRS_47X,
 		.cpu_user_features	= COMMON_USER_BOOKE |
 			PPC_FEATURE_HAS_FPU,
-		.cpu_user_features	= COMMON_USER_BOOKE,
 		.mmu_features		= MMU_FTR_TYPE_47x |
 			MMU_FTR_USE_TLBIVAX_BCAST | MMU_FTR_LOCK_BCAST_INVAL,
 		.icache_bsize		= 32,
-- 
1.7.1

^ permalink raw reply related

* [PATCH 1/4] powerpc/47x: Make sure mcsr is cleared before enabling machine check interrupts
From: Dave Kleikamp @ 2010-08-18 16:44 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Dave Kleikamp, linuxppc-dev list
In-Reply-To: <1282149866-4710-1-git-send-email-shaggy@linux.vnet.ibm.com>

Clear the machine check syndrom register before enabling machine check
interrupts.  The initial state of the tlb can lead to parity errors being
flagged early after a cold boot.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/head_44x.S |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S
index 5ab484e..562305b 100644
--- a/arch/powerpc/kernel/head_44x.S
+++ b/arch/powerpc/kernel/head_44x.S
@@ -113,6 +113,10 @@ _ENTRY(_start);
 	stw	r5, 0(r4)	/* Save abatron_pteptrs at a fixed location */
 	stw	r6, 0(r5)
 
+	/* Clear the Machine Check Syndrome Register */
+	li	r0,0
+	mtspr	SPRN_MCSR,r0
+
 	/* Let's move on */
 	lis	r4,start_kernel@h
 	ori	r4,r4,start_kernel@l
-- 
1.7.1

^ permalink raw reply related

* [PATCH 0/4] *** SUBJECT HERE ***
From: Dave Kleikamp @ 2010-08-18 16:44 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Dave Kleikamp, linuxppc-dev list

Josh,
Here are some bug fixes for the powerpc-4xx tree.  It'd be nice if they
could make it into 2.6.46.

Thanks,
Shaggy

Dave Kleikamp (4):
  powerpc/47x: Make sure mcsr is cleared before enabling machine check
    interrupts
  powerpc/47x: Remove redundant line from cputable.c
  powerpc/4xx: Index interrupt stacks by physical cpu
  powerpc/47x: Add an isync before the tlbivax instruction

 arch/powerpc/kernel/cputable.c   |    1 -
 arch/powerpc/kernel/head_44x.S   |    4 ++++
 arch/powerpc/kernel/irq.c        |   15 ++++++++-------
 arch/powerpc/kernel/setup_32.c   |    9 +++++----
 arch/powerpc/mm/tlb_nohash_low.S |    1 +
 5 files changed, 18 insertions(+), 12 deletions(-)

^ permalink raw reply

* Re: [PATCH 1/5] ptp: Added a brand new class driver for ptp clocks.
From: Arnd Bergmann @ 2010-08-18 15:02 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Rodolfo Giometti, john stultz, devicetree-discuss, linux-kernel,
	netdev, linuxppc-dev, linux-arm-kernel, Krzysztof Halasa
In-Reply-To: <20100818140440.GA22655@riccoc20.at.omicron.at>

On Wednesday 18 August 2010, Richard Cochran wrote:
> On Tue, Aug 17, 2010 at 01:36:29PM +0200, Arnd Bergmann wrote:
> > On Tuesday 17 August 2010, Richard Cochran wrote:
> > > I've been looking at offering the PTP clock as a posix clock, and it
> > > is not as hard as I first thought. The PTP clock or clocks just have
> > > to be registered as one of the posix_clocks[MAX_CLOCKS] in
> > > posix-timers.c.
> > 
> > Ok sounds good.
> 
> I've been working turning the PTP stuff into syscalls, but here is a
> little issue I ran into. The PTP clock layer wants to call the PPS
> code via pps_register_source(), but the PPS can be compiled as a
> module. Since the PTP layer is now offering syscalls, it must always
> be present in the kernel. So I need to make sure that the PPS is
> either absent entirely or staticly linked in.
> 
> How can I do this?

You might want to use callbacks for these system calls that you
can register to at module load time, like it is done for the
existing syscalls.

The simpler way (e.g. for testing) is using Kconfig dependencies, like

config PTP
	bool "IEEE 1588 Precision Time Protocol"

config PPS
	tristate "Pulse per Second"
	depends on PTP || !PTP

The depends statement is a way of expressing that when PTP is enabled,
PPS cannot be a module, while it may be a module if PTP is disabled.

	Arnd

^ permalink raw reply

* Re: [PATCH 1/5] ptp: Added a brand new class driver for ptp clocks.
From: Richard Cochran @ 2010-08-18 14:04 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Rodolfo Giometti, john stultz, devicetree-discuss, linux-kernel,
	netdev, linuxppc-dev, linux-arm-kernel, Krzysztof Halasa
In-Reply-To: <201008171336.29375.arnd@arndb.de>

On Tue, Aug 17, 2010 at 01:36:29PM +0200, Arnd Bergmann wrote:
> On Tuesday 17 August 2010, Richard Cochran wrote:
> > I've been looking at offering the PTP clock as a posix clock, and it
> > is not as hard as I first thought. The PTP clock or clocks just have
> > to be registered as one of the posix_clocks[MAX_CLOCKS] in
> > posix-timers.c.
> 
> Ok sounds good.

I've been working turning the PTP stuff into syscalls, but here is a
little issue I ran into. The PTP clock layer wants to call the PPS
code via pps_register_source(), but the PPS can be compiled as a
module. Since the PTP layer is now offering syscalls, it must always
be present in the kernel. So I need to make sure that the PPS is
either absent entirely or staticly linked in.

How can I do this?

Thanks,

Richard

^ permalink raw reply

* Re: How to use mpc8xxx_gpio.c device driver
From: Ravi Gupta @ 2010-08-18 12:45 UTC (permalink / raw)
  To: Anton Vorontsov, Ira W. Snyder, MJ embd; +Cc: linuxppc-dev
In-Reply-To: <20100813131618.GA25414@oksana.dev.rtsoft.ru>

[-- Attachment #1: Type: text/plain, Size: 4111 bytes --]

Hi,

Thanks for all your replies. Now I want to enable interrupts on some GPIO
pins(GPIO pins number 224, 225, 226, 227, 228 and 229 i.e gpiochip224's pins
0 to 5). I configured these pins as input(GPIOF_IN) during gpio request. I
am able to request gpio pins successfully.

#define GPIO_CHIP0_BASE    192
#define GPIO_CHIP0_COUNT   32

#define GPIO_CHIP1_BASE    224
#define GPIO_CHIP1_COUNT   32

#define GPIO(chip, pin) (GPIO_CHIP##chip##_BASE + (pin))

/* Initial GPIO pins setting. */
static struct gpio gpio_pins[] = {
  { GPIO(1,  0), GPIOF_IN,            "Chip 1, Pin 0"  },
  { GPIO(1,  1), GPIOF_IN,            "Chip 1, Pin 1"  },
  { GPIO(1,  2), GPIOF_IN,            "Chip 1, Pin 2"  },
  { GPIO(1,  3), GPIOF_IN,            "Chip 1, Pin 3"  },
  { GPIO(1,  4), GPIOF_IN,            "Chip 1, Pin 4"  },
  { GPIO(1,  5), GPIOF_IN,            "Chip 1, Pin 5"  },
};

static __init int gpio_module_init(void)
{
  int err;
  dev_t devno = 0;

  printk(KERN_INFO "GPIO Driver Version 0.1\n");

  /* resuest the gpio pin */
  err = gpio_request_array(gpio_pins, ARRAY_SIZE(gpio_pins));
  if (err) {
      printk(KERN_WARNING "gpio: unable to request gpio pins\n");
      return -EBUSY;
  }

  /* register interrupts handlers */
  err = gpio_request_irq_array(gpio_irq_pins, ARRAY_SIZE(gpio_irq_pins));
  if (err) {
      printk(KERN_WARNING "gpio: unable to register interrupt handler.\n");
      goto interrupt_fail;
  }
.
.
.
}

Now in order to enable interrupts I have to set the bits corresponding to
these pins in gpiochp Interrupt Mask Register.

My first question is that is there any gpio API call available for doing
this? OR I have to manually memory map the gpio IMR register and set bit
manually? For the time being I have manually set the IMR bit(using memory
map).

Second, in order to register an interrupt handler on a gpio pin, I need a
IRQ number, so how am I going to get that number? From the
Documentation/gpio.txt, it looks to me that gpio_to_irq() function should
give me the irq number, but when I tried to use it, it gives me an error.

#define GPIO_CHIP0_BASE    192
#define GPIO_CHIP0_COUNT   32

#define GPIO_CHIP1_BASE    224
#define GPIO_CHIP1_COUNT   32

#define GPIO(chip, pin) (GPIO_CHIP##chip##_BASE + (pin))

static struct gpio_irq gpio_irq_pins[] = {
  { GPIO(1, 0), IRQ_TYPE_LEVEL_HIGH | IRQ_TYPE_LEVEL_LOW },
  { GPIO(1, 1), IRQ_TYPE_LEVEL_HIGH | IRQ_TYPE_LEVEL_LOW },
  { GPIO(1, 2), IRQ_TYPE_LEVEL_HIGH | IRQ_TYPE_LEVEL_LOW },
  { GPIO(1, 3), IRQ_TYPE_LEVEL_HIGH | IRQ_TYPE_LEVEL_LOW },
  { GPIO(1, 4), IRQ_TYPE_LEVEL_HIGH | IRQ_TYPE_LEVEL_LOW },
  { GPIO(1, 5), IRQ_TYPE_LEVEL_HIGH | IRQ_TYPE_LEVEL_LOW },
};

int gpio_request_irq_array(struct gpio_irq *array, size_t num)
{
  int i, err, irq;

  for (i = 0; i < num; i++, array++) {
    /* get the irq number corresponding to the gpio pin */
    irq = gpio_to_irq(array->gpio);
    if (irq < 0) {
      printk(KERN_ERR "gpio: Trying to get irq number for GPIO%d\n",
array->gpio);
      err = -EINVAL;
      goto err_free;
    }
    printd("gpio: irq number for GPIO%d = %d(i=%d)\n", array->gpio, irq, i);


    /* set the irq type for the gpio pin */
    err = set_irq_type(irq, array->type);
    if (err)
      goto err_free;

    /* register irq handler */
    err = request_irq(irq, gpio_irq_handler, IRQF_SHARED, "GPIO Driver",
NULL);
    if (err) {
      printk(KERN_ERR "gpio: can't get assigned irq for GPIO%d\n",
array->gpio);
      goto err_free;
    }
  }
  return 0;

err_free:
  while (i--)
    free_irq(gpio_to_irq((--array)->gpio), NULL);
  return err;
}

Given above is my driver code, I have implemented a function named
"gpio_request_irq_array" which works in same manner as that of
"gpio_request_array". I am calling in the init function, see the init code
given initially. Now when I load it on my MPC837xERDB board, it gives the
following errors.

[ 1812.776420] GPIO Driver Version 0.1
[ 1812.779985] gpio: Trying to get irq number for GPIO224
[ 1812.785126] gpio: unable to register interrupt handler.
insmod: error inserting './gpio.ko': -1 Invalid parameters

Thanks in advance
Ravi Gupta

[-- Attachment #2: Type: text/html, Size: 4701 bytes --]

^ permalink raw reply

* cpm_i2c: write OK but read fail
From: Shawn Jin @ 2010-08-18  9:04 UTC (permalink / raw)
  To: ppcdev

Hi,

My system is MPC870 based and I'm enabling the I2C driver and the
ds1339 RTC driver. When the kernel tries to probe the RTC chip (slave
address 0x68), the first write seems OK but the subsequent read times
out. I applied the I2C/SPI microcode patch, which doesn't seem
necessary though. Below is the dmesg output and all I2C debug messages
are enabled.

i2c-core: driver [rtc-ds1307] registered
i2c /dev entries driver
i2c-core: driver [dev_driver] registered
fsl-i2c-cpm fa200860.i2c: cpm_i2c_setup()
  alloc irq_desc for 21 on node 0
  alloc kstat_irqs on node 0
irq: irq 16 on host /soc@fa200000/cpm@9c0/interrupt-controller@930
mapped to virtual irq 21
fsl-i2c-cpm fa200860.i2c: i2c_ram 0xfddfa400, i2c_addr 0x0400, freq 60000
fsl-i2c-cpm fa200860.i2c: tbase 0x0340, rbase 0x0360
i2c i2c-0: adapter [i2c-cpm] registered
i2c-dev: adapter [i2c-cpm] registered as minor 0
fsl-i2c-cpm fa200860.i2c: hw routines for i2c-cpm registered.
i2c 0-0068: uevent
rtc-ds1307 0-0068: probe
i2c i2c-0: master_xfer[0] W, addr=0x68, len=1
i2c i2c-0: master_xfer[1] R, addr=0x68, len=2
i2c i2c-0: R: 0 T: 0
i2c i2c-0: cpm_i2c_write(abyte=0xd0)
i2c i2c-0: R: 0 T: 1
i2c i2c-0: cpm_i2c_read(abyte=0xd1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The above shows there were two transfers being built (one write, one read).

i2c i2c-0: test ready.
i2c i2c-0: Interrupt: 2
i2c i2c-0: ready.
i2c i2c-0: tx sc 0 0x1400
^^^^^^^^^^^^^^^^^^^^^^^
This shows the write command went through successfully.

i2c i2c-0: test ready.
i2c i2c-0: I2C transfer: timeout
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
However the read timed out. :(

i2c i2c-0: cpm_i2c_force_close()
rtc-ds1307: probe of 0-0068 failed with error -5
i2c i2c-0: client [ds1339] registered with bus id 0-0068

The Tx/Rx BDs are dumped below.
fa202340 : 14000002 07b51000 ac000003 07b53000  ..............0.
fa202350 : 9fc3f137 07b55000 9739a1d0 07b4f000  ...7..P..9......
fa202360 : 90000000 07b50000 b68fa19d 07b52000  .............. .
fa202370 : 1b85e6cc 07b54000 49df3090 07b4e000  ......@.I.0.....

What went wrong with read?

Thanks,
-Shawn.

^ permalink raw reply

* Re: [PATCH 1/5] ptp: Added a brand new class driver for ptp clocks.
From: Richard Cochran @ 2010-08-18  7:19 UTC (permalink / raw)
  To: john stultz
  Cc: Rodolfo Giometti, Arnd Bergmann, netdev, devicetree-discuss,
	linux-kernel, linuxppc-dev, linux-arm-kernel, Krzysztof Halasa
In-Reply-To: <1282090963.1734.97.camel@localhost>

On Tue, Aug 17, 2010 at 05:22:43PM -0700, john stultz wrote:
> Why would system time not be adjusted to the PTP time?
> 
> This is my main concern, that we're presenting a fractured API to
> userland. Suddenly there isn't just system time, but ptp time as well,
> and possibly multiple different ptp times.

John, it is a good thing to make thoughts about the big picture with
PTP clocks and the system time, like you are doing. However, the
situation is not as troubled as you think. Let me try to explain.

> The PTP clock is a bit of hardware (usually on the NIC) that can put
> timestamps on packets (both incoming or outgoing?).

Not only on the NIC. There are bunch of new products doing the
timestamping in the PHY or in a switch fabric attached to the host
like a PHY. The synchronization that one can achieve with PHY
timestamps is better that that with MAC timestamping.

> So while to me, it think it would be more ideal (or maybe just less
> different) to have a read-only interface (like the RTC), leaving PTPd to
> manage offset calculations and use that to steer the system time. I can
> acknowledge the need to have some way to correct the freq so the packet
> timestamps are corrected.

The PTPd need not change the system time at all for PTP clock to be
useful. (see below)

> I still feel a little concerned over the timer/alarm related interfaces.
> Could you explain why the alarm interface is necessary? 

The timer/alarm stuff is "ancillary" and is not at all necessary. It
is just a "nice to have." I will happily remove it, if it is too
troubling for people.
 
> So really I think my initial negative gut reaction to this was mostly
> out of the fact that you introduced a char dev that provides almost 100%
> coverage of the posix-time interface. That is duplication we definitely
> don't want. 

The reason why I modelled the char device on the posix interface was
to make the API more familiar to application programmers. After the
recent discussion (and having reviewed the posix clock implementation
in Linux), I now think it would be even better to simply offer a new
posic clock ID for PTP.

I was emulating the posix interface. Instead I should use it directly.

> Also I think the documentation I've read about PTP (likely just due to
> the engineering focus) has an odd inverted sense of priority, focusing
> on keeping obscure hardware clocks on NIC cards in sync, rather then the
> the more tangible feature of keeping the system time in sync.
> 
> This could be comically interpreted as trying to create a shadow-time on
> the system that is the "real time" and "yea, maybe we'll let the system
> know what time it is, but user-apps who want to know the score can send
> a magic ioctl to /dev/something and get the real deal". ;)  I'm sure
> that's not the case, but I'd like to keep any confusion in userland
> about which time is the best time to a minimum (ie: use the system
> time).

You are right. As John Eidson's excellent book points out, modern
computers and operating systems provide surprisingly little support
for programming based on absolute time. It is not PTP's fault. PTP is
actually a step in the right direction, but it doesn't yet really fit
in to the present computing world.

Okay, here is the Big Picture.

1. Use Case: SW timestamping

   PTP with software timestamping (ie without special hardware) can
   acheive synchronization within a few dozen microseconds, after
   about twenty minutes. This is sufficient for very many people. The
   new API (whether char device or syscall) fully and simply supports
   this use case. When the PTPd adjusts the PTP clock, it is actually
   adjusting the system time, just like NTPd.

2. Use Case: HW timestamping for industrial control

   PTP with hardware timestamping can acheive synchronization within
   100 nanoseconds after one minute. If you want to do something with
   your wonderfully synchronization PTP clock, it must have some kind
   of special hardware, like timestamping external signals or
   generating one-shot or periodic outputs. The new API (whether char
   device or syscall) supports this use case via the ancillary
   commands.

   In this case, the end user has an application that interfaces with
   the outside world via the PTP clock API. Such a specialized
   application (for example, motor control) uses only the PTP API,
   since it knows that the standard posix API cannot help. It is
   irrelevant that the system time is not synchronized, in this case.

   The PTP clock hardware may or may not provide a hardware interface
   (interrupt) to the main CPU. In this case, it does not matter. The
   PTP clock is useful all by itself.

3. Use Case: HW timestamping with PPS to host

   This case is the same as case 2, with the exception that the PTP
   clock can interrupt the main CPU. The PTP clock driver advertises
   the "PPS" capability. When enabled, the PTP layer delivers events
   via the existing Linux PPS subsystem. Programs like NTPd can use
   these events to regulate the system time.

   This means that the system clock and the PTP clock will be at least
   as well synchronized as when using a traditionial radio clock, GPS,
   or IRIG-B method. In my opinion, this will be good enough for any
   practical purpose. For example, let's say you want to run a
   periodic task synchronized to the absolute wall clock time. Your
   scheduling latency will be a dozen microseconds or so. Your PPS
   synchronized system clock should be close enough to the PTP clock
   to support this.

The API that I have suggested, whether offered as a char device or as
syscalls, supports all of the use cases using a single API.

Richard

^ permalink raw reply

* Re: 64-bit ppc rwsem
From: David Miller @ 2010-08-18  5:48 UTC (permalink / raw)
  To: sam; +Cc: linuxppc-dev, paulus, linux-kernel, sparclinux, akpm, torvalds
In-Reply-To: <20100818053955.GA11962@merkur.ravnborg.org>

From: Sam Ravnborg <sam@ravnborg.org>
Date: Wed, 18 Aug 2010 07:39:55 +0200

>> @@ -15,7 +15,7 @@ lib-$(CONFIG_SPARC32) += divdi3.o udivdi3.o
>>  lib-$(CONFIG_SPARC32) += copy_user.o locks.o
>>  lib-y                 += atomic_$(BITS).o
>>  lib-$(CONFIG_SPARC32) += lshrdi3.o ashldi3.o
>> -lib-y                 += rwsem_$(BITS).o
>> +lib-$(CONFIG_SPARC32) += rwsem_$(BITS).o
> 
> You could write this explicit as:
> 
>> +lib-$(CONFIG_SPARC32) += rwsem_32.o
> 
> As rwsem_64 is gone now.

Sure, I'll make that change, thanks Sam.

^ permalink raw reply

* Re: 64-bit ppc rwsem
From: Sam Ravnborg @ 2010-08-18  5:39 UTC (permalink / raw)
  To: David Miller
  Cc: linuxppc-dev, paulus, linux-kernel, sparclinux, akpm, torvalds
In-Reply-To: <20100817.222818.193699062.davem@davemloft.net>

> diff --git a/arch/sparc/lib/Makefile b/arch/sparc/lib/Makefile
> index c4b5e03..fa4c3ea 100644
> --- a/arch/sparc/lib/Makefile
> +++ b/arch/sparc/lib/Makefile
> @@ -15,7 +15,7 @@ lib-$(CONFIG_SPARC32) += divdi3.o udivdi3.o
>  lib-$(CONFIG_SPARC32) += copy_user.o locks.o
>  lib-y                 += atomic_$(BITS).o
>  lib-$(CONFIG_SPARC32) += lshrdi3.o ashldi3.o
> -lib-y                 += rwsem_$(BITS).o
> +lib-$(CONFIG_SPARC32) += rwsem_$(BITS).o

You could write this explicit as:

> +lib-$(CONFIG_SPARC32) += rwsem_32.o

As rwsem_64 is gone now.

	Sam

^ permalink raw reply

* Re: 64-bit ppc rwsem
From: David Miller @ 2010-08-18  5:28 UTC (permalink / raw)
  To: benh; +Cc: torvalds, paulus, linux-kernel, sparclinux, akpm, linuxppc-dev
In-Reply-To: <1282107803.22370.173.camel@pasglop>

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Wed, 18 Aug 2010 15:03:23 +1000

> I tried various tricks but so far they didn't work. I'll have another
> look tomorrow, but I may end up having to keep all the crap typecasts.

The casts are pretty much unavoidable.

Here's what I'm going to end up using on sparc64:

--------------------
sparc64: Make rwsems 64-bit.

Basically tip-off the powerpc code, use a 64-bit type and atomic64_t
interfaces for the implementation.

This gets us off of the by-hand asm code I wrote, which frankly I
think probably ruins I-cache hit rates.

The idea was the keep the call chains less deep, but anything taking
the rw-semaphores probably is also calling other stuff and therefore
already has allocated a stack-frame.  So no real stack frame savings
ever.

Ben H. has posted patches to make powerpc use 64-bit too and with some
abstractions we can probably use a shared header file somewhere.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 arch/sparc/include/asm/rwsem-const.h |   12 ---
 arch/sparc/include/asm/rwsem.h       |  120 +++++++++++++++++++++----
 arch/sparc/lib/Makefile              |    2 +-
 arch/sparc/lib/rwsem_64.S            |  163 ----------------------------------
 4 files changed, 104 insertions(+), 193 deletions(-)
 delete mode 100644 arch/sparc/include/asm/rwsem-const.h
 delete mode 100644 arch/sparc/lib/rwsem_64.S

diff --git a/arch/sparc/include/asm/rwsem-const.h b/arch/sparc/include/asm/rwsem-const.h
deleted file mode 100644
index e4c61a1..0000000
--- a/arch/sparc/include/asm/rwsem-const.h
+++ /dev/null
@@ -1,12 +0,0 @@
-/* rwsem-const.h: RW semaphore counter constants.  */
-#ifndef _SPARC64_RWSEM_CONST_H
-#define _SPARC64_RWSEM_CONST_H
-
-#define RWSEM_UNLOCKED_VALUE		0x00000000
-#define RWSEM_ACTIVE_BIAS		0x00000001
-#define RWSEM_ACTIVE_MASK		0x0000ffff
-#define RWSEM_WAITING_BIAS		(-0x00010000)
-#define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
-#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
-
-#endif /* _SPARC64_RWSEM_CONST_H */
diff --git a/arch/sparc/include/asm/rwsem.h b/arch/sparc/include/asm/rwsem.h
index 6e56210..a2b4302 100644
--- a/arch/sparc/include/asm/rwsem.h
+++ b/arch/sparc/include/asm/rwsem.h
@@ -15,16 +15,21 @@
 
 #include <linux/list.h>
 #include <linux/spinlock.h>
-#include <asm/rwsem-const.h>
 
 struct rwsem_waiter;
 
 struct rw_semaphore {
-	signed int count;
-	spinlock_t		wait_lock;
-	struct list_head	wait_list;
+	signed long			count;
+#define RWSEM_UNLOCKED_VALUE		0x00000000L
+#define RWSEM_ACTIVE_BIAS		0x00000001L
+#define RWSEM_ACTIVE_MASK		0xffffffffL
+#define RWSEM_WAITING_BIAS		(-RWSEM_ACTIVE_MASK-1)
+#define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
+#define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+	spinlock_t			wait_lock;
+	struct list_head		wait_list;
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map	dep_map;
+	struct lockdep_map		dep_map;
 #endif
 };
 
@@ -41,6 +46,11 @@ struct rw_semaphore {
 #define DECLARE_RWSEM(name) \
 	struct rw_semaphore name = __RWSEM_INITIALIZER(name)
 
+extern struct rw_semaphore *rwsem_down_read_failed(struct rw_semaphore *sem);
+extern struct rw_semaphore *rwsem_down_write_failed(struct rw_semaphore *sem);
+extern struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem);
+extern struct rw_semaphore *rwsem_downgrade_wake(struct rw_semaphore *sem);
+
 extern void __init_rwsem(struct rw_semaphore *sem, const char *name,
 			 struct lock_class_key *key);
 
@@ -51,27 +61,103 @@ do {								\
 	__init_rwsem((sem), #sem, &__key);			\
 } while (0)
 
-extern void __down_read(struct rw_semaphore *sem);
-extern int __down_read_trylock(struct rw_semaphore *sem);
-extern void __down_write(struct rw_semaphore *sem);
-extern int __down_write_trylock(struct rw_semaphore *sem);
-extern void __up_read(struct rw_semaphore *sem);
-extern void __up_write(struct rw_semaphore *sem);
-extern void __downgrade_write(struct rw_semaphore *sem);
+/*
+ * lock for reading
+ */
+static inline void __down_read(struct rw_semaphore *sem)
+{
+	if (unlikely(atomic64_inc_return((atomic64_t *)(&sem->count)) <= 0L))
+		rwsem_down_read_failed(sem);
+}
+
+static inline int __down_read_trylock(struct rw_semaphore *sem)
+{
+	long tmp;
+
+	while ((tmp = sem->count) >= 0L) {
+		if (tmp == cmpxchg(&sem->count, tmp,
+				   tmp + RWSEM_ACTIVE_READ_BIAS)) {
+			return 1;
+		}
+	}
+	return 0;
+}
 
+/*
+ * lock for writing
+ */
 static inline void __down_write_nested(struct rw_semaphore *sem, int subclass)
 {
-	__down_write(sem);
+	long tmp;
+
+	tmp = atomic64_add_return(RWSEM_ACTIVE_WRITE_BIAS,
+				  (atomic64_t *)(&sem->count));
+	if (unlikely(tmp != RWSEM_ACTIVE_WRITE_BIAS))
+		rwsem_down_write_failed(sem);
 }
 
-static inline int rwsem_atomic_update(int delta, struct rw_semaphore *sem)
+static inline void __down_write(struct rw_semaphore *sem)
 {
-	return atomic_add_return(delta, (atomic_t *)(&sem->count));
+	__down_write_nested(sem, 0);
+}
+
+static inline int __down_write_trylock(struct rw_semaphore *sem)
+{
+	long tmp;
+
+	tmp = cmpxchg(&sem->count, RWSEM_UNLOCKED_VALUE,
+		      RWSEM_ACTIVE_WRITE_BIAS);
+	return tmp == RWSEM_UNLOCKED_VALUE;
 }
 
-static inline void rwsem_atomic_add(int delta, struct rw_semaphore *sem)
+/*
+ * unlock after reading
+ */
+static inline void __up_read(struct rw_semaphore *sem)
+{
+	long tmp;
+
+	tmp = atomic64_dec_return((atomic64_t *)(&sem->count));
+	if (unlikely(tmp < -1L && (tmp & RWSEM_ACTIVE_MASK) == 0L))
+		rwsem_wake(sem);
+}
+
+/*
+ * unlock after writing
+ */
+static inline void __up_write(struct rw_semaphore *sem)
+{
+	if (unlikely(atomic64_sub_return(RWSEM_ACTIVE_WRITE_BIAS,
+					 (atomic64_t *)(&sem->count)) < 0L))
+		rwsem_wake(sem);
+}
+
+/*
+ * implement atomic add functionality
+ */
+static inline void rwsem_atomic_add(long delta, struct rw_semaphore *sem)
+{
+	atomic64_add(delta, (atomic64_t *)(&sem->count));
+}
+
+/*
+ * downgrade write lock to read lock
+ */
+static inline void __downgrade_write(struct rw_semaphore *sem)
+{
+	long tmp;
+
+	tmp = atomic64_add_return(-RWSEM_WAITING_BIAS, (atomic64_t *)(&sem->count));
+	if (tmp < 0L)
+		rwsem_downgrade_wake(sem);
+}
+
+/*
+ * implement exchange and add functionality
+ */
+static inline long rwsem_atomic_update(long delta, struct rw_semaphore *sem)
 {
-	atomic_add(delta, (atomic_t *)(&sem->count));
+	return atomic64_add_return(delta, (atomic64_t *)(&sem->count));
 }
 
 static inline int rwsem_is_locked(struct rw_semaphore *sem)
diff --git a/arch/sparc/lib/Makefile b/arch/sparc/lib/Makefile
index c4b5e03..fa4c3ea 100644
--- a/arch/sparc/lib/Makefile
+++ b/arch/sparc/lib/Makefile
@@ -15,7 +15,7 @@ lib-$(CONFIG_SPARC32) += divdi3.o udivdi3.o
 lib-$(CONFIG_SPARC32) += copy_user.o locks.o
 lib-y                 += atomic_$(BITS).o
 lib-$(CONFIG_SPARC32) += lshrdi3.o ashldi3.o
-lib-y                 += rwsem_$(BITS).o
+lib-$(CONFIG_SPARC32) += rwsem_$(BITS).o
 lib-$(CONFIG_SPARC32) += muldi3.o bitext.o cmpdi2.o
 
 lib-$(CONFIG_SPARC64) += copy_page.o clear_page.o bzero.o
diff --git a/arch/sparc/lib/rwsem_64.S b/arch/sparc/lib/rwsem_64.S
deleted file mode 100644
index 91a7d29..0000000
--- a/arch/sparc/lib/rwsem_64.S
+++ /dev/null
@@ -1,163 +0,0 @@
-/* rwsem.S: RW semaphore assembler.
- *
- * Written by David S. Miller (davem@redhat.com), 2001.
- * Derived from asm-i386/rwsem.h
- */
-
-#include <asm/rwsem-const.h>
-
-	.section	.sched.text, "ax"
-
-	.globl		__down_read
-__down_read:
-1:	lduw		[%o0], %g1
-	add		%g1, 1, %g7
-	cas		[%o0], %g1, %g7
-	cmp		%g1, %g7
-	bne,pn		%icc, 1b
-	 add		%g7, 1, %g7
-	cmp		%g7, 0
-	bl,pn		%icc, 3f
-	 nop
-2:
-	retl
-	 nop
-3:
-	save		%sp, -192, %sp
-	call		rwsem_down_read_failed
-	 mov		%i0, %o0
-	ret
-	 restore
-	.size		__down_read, .-__down_read
-
-	.globl		__down_read_trylock
-__down_read_trylock:
-1:	lduw		[%o0], %g1
-	add		%g1, 1, %g7
-	cmp		%g7, 0
-	bl,pn		%icc, 2f
-	 mov		0, %o1
-	cas		[%o0], %g1, %g7
-	cmp		%g1, %g7
-	bne,pn		%icc, 1b
-	 mov		1, %o1
-2:	retl
-	 mov		%o1, %o0
-	.size		__down_read_trylock, .-__down_read_trylock
-
-	.globl		__down_write
-__down_write:
-	sethi		%hi(RWSEM_ACTIVE_WRITE_BIAS), %g1
-	or		%g1, %lo(RWSEM_ACTIVE_WRITE_BIAS), %g1
-1:
-	lduw		[%o0], %g3
-	add		%g3, %g1, %g7
-	cas		[%o0], %g3, %g7
-	cmp		%g3, %g7
-	bne,pn		%icc, 1b
-	 cmp		%g7, 0
-	bne,pn		%icc, 3f
-	 nop
-2:	retl
-	 nop
-3:
-	save		%sp, -192, %sp
-	call		rwsem_down_write_failed
-	 mov		%i0, %o0
-	ret
-	 restore
-	.size		__down_write, .-__down_write
-
-	.globl		__down_write_trylock
-__down_write_trylock:
-	sethi		%hi(RWSEM_ACTIVE_WRITE_BIAS), %g1
-	or		%g1, %lo(RWSEM_ACTIVE_WRITE_BIAS), %g1
-1:
-	lduw		[%o0], %g3
-	cmp		%g3, 0
-	bne,pn		%icc, 2f
-	 mov		0, %o1
-	add		%g3, %g1, %g7
-	cas		[%o0], %g3, %g7
-	cmp		%g3, %g7
-	bne,pn		%icc, 1b
-	 mov		1, %o1
-2:	retl
-	 mov		%o1, %o0
-	.size		__down_write_trylock, .-__down_write_trylock
-
-	.globl		__up_read
-__up_read:
-1:
-	lduw		[%o0], %g1
-	sub		%g1, 1, %g7
-	cas		[%o0], %g1, %g7
-	cmp		%g1, %g7
-	bne,pn		%icc, 1b
-	 cmp		%g7, 0
-	bl,pn		%icc, 3f
-	 nop
-2:	retl
-	 nop
-3:	sethi		%hi(RWSEM_ACTIVE_MASK), %g1
-	sub		%g7, 1, %g7
-	or		%g1, %lo(RWSEM_ACTIVE_MASK), %g1
-	andcc		%g7, %g1, %g0
-	bne,pn		%icc, 2b
-	 nop
-	save		%sp, -192, %sp
-	call		rwsem_wake
-	 mov		%i0, %o0
-	ret
-	 restore
-	.size		__up_read, .-__up_read
-
-	.globl		__up_write
-__up_write:
-	sethi		%hi(RWSEM_ACTIVE_WRITE_BIAS), %g1
-	or		%g1, %lo(RWSEM_ACTIVE_WRITE_BIAS), %g1
-1:
-	lduw		[%o0], %g3
-	sub		%g3, %g1, %g7
-	cas		[%o0], %g3, %g7
-	cmp		%g3, %g7
-	bne,pn		%icc, 1b
-	 sub		%g7, %g1, %g7
-	cmp		%g7, 0
-	bl,pn		%icc, 3f
-	 nop
-2:
-	retl
-	 nop
-3:
-	save		%sp, -192, %sp
-	call		rwsem_wake
-	 mov		%i0, %o0
-	ret
-	 restore
-	.size		__up_write, .-__up_write
-
-	.globl		__downgrade_write
-__downgrade_write:
-	sethi		%hi(RWSEM_WAITING_BIAS), %g1
-	or		%g1, %lo(RWSEM_WAITING_BIAS), %g1
-1:
-	lduw		[%o0], %g3
-	sub		%g3, %g1, %g7
-	cas		[%o0], %g3, %g7
-	cmp		%g3, %g7
-	bne,pn		%icc, 1b
-	 sub		%g7, %g1, %g7
-	cmp		%g7, 0
-	bl,pn		%icc, 3f
-	 nop
-2:
-	retl
-	 nop
-3:
-	save		%sp, -192, %sp
-	call		rwsem_downgrade_wake
-	 mov		%i0, %o0
-	ret
-	 restore
-	.size		__downgrade_write, .-__downgrade_write
-- 
1.7.2.1

^ permalink raw reply related

* Re: 64-bit ppc rwsem (was: Re: [GIT] Sparc)
From: Benjamin Herrenschmidt @ 2010-08-18  5:03 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, paulus, linux-kernel, sparclinux, akpm, linuxppc-dev
In-Reply-To: <1282106338.22370.151.camel@pasglop>

On Wed, 2010-08-18 at 14:38 +1000, Benjamin Herrenschmidt wrote:
> 
> Here's an untested patch for the folks on linuxppc-dev to look at,
> I'll
> review my own stuff & test tomorrow. 

Allright, gcc's being a pain, and atomics are a struct so we can't that
easily assign.

I tried various tricks but so far they didn't work. I'll have another
look tomorrow, but I may end up having to keep all the crap typecasts.

Cheers,
Ben.

^ permalink raw reply

* 64-bit ppc rwsem (was: Re: [GIT] Sparc)
From: Benjamin Herrenschmidt @ 2010-08-18  4:38 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, paulus, linux-kernel, sparclinux, akpm, linuxppc-dev
In-Reply-To: <20100817.191424.183031381.davem@davemloft.net>

On Tue, 2010-08-17 at 19:14 -0700, David Miller wrote:

> > I merged your pull request, but you've got some fixing up to do,
> > methinks. I also really think you need to make your rwsem's use 64-bit
> > values on sparc64, because otherwise you can overflow the mmap_sem by
> > having more than 65536 threads doing page-faults (on 32-bit, having
> > more than 2**16 threads in one process is unlikely to work for other
> > reasons, like just pure stack usage, so we don't really care about the
> > 32-bit case)
> 
> I have a patch to do this already, just need to test it.
> 
> You should bug the powerpc folks too :-)

32K threads :-) you guys are nuts !

Here's an untested patch for the folks on linuxppc-dev to look at, I'll
review my own stuff & test tomorrow.

Cheers,
Ben.

powerpc: Make rwsem use "long" types on 64-bit platforms

This should avoid overflow of the mmap_sem when playing with insane
number of threads.

Not-signed-off-by-yet.

diff --git a/arch/powerpc/include/asm/rwsem.h b/arch/powerpc/include/asm/rwsem.h
index 24cd928..ca64a98 100644
--- a/arch/powerpc/include/asm/rwsem.h
+++ b/arch/powerpc/include/asm/rwsem.h
@@ -21,15 +21,20 @@
 /*
  * the semaphore definition
  */
-struct rw_semaphore {
-	/* XXX this should be able to be an atomic_t  -- paulus */
-	signed int		count;
-#define RWSEM_UNLOCKED_VALUE		0x00000000
-#define RWSEM_ACTIVE_BIAS		0x00000001
-#define RWSEM_ACTIVE_MASK		0x0000ffff
-#define RWSEM_WAITING_BIAS		(-0x00010000)
+#ifdef CONFIG_PPC64
+# define RWSEM_ACTIVE_MASK		0xffffffffL
+#else
+# define RWSEM_ACTIVE_MASK		0x0000ffffL
+#endif
+
+#define RWSEM_UNLOCKED_VALUE		0x00000000L
+#define RWSEM_ACTIVE_BIAS		0x00000001L
+#define RWSEM_WAITING_BIAS		(-RWSEM_ACTIVE_MASK-1)
 #define RWSEM_ACTIVE_READ_BIAS		RWSEM_ACTIVE_BIAS
 #define RWSEM_ACTIVE_WRITE_BIAS		(RWSEM_WAITING_BIAS + RWSEM_ACTIVE_BIAS)
+
+struct rw_semaphore {
+	atomic_long_t		count;
 	spinlock_t		wait_lock;
 	struct list_head	wait_list;
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -43,9 +48,13 @@ struct rw_semaphore {
 # define __RWSEM_DEP_MAP_INIT(lockname)
 #endif
 
-#define __RWSEM_INITIALIZER(name) \
-	{ RWSEM_UNLOCKED_VALUE, __SPIN_LOCK_UNLOCKED((name).wait_lock), \
-	  LIST_HEAD_INIT((name).wait_list) __RWSEM_DEP_MAP_INIT(name) }
+#define __RWSEM_INITIALIZER(name)				\
+{								\
+	ATOMIC_LONG_INIT(RWSEM_UNLOCKED_VALUE),			\
+	__SPIN_LOCK_UNLOCKED((name).wait_lock),			\
+	LIST_HEAD_INIT((name).wait_list)			\
+	__RWSEM_DEP_MAP_INIT(name)				\
+}
 
 #define DECLARE_RWSEM(name)		\
 	struct rw_semaphore name = __RWSEM_INITIALIZER(name)
@@ -70,16 +79,16 @@ extern void __init_rwsem(struct rw_semaphore *sem, const char *name,
  */
 static inline void __down_read(struct rw_semaphore *sem)
 {
-	if (unlikely(atomic_inc_return((atomic_t *)(&sem->count)) <= 0))
+	if (unlikely(atomic_long_inc_return(&sem->count) <= 0))
 		rwsem_down_read_failed(sem);
 }
 
 static inline int __down_read_trylock(struct rw_semaphore *sem)
 {
-	int tmp;
+	long tmp;
 
-	while ((tmp = sem->count) >= 0) {
-		if (tmp == cmpxchg(&sem->count, tmp,
+	while ((tmp = atomic_long_read(&sem->count)) >= 0) {
+		if (tmp == cmpxchg((long *)&sem->count, tmp,
 				   tmp + RWSEM_ACTIVE_READ_BIAS)) {
 			return 1;
 		}
@@ -92,10 +101,10 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
  */
 static inline void __down_write_nested(struct rw_semaphore *sem, int subclass)
 {
-	int tmp;
+	long tmp;
 
-	tmp = atomic_add_return(RWSEM_ACTIVE_WRITE_BIAS,
-				(atomic_t *)(&sem->count));
+	tmp = atomic_long_add_return(RWSEM_ACTIVE_WRITE_BIAS,
+				     &sem->count);
 	if (unlikely(tmp != RWSEM_ACTIVE_WRITE_BIAS))
 		rwsem_down_write_failed(sem);
 }
@@ -107,9 +116,9 @@ static inline void __down_write(struct rw_semaphore *sem)
 
 static inline int __down_write_trylock(struct rw_semaphore *sem)
 {
-	int tmp;
+	long tmp;
 
-	tmp = cmpxchg(&sem->count, RWSEM_UNLOCKED_VALUE,
+	tmp = cmpxchg((long *)&sem->count, RWSEM_UNLOCKED_VALUE,
 		      RWSEM_ACTIVE_WRITE_BIAS);
 	return tmp == RWSEM_UNLOCKED_VALUE;
 }
@@ -119,9 +128,9 @@ static inline int __down_write_trylock(struct rw_semaphore *sem)
  */
 static inline void __up_read(struct rw_semaphore *sem)
 {
-	int tmp;
+	long tmp;
 
-	tmp = atomic_dec_return((atomic_t *)(&sem->count));
+	tmp = atomic_long_dec_return(&sem->count);
 	if (unlikely(tmp < -1 && (tmp & RWSEM_ACTIVE_MASK) == 0))
 		rwsem_wake(sem);
 }
@@ -131,17 +140,17 @@ static inline void __up_read(struct rw_semaphore *sem)
  */
 static inline void __up_write(struct rw_semaphore *sem)
 {
-	if (unlikely(atomic_sub_return(RWSEM_ACTIVE_WRITE_BIAS,
-			      (atomic_t *)(&sem->count)) < 0))
+	if (unlikely(atomic_long_sub_return(RWSEM_ACTIVE_WRITE_BIAS,
+					    &sem->count) < 0))
 		rwsem_wake(sem);
 }
 
 /*
  * implement atomic add functionality
  */
-static inline void rwsem_atomic_add(int delta, struct rw_semaphore *sem)
+static inline void rwsem_atomic_add(long delta, struct rw_semaphore *sem)
 {
-	atomic_add(delta, (atomic_t *)(&sem->count));
+	atomic_long_add(delta, &sem->count);
 }
 
 /*
@@ -149,9 +158,9 @@ static inline void rwsem_atomic_add(int delta, struct rw_semaphore *sem)
  */
 static inline void __downgrade_write(struct rw_semaphore *sem)
 {
-	int tmp;
+	long tmp;
 
-	tmp = atomic_add_return(-RWSEM_WAITING_BIAS, (atomic_t *)(&sem->count));
+	tmp = atomic_long_add_return(-RWSEM_WAITING_BIAS, &sem->count);
 	if (tmp < 0)
 		rwsem_downgrade_wake(sem);
 }
@@ -159,14 +168,14 @@ static inline void __downgrade_write(struct rw_semaphore *sem)
 /*
  * implement exchange and add functionality
  */
-static inline int rwsem_atomic_update(int delta, struct rw_semaphore *sem)
+static inline long rwsem_atomic_update(long delta, struct rw_semaphore *sem)
 {
-	return atomic_add_return(delta, (atomic_t *)(&sem->count));
+	return atomic_long_add_return(delta, &sem->count);
 }
 
 static inline int rwsem_is_locked(struct rw_semaphore *sem)
 {
-	return (sem->count != 0);
+	return atomic_long_read(&sem->count) != 0;
 }
 
 #endif	/* __KERNEL__ */

^ permalink raw reply related

* Re: [PATCH 1/5] ptp: Added a brand new class driver for ptp clocks.
From: john stultz @ 2010-08-18  0:40 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linuxppc-dev, Richard Cochran, linux-kernel, netdev,
	Rodolfo Giometti, devicetree-discuss, linux-arm-kernel,
	Krzysztof Halasa
In-Reply-To: <201008171336.29375.arnd@arndb.de>

On Tue, 2010-08-17 at 13:36 +0200, Arnd Bergmann wrote:
> On Tuesday 17 August 2010, Richard Cochran wrote:
> > On Tue, Aug 17, 2010 at 09:25:55AM +0000, Arnd Bergmann wrote:
> > > Another difference is that we generally use ioctl for devices that can
> > > be enumerated, while syscalls are for system services that are not tied to
> > > a specific device. This argument works both ways for PTP IMHO: On the one
> > > hand you want to have a reliable clock that you can use without knowing
> > > where it comes from, on the other you might have multiple PTP sources that
> > > you need to differentiate.
> > 
> > Yes, I agree. In normal use, there will be only one PTP clock in a
> > system. However, for research purposes, it would be nice to have more
> > than one.
> > 
> > I've been looking at offering the PTP clock as a posix clock, and it
> > is not as hard as I first thought. The PTP clock or clocks just have
> > to be registered as one of the posix_clocks[MAX_CLOCKS] in
> > posix-timers.c.
> 
> Ok sounds good.

Oh no. I'm starting to waffle here. :)

So while I'm not a fan of the duplication of the posix clocks interface
that the proposed chardev interface introduced, I'm not sure if
absorbing the PTP clocks as a clockid is much better.

Mainly due to the fact that userland apps now have to chose between two
clockids that supposedly represent the same thing (the current wall
time).

> > My suggestion would be to reserve three clock ids in time.h,
> > CLOCK_PTP0, CLOCK_PTP1, and CLOCK_PTP2. The first one would be the
> > same as CLOCK_REALTIME, for SW timestamping, and the other two would
> > allow two different PTP clocks at the same time, for the research use
> > case.

Why would you have CLOCK_PTP0 == CLOCK_REALTIME? Whats the point of
that?

> I don't think there is a point in making exactly two independent sources
> available. The clockid_t space is not really limited, so we could define
> an arbitrary range of ids for ptp sources that could be used simultaneously,
> as long as we have space more more ids with a fixed number.
> 
> Would it be reasonable to assume that on a machine with a large number
> of NICs, you'd want a separate ptp source for each of them for timestamping?
> Or would you preferably define just one source in such a setup?
> 
> I think both could be done with the use of class device attributes in
> sysfs for configuration. Maybe you can have one CLOCK_PTP value for one
> global PTP source and use sysfs to configure which device that is.
> 
> If you also need simultaneous access to the specific clocks, you could
> have run-time configured clockid numbers in a sysfs attribute for each
> ptp class device.
> 
> > Using the clock id will bring another advantage, since it will then be
> > possible for user space to specify the desired timestamp source for
> > SO_TIMESTAMPING.

So how exactly would one map CLOCK_PTPx to an eth device?

So this is a little bit further out there, but assuming PTPd can sync
the PTP clock correctly, why could the kernel itself not sync the PTP
clock to system time?

So instead of the sync path looking like:
1) packet from master arrives on NIC, is timestamped by PTP clock
2) PTPd calculates offset between PTP clock and master time
3) PTPd corrects PTP clock freq/offset
4) PTPd corrects system time freq/offset

It would look like:
1) packet from master arrives on NIC, is timestamped by PTP clock
2) PTPd calculates offset between PTP clock and master time
3) PTPd corrects system time freq/offset
4) kernel corrects PTP clock freq/offset

I'm guessing the indirect PTP clock sync would have greater error, but
it avoids the fractured sense of time.


thanks
-john

^ permalink raw reply

* Re: [PATCH 1/5] ptp: Added a brand new class driver for ptp clocks.
From: john stultz @ 2010-08-18  0:22 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Rodolfo Giometti, Arnd Bergmann, netdev, devicetree-discuss,
	linux-kernel, linuxppc-dev, linux-arm-kernel, Krzysztof Halasa
In-Reply-To: <20100817085324.GB3330@riccoc20.at.omicron.at>

On Tue, 2010-08-17 at 10:53 +0200, Richard Cochran wrote:
> On Mon, Aug 16, 2010 at 12:24:48PM -0700, john stultz wrote:
> > 3) I'm not sure I see the benefit of being able to have multiple
> > frequency corrected time domains.  In other words, what benefit would
> > you get from adjusting a PTP clock's frequency instead of just
> > adjusting the system's time freq? Having the PTP time as a reference
> > to correct the system time seems reasonable, but I'm not sure I see
> > why userland would want to adjust the PTP clock's freq.
> 
> For PTP enabled hardware, the timestamp on the network packet comes
> from from the PTP clock, not from the system time.
> 
> Of course, you can always just leave the PTP clock alone, figure the
> needed correction, and apply it whenever needed. But this has some
> disadvantages. First of all, the (one and only) open source PTPd does
> not do it that way. Also, only one program (the PTPd or equivalent)
> will know the correct time. Other programs will not be able to ask the
> operating system for time services. Instead, they would need to use
> IPC to the PTPd.

Why would system time not be adjusted to the PTP time?

This is my main concern, that we're presenting a fractured API to
userland. Suddenly there isn't just system time, but ptp time as well,
and possibly multiple different ptp times.

Forgive me, as I suspect I'm missing something key here.

> The PTP protocol (and some PTP hardware) offers a "one step" method,
> where the timestamps are inserted by the hardware on the fly. Here you
> really do need the PTP clock to be correctly adjusted.
> 
> All of the PTP hardware that I am familiar with provides a clock
> adjustment method, so it simpler and cleaner just to use this facility
> to tune the PTP clock.

Hmm. So trying to recap here to see if I'm understanding properly:

The PTP clock is a bit of hardware (usually on the NIC) that can put
timestamps on packets (both incoming or outgoing?).

The need to offset/freq correct the PTP clock is important so that the
timestamps (incoming and outgoing) are correct, which makes future
offset calculations simpler.

Hmm. So I'm actually starting to come around to the chardev interface.

In a way, it seems it has some similarities to how the RTC device is
used in NTPd. Granted, NTPd doesn't correct the RTC (the kernel does
when we're synced, but its not a perfect parallel), but it can be used
as an input the steer the clock.

So while to me, it think it would be more ideal (or maybe just less
different) to have a read-only interface (like the RTC), leaving PTPd to
manage offset calculations and use that to steer the system time. I can
acknowledge the need to have some way to correct the freq so the packet
timestamps are corrected.

I still feel a little concerned over the timer/alarm related interfaces.
Could you explain why the alarm interface is necessary? 

So really I think my initial negative gut reaction to this was mostly
out of the fact that you introduced a char dev that provides almost 100%
coverage of the posix-time interface. That is duplication we definitely
don't want. 

Also I think the documentation I've read about PTP (likely just due to
the engineering focus) has an odd inverted sense of priority, focusing
on keeping obscure hardware clocks on NIC cards in sync, rather then the
the more tangible feature of keeping the system time in sync.

This could be comically interpreted as trying to create a shadow-time on
the system that is the "real time" and "yea, maybe we'll let the system
know what time it is, but user-apps who want to know the score can send
a magic ioctl to /dev/something and get the real deal". ;)  I'm sure
that's not the case, but I'd like to keep any confusion in userland
about which time is the best time to a minimum (ie: use the system
time).

thanks
-john

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox