LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 7/9] RapidIO: Add handling for PW message from a lost device
From: Alexandre Bounine @ 2010-08-13 15:18 UTC (permalink / raw)
  To: akpm, linux-kernel, linuxppc-dev; +Cc: Alexandre Bounine
In-Reply-To: <1281712686-31308-1-git-send-email-alexandre.bounine@idt.com>

Add check if PW message source device is accessible and change PW message
handler to recover if PW message source device is not available anymore (power
down or link disconnect).
To avoid possible loss of notification, the PW message handler scans the route
back from the source device to identify end of the broken link.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reviewed-by: Thomas Moll <thomas.moll@sysgo.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
---
 drivers/rapidio/rio.c |  111 ++++++++++++++++++++++++++++++++++++++++++++++--
 drivers/rapidio/rio.h |    2 +
 2 files changed, 108 insertions(+), 5 deletions(-)

diff --git a/drivers/rapidio/rio.c b/drivers/rapidio/rio.c
index f58df11..22f7847 100644
--- a/drivers/rapidio/rio.c
+++ b/drivers/rapidio/rio.c
@@ -495,6 +495,90 @@ int rio_set_port_lockout(struct rio_dev *rdev, u32 pnum, int lock)
 }
 
 /**
+ * rio_chk_dev_route - Validate route to the specified device.
+ * @rdev: Pointer to RIO device control structure
+ * @nrdev: Pointer to last active device on the route to rdev
+ * @npnum: nrdev port number on the route to rdev
+ *
+ * Follows a route to the specified RIO device to determine the last available
+ * device (and corresponding RIO port) on the route.
+ */
+static int
+rio_chk_dev_route(struct rio_dev *rdev, struct rio_dev **nrdev, int *npnum)
+{
+	u32 result;
+	int p_port, rc = -EIO;
+	struct rio_dev *prev = NULL;
+
+	while (rdev->prev && (rdev->prev->pef & RIO_PEF_SWITCH)) {
+		if (rio_read_config_32(rdev->prev, RIO_DEV_ID_CAR, &result)) {
+			rdev = rdev->prev;
+			continue;
+		}
+
+		prev = rdev->prev;
+		for (p_port = 0; p_port < prev->rswitch->nports; p_port++)
+			if (prev->rswitch->nextdev[p_port] == rdev)
+				break;
+
+		if (p_port < prev->rswitch->nports) {
+			pr_debug("RIO: link failed on [%s]-P%d\n",
+				 rio_name(prev), p_port);
+			*nrdev = prev;
+			*npnum = p_port;
+			rc = 0;
+		} else {
+			pr_debug("RIO: failed to trace route to %s\n",
+				 rio_name(prev));
+		}
+
+		break;
+	}
+
+	return rc;
+}
+
+/**
+ * rio_mport_chk_dev_access - Validate access to the specified device.
+ * @mport: Master port to send transactions
+ * @destid: Device destination ID in network
+ * @hopcount: Number of hops into the network
+ */
+static int
+rio_mport_chk_dev_access(struct rio_mport *mport, u16 destid, u8 hopcount)
+{
+	int i = 0;
+	u32 tmp;
+
+	while (rio_mport_read_config_32(mport, destid, hopcount,
+					RIO_DEV_ID_CAR, &tmp)) {
+		i++;
+		if (i == RIO_MAX_CHK_RETRY)
+			return -EIO;
+		mdelay(1);
+	}
+
+	return 0;
+}
+
+/**
+ * rio_chk_dev_access - Validate access to the specified device.
+ * @rdev: Pointer to RIO device control structure
+ */
+static int rio_chk_dev_access(struct rio_dev *rdev)
+{
+	u8 hopcount = 0xff;
+	u16 destid = rdev->destid;
+
+	if (rdev->rswitch) {
+		destid = rdev->rswitch->destid;
+		hopcount = rdev->rswitch->hopcount;
+	}
+
+	return rio_mport_chk_dev_access(rdev->net->hport, destid, hopcount);
+}
+
+/**
  * rio_clr_err_stopped - Clears port Error-stopped states.
  * @rdev: Pointer to RIO device control structure
  * @pnum: Switch port number to clear errors
@@ -627,8 +711,8 @@ int rio_inb_pwrite_handler(union rio_pw_msg *pw_msg)
 
 	rdev = rio_get_comptag(pw_msg->em.comptag, NULL);
 	if (rdev == NULL) {
-		/* Someting bad here (probably enumeration error) */
-		pr_err("RIO: %s No matching device for CTag 0x%08x\n",
+		/* Device removed or enumeration error */
+		pr_debug("RIO: %s No matching device for CTag 0x%08x\n",
 			__func__, pw_msg->em.comptag);
 		return -EIO;
 	}
@@ -659,6 +743,26 @@ int rio_inb_pwrite_handler(union rio_pw_msg *pw_msg)
 			return 0;
 	}
 
+	portnum = pw_msg->em.is_port & 0xFF;
+
+	/* Check if device and route to it are functional:
+	 * Sometimes devices may send PW message(s) just before being
+	 * powered down (or link being lost).
+	 */
+	if (rio_chk_dev_access(rdev)) {
+		pr_debug("RIO: device access failed - get link partner\n");
+		/* Scan route to the device and identify failed link.
+		 * This will replace device and port reported in PW message.
+		 * PW message should not be used after this point.
+		 */
+		if (rio_chk_dev_route(rdev, &rdev, &portnum)) {
+			pr_err("RIO: Route trace for %s failed\n",
+				rio_name(rdev));
+			return -EIO;
+		}
+		pw_msg = NULL;
+	}
+
 	/* For End-point devices processing stops here */
 	if (!(rdev->pef & RIO_PEF_SWITCH))
 		return 0;
@@ -676,9 +780,6 @@ int rio_inb_pwrite_handler(union rio_pw_msg *pw_msg)
 	/*
 	 * Process the port-write notification from switch
 	 */
-
-	portnum = pw_msg->em.is_port & 0xFF;
-
 	if (rdev->rswitch->em_handle)
 		rdev->rswitch->em_handle(rdev, portnum);
 
diff --git a/drivers/rapidio/rio.h b/drivers/rapidio/rio.h
index f27b7a9..bc71ba1 100644
--- a/drivers/rapidio/rio.h
+++ b/drivers/rapidio/rio.h
@@ -14,6 +14,8 @@
 #include <linux/list.h>
 #include <linux/rio.h>
 
+#define RIO_MAX_CHK_RETRY	3
+
 /* Functions internal to the RIO core code */
 
 extern u32 rio_mport_get_feature(struct rio_mport *mport, int local, u16 destid,
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH 6/9] RapidIO: Add switch-specific sysfs initialization callback
From: Alexandre Bounine @ 2010-08-13 15:18 UTC (permalink / raw)
  To: akpm, linux-kernel, linuxppc-dev; +Cc: Alexandre Bounine
In-Reply-To: <1281712686-31308-1-git-send-email-alexandre.bounine@idt.com>

Add callback that allows to create/remove switch-specific sysfs attributes.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reviewed-by: Thomas Moll <thomas.moll@sysgo.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
---
 drivers/rapidio/rio-sysfs.c |   26 +++++++++++++++++++-------
 include/linux/rio.h         |    2 ++
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/rapidio/rio-sysfs.c b/drivers/rapidio/rio-sysfs.c
index 00b4756..bfc483b 100644
--- a/drivers/rapidio/rio-sysfs.c
+++ b/drivers/rapidio/rio-sysfs.c
@@ -40,9 +40,6 @@ static ssize_t routes_show(struct device *dev, struct device_attribute *attr, ch
 	char *str = buf;
 	int i;
 
-	if (!rdev->rswitch)
-		goto out;
-
 	for (i = 0; i < RIO_MAX_ROUTE_ENTRIES(rdev->net->hport->sys_size);
 			i++) {
 		if (rdev->rswitch->route_table[i] == RIO_INVALID_ROUTE)
@@ -52,7 +49,6 @@ static ssize_t routes_show(struct device *dev, struct device_attribute *attr, ch
 			    rdev->rswitch->route_table[i]);
 	}
 
-      out:
 	return (str - buf);
 }
 
@@ -63,10 +59,11 @@ struct device_attribute rio_dev_attrs[] = {
 	__ATTR_RO(asm_did),
 	__ATTR_RO(asm_vid),
 	__ATTR_RO(asm_rev),
-	__ATTR_RO(routes),
 	__ATTR_NULL,
 };
 
+static DEVICE_ATTR(routes, S_IRUGO, routes_show, NULL);
+
 static ssize_t
 rio_read_config(struct file *filp, struct kobject *kobj,
 		struct bin_attribute *bin_attr,
@@ -218,7 +215,17 @@ int rio_create_sysfs_dev_files(struct rio_dev *rdev)
 {
 	int err = 0;
 
-	err = sysfs_create_bin_file(&rdev->dev.kobj, &rio_config_attr);
+	err = device_create_bin_file(&rdev->dev, &rio_config_attr);
+
+	if (!err && rdev->rswitch) {
+		err = device_create_file(&rdev->dev, &dev_attr_routes);
+		if (!err && rdev->rswitch->sw_sysfs)
+			err = rdev->rswitch->sw_sysfs(rdev, 1);
+	}
+
+	if (err)
+		pr_warning("RIO: Failed to create attribute file(s) for %s\n",
+			   rio_name(rdev));
 
 	return err;
 }
@@ -231,5 +238,10 @@ int rio_create_sysfs_dev_files(struct rio_dev *rdev)
  */
 void rio_remove_sysfs_dev_files(struct rio_dev *rdev)
 {
-	sysfs_remove_bin_file(&rdev->dev.kobj, &rio_config_attr);
+	device_remove_bin_file(&rdev->dev, &rio_config_attr);
+	if (rdev->rswitch) {
+		device_remove_file(&rdev->dev, &dev_attr_routes);
+		if (rdev->rswitch->sw_sysfs)
+			rdev->rswitch->sw_sysfs(rdev, 0);
+	}
 }
diff --git a/include/linux/rio.h b/include/linux/rio.h
index 754895c..8f19fb2 100644
--- a/include/linux/rio.h
+++ b/include/linux/rio.h
@@ -233,6 +233,7 @@ struct rio_net {
  * @get_domain: Callback for switch-specific domain get function
  * @em_init: Callback for switch-specific error management initialization function
  * @em_handle: Callback for switch-specific error management handler function
+ * @sw_sysfs: Callback that initializes switch-specific sysfs attributes
  * @nextdev: Array of per-port pointers to the next attached device
  */
 struct rio_switch {
@@ -256,6 +257,7 @@ struct rio_switch {
 			   u8 *sw_domain);
 	int (*em_init) (struct rio_dev *dev);
 	int (*em_handle) (struct rio_dev *dev, u8 swport);
+	int (*sw_sysfs) (struct rio_dev *dev, int create);
 	struct rio_dev *nextdev[0];
 };
 
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH 5/9] RapidIO: Add default handler for error_stopped state
From: Alexandre Bounine @ 2010-08-13 15:18 UTC (permalink / raw)
  To: akpm, linux-kernel, linuxppc-dev; +Cc: Alexandre Bounine
In-Reply-To: <1281712686-31308-1-git-send-email-alexandre.bounine@idt.com>

The default error-stopped state handler provides recovery mechanism as defined
by RIO specification.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reviewed-by: Thomas Moll <thomas.moll@sysgo.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
---
 drivers/rapidio/rio.c             |  191 +++++++++++++++++++++++++++++++------
 drivers/rapidio/switches/idtcps.c |   10 ++
 drivers/rapidio/switches/tsi57x.c |    4 +
 include/linux/rio_regs.h          |    8 +-
 4 files changed, 179 insertions(+), 34 deletions(-)

diff --git a/drivers/rapidio/rio.c b/drivers/rapidio/rio.c
index 74e9d22..f58df11 100644
--- a/drivers/rapidio/rio.c
+++ b/drivers/rapidio/rio.c
@@ -495,6 +495,121 @@ int rio_set_port_lockout(struct rio_dev *rdev, u32 pnum, int lock)
 }
 
 /**
+ * rio_clr_err_stopped - Clears port Error-stopped states.
+ * @rdev: Pointer to RIO device control structure
+ * @pnum: Switch port number to clear errors
+ * @err_status: port error status (if 0 reads register from device)
+ */
+static int rio_clr_err_stopped(struct rio_dev *rdev, u32 pnum, u32 err_status)
+{
+	struct rio_mport *mport = rdev->net->hport;
+	u16 destid = rdev->rswitch->destid;
+	u8 hopcount = rdev->rswitch->hopcount;
+	struct rio_dev *nextdev = rdev->rswitch->nextdev[pnum];
+	u32 regval;
+	u32 far_ackid, far_linkstat, near_ackid;
+	int checkcount;
+
+	if (err_status == 0)
+		rio_mport_read_config_32(mport, destid, hopcount,
+			rdev->phys_efptr + RIO_PORT_N_ERR_STS_CSR(pnum),
+			&err_status);
+
+	if (err_status & RIO_PORT_N_ERR_STS_PW_OUT_ES) {
+		pr_debug("RIO_EM: servicing Output Error-Stopped state\n");
+		/*
+		 * Send a Link-Request/Input-Status control symbol
+		 */
+
+		/* Read from link maintenance response register
+		 * to clear valid bit */
+		rio_mport_read_config_32(mport, destid, hopcount,
+			rdev->phys_efptr + RIO_PORT_N_MNT_RSP_CSR(pnum),
+			&regval);
+		udelay(50);
+
+		/* Issue Input-status command */
+		rio_mport_write_config_32(mport, destid, hopcount,
+			rdev->phys_efptr + RIO_PORT_N_MNT_REQ_CSR(pnum),
+			RIO_MNT_REQ_CMD_IS);
+
+		checkcount = 3;
+		while (checkcount--) {
+			udelay(50);
+			rio_mport_read_config_32(mport, destid, hopcount,
+				rdev->phys_efptr + RIO_PORT_N_MNT_RSP_CSR(pnum),
+				&regval);
+			if (regval & RIO_PORT_N_MNT_RSP_RVAL)
+				break;
+		}
+
+		pr_debug("RIO_EM: SP%d_MNT_RSP_CSR=0x%08x\n", pnum, regval);
+		if (regval & RIO_PORT_N_MNT_RSP_RVAL) {
+			far_ackid = (regval & RIO_PORT_N_MNT_RSP_ASTAT) >> 5;
+			far_linkstat = regval & RIO_PORT_N_MNT_RSP_LSTAT;
+			rio_mport_read_config_32(mport, destid, hopcount,
+				rdev->phys_efptr + RIO_PORT_N_ACK_STS_CSR(pnum),
+				&regval);
+			pr_debug("RIO_EM: SP%d_ACK_STS_CSR=0x%08x\n",
+				pnum, regval);
+			near_ackid = (regval & RIO_PORT_N_ACK_INBOUND) >> 24;
+			pr_debug("RIO_EM: SP%d far_ackID=0x%02x far_linkstat=0x%02x near_ackID=0x%02x\n",
+				pnum, far_ackid, far_linkstat, near_ackid);
+
+			/*
+			 * If required, synchronize ackIDs of near and
+			 * far sides.
+			 */
+			if ((far_ackid != ((regval & RIO_PORT_N_ACK_OUTSTAND) >> 8)) ||
+			    (far_ackid != (regval & RIO_PORT_N_ACK_OUTBOUND))) {
+				/* Align near outstanding/outbound ackIDs with
+				 * far inbound.
+				 */
+				rio_mport_write_config_32(mport, destid,
+					hopcount, rdev->phys_efptr +
+						RIO_PORT_N_ACK_STS_CSR(pnum),
+					(near_ackid << 24) |
+						(far_ackid << 8) | far_ackid);
+				/* Align far outstanding/outbound ackIDs with
+				 * near inbound.
+				 */
+				far_ackid++;
+				if (nextdev)
+					rio_write_config_32(nextdev,
+						nextdev->phys_efptr +
+						RIO_PORT_N_ACK_STS_CSR(nextdev->rswitch->inport),
+						(far_ackid << 24) |
+						(near_ackid << 8) | near_ackid);
+				else
+					pr_debug("RIO_EM: Invalid nextdev pointer (NULL)\n");
+			}
+		}
+
+		rio_mport_read_config_32(mport, destid, hopcount,
+			rdev->phys_efptr + RIO_PORT_N_ERR_STS_CSR(pnum),
+			&err_status);
+		pr_debug("RIO_EM: SP%d_ERR_STS_CSR=0x%08x\n", pnum, err_status);
+	}
+
+	if ((err_status & RIO_PORT_N_ERR_STS_PW_INP_ES) && nextdev) {
+		pr_debug("RIO_EM: servicing Input Error-Stopped state\n");
+		rio_write_config_32(nextdev,
+			nextdev->phys_efptr +
+			RIO_PORT_N_MNT_REQ_CSR(nextdev->rswitch->inport),
+			RIO_MNT_REQ_CMD_IS);
+		udelay(50);
+
+		rio_mport_read_config_32(mport, destid, hopcount,
+			rdev->phys_efptr + RIO_PORT_N_ERR_STS_CSR(pnum),
+			&err_status);
+		pr_debug("RIO_EM: SP%d_ERR_STS_CSR=0x%08x\n", pnum, err_status);
+	}
+
+	return (err_status & (RIO_PORT_N_ERR_STS_PW_OUT_ES |
+			      RIO_PORT_N_ERR_STS_PW_INP_ES)) ? 1 : 0;
+}
+
+/**
  * rio_inb_pwrite_handler - process inbound port-write message
  * @pw_msg: pointer to inbound port-write message
  *
@@ -507,7 +622,7 @@ int rio_inb_pwrite_handler(union rio_pw_msg *pw_msg)
 	struct rio_mport *mport;
 	u8 hopcount;
 	u16 destid;
-	u32 err_status;
+	u32 err_status, em_perrdet, em_ltlerrdet;
 	int rc, portnum;
 
 	rdev = rio_get_comptag(pw_msg->em.comptag, NULL);
@@ -524,12 +639,11 @@ int rio_inb_pwrite_handler(union rio_pw_msg *pw_msg)
 	{
 	u32 i;
 	for (i = 0; i < RIO_PW_MSG_SIZE/sizeof(u32);) {
-			pr_debug("0x%02x: %08x %08x %08x %08x",
+			pr_debug("0x%02x: %08x %08x %08x %08x\n",
 				 i*4, pw_msg->raw[i], pw_msg->raw[i + 1],
 				 pw_msg->raw[i + 2], pw_msg->raw[i + 3]);
 			i += 4;
 	}
-	pr_debug("\n");
 	}
 #endif
 
@@ -573,29 +687,28 @@ int rio_inb_pwrite_handler(union rio_pw_msg *pw_msg)
 			&err_status);
 	pr_debug("RIO_PW: SP%d_ERR_STS_CSR=0x%08x\n", portnum, err_status);
 
-	if (pw_msg->em.errdetect) {
-		pr_debug("RIO_PW: RIO_EM_P%d_ERR_DETECT=0x%08x\n",
-			 portnum, pw_msg->em.errdetect);
-		/* Clear EM Port N Error Detect CSR */
-		rio_mport_write_config_32(mport, destid, hopcount,
-			rdev->em_efptr + RIO_EM_PN_ERR_DETECT(portnum), 0);
-	}
+	if (err_status & RIO_PORT_N_ERR_STS_PORT_OK) {
 
-	if (pw_msg->em.ltlerrdet) {
-		pr_debug("RIO_PW: RIO_EM_LTL_ERR_DETECT=0x%08x\n",
-			 pw_msg->em.ltlerrdet);
-		/* Clear EM L/T Layer Error Detect CSR */
-		rio_mport_write_config_32(mport, destid, hopcount,
-			rdev->em_efptr + RIO_EM_LTL_ERR_DETECT, 0);
-	}
+		if (!(rdev->rswitch->port_ok & (1 << portnum))) {
+			rdev->rswitch->port_ok |= (1 << portnum);
+			rio_set_port_lockout(rdev, portnum, 0);
+			/* Schedule Insertion Service */
+			pr_debug("RIO_PW: Device Insertion on [%s]-P%d\n",
+			       rio_name(rdev), portnum);
+		}
 
-	/* Clear Port Errors */
-	rio_mport_write_config_32(mport, destid, hopcount,
-			rdev->phys_efptr + RIO_PORT_N_ERR_STS_CSR(portnum),
-			err_status & RIO_PORT_N_ERR_STS_CLR_MASK);
+		/* Clear error-stopped states (if reported).
+		 * Depending on the link partner state, two attempts
+		 * may be needed for successful recovery.
+		 */
+		if (err_status & (RIO_PORT_N_ERR_STS_PW_OUT_ES |
+				  RIO_PORT_N_ERR_STS_PW_INP_ES)) {
+			if (rio_clr_err_stopped(rdev, portnum, err_status))
+				rio_clr_err_stopped(rdev, portnum, 0);
+		}
+	}  else { /* if (err_status & RIO_PORT_N_ERR_STS_PORT_UNINIT) */
 
-	if (rdev->rswitch->port_ok & (1 << portnum)) {
-		if (err_status & RIO_PORT_N_ERR_STS_PORT_UNINIT) {
+		if (rdev->rswitch->port_ok & (1 << portnum)) {
 			rdev->rswitch->port_ok &= ~(1 << portnum);
 			rio_set_port_lockout(rdev, portnum, 1);
 
@@ -608,17 +721,33 @@ int rio_inb_pwrite_handler(union rio_pw_msg *pw_msg)
 			pr_debug("RIO_PW: Device Extraction on [%s]-P%d\n",
 			       rio_name(rdev), portnum);
 		}
-	} else {
-		if (err_status & RIO_PORT_N_ERR_STS_PORT_OK) {
-			rdev->rswitch->port_ok |= (1 << portnum);
-			rio_set_port_lockout(rdev, portnum, 0);
+	}
 
-			/* Schedule Insertion Service */
-			pr_debug("RIO_PW: Device Insertion on [%s]-P%d\n",
-			       rio_name(rdev), portnum);
-		}
+	rio_mport_read_config_32(mport, destid, hopcount,
+		rdev->em_efptr + RIO_EM_PN_ERR_DETECT(portnum), &em_perrdet);
+	if (em_perrdet) {
+		pr_debug("RIO_PW: RIO_EM_P%d_ERR_DETECT=0x%08x\n",
+			 portnum, em_perrdet);
+		/* Clear EM Port N Error Detect CSR */
+		rio_mport_write_config_32(mport, destid, hopcount,
+			rdev->em_efptr + RIO_EM_PN_ERR_DETECT(portnum), 0);
+	}
+
+	rio_mport_read_config_32(mport, destid, hopcount,
+		rdev->em_efptr + RIO_EM_LTL_ERR_DETECT, &em_ltlerrdet);
+	if (em_ltlerrdet) {
+		pr_debug("RIO_PW: RIO_EM_LTL_ERR_DETECT=0x%08x\n",
+			 em_ltlerrdet);
+		/* Clear EM L/T Layer Error Detect CSR */
+		rio_mport_write_config_32(mport, destid, hopcount,
+			rdev->em_efptr + RIO_EM_LTL_ERR_DETECT, 0);
 	}
 
+	/* Clear remaining error bits */
+	rio_mport_write_config_32(mport, destid, hopcount,
+			rdev->phys_efptr + RIO_PORT_N_ERR_STS_CSR(portnum),
+			err_status & RIO_PORT_N_ERR_STS_CLR_MASK);
+
 	/* Clear Port-Write Pending bit */
 	rio_mport_write_config_32(mport, destid, hopcount,
 			rdev->phys_efptr + RIO_PORT_N_ERR_STS_CSR(portnum),
diff --git a/drivers/rapidio/switches/idtcps.c b/drivers/rapidio/switches/idtcps.c
index 2c790c1..fc9f637 100644
--- a/drivers/rapidio/switches/idtcps.c
+++ b/drivers/rapidio/switches/idtcps.c
@@ -117,6 +117,10 @@ idtcps_get_domain(struct rio_mport *mport, u16 destid, u8 hopcount,
 
 static int idtcps_switch_init(struct rio_dev *rdev, int do_enum)
 {
+	struct rio_mport *mport = rdev->net->hport;
+	u16 destid = rdev->rswitch->destid;
+	u8 hopcount = rdev->rswitch->hopcount;
+
 	pr_debug("RIO: %s for %s\n", __func__, rio_name(rdev));
 	rdev->rswitch->add_entry = idtcps_route_add_entry;
 	rdev->rswitch->get_entry = idtcps_route_get_entry;
@@ -126,6 +130,12 @@ static int idtcps_switch_init(struct rio_dev *rdev, int do_enum)
 	rdev->rswitch->em_init = NULL;
 	rdev->rswitch->em_handle = NULL;
 
+	if (do_enum) {
+		/* set TVAL = ~50us */
+		rio_mport_write_config_32(mport, destid, hopcount,
+			rdev->phys_efptr + RIO_PORT_LINKTO_CTL_CSR, 0x8e << 8);
+	}
+
 	return 0;
 }
 
diff --git a/drivers/rapidio/switches/tsi57x.c b/drivers/rapidio/switches/tsi57x.c
index d34df72..d9e9492 100644
--- a/drivers/rapidio/switches/tsi57x.c
+++ b/drivers/rapidio/switches/tsi57x.c
@@ -205,6 +205,10 @@ tsi57x_em_init(struct rio_dev *rdev)
 			portnum++;
 	}
 
+	/* set TVAL = ~50us */
+	rio_mport_write_config_32(mport, destid, hopcount,
+		rdev->phys_efptr + RIO_PORT_LINKTO_CTL_CSR, 0x9a << 8);
+
 	return 0;
 }
 
diff --git a/include/linux/rio_regs.h b/include/linux/rio_regs.h
index aedee04..49a4dc7 100644
--- a/include/linux/rio_regs.h
+++ b/include/linux/rio_regs.h
@@ -222,15 +222,17 @@
 #define  RIO_PORT_GEN_MASTER		0x40000000
 #define  RIO_PORT_GEN_DISCOVERED	0x20000000
 #define RIO_PORT_N_MNT_REQ_CSR(x)	(0x0040 + x*0x20)	/* 0x0002 */
+#define  RIO_MNT_REQ_CMD_RD		0x03	/* Reset-device command */
+#define  RIO_MNT_REQ_CMD_IS		0x04	/* Input-status command */
 #define RIO_PORT_N_MNT_RSP_CSR(x)	(0x0044 + x*0x20)	/* 0x0002 */
 #define  RIO_PORT_N_MNT_RSP_RVAL	0x80000000 /* Response Valid */
 #define  RIO_PORT_N_MNT_RSP_ASTAT	0x000003e0 /* ackID Status */
 #define  RIO_PORT_N_MNT_RSP_LSTAT	0x0000001f /* Link Status */
 #define RIO_PORT_N_ACK_STS_CSR(x)	(0x0048 + x*0x20)	/* 0x0002 */
 #define  RIO_PORT_N_ACK_CLEAR		0x80000000
-#define  RIO_PORT_N_ACK_INBOUND		0x1f000000
-#define  RIO_PORT_N_ACK_OUTSTAND	0x00001f00
-#define  RIO_PORT_N_ACK_OUTBOUND	0x0000001f
+#define  RIO_PORT_N_ACK_INBOUND		0x3f000000
+#define  RIO_PORT_N_ACK_OUTSTAND	0x00003f00
+#define  RIO_PORT_N_ACK_OUTBOUND	0x0000003f
 #define RIO_PORT_N_ERR_STS_CSR(x)	(0x0058 + x*0x20)
 #define  RIO_PORT_N_ERR_STS_PW_OUT_ES	0x00010000 /* Output Error-stopped */
 #define  RIO_PORT_N_ERR_STS_PW_INP_ES	0x00000100 /* Input Error-stopped */
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH 4/9] RapidIO: Add relation links between RIO device structures
From: Alexandre Bounine @ 2010-08-13 15:18 UTC (permalink / raw)
  To: akpm, linux-kernel, linuxppc-dev; +Cc: Alexandre Bounine
In-Reply-To: <1281712686-31308-1-git-send-email-alexandre.bounine@idt.com>

Create back and forward links between RIO devices. These links are intended for
use by error management and hot-plug extensions.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reviewed-by: Thomas Moll <thomas.moll@sysgo.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
---
 drivers/rapidio/rio-scan.c |   55 +++++++++++++++++--------------------------
 include/linux/rio.h        |    6 ++++
 2 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/drivers/rapidio/rio-scan.c b/drivers/rapidio/rio-scan.c
index efe3519..5dc33d1 100644
--- a/drivers/rapidio/rio-scan.c
+++ b/drivers/rapidio/rio-scan.c
@@ -442,7 +442,10 @@ static struct rio_dev __devinit *rio_setup_device(struct rio_net *net,
 	if (rio_is_switch(rdev)) {
 		rio_mport_read_config_32(port, destid, hopcount,
 					 RIO_SWP_INFO_CAR, &swpinfo);
-		rswitch = kzalloc(sizeof(struct rio_switch), GFP_KERNEL);
+		rswitch = kzalloc(sizeof(*rswitch) +
+				  RIO_GET_TOTAL_PORTS(swpinfo) *
+				  sizeof(rswitch->nextdev[0]),
+				  GFP_KERNEL);
 		if (!rswitch)
 			goto cleanup;
 		rswitch->switchid = next_switchid;
@@ -450,6 +453,7 @@ static struct rio_dev __devinit *rio_setup_device(struct rio_net *net,
 		rswitch->destid = destid;
 		rswitch->port_ok = 0;
 		rswitch->inport = (u8)(swpinfo & RIO_SWP_INFO_PORT_NUM_MASK);
+		rswitch->nports = (u8)RIO_GET_TOTAL_PORTS(swpinfo);
 		rswitch->route_table = kzalloc(sizeof(u8)*
 					RIO_MAX_ROUTE_ENTRIES(port->sys_size),
 					GFP_KERNEL);
@@ -721,25 +725,6 @@ static u16 rio_get_host_deviceid_lock(struct rio_mport *port, u8 hopcount)
 }
 
 /**
- * rio_get_swpinfo_tports- Gets total number of ports on the switch
- * @mport: Master port to send transaction
- * @destid: Destination ID associated with the switch
- * @hopcount: Number of hops to the device
- *
- * Returns total numbers of ports implemented by the switch device.
- */
-static u8 rio_get_swpinfo_tports(struct rio_mport *mport, u16 destid,
-				 u8 hopcount)
-{
-	u32 result;
-
-	rio_mport_read_config_32(mport, destid, hopcount, RIO_SWP_INFO_CAR,
-				 &result);
-
-	return RIO_GET_TOTAL_PORTS(result);
-}
-
-/**
  * rio_net_add_mport- Add a master port to a RIO network
  * @net: RIO network
  * @port: Master port to add
@@ -759,15 +744,16 @@ static void rio_net_add_mport(struct rio_net *net, struct rio_mport *port)
  * @net: RIO network being enumerated
  * @port: Master port to send transactions
  * @hopcount: Number of hops into the network
+ * @prev: Previous RIO device connected to the enumerated one
+ * @prev_port: Port on previous RIO device
  *
  * Recursively enumerates a RIO network.  Transactions are sent via the
  * master port passed in @port.
  */
 static int __devinit rio_enum_peer(struct rio_net *net, struct rio_mport *port,
-			 u8 hopcount)
+			 u8 hopcount, struct rio_dev *prev, int prev_port)
 {
 	int port_num;
-	int num_ports;
 	int cur_destid;
 	int sw_destid;
 	int sw_inport;
@@ -812,6 +798,9 @@ static int __devinit rio_enum_peer(struct rio_net *net, struct rio_mport *port,
 	if (rdev) {
 		/* Add device to the global and bus/net specific list. */
 		list_add_tail(&rdev->net_list, &net->devices);
+		rdev->prev = prev;
+		if (prev && rio_is_switch(prev))
+			prev->rswitch->nextdev[prev_port] = rdev;
 	} else
 		return -1;
 
@@ -830,14 +819,13 @@ static int __devinit rio_enum_peer(struct rio_net *net, struct rio_mport *port,
 			rdev->rswitch->route_table[destid] = sw_inport;
 		}
 
-		num_ports =
-		    rio_get_swpinfo_tports(port, RIO_ANY_DESTID(port->sys_size),
-						hopcount);
 		pr_debug(
 		    "RIO: found %s (vid %4.4x did %4.4x) with %d ports\n",
-		    rio_name(rdev), rdev->vid, rdev->did, num_ports);
+		    rio_name(rdev), rdev->vid, rdev->did,
+		    rdev->rswitch->nports);
 		sw_destid = next_destid;
-		for (port_num = 0; port_num < num_ports; port_num++) {
+		for (port_num = 0;
+		     port_num < rdev->rswitch->nports; port_num++) {
 			/*Enable Input Output Port (transmitter reviever)*/
 			rio_enable_rx_tx_port(port, 0,
 					      RIO_ANY_DESTID(port->sys_size),
@@ -862,7 +850,8 @@ static int __devinit rio_enum_peer(struct rio_net *net, struct rio_mport *port,
 						RIO_ANY_DESTID(port->sys_size),
 						port_num, 0);
 
-				if (rio_enum_peer(net, port, hopcount + 1) < 0)
+				if (rio_enum_peer(net, port, hopcount + 1,
+						  rdev, port_num) < 0)
 					return -1;
 
 				/* Update routing tables */
@@ -949,7 +938,6 @@ rio_disc_peer(struct rio_net *net, struct rio_mport *port, u16 destid,
 	      u8 hopcount)
 {
 	u8 port_num, route_port;
-	int num_ports;
 	struct rio_dev *rdev;
 	u16 ndestid;
 
@@ -966,11 +954,12 @@ rio_disc_peer(struct rio_net *net, struct rio_mport *port, u16 destid,
 		/* Associated destid is how we accessed this switch */
 		rdev->rswitch->destid = destid;
 
-		num_ports = rio_get_swpinfo_tports(port, destid, hopcount);
 		pr_debug(
 		    "RIO: found %s (vid %4.4x did %4.4x) with %d ports\n",
-		    rio_name(rdev), rdev->vid, rdev->did, num_ports);
-		for (port_num = 0; port_num < num_ports; port_num++) {
+		    rio_name(rdev), rdev->vid, rdev->did,
+		    rdev->rswitch->nports);
+		for (port_num = 0;
+		     port_num < rdev->rswitch->nports; port_num++) {
 			if (rdev->rswitch->inport == port_num)
 				continue;
 
@@ -1163,7 +1152,7 @@ int __devinit rio_enum_mport(struct rio_mport *mport)
 		/* Enable Input Output Port (transmitter reviever) */
 		rio_enable_rx_tx_port(mport, 1, 0, 0, 0);
 
-		if (rio_enum_peer(net, mport, 0) < 0) {
+		if (rio_enum_peer(net, mport, 0, NULL, 0) < 0) {
 			/* A higher priority host won enumeration, bail. */
 			printk(KERN_INFO
 			       "RIO: master port %d device has lost enumeration to a remote host\n",
diff --git a/include/linux/rio.h b/include/linux/rio.h
index 718075a..754895c 100644
--- a/include/linux/rio.h
+++ b/include/linux/rio.h
@@ -98,6 +98,7 @@ union rio_pw_msg;
  * @riores: RIO resources this device owns
  * @pwcback: port-write callback function for this device
  * @destid: Network destination ID
+ * @prev: Previous RIO device connected to the current one
  */
 struct rio_dev {
 	struct list_head global_list;	/* node in list of all RIO devices */
@@ -123,6 +124,7 @@ struct rio_dev {
 	struct resource riores[RIO_MAX_DEV_RESOURCES];
 	int (*pwcback) (struct rio_dev *rdev, union rio_pw_msg *msg, int step);
 	u16 destid;
+	struct rio_dev *prev;
 };
 
 #define rio_dev_g(n) list_entry(n, struct rio_dev, global_list)
@@ -221,6 +223,7 @@ struct rio_net {
  * @hopcount: Hopcount to this switch
  * @destid: Associated destid in the path
  * @inport: Switch ingress port number
+ * @nports: Total number of ports in the switch
  * @route_table: Copy of switch routing table
  * @port_ok: Status of each port (one bit per port) - OK=1 or UNINIT=0
  * @add_entry: Callback for switch-specific route add function
@@ -230,6 +233,7 @@ struct rio_net {
  * @get_domain: Callback for switch-specific domain get function
  * @em_init: Callback for switch-specific error management initialization function
  * @em_handle: Callback for switch-specific error management handler function
+ * @nextdev: Array of per-port pointers to the next attached device
  */
 struct rio_switch {
 	struct list_head node;
@@ -237,6 +241,7 @@ struct rio_switch {
 	u16 hopcount;
 	u16 destid;
 	u8  inport;
+	u8  nports;
 	u8 *route_table;
 	u32 port_ok;
 	int (*add_entry) (struct rio_mport * mport, u16 destid, u8 hopcount,
@@ -251,6 +256,7 @@ struct rio_switch {
 			   u8 *sw_domain);
 	int (*em_init) (struct rio_dev *dev);
 	int (*em_handle) (struct rio_dev *dev, u8 swport);
+	struct rio_dev *nextdev[0];
 };
 
 /* Low-level architecture-dependent routines */
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH 3/9] RapidIO: Add the ingress port number into the RIO switch data structure
From: Alexandre Bounine @ 2010-08-13 15:18 UTC (permalink / raw)
  To: akpm, linux-kernel, linuxppc-dev; +Cc: Alexandre Bounine
In-Reply-To: <1281712686-31308-1-git-send-email-alexandre.bounine@idt.com>

A switch ingress port number has to be saved for software assisted error
recovery from the error-stopped state. Saving this information also allows
to remove several register reads from the RIO enumeration process.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reviewed-by: Thomas Moll <thomas.moll@sysgo.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
---
 drivers/rapidio/rio-scan.c |   38 ++++++++------------------------------
 include/linux/rio.h        |    4 ++--
 2 files changed, 10 insertions(+), 32 deletions(-)

diff --git a/drivers/rapidio/rio-scan.c b/drivers/rapidio/rio-scan.c
index 1123be8..efe3519 100644
--- a/drivers/rapidio/rio-scan.c
+++ b/drivers/rapidio/rio-scan.c
@@ -389,6 +389,7 @@ static struct rio_dev __devinit *rio_setup_device(struct rio_net *net,
 	int ret = 0;
 	struct rio_dev *rdev;
 	struct rio_switch *rswitch = NULL;
+	u32 swpinfo;
 	int result, rdid;
 
 	rdev = kzalloc(sizeof(struct rio_dev), GFP_KERNEL);
@@ -440,7 +441,7 @@ static struct rio_dev __devinit *rio_setup_device(struct rio_net *net,
 	/* If a PE has both switch and other functions, show it as a switch */
 	if (rio_is_switch(rdev)) {
 		rio_mport_read_config_32(port, destid, hopcount,
-					 RIO_SWP_INFO_CAR, &rdev->swpinfo);
+					 RIO_SWP_INFO_CAR, &swpinfo);
 		rswitch = kzalloc(sizeof(struct rio_switch), GFP_KERNEL);
 		if (!rswitch)
 			goto cleanup;
@@ -448,6 +449,7 @@ static struct rio_dev __devinit *rio_setup_device(struct rio_net *net,
 		rswitch->hopcount = hopcount;
 		rswitch->destid = destid;
 		rswitch->port_ok = 0;
+		rswitch->inport = (u8)(swpinfo & RIO_SWP_INFO_PORT_NUM_MASK);
 		rswitch->route_table = kzalloc(sizeof(u8)*
 					RIO_MAX_ROUTE_ENTRIES(port->sys_size),
 					GFP_KERNEL);
@@ -719,25 +721,6 @@ static u16 rio_get_host_deviceid_lock(struct rio_mport *port, u8 hopcount)
 }
 
 /**
- * rio_get_swpinfo_inport- Gets the ingress port number
- * @mport: Master port to send transaction
- * @destid: Destination ID associated with the switch
- * @hopcount: Number of hops to the device
- *
- * Returns port number being used to access the switch device.
- */
-static u8
-rio_get_swpinfo_inport(struct rio_mport *mport, u16 destid, u8 hopcount)
-{
-	u32 result;
-
-	rio_mport_read_config_32(mport, destid, hopcount, RIO_SWP_INFO_CAR,
-				 &result);
-
-	return (u8) (result & 0xff);
-}
-
-/**
  * rio_get_swpinfo_tports- Gets total number of ports on the switch
  * @mport: Master port to send transaction
  * @destid: Destination ID associated with the switch
@@ -834,8 +817,7 @@ static int __devinit rio_enum_peer(struct rio_net *net, struct rio_mport *port,
 
 	if (rio_is_switch(rdev)) {
 		next_switchid++;
-		sw_inport = rio_get_swpinfo_inport(port,
-				RIO_ANY_DESTID(port->sys_size), hopcount);
+		sw_inport = rdev->rswitch->inport;
 		rio_route_add_entry(port, rdev->rswitch, RIO_GLOBAL_TABLE,
 				    port->host_deviceid, sw_inport, 0);
 		rdev->rswitch->route_table[port->host_deviceid] = sw_inport;
@@ -989,8 +971,7 @@ rio_disc_peer(struct rio_net *net, struct rio_mport *port, u16 destid,
 		    "RIO: found %s (vid %4.4x did %4.4x) with %d ports\n",
 		    rio_name(rdev), rdev->vid, rdev->did, num_ports);
 		for (port_num = 0; port_num < num_ports; port_num++) {
-			if (rio_get_swpinfo_inport(port, destid, hopcount) ==
-			    port_num)
+			if (rdev->rswitch->inport == port_num)
 				continue;
 
 			if (rio_sport_is_active
@@ -1092,7 +1073,6 @@ static void rio_update_route_tables(struct rio_mport *port)
 {
 	struct rio_dev *rdev;
 	struct rio_switch *rswitch;
-	u8 sport;
 	u16 destid;
 
 	list_for_each_entry(rdev, &rio_devices, global_list) {
@@ -1109,14 +1089,12 @@ static void rio_update_route_tables(struct rio_mport *port)
 				if (rswitch->destid == destid)
 					continue;
 
-				sport = rio_get_swpinfo_inport(port,
-						rswitch->destid, rswitch->hopcount);
-
 				if (rswitch->add_entry)	{
 					rio_route_add_entry(port, rswitch,
 						RIO_GLOBAL_TABLE, destid,
-						sport, 0);
-					rswitch->route_table[destid] = sport;
+						rswitch->inport, 0);
+					rswitch->route_table[destid] =
+								rswitch->inport;
 				}
 			}
 		}
diff --git a/include/linux/rio.h b/include/linux/rio.h
index 84c9f8c..718075a 100644
--- a/include/linux/rio.h
+++ b/include/linux/rio.h
@@ -86,7 +86,6 @@ union rio_pw_msg;
  * @asm_rev: Assembly revision
  * @efptr: Extended feature pointer
  * @pef: Processing element features
- * @swpinfo: Switch port info
  * @src_ops: Source operation capabilities
  * @dst_ops: Destination operation capabilities
  * @comp_tag: RIO component tag
@@ -112,7 +111,6 @@ struct rio_dev {
 	u16 asm_rev;
 	u16 efptr;
 	u32 pef;
-	u32 swpinfo;		/* Only used for switches */
 	u32 src_ops;
 	u32 dst_ops;
 	u32 comp_tag;
@@ -222,6 +220,7 @@ struct rio_net {
  * @switchid: Switch ID that is unique across a network
  * @hopcount: Hopcount to this switch
  * @destid: Associated destid in the path
+ * @inport: Switch ingress port number
  * @route_table: Copy of switch routing table
  * @port_ok: Status of each port (one bit per port) - OK=1 or UNINIT=0
  * @add_entry: Callback for switch-specific route add function
@@ -237,6 +236,7 @@ struct rio_switch {
 	u16 switchid;
 	u16 hopcount;
 	u16 destid;
+	u8  inport;
 	u8 *route_table;
 	u32 port_ok;
 	int (*add_entry) (struct rio_mport * mport, u16 destid, u8 hopcount,
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH 2/9] RapidIO, powerpc/85xx: modify RIO port-write interrupt handler
From: Alexandre Bounine @ 2010-08-13 15:17 UTC (permalink / raw)
  To: akpm, linux-kernel, linuxppc-dev; +Cc: Alexandre Bounine
In-Reply-To: <1281712686-31308-1-git-send-email-alexandre.bounine@idt.com>

- Rearranged RIO port-write interrupt handling to perform message buffering
as soon as possible.
- Modified to disable port-write controller when clearing Transaction Error (TE)
bit.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reviewed-by: Thomas Moll <thomas.moll@sysgo.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
---
 arch/powerpc/sysdev/fsl_rio.c |   67 ++++++++++++++++++++++------------------
 1 files changed, 37 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/sysdev/fsl_rio.c b/arch/powerpc/sysdev/fsl_rio.c
index cd71dc1..708d94e 100644
--- a/arch/powerpc/sysdev/fsl_rio.c
+++ b/arch/powerpc/sysdev/fsl_rio.c
@@ -1065,18 +1065,12 @@ fsl_rio_port_write_handler(int irq, void *dev_instance)
 	struct rio_priv *priv = port->priv;
 	u32 epwisr, tmp;
 
-	ipwmr = in_be32(&priv->msg_regs->pwmr);
-	ipwsr = in_be32(&priv->msg_regs->pwsr);
-
 	epwisr = in_be32(priv->regs_win + RIO_EPWISR);
-	if (epwisr & 0x80000000) {
-		tmp = in_be32(priv->regs_win + RIO_LTLEDCSR);
-		pr_info("RIO_LTLEDCSR = 0x%x\n", tmp);
-		out_be32(priv->regs_win + RIO_LTLEDCSR, 0);
-	}
-
 	if (!(epwisr & 0x00000001))
-		return IRQ_HANDLED;
+		goto pw_done;
+
+	ipwmr = in_be32(&priv->msg_regs->pwmr);
+	ipwsr = in_be32(&priv->msg_regs->pwsr);
 
 #ifdef DEBUG_PW
 	pr_debug("PW Int->IPWMR: 0x%08x IPWSR: 0x%08x (", ipwmr, ipwsr);
@@ -1092,22 +1086,8 @@ fsl_rio_port_write_handler(int irq, void *dev_instance)
 		pr_debug(" PWB");
 	pr_debug(" )\n");
 #endif
-	out_be32(&priv->msg_regs->pwsr,
-		 ipwsr & (RIO_IPWSR_TE | RIO_IPWSR_QFI | RIO_IPWSR_PWD));
-
-	if ((ipwmr & RIO_IPWMR_EIE) && (ipwsr & RIO_IPWSR_TE)) {
-		priv->port_write_msg.err_count++;
-		pr_info("RIO: Port-Write Transaction Err (%d)\n",
-			 priv->port_write_msg.err_count);
-	}
-	if (ipwsr & RIO_IPWSR_PWD) {
-		priv->port_write_msg.discard_count++;
-		pr_info("RIO: Port Discarded Port-Write Msg(s) (%d)\n",
-			 priv->port_write_msg.discard_count);
-	}
-
 	/* Schedule deferred processing if PW was received */
-	if (ipwsr & RIO_IPWSR_QFI) {
+	if ((ipwmr & RIO_IPWMR_QFIE) && (ipwsr & RIO_IPWSR_QFI)) {
 		/* Save PW message (if there is room in FIFO),
 		 * otherwise discard it.
 		 */
@@ -1117,16 +1097,43 @@ fsl_rio_port_write_handler(int irq, void *dev_instance)
 				 RIO_PW_MSG_SIZE);
 		} else {
 			priv->port_write_msg.discard_count++;
-			pr_info("RIO: ISR Discarded Port-Write Msg(s) (%d)\n",
+			pr_debug("RIO: ISR Discarded Port-Write Msg(s) (%d)\n",
 				 priv->port_write_msg.discard_count);
 		}
+		/* Clear interrupt and issue Clear Queue command. This allows
+		 * another port-write to be received.
+		 */
+		out_be32(&priv->msg_regs->pwsr,	RIO_IPWSR_QFI);
+		out_be32(&priv->msg_regs->pwmr, ipwmr | RIO_IPWMR_CQ);
+
 		schedule_work(&priv->pw_work);
 	}
 
-	/* Issue Clear Queue command. This allows another
-	 * port-write to be received.
-	 */
-	out_be32(&priv->msg_regs->pwmr, ipwmr | RIO_IPWMR_CQ);
+	if ((ipwmr & RIO_IPWMR_EIE) && (ipwsr & RIO_IPWSR_TE)) {
+		priv->port_write_msg.err_count++;
+		pr_debug("RIO: Port-Write Transaction Err (%d)\n",
+			 priv->port_write_msg.err_count);
+		/* Clear Transaction Error: port-write controller should be
+		 * disabled when clearing this error
+		 */
+		out_be32(&priv->msg_regs->pwmr, ipwmr & ~RIO_IPWMR_PWE);
+		out_be32(&priv->msg_regs->pwsr,	RIO_IPWSR_TE);
+		out_be32(&priv->msg_regs->pwmr, ipwmr);
+	}
+
+	if (ipwsr & RIO_IPWSR_PWD) {
+		priv->port_write_msg.discard_count++;
+		pr_debug("RIO: Port Discarded Port-Write Msg(s) (%d)\n",
+			 priv->port_write_msg.discard_count);
+		out_be32(&priv->msg_regs->pwsr, RIO_IPWSR_PWD);
+	}
+
+pw_done:
+	if (epwisr & 0x80000000) {
+		tmp = in_be32(priv->regs_win + RIO_LTLEDCSR);
+		pr_debug("RIO_LTLEDCSR = 0x%x\n", tmp);
+		out_be32(priv->regs_win + RIO_LTLEDCSR, 0);
+	}
 
 	return IRQ_HANDLED;
 }
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH 1/9] RapidIO: fix RapidIO sysfs hierarchy
From: Alexandre Bounine @ 2010-08-13 15:17 UTC (permalink / raw)
  To: akpm, linux-kernel, linuxppc-dev; +Cc: Alexandre Bounine
In-Reply-To: <1281712686-31308-1-git-send-email-alexandre.bounine@idt.com>

Makes RapidIO devices appear in /sys/devices/rapidio directory instead of top
of /sys/devices directory.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reviewed-by: Thomas Moll <thomas.moll@sysgo.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
---
 drivers/rapidio/rio-driver.c |    2 +-
 drivers/rapidio/rio-scan.c   |    1 +
 include/linux/rio.h          |    1 +
 3 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/rapidio/rio-driver.c b/drivers/rapidio/rio-driver.c
index 3222fa3..0f4a53b 100644
--- a/drivers/rapidio/rio-driver.c
+++ b/drivers/rapidio/rio-driver.c
@@ -192,7 +192,7 @@ static int rio_match_bus(struct device *dev, struct device_driver *drv)
       out:return 0;
 }
 
-static struct device rio_bus = {
+struct device rio_bus = {
 	.init_name = "rapidio",
 };
 
diff --git a/drivers/rapidio/rio-scan.c b/drivers/rapidio/rio-scan.c
index 8070e07..1123be8 100644
--- a/drivers/rapidio/rio-scan.c
+++ b/drivers/rapidio/rio-scan.c
@@ -478,6 +478,7 @@ static struct rio_dev __devinit *rio_setup_device(struct rio_net *net,
 	}
 
 	rdev->dev.bus = &rio_bus_type;
+	rdev->dev.parent = &rio_bus;
 
 	device_initialize(&rdev->dev);
 	rdev->dev.release = rio_release_dev;
diff --git a/include/linux/rio.h b/include/linux/rio.h
index bd6eb0e..84c9f8c 100644
--- a/include/linux/rio.h
+++ b/include/linux/rio.h
@@ -67,6 +67,7 @@
 #define RIO_PW_MSG_SIZE		64
 
 extern struct bus_type rio_bus_type;
+extern struct device rio_bus;
 extern struct list_head rio_devices;	/* list of all devices */
 
 struct rio_mport;
-- 
1.7.0.5

^ permalink raw reply related

* [PATCH 0/9] RapidIO: Set of patches to add Gen2 switches
From: Alexandre Bounine @ 2010-08-13 15:17 UTC (permalink / raw)
  To: akpm, linux-kernel, linuxppc-dev; +Cc: Alexandre Bounine, Thomas Moll

This set of RapidIO patches adds support for new IDT Gen2 sRIO switch
devices - CPS-1848 and CPS-1616.
Adding these sRIO switches required to implement standard error recovery
mechanism defined by the RapidIO specification.

^ permalink raw reply

* Re: How to use mpc8xxx_gpio.c device driver
From: Anton Vorontsov @ 2010-08-13 13:16 UTC (permalink / raw)
  To: Ravi Gupta; +Cc: linuxppc-dev, MJ embd, linuxppc-dev, Ira W. Snyder
In-Reply-To: <AANLkTi=H1sKaGL8cHm9LAZm1tzm88dvkReXk12hmkWLj@mail.gmail.com>

Hi,

On Fri, Aug 13, 2010 at 03:29:11PM +0530, Ravi Gupta wrote:
[...]
> Thanks for the reply.
> I had added the entries for gpio pin 9 for both controllers(I was not sure
> with controller's pin is connected to LED, but now I know it is pin no. 233
> i.e 224+9) in the mpc8377_rdb.dts file. Below is a portion of my dts file, I
> have attached the complete dts file as attachment.
> 
> immr@e0000000 {
[...]
>    * led@0 {
>         compatible = "gpio-leds";
>         label = "hdd";
>         gpios = <&gpio1 9 0>;
>     };

What kernel version you look at? Please see the latest kernel,
it has MCU GPIO LED nodes already, and you can just add some
additional nodes.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/powerpc/boot/dts/mpc8377_rdb.dts#l490

[...]
> Also I have enabled drivers/leds/leds-gpio.c in my kernel. To test whether
> the leds entires in dts file get attached to leds-gpio driver, I added
> printks in the probe function of the driver.
> 
> static int gpio_led_probe(struct platform_device *pdev)
> {
>   struct gpio_led_platform_data *pdata = pdev->dev.platform_data;
>   struct gpio_led *cur_led;
>   struct gpio_led_data *leds_data, *led_dat;
>   int i, ret = 0;
> 
>   *printk(KERN_INFO "led: inside gpio_led_probe.\n");*

You have put the printk into the wrong function. It should
have been of_gpio_leds_probe():

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/leds/leds-gpio.c#l227

If you don't have that function then you use too old kernel.

Thanks,

-- 
Anton Vorontsov
email: cbouatmailru@gmail.com
irc://irc.freenode.net/bd2

^ permalink raw reply

* Re: How to use mpc8xxx_gpio.c device driver
From: Ravi Gupta @ 2010-08-13 10:01 UTC (permalink / raw)
  To: Ira W. Snyder; +Cc: linuxppc-dev, linuxppc-dev
In-Reply-To: <20100812153618.GA7498@ovro.caltech.edu>

[-- Attachment #1: Type: text/plain, Size: 1160 bytes --]

> Looking at the device tree for this board, it appears U-Boot remaps the
> IMMR registers to 0xe0000000. They are no longer accessible at
> 0xff400000.
>
> I would recommend studying arch/powerpc/boot/dts/mpc8377_rdb.dts in the
> Linux source code. That describes the device layout on your board after
> U-Boot has run.
>
> A wonderful tool for testing devices from userspace is "busybox devmem".
> It allows you to poke any physical address with any value. The output of
> "busybox devmem --help" should get you started. As a quick example,
> "busybox devmem 0xe0000c00 w 0x1" will write the 32-bit value 0x1 to
> address 0xe0000c00.
>
> I would also recommend using the built-in Linux GPIO API. It works, you
> just need to figure out how to use it. It will be much easier to get
> your code upstream if you use the provided APIs.
>
> The Documentation/gpio.txt file should help you in understanding the
> in-kernel Linux GPIO API. I'm afraid I don't have much experience other
> than accessing it via sysfs from userspace.
>
> Ira
>

Hi Ira,

Thanks for another great reply. Now I can also access gpio memory map
registers.

Thanks and Regards,
Ravi Gupta

[-- Attachment #2: Type: text/html, Size: 1476 bytes --]

^ permalink raw reply

* Re: How to use mpc8xxx_gpio.c device driver
From: Ravi Gupta @ 2010-08-13  9:59 UTC (permalink / raw)
  To: Anton Vorontsov; +Cc: linuxppc-dev, MJ embd, linuxppc-dev, Ira W. Snyder
In-Reply-To: <20100811164527.GA19793@oksana.dev.rtsoft.ru>


[-- Attachment #1.1: Type: text/plain, Size: 5563 bytes --]

On Wed, Aug 11, 2010 at 10:15 PM, Anton Vorontsov <cbouatmailru@gmail.com>wrote:

> Hi,
>
> On Wed, Aug 11, 2010 at 06:57:16PM +0530, Ravi Gupta wrote:
> > I am new to device driver development. I am trying to access the GPIO of
> > MPC837xERDB eval board. I have upgraded its kernel to linux-2.6.28.9 and
> > enable support for mpc8xxx_gpio.c. On boot up, it successfully detect two
> > gpio controllers. Now my question is how I am going to use it to
> communicate
> > with the gpio pins? Do I have to modify the code in mpc8xxx_gpio.c file
> to
> > do whatever I want to do with gpios or I can use the standard gpio API
> > provided in kernel ( gpio_request()/gpio_free() ). I also tries the
> standard
> > kernel API, but it fails. Here is my code :
>
> No, you don't have to modify anything, and yes, you can
> use standard kernel GPIO API.
>
> > #include <linux/module.h>
> > #include <linux/errno.h>  /* error codes */
> > #include <linux/gpio.h>
> >
> > static __init int sample_module_init(void)
> > {
> >   int ret;
> >
> >   int i;
> >   for (i=1; i<32; i++) {
> >     ret = gpio_request(i, "Sample Driver");
>
> Before issing gpio_request() you must get GPIO number from the
> of_get_gpio() or of_get_gpio_flags() calls (the _flags variant
> will also retreive flags such as 'active-low').
>
> The calls assume that you have gpio = <> specifier in the
> device tree, see arch/powerpc/boot/dts/mpc8377_rdb.dts's
> "leds" node as an example.
>
> As you want GPIO LEDs, you can use drivers/leds/leds-gpio.c
> (see of_gpio_leds_probe() call, it gets gpio numbers via
> of_get_gpio_flags() and then requests them via gpio_request()).
>
> Also see
>
> Documentation/powerpc/dts-bindings/gpio/gpio.txt
> Documentation/powerpc/dts-bindings/gpio/led.txt
>
> >     if (ret) {
> >       printk(KERN_WARNING "sample_driver: unable to request GPIO_PG%d\n",
> > i);
> >       //return ret;
> >     }
> >   }
> >
> >   return 0;
> > }
>
>
> On Wed, Aug 11, 2010 at 07:49:40PM +0530, Ravi Gupta wrote:
> > Also, when I try to export a gpio in sysfs
> >
> > echo 9 > /sys/class/gpio/export
>
> The gpio numbers are global, i.e. GPIO controller base + GPIO
> number within the controller.
>
> [...]
> > drwxr-xr-x    3 root     root            0 Jan  1 00:00 gpiochip192
>
> So, if you want GPIO9 within gpiochip192, you should issue
> "echo 201 > export".
>
> Thanks,
>
> --
> Anton Vorontsov
> email: cbouatmailru@gmail.com
> irc://irc.freenode.net/bd2
>

Hi Anton,

Thanks for the reply.
I had added the entries for gpio pin 9 for both controllers(I was not sure
with controller's pin is connected to LED, but now I know it is pin no. 233
i.e 224+9) in the mpc8377_rdb.dts file. Below is a portion of my dts file, I
have attached the complete dts file as attachment.

immr@e0000000 {
    #address-cells = <1>;
    #size-cells = <1>;
    device_type = "soc";
    compatible = "simple-bus";
    ranges = <0x0 0xe0000000 0x00100000>;
    reg = <0xe0000000 0x00000200>;
    bus-frequency = <0>;

    wdt@200 {
        device_type = "watchdog";
        compatible = "mpc83xx_wdt";
        reg = <0x200 0x100>;
    };

    gpio1: gpio-controller@c00 {
        #gpio-cells = <2>;
        compatible = "fsl,mpc8377-gpio", "fsl,mpc8349-gpio";
        reg = <0xc00 0x100>;
        interrupts = <74 0x8>;
        interrupt-parent = <&ipic>;
        gpio-controller;
    };

    gpio2: gpio-controller@d00 {
        #gpio-cells = <2>;
        compatible = "fsl,mpc8377-gpio", "fsl,mpc8349-gpio";
        reg = <0xd00 0x100>;
        interrupts = <75 0x8>;
        interrupt-parent = <&ipic>;
        gpio-controller;
    };

   * led@0 {
        compatible = "gpio-leds";
        label = "hdd";
        gpios = <&gpio1 9 0>;
    };

    led@1 {
        compatible = "gpio-leds";
        label = "rom";
        gpios = <&gpio2 9 0>;
    };*
    .
    .
    .
    .
};


Also I have enabled drivers/leds/leds-gpio.c in my kernel. To test whether
the leds entires in dts file get attached to leds-gpio driver, I added
printks in the probe function of the driver.

static int gpio_led_probe(struct platform_device *pdev)
{
  struct gpio_led_platform_data *pdata = pdev->dev.platform_data;
  struct gpio_led *cur_led;
  struct gpio_led_data *leds_data, *led_dat;
  int i, ret = 0;

  *printk(KERN_INFO "led: inside gpio_led_probe.\n");*

  ....
}

But I couldn't see "led: inside gpio_led_probe." message in dmesg. I checked
sysfs and I can see entries of leds. Here is contents of sysfs on my
machine.

# ls /sys/devices/ -la
drwxr-xr-x    7 root     root            0 Jan  1 01:58 .
drwxr-xr-x   12 root     root            0 Jan  1 00:00 ..
drwxr-xr-x   22 root     root            0 Jan  1 01:58 *e0000000.immr*
drwxr-xr-x    5 root     root            0 Jan  1 01:58 e0005000.localbus
drwxr-xr-x    4 root     root            0 Jan  1 01:58 pci0000:00
drwxr-xr-x    9 root     root            0 Jan  1 01:58 platform
drwxr-xr-x    8 root     root            0 Jan  1 01:58 system


# ls /sys/devices/e0000000.immr/
bus                           e0004600.serial       *led1.0*
devspec                    e0007000.spi           *led2.1*
e0000200.wdt            e00082a8.dma         modalias
e0000700.interrupt-    e0018000.sata         name
e0000b00.power        e0019000.sata         power
e0000c00.gpio-contr  e0023000.usb          sleep-nexus.2
e0000d00.gpio-contr  e0024000.ethernet    subsystem
e0003100.i2c            e0025000.ethernet    uevent
e0004500.serial        e0030000.crypto

Is there something that I am doing wrong? Please suggest.

Thanks in Advance
Ravi Gupta

[-- Attachment #1.2: Type: text/html, Size: 7119 bytes --]

[-- Attachment #2: mpc8377_rdb.dts --]
[-- Type: application/octet-stream, Size: 11597 bytes --]

/*
 * MPC8377E RDB Device Tree Source
 *
 * Copyright 2007, 2008 Freescale Semiconductor Inc.
 *
 * This program is free software; you can redistribute  it and/or modify it
 * under  the terms of  the GNU General  Public License as published by the
 * Free Software Foundation;  either version 2 of the  License, or (at your
 * option) any later version.
 */

/dts-v1/;

/ {
	compatible = "fsl,mpc8377rdb";
	#address-cells = <1>;
	#size-cells = <1>;

	aliases {
		ethernet0 = &enet0;
		ethernet1 = &enet1;
		serial0 = &serial0;
		serial1 = &serial1;
		pci0 = &pci0;
		pci1 = &pci1;
		pci2 = &pci2;
	};

	cpus {
		#address-cells = <1>;
		#size-cells = <0>;

		PowerPC,8377@0 {
			device_type = "cpu";
			reg = <0x0>;
			d-cache-line-size = <32>;
			i-cache-line-size = <32>;
			d-cache-size = <32768>;
			i-cache-size = <32768>;
			timebase-frequency = <0>;
			bus-frequency = <0>;
			clock-frequency = <0>;
		};
	};

	memory {
		device_type = "memory";
		reg = <0x00000000 0x10000000>;	// 256MB at 0
	};

	localbus@e0005000 {
		#address-cells = <2>;
		#size-cells = <1>;
		compatible = "fsl,mpc8377-elbc", "fsl,elbc", "simple-bus";
		reg = <0xe0005000 0x1000>;
		interrupts = <77 0x8>;
		interrupt-parent = <&ipic>;

		// CS0 and CS1 are swapped when
		// booting from nand, but the
		// addresses are the same.
		ranges = <0x0 0x0 0xfe000000 0x00800000
		          0x1 0x0 0xe0600000 0x00008000
		          0x2 0x0 0xf0000000 0x00020000
		          0x3 0x0 0xfa000000 0x00008000>;

		flash@0,0 {
			#address-cells = <1>;
			#size-cells = <1>;
			compatible = "cfi-flash";
			reg = <0x0 0x0 0x800000>;
			bank-width = <2>;
			device-width = <1>;
		};

		nand@1,0 {
			#address-cells = <1>;
			#size-cells = <1>;
			compatible = "fsl,mpc8377-fcm-nand",
			             "fsl,elbc-fcm-nand";
			reg = <0x1 0x0 0x8000>;

			u-boot@0 {
				reg = <0x0 0x100000>;
				read-only;
			};

			kernel@100000 {
				reg = <0x100000 0x300000>;
			};
			fs@400000 {
				reg = <0x400000 0x1c00000>;
			};
		};
	};

	immr@e0000000 {
		#address-cells = <1>;
		#size-cells = <1>;
		device_type = "soc";
		compatible = "simple-bus";
		ranges = <0x0 0xe0000000 0x00100000>;
		reg = <0xe0000000 0x00000200>;
		bus-frequency = <0>;

		wdt@200 {
			device_type = "watchdog";
			compatible = "mpc83xx_wdt";
			reg = <0x200 0x100>;
		};

		gpio1: gpio-controller@c00 {
			#gpio-cells = <2>;
			compatible = "fsl,mpc8377-gpio", "fsl,mpc8349-gpio";
			reg = <0xc00 0x100>;
			interrupts = <74 0x8>;
			interrupt-parent = <&ipic>;
			gpio-controller;
			
		
		};

		gpio2: gpio-controller@d00 {
			#gpio-cells = <2>;
			compatible = "fsl,mpc8377-gpio", "fsl,mpc8349-gpio";
			reg = <0xd00 0x100>;
			interrupts = <75 0x8>;
			interrupt-parent = <&ipic>;
			gpio-controller;
		};

		led@0 {
			compatible = "gpio-leds";
			label = "hdd";
			gpios = <&gpio1 9 0>; 
		};

		led@1 {
			compatible = "gpio-leds";
			label = "rom";
			gpios = <&gpio2 9 0>; 
		};

		sleep-nexus {
			#address-cells = <1>;
			#size-cells = <1>;
			compatible = "simple-bus";
			sleep = <&pmc 0x0c000000>;
			ranges;

			i2c@3000 {
				#address-cells = <1>;
				#size-cells = <0>;
				cell-index = <0>;
				compatible = "fsl-i2c";
				reg = <0x3000 0x100>;
				interrupts = <14 0x8>;
				interrupt-parent = <&ipic>;
				dfsrr;

				dtt@48 {
					compatible = "national,lm75";
					reg = <0x48>;
				};

				at24@50 {
					compatible = "at24,24c256";
					reg = <0x50>;
				};

				rtc@68 {
					compatible = "dallas,ds1339";
					reg = <0x68>;
				};

				mcu_pio: mcu@a {
					#gpio-cells = <2>;
					compatible = "fsl,mc9s08qg8-mpc8377erdb",
						     "fsl,mcu-mpc8349emitx";
					reg = <0x0a>;
					gpio-controller;
				};
			};

			sdhci@2e000 {
				compatible = "fsl,mpc8377-esdhc", "fsl,esdhc";
				reg = <0x2e000 0x1000>;
				interrupts = <42 0x8>;
				interrupt-parent = <&ipic>;
				sdhci,wp-inverted;
				/* Filled in by U-Boot */
				clock-frequency = <111111111>;
			};
		};

		i2c@3100 {
			#address-cells = <1>;
			#size-cells = <0>;
			cell-index = <1>;
			compatible = "fsl-i2c";
			reg = <0x3100 0x100>;
			interrupts = <15 0x8>;
			interrupt-parent = <&ipic>;
			dfsrr;
		};

		spi@7000 {
			cell-index = <0>;
			compatible = "fsl,spi";
			reg = <0x7000 0x1000>;
			interrupts = <16 0x8>;
			interrupt-parent = <&ipic>;
			mode = "cpu";
		};

		dma@82a8 {
			#address-cells = <1>;
			#size-cells = <1>;
			compatible = "fsl,mpc8377-dma", "fsl,elo-dma";
			reg = <0x82a8 4>;
			ranges = <0 0x8100 0x1a8>;
			interrupt-parent = <&ipic>;
			interrupts = <71 8>;
			cell-index = <0>;
			dma-channel@0 {
				compatible = "fsl,mpc8377-dma-channel", "fsl,elo-dma-channel";
				reg = <0 0x80>;
				cell-index = <0>;
				interrupt-parent = <&ipic>;
				interrupts = <71 8>;
			};
			dma-channel@80 {
				compatible = "fsl,mpc8377-dma-channel", "fsl,elo-dma-channel";
				reg = <0x80 0x80>;
				cell-index = <1>;
				interrupt-parent = <&ipic>;
				interrupts = <71 8>;
			};
			dma-channel@100 {
				compatible = "fsl,mpc8377-dma-channel", "fsl,elo-dma-channel";
				reg = <0x100 0x80>;
				cell-index = <2>;
				interrupt-parent = <&ipic>;
				interrupts = <71 8>;
			};
			dma-channel@180 {
				compatible = "fsl,mpc8377-dma-channel", "fsl,elo-dma-channel";
				reg = <0x180 0x28>;
				cell-index = <3>;
				interrupt-parent = <&ipic>;
				interrupts = <71 8>;
			};
		};

		usb@23000 {
			compatible = "fsl-usb2-dr";
			reg = <0x23000 0x1000>;
			#address-cells = <1>;
			#size-cells = <0>;
			interrupt-parent = <&ipic>;
			interrupts = <38 0x8>;
			phy_type = "ulpi";
			sleep = <&pmc 0x00c00000>;
		};

		enet0: ethernet@24000 {
			#address-cells = <1>;
			#size-cells = <1>;
			cell-index = <0>;
			device_type = "network";
			model = "eTSEC";
			compatible = "gianfar";
			reg = <0x24000 0x1000>;
			ranges = <0x0 0x24000 0x1000>;
			local-mac-address = [ 00 00 00 00 00 00 ];
			interrupts = <32 0x8 33 0x8 34 0x8>;
			phy-connection-type = "mii";
			interrupt-parent = <&ipic>;
			tbi-handle = <&tbi0>;
			phy-handle = <&phy2>;
			sleep = <&pmc 0xc0000000>;
			fsl,magic-packet;

			mdio@520 {
				#address-cells = <1>;
				#size-cells = <0>;
				compatible = "fsl,gianfar-mdio";
				reg = <0x520 0x20>;

				phy2: ethernet-phy@2 {
					interrupt-parent = <&ipic>;
					interrupts = <17 0x8>;
					reg = <0x2>;
					device_type = "ethernet-phy";
				};

				tbi0: tbi-phy@11 {
					reg = <0x11>;
					device_type = "tbi-phy";
				};
			};
		};

		enet1: ethernet@25000 {
			#address-cells = <1>;
			#size-cells = <1>;
			cell-index = <1>;
			device_type = "network";
			model = "eTSEC";
			compatible = "gianfar";
			reg = <0x25000 0x1000>;
			ranges = <0x0 0x25000 0x1000>;
			local-mac-address = [ 00 00 00 00 00 00 ];
			interrupts = <35 0x8 36 0x8 37 0x8>;
			phy-connection-type = "mii";
			interrupt-parent = <&ipic>;
			fixed-link = <1 1 1000 0 0>;
			tbi-handle = <&tbi1>;
			sleep = <&pmc 0x30000000>;
			fsl,magic-packet;

			mdio@520 {
				#address-cells = <1>;
				#size-cells = <0>;
				compatible = "fsl,gianfar-tbi";
				reg = <0x520 0x20>;

				tbi1: tbi-phy@11 {
					reg = <0x11>;
					device_type = "tbi-phy";
				};
			};
		};

		serial0: serial@4500 {
			cell-index = <0>;
			device_type = "serial";
			compatible = "ns16550";
			reg = <0x4500 0x100>;
			clock-frequency = <0>;
			interrupts = <9 0x8>;
			interrupt-parent = <&ipic>;
		};

		serial1: serial@4600 {
			cell-index = <1>;
			device_type = "serial";
			compatible = "ns16550";
			reg = <0x4600 0x100>;
			clock-frequency = <0>;
			interrupts = <10 0x8>;
			interrupt-parent = <&ipic>;
		};

		crypto@30000 {
			compatible = "fsl,sec3.0", "fsl,sec2.4", "fsl,sec2.2",
				     "fsl,sec2.1", "fsl,sec2.0";
			reg = <0x30000 0x10000>;
			interrupts = <11 0x8>;
			interrupt-parent = <&ipic>;
			fsl,num-channels = <4>;
			fsl,channel-fifo-len = <24>;
			fsl,exec-units-mask = <0x9fe>;
			fsl,descriptor-types-mask = <0x3ab0ebf>;
			sleep = <&pmc 0x03000000>;
		};

		sata@18000 {
			compatible = "fsl,mpc8377-sata", "fsl,pq-sata";
			reg = <0x18000 0x1000>;
			interrupts = <44 0x8>;
			interrupt-parent = <&ipic>;
			sleep = <&pmc 0x000000c0>;
		};

		sata@19000 {
			compatible = "fsl,mpc8377-sata", "fsl,pq-sata";
			reg = <0x19000 0x1000>;
			interrupts = <45 0x8>;
			interrupt-parent = <&ipic>;
			sleep = <&pmc 0x00000030>;
		};

		/* IPIC
		 * interrupts cell = <intr #, sense>
		 * sense values match linux IORESOURCE_IRQ_* defines:
		 * sense == 8: Level, low assertion
		 * sense == 2: Edge, high-to-low change
		 */
		ipic: interrupt-controller@700 {
			compatible = "fsl,ipic";
			interrupt-controller;
			#address-cells = <0>;
			#interrupt-cells = <2>;
			reg = <0x700 0x100>;
		};

		pmc: power@b00 {
			compatible = "fsl,mpc8377-pmc", "fsl,mpc8349-pmc";
			reg = <0xb00 0x100 0xa00 0x100>;
			interrupts = <80 0x8>;
			interrupt-parent = <&ipic>;
		};
	};

	pci0: pci@e0008500 {
		interrupt-map-mask = <0xf800 0 0 7>;
		interrupt-map = <
				/* IRQ5 = 21 = 0x15, IRQ6 = 0x16, IRQ7 = 23 = 0x17 */

				/* IDSEL AD14 IRQ6 inta */
				 0x7000 0x0 0x0 0x1 &ipic 22 0x8

				/* IDSEL AD15 IRQ5 inta, IRQ6 intb, IRQ7 intd */
				 0x7800 0x0 0x0 0x1 &ipic 21 0x8
				 0x7800 0x0 0x0 0x2 &ipic 22 0x8
				 0x7800 0x0 0x0 0x4 &ipic 23 0x8

				/* IDSEL AD28 IRQ7 inta, IRQ5 intb IRQ6 intc*/
				 0xE000 0x0 0x0 0x1 &ipic 23 0x8
				 0xE000 0x0 0x0 0x2 &ipic 21 0x8
				 0xE000 0x0 0x0 0x3 &ipic 22 0x8>;
		interrupt-parent = <&ipic>;
		interrupts = <66 0x8>;
		bus-range = <0 0>;
		ranges = <0x02000000 0x0 0x90000000 0x90000000 0x0 0x10000000
		          0x42000000 0x0 0x80000000 0x80000000 0x0 0x10000000
		          0x01000000 0x0 0x00000000 0xe0300000 0x0 0x00100000>;
		sleep = <&pmc 0x00010000>;
		clock-frequency = <66666666>;
		#interrupt-cells = <1>;
		#size-cells = <2>;
		#address-cells = <3>;
		reg = <0xe0008500 0x100		/* internal registers */
		       0xe0008300 0x8>;		/* config space access registers */
		compatible = "fsl,mpc8349-pci";
		device_type = "pci";
	};

	pci1: pcie@e0009000 {
		#address-cells = <3>;
		#size-cells = <2>;
		#interrupt-cells = <1>;
		device_type = "pci";
		compatible = "fsl,mpc8377-pcie", "fsl,mpc8314-pcie";
		reg = <0xe0009000 0x00001000>;
		ranges = <0x02000000 0 0xa8000000 0xa8000000 0 0x10000000
		          0x01000000 0 0x00000000 0xb8000000 0 0x00800000>;
		bus-range = <0 255>;
		interrupt-map-mask = <0xf800 0 0 7>;
		interrupt-map = <0 0 0 1 &ipic 1 8
				 0 0 0 2 &ipic 1 8
				 0 0 0 3 &ipic 1 8
				 0 0 0 4 &ipic 1 8>;
		sleep = <&pmc 0x00300000>;
		clock-frequency = <0>;

		pcie@0 {
			#address-cells = <3>;
			#size-cells = <2>;
			device_type = "pci";
			reg = <0 0 0 0 0>;
			ranges = <0x02000000 0 0xa8000000
				  0x02000000 0 0xa8000000
				  0 0x10000000
				  0x01000000 0 0x00000000
				  0x01000000 0 0x00000000
				  0 0x00800000>;
		};
	};

	pci2: pcie@e000a000 {
		#address-cells = <3>;
		#size-cells = <2>;
		#interrupt-cells = <1>;
		device_type = "pci";
		compatible = "fsl,mpc8377-pcie", "fsl,mpc8314-pcie";
		reg = <0xe000a000 0x00001000>;
		ranges = <0x02000000 0 0xc8000000 0xc8000000 0 0x10000000
			  0x01000000 0 0x00000000 0xd8000000 0 0x00800000>;
		bus-range = <0 255>;
		interrupt-map-mask = <0xf800 0 0 7>;
		interrupt-map = <0 0 0 1 &ipic 2 8
				 0 0 0 2 &ipic 2 8
				 0 0 0 3 &ipic 2 8
				 0 0 0 4 &ipic 2 8>;
		sleep = <&pmc 0x000c0000>;
		clock-frequency = <0>;

		pcie@0 {
			#address-cells = <3>;
			#size-cells = <2>;
			device_type = "pci";
			reg = <0 0 0 0 0>;
			ranges = <0x02000000 0 0xc8000000
				  0x02000000 0 0xc8000000
				  0 0x10000000
				  0x01000000 0 0x00000000
				  0x01000000 0 0x00000000
				  0 0x00800000>;
		};
	};
};

^ permalink raw reply

* Re: [PATCH 08/12] ptp: Added a brand new class driver for ptp clocks.
From: Richard Cochran @ 2010-08-13  9:34 UTC (permalink / raw)
  To: Grant Likely
  Cc: netdev, devicetree-discuss, Thomas Gleixner, linuxppc-dev,
	linux-arm-kernel, Krzysztof Halasa
In-Reply-To: <AANLkTinYjzfChsN0AB5W6kOh6xJYPJtoEq941pG8WbmS@mail.gmail.com>

On Tue, Jun 15, 2010 at 01:11:30PM -0600, Grant Likely wrote:
> On Tue, Jun 15, 2010 at 10:09 AM, Richard Cochran
> > +static DEFINE_SPINLOCK(clocks_lock); /* protects 'clocks' */
> 
> Doesn't appear that clocks is manipulated at atomic context.  Mutex instead?
...
> If the spinlock is changed to a mutex that is held for the entire
> function call, then the logic here can be simpler.

Grant,

I am working on another go at this patch series. Stupid question:

The caller of ptp_clock_register(), which takes the clocks_lock, is
always a module_init() function. Is this always a safe context in
which to call mutex_lock?

Thanks,

Richard

^ permalink raw reply

* Re: 2.6.35-stable/ppc64/p7: suspicious rcu_dereference_check() usage detected during 2.6.35-stable boot
From: Subrata Modak @ 2010-08-13  8:37 UTC (permalink / raw)
  To: paulmck, Paul Menage, containers
  Cc: sachinp, Peter Zijlstra, Li Zefan, linux-kernel, Linuxppc-dev,
	DIVYA PRAKASH
In-Reply-To: <20100809161200.GC3026@linux.vnet.ibm.com>

Adding CONTROL GROUP Maintainers/Mailing list..

Regards--
Subrata

On Mon, 2010-08-09 at 09:12 -0700, Paul E. McKenney wrote:
> On Mon, Aug 02, 2010 at 02:22:12PM +0530, Subrata Modak wrote:
> > Hi,
> > 
> > The following suspicious rcu_dereference_check() usage is detected
> > during 2.6.35-stable boot on my ppc64/p7 machine:
> > 
> > ==================================================
> > [ INFO: suspicious rcu_dereference_check() usage. ]
> > ---------------------------------------------------
> > kernel/sched.c:616 invoked rcu_dereference_check() without protection!
> > other info that might help us debug this:
> > 
> > rcu_scheduler_active = 1, debug_locks = 0
> > 1 lock held by swapper/1:
> >  #0:  (&rq->lock){-.....}, at: [<c0000000007ca2f8>] .init_idle+0x78/0x4a8
> > stack backtrace:
> > Call Trace:
> > [c000000f392bf990] [c000000000014f04] .show_stack+0xb0/0x1a0 (unreliable)
> > [c000000f392bfa50] [c0000000007c87b4] .dump_stack+0x28/0x3c
> > [c000000f392bfad0] [c000000000103e1c] .lockdep_rcu_dereference+0xbc/0xe4
> > [c000000f392bfb70] [c0000000007ca434] .init_idle+0x1b4/0x4a8
> > [c000000f392bfc30] [c0000000007cad04] .fork_idle+0xa4/0xd0
> > [c000000f392bfe30] [c000000000aefaac] .smp_prepare_cpus+0x23c/0x2f4
> > [c000000f392bfed0] [c000000000ae1424] .kernel_init+0xec/0x32c
> > [c000000f392bff90] [c000000000033f40] .kernel_thread+0x54/0x70
> > ==================================================
> > 
> > Please note that this was reported earlier on 2.6.34-rc6:
> > http://marc.info/?l=linux-kernel&m=127313031922395&w=2,
> > The issue was fixed with:
> > 	commit 1ce7e4ff24fe338438bc7837e02780f202bf202b
> > 	Author: Li Zefan <lizf@cn.fujitsu.com>
> > 	Date:   Fri Apr 23 10:35:52 2010 +0800
> > 	cgroup: Check task_lock in task_subsys_state()
> > 
> > According to:
> > 	http://lkml.org/lkml/2010/7/1/883,
> > 	commit dc61b1d65e353d638b2445f71fb8e5b5630f2415
> > 	Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > 	Date:   Tue Jun 8 11:40:42 2010 +0200
> > 	sched: Fix PROVE_RCU vs cpu_cgroup
> > should have fixed this. But this is reproducible on 2.6.35-stable.
> > 
> > Please also see the config file attached.
> 
> Hello, Subrata,
> 
> Thank you for locating this one!  This looks like the same issue that
> Ilia Mirkin located.  Please see below for my analysis -- no fix yet,
> as I need confirmation from cgroups experts.  I can easily create a
> patch that suppresses the warning, but I don't yet know whether this is
> the right thing to do.
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> On Thu, Aug 05, 2010 at 01:31:10PM -0400, Ilia Mirkin wrote:
> > On Thu, Jul 1, 2010 at 6:18 PM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > > On Thu, Jul 01, 2010 at 08:21:43AM -0400, Miles Lane wrote:
> > >> [ INFO: suspicious rcu_dereference_check() usage. ]
> > >> ---------------------------------------------------
> > >> kernel/sched.c:616 invoked rcu_dereference_check() without protection!
> > >>
> > >> other info that might help us debug this:
> > >>
> > >> rcu_scheduler_active = 1, debug_locks = 1
> > >> 3 locks held by swapper/1:
> > >>   #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81042914>]
> > >> cpu_maps_update_begin+0x12/0x14
> > >>   #1:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8104294f>]
> > >> cpu_hotplug_begin+0x27/0x4e
> > >>   #2:  (&rq->lock){-.-...}, at: [<ffffffff812f8502>] init_idle+0x2b/0x114
> > >
> > > Hello, Miles!
> > >
> > > I believe that this one is fixed by commit dc61b1d6 in -tip.
> > 
> > Hi Paul,
> > 
> > Looks like that commit made it into 2.6.35:
> > 
> > git tag -l --contains dc61b1d65e353d638b2445f71fb8e5b5630f2415 v2.6.35*
> > v2.6.35
> > v2.6.35-rc4
> > v2.6.35-rc5
> > v2.6.35-rc6
> > 
> > However I still get:
> > 
> > [    0.051203] CPU0: AMD QEMU Virtual CPU version 0.12.4 stepping 03
> > [    0.052999] lockdep: fixing up alternatives.
> > [    0.054105]
> > [    0.054106] ===================================================
> > [    0.054999] [ INFO: suspicious rcu_dereference_check() usage. ]
> > [    0.054999] ---------------------------------------------------
> > [    0.054999] kernel/sched.c:616 invoked rcu_dereference_check()
> > without protection
> > !
> > [    0.054999]
> > [    0.054999] other info that might help us debug this:
> > [    0.054999]
> > [    0.054999]
> > [    0.054999] rcu_scheduler_active = 1, debug_locks = 1
> > [    0.054999] 3 locks held by swapper/1:
> > [    0.054999]  #0:  (cpu_add_remove_lock){+.+.+.}, at:
> > [<ffffffff814be933>] cpu_up+
> > 0x42/0x6a
> > [    0.054999]  #1:  (cpu_hotplug.lock){+.+.+.}, at:
> > [<ffffffff810400d8>] cpu_hotplu
> > g_begin+0x2a/0x51
> > [    0.054999]  #2:  (&rq->lock){-.-...}, at: [<ffffffff814be2f7>]
> > init_idle+0x2f/0x
> > 113
> > [    0.054999]
> > [    0.054999] stack backtrace:
> > [    0.054999] Pid: 1, comm: swapper Not tainted 2.6.35 #1
> > [    0.054999] Call Trace:
> > [    0.054999]  [<ffffffff81068054>] lockdep_rcu_dereference+0x9b/0xa3
> > [    0.054999]  [<ffffffff810325c3>] task_group+0x7b/0x8a
> > [    0.054999]  [<ffffffff810325e5>] set_task_rq+0x13/0x40
> > [    0.054999]  [<ffffffff814be39a>] init_idle+0xd2/0x113
> > [    0.054999]  [<ffffffff814be78a>] fork_idle+0xb8/0xc7
> > [    0.054999]  [<ffffffff81068717>] ? mark_held_locks+0x4d/0x6b
> > [    0.054999]  [<ffffffff814bcebd>] do_fork_idle+0x17/0x2b
> > [    0.054999]  [<ffffffff814bc89b>] native_cpu_up+0x1c1/0x724
> > [    0.054999]  [<ffffffff814bcea6>] ? do_fork_idle+0x0/0x2b
> > [    0.054999]  [<ffffffff814be876>] _cpu_up+0xac/0x127
> > [    0.054999]  [<ffffffff814be946>] cpu_up+0x55/0x6a
> > [    0.054999]  [<ffffffff81ab562a>] kernel_init+0xe1/0x1ff
> > [    0.054999]  [<ffffffff81003854>] kernel_thread_helper+0x4/0x10
> > [    0.054999]  [<ffffffff814c353c>] ? restore_args+0x0/0x30
> > [    0.054999]  [<ffffffff81ab5549>] ? kernel_init+0x0/0x1ff
> > [    0.054999]  [<ffffffff81003850>] ? kernel_thread_helper+0x0/0x10
> > [    0.056074] Booting Node   0, Processors  #1lockdep: fixing up alternatives.
> > [    0.130045]  #2lockdep: fixing up alternatives.
> > [    0.203089]  #3 Ok.
> > [    0.275286] Brought up 4 CPUs
> > [    0.276005] Total of 4 processors activated (16017.17 BogoMIPS).
> 
> This does look like a new one, thank you for reporting it!
> 
> Here is my analysis, which should at least provide some humor value to
> those who understand the code better than I do.  ;-)
> 
> So the corresponding rcu_dereference_check() is in
> task_subsys_state_check(), and is fetching the cpu_cgroup_subsys_id
> element of the newly created task's task->cgroups->subsys[] array.
> The "git grep" command finds only three uses of cpu_cgroup_subsys_id,
> but no definition.
> 
> Now, fork_idle() invokes copy_process(), which invokes cgroup_fork(),
> which sets the child process's ->cgroups pointer to that of the parent,
> also invoking get_css_set(), which increments the corresponding reference
> count, doing both operations under task_lock() protection (->alloc_lock).
> Because fork_idle() does not specify any of CLONE_NEWNS, CLONE_NEWUTS,
> CLONE_NEWIPC, CLONE_NEWPID, or CLONE_NEWNET, copy_namespaces() should
> not create a new namespace, and so there should be no ns_cgroup_clone().
> We should thus retain the parent's ->cgroups pointer.  And copy_process()
> installs the new task in the various lists, so that the task is externally
> accessible upon return.
> 
> After a non-error return from copy_process(), fork_init() invokes
> init_idle_pid(), which does not appear to affect the task's cgroup
> state.  Next fork_init() invokes init_idle(), which in turn invokes
> __set_task_cpu(), which invokes set_task_rq(), which calls task_group()
> several times, which calls task_subsys_state_check(), which calls the
> rcu_dereference_check() that complained above.
> 
> However, the result returns by rcu_dereference_check() is stored into
> the task structure:
> 
> 	p->se.cfs_rq = task_group(p)->cfs_rq[cpu];
> 	p->se.parent = task_group(p)->se[cpu];
> 
> This means that the corresponding structure must have been tied down with
> a reference count or some such.  If such a reference has been taken, then
> this complaint is a false positive, and could be suppressed by putting
> rcu_read_lock() and rcu_read_unlock() around the call to init_idle()
> from fork_idle().  However, although, reference to the enclosing ->cgroups
> struct css_set is held, it is not clear to me that this reference applies
> to the structures pointed to by the ->subsys[] array, especially given
> that the cgroup_subsys_state structures referenced by this array have
> their own reference count, which does not appear to me to be acquired
> by this code path.
> 
> Or are the cgroup_subsys_state structures referenced by idle tasks
> never freed or some such?
> 
> 							Thanx, Paul

^ permalink raw reply

* Re: 2.6.35-stable/ppc64/p7: suspicious rcu_dereference_check() usage detected during 2.6.35-stable boot
From: Dhaval Giani @ 2010-08-13  7:42 UTC (permalink / raw)
  To: subrata, Paul Menage, containers
  Cc: sachinp, Peter Zijlstra, Li Zefan, linux-kernel, DIVYA PRAKASH,
	Linuxppc-dev, Ingo Molnar, Ingo Molnar, paulmck, balbir
In-Reply-To: <1281682514.5976.2.camel@subratamodak.linux.ibm.com>

On Fri, Aug 13, 2010 at 8:55 AM, Subrata Modak
<subrata@linux.vnet.ibm.com> wrote:
> Hi Paul,
>
> Is there any specific person(s) whom we whom we should direct this mail
> to ? We have not received any response from CGROUP developers on this.
> Kindly let me know whom to contact for this. I am adding few more people
> i know :-)
>

It would help if you would add in the cgroup maintainers and the
containers mailing list to the cc as well.

> Regards--
> Subrata
>
> On Mon, 2010-08-09 at 09:12 -0700, Paul E. McKenney wrote:
>> On Mon, Aug 02, 2010 at 02:22:12PM +0530, Subrata Modak wrote:
>> > Hi,
>> >
>> > The following suspicious rcu_dereference_check() usage is detected
>> > during 2.6.35-stable boot on my ppc64/p7 machine:
>> >
>> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
>> > [ INFO: suspicious rcu_dereference_check() usage. ]
>> > ---------------------------------------------------
>> > kernel/sched.c:616 invoked rcu_dereference_check() without protection!
>> > other info that might help us debug this:
>> >
>> > rcu_scheduler_active =3D 1, debug_locks =3D 0
>> > 1 lock held by swapper/1:
>> > =A0#0: =A0(&rq->lock){-.....}, at: [<c0000000007ca2f8>] .init_idle+0x7=
8/0x4a8
>> > stack backtrace:
>> > Call Trace:
>> > [c000000f392bf990] [c000000000014f04] .show_stack+0xb0/0x1a0 (unreliab=
le)
>> > [c000000f392bfa50] [c0000000007c87b4] .dump_stack+0x28/0x3c
>> > [c000000f392bfad0] [c000000000103e1c] .lockdep_rcu_dereference+0xbc/0x=
e4
>> > [c000000f392bfb70] [c0000000007ca434] .init_idle+0x1b4/0x4a8
>> > [c000000f392bfc30] [c0000000007cad04] .fork_idle+0xa4/0xd0
>> > [c000000f392bfe30] [c000000000aefaac] .smp_prepare_cpus+0x23c/0x2f4
>> > [c000000f392bfed0] [c000000000ae1424] .kernel_init+0xec/0x32c
>> > [c000000f392bff90] [c000000000033f40] .kernel_thread+0x54/0x70
>> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
>> >
>> > Please note that this was reported earlier on 2.6.34-rc6:
>> > http://marc.info/?l=3Dlinux-kernel&m=3D127313031922395&w=3D2,
>> > The issue was fixed with:
>> > =A0 =A0 commit 1ce7e4ff24fe338438bc7837e02780f202bf202b
>> > =A0 =A0 Author: Li Zefan <lizf@cn.fujitsu.com>
>> > =A0 =A0 Date: =A0 Fri Apr 23 10:35:52 2010 +0800
>> > =A0 =A0 cgroup: Check task_lock in task_subsys_state()
>> >
>> > According to:
>> > =A0 =A0 http://lkml.org/lkml/2010/7/1/883,
>> > =A0 =A0 commit dc61b1d65e353d638b2445f71fb8e5b5630f2415
>> > =A0 =A0 Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> > =A0 =A0 Date: =A0 Tue Jun 8 11:40:42 2010 +0200
>> > =A0 =A0 sched: Fix PROVE_RCU vs cpu_cgroup
>> > should have fixed this. But this is reproducible on 2.6.35-stable.
>> >
>> > Please also see the config file attached.
>>
>> Hello, Subrata,
>>
>> Thank you for locating this one! =A0This looks like the same issue that
>> Ilia Mirkin located. =A0Please see below for my analysis -- no fix yet,
>> as I need confirmation from cgroups experts. =A0I can easily create a
>> patch that suppresses the warning, but I don't yet know whether this is
>> the right thing to do.
>>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Thanx, Paul
>>
>> ------------------------------------------------------------------------
>>
>> On Thu, Aug 05, 2010 at 01:31:10PM -0400, Ilia Mirkin wrote:
>> > On Thu, Jul 1, 2010 at 6:18 PM, Paul E. McKenney
>> > <paulmck@linux.vnet.ibm.com> wrote:
>> > > On Thu, Jul 01, 2010 at 08:21:43AM -0400, Miles Lane wrote:
>> > >> [ INFO: suspicious rcu_dereference_check() usage. ]
>> > >> ---------------------------------------------------
>> > >> kernel/sched.c:616 invoked rcu_dereference_check() without protecti=
on!
>> > >>
>> > >> other info that might help us debug this:
>> > >>
>> > >> rcu_scheduler_active =3D 1, debug_locks =3D 1
>> > >> 3 locks held by swapper/1:
>> > >> =A0 #0: =A0(cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81042914>]
>> > >> cpu_maps_update_begin+0x12/0x14
>> > >> =A0 #1: =A0(cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8104294f>]
>> > >> cpu_hotplug_begin+0x27/0x4e
>> > >> =A0 #2: =A0(&rq->lock){-.-...}, at: [<ffffffff812f8502>] init_idle+=
0x2b/0x114
>> > >
>> > > Hello, Miles!
>> > >
>> > > I believe that this one is fixed by commit dc61b1d6 in -tip.
>> >
>> > Hi Paul,
>> >
>> > Looks like that commit made it into 2.6.35:
>> >
>> > git tag -l --contains dc61b1d65e353d638b2445f71fb8e5b5630f2415 v2.6.35=
*
>> > v2.6.35
>> > v2.6.35-rc4
>> > v2.6.35-rc5
>> > v2.6.35-rc6
>> >
>> > However I still get:
>> >
>> > [ =A0 =A00.051203] CPU0: AMD QEMU Virtual CPU version 0.12.4 stepping =
03
>> > [ =A0 =A00.052999] lockdep: fixing up alternatives.
>> > [ =A0 =A00.054105]
>> > [ =A0 =A00.054106] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D
>> > [ =A0 =A00.054999] [ INFO: suspicious rcu_dereference_check() usage. ]
>> > [ =A0 =A00.054999] ---------------------------------------------------
>> > [ =A0 =A00.054999] kernel/sched.c:616 invoked rcu_dereference_check()
>> > without protection
>> > !
>> > [ =A0 =A00.054999]
>> > [ =A0 =A00.054999] other info that might help us debug this:
>> > [ =A0 =A00.054999]
>> > [ =A0 =A00.054999]
>> > [ =A0 =A00.054999] rcu_scheduler_active =3D 1, debug_locks =3D 1
>> > [ =A0 =A00.054999] 3 locks held by swapper/1:
>> > [ =A0 =A00.054999] =A0#0: =A0(cpu_add_remove_lock){+.+.+.}, at:
>> > [<ffffffff814be933>] cpu_up+
>> > 0x42/0x6a
>> > [ =A0 =A00.054999] =A0#1: =A0(cpu_hotplug.lock){+.+.+.}, at:
>> > [<ffffffff810400d8>] cpu_hotplu
>> > g_begin+0x2a/0x51
>> > [ =A0 =A00.054999] =A0#2: =A0(&rq->lock){-.-...}, at: [<ffffffff814be2=
f7>]
>> > init_idle+0x2f/0x
>> > 113
>> > [ =A0 =A00.054999]
>> > [ =A0 =A00.054999] stack backtrace:
>> > [ =A0 =A00.054999] Pid: 1, comm: swapper Not tainted 2.6.35 #1
>> > [ =A0 =A00.054999] Call Trace:
>> > [ =A0 =A00.054999] =A0[<ffffffff81068054>] lockdep_rcu_dereference+0x9=
b/0xa3
>> > [ =A0 =A00.054999] =A0[<ffffffff810325c3>] task_group+0x7b/0x8a
>> > [ =A0 =A00.054999] =A0[<ffffffff810325e5>] set_task_rq+0x13/0x40
>> > [ =A0 =A00.054999] =A0[<ffffffff814be39a>] init_idle+0xd2/0x113
>> > [ =A0 =A00.054999] =A0[<ffffffff814be78a>] fork_idle+0xb8/0xc7
>> > [ =A0 =A00.054999] =A0[<ffffffff81068717>] ? mark_held_locks+0x4d/0x6b
>> > [ =A0 =A00.054999] =A0[<ffffffff814bcebd>] do_fork_idle+0x17/0x2b
>> > [ =A0 =A00.054999] =A0[<ffffffff814bc89b>] native_cpu_up+0x1c1/0x724
>> > [ =A0 =A00.054999] =A0[<ffffffff814bcea6>] ? do_fork_idle+0x0/0x2b
>> > [ =A0 =A00.054999] =A0[<ffffffff814be876>] _cpu_up+0xac/0x127
>> > [ =A0 =A00.054999] =A0[<ffffffff814be946>] cpu_up+0x55/0x6a
>> > [ =A0 =A00.054999] =A0[<ffffffff81ab562a>] kernel_init+0xe1/0x1ff
>> > [ =A0 =A00.054999] =A0[<ffffffff81003854>] kernel_thread_helper+0x4/0x=
10
>> > [ =A0 =A00.054999] =A0[<ffffffff814c353c>] ? restore_args+0x0/0x30
>> > [ =A0 =A00.054999] =A0[<ffffffff81ab5549>] ? kernel_init+0x0/0x1ff
>> > [ =A0 =A00.054999] =A0[<ffffffff81003850>] ? kernel_thread_helper+0x0/=
0x10
>> > [ =A0 =A00.056074] Booting Node =A0 0, Processors =A0#1lockdep: fixing=
 up alternatives.
>> > [ =A0 =A00.130045] =A0#2lockdep: fixing up alternatives.
>> > [ =A0 =A00.203089] =A0#3 Ok.
>> > [ =A0 =A00.275286] Brought up 4 CPUs
>> > [ =A0 =A00.276005] Total of 4 processors activated (16017.17 BogoMIPS)=
.
>>
>> This does look like a new one, thank you for reporting it!
>>
>> Here is my analysis, which should at least provide some humor value to
>> those who understand the code better than I do. =A0;-)
>>
>> So the corresponding rcu_dereference_check() is in
>> task_subsys_state_check(), and is fetching the cpu_cgroup_subsys_id
>> element of the newly created task's task->cgroups->subsys[] array.
>> The "git grep" command finds only three uses of cpu_cgroup_subsys_id,
>> but no definition.
>>
>> Now, fork_idle() invokes copy_process(), which invokes cgroup_fork(),
>> which sets the child process's ->cgroups pointer to that of the parent,
>> also invoking get_css_set(), which increments the corresponding referenc=
e
>> count, doing both operations under task_lock() protection (->alloc_lock)=
.
>> Because fork_idle() does not specify any of CLONE_NEWNS, CLONE_NEWUTS,
>> CLONE_NEWIPC, CLONE_NEWPID, or CLONE_NEWNET, copy_namespaces() should
>> not create a new namespace, and so there should be no ns_cgroup_clone().
>> We should thus retain the parent's ->cgroups pointer. =A0And copy_proces=
s()
>> installs the new task in the various lists, so that the task is external=
ly
>> accessible upon return.
>>
>> After a non-error return from copy_process(), fork_init() invokes
>> init_idle_pid(), which does not appear to affect the task's cgroup
>> state. =A0Next fork_init() invokes init_idle(), which in turn invokes
>> __set_task_cpu(), which invokes set_task_rq(), which calls task_group()
>> several times, which calls task_subsys_state_check(), which calls the
>> rcu_dereference_check() that complained above.
>>
>> However, the result returns by rcu_dereference_check() is stored into
>> the task structure:
>>
>> =A0 =A0 =A0 p->se.cfs_rq =3D task_group(p)->cfs_rq[cpu];
>> =A0 =A0 =A0 p->se.parent =3D task_group(p)->se[cpu];
>>
>> This means that the corresponding structure must have been tied down wit=
h
>> a reference count or some such. =A0If such a reference has been taken, t=
hen
>> this complaint is a false positive, and could be suppressed by putting
>> rcu_read_lock() and rcu_read_unlock() around the call to init_idle()
>> from fork_idle(). =A0However, although, reference to the enclosing ->cgr=
oups
>> struct css_set is held, it is not clear to me that this reference applie=
s
>> to the structures pointed to by the ->subsys[] array, especially given
>> that the cgroup_subsys_state structures referenced by this array have
>> their own reference count, which does not appear to me to be acquired
>> by this code path.
>>
>> Or are the cgroup_subsys_state structures referenced by idle tasks
>> never freed or some such?
>>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Thanx, Paul
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" i=
n
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at =A0http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at =A0http://www.tux.org/lkml/
>

^ permalink raw reply

* [PATCH] powerpc: Initialise paca->kstack before early_setup_secondary
From: Matt Evans @ 2010-08-13  6:58 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: mjwolf

As early setup calls down to slb_initialize(), we must have kstack
initialised before checking "should we add a bolted SLB entry for our kstack?"

Failing to do so means stack access requires an SLB miss exception to refill
an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text
& static data).  It's not always allowable to take such a miss, and
intermittent crashes will result.

Primary CPUs don't have this issue; an SLB entry is not bolted for their
stack anyway (as that lives within SLB(0)).  This patch therefore only
affects the init of secondaries.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Cc: stable <stable@kernel.org>
---
 arch/powerpc/kernel/head_64.S |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 844a44b..4d6681d 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -572,9 +572,6 @@ __secondary_start:
 	/* Set thread priority to MEDIUM */
 	HMT_MEDIUM
 
-	/* Do early setup for that CPU (stab, slb, hash table pointer) */
-	bl	.early_setup_secondary
-
 	/* Initialize the kernel stack.  Just a repeat for iSeries.	 */
 	LOAD_REG_ADDR(r3, current_set)
 	sldi	r28,r24,3		/* get current_set[cpu#]	 */
@@ -582,6 +579,9 @@ __secondary_start:
 	addi	r1,r1,THREAD_SIZE-STACK_FRAME_OVERHEAD
 	std	r1,PACAKSAVE(r13)
 
+	/* Do early setup for that CPU (stab, slb, hash table pointer) */
+	bl	.early_setup_secondary
+
 	/* Clear backchain so we get nice backtraces */
 	li	r7,0
 	mtlr	r7
-- 
1.7.0.4

^ permalink raw reply related

* Re: 2.6.35-stable/ppc64/p7: suspicious rcu_dereference_check() usage detected during 2.6.35-stable boot
From: Subrata Modak @ 2010-08-13  6:55 UTC (permalink / raw)
  To: paulmck, Ingo Molnar, balbir
  Cc: sachinp, Peter Zijlstra, Li Zefan, linux-kernel, Linuxppc-dev,
	Ingo Molnar, DIVYA PRAKASH
In-Reply-To: <20100809161200.GC3026@linux.vnet.ibm.com>

Hi Paul,

Is there any specific person(s) whom we whom we should direct this mail
to ? We have not received any response from CGROUP developers on this.
Kindly let me know whom to contact for this. I am adding few more people
i know :-)

Regards--
Subrata

On Mon, 2010-08-09 at 09:12 -0700, Paul E. McKenney wrote:
> On Mon, Aug 02, 2010 at 02:22:12PM +0530, Subrata Modak wrote:
> > Hi,
> > 
> > The following suspicious rcu_dereference_check() usage is detected
> > during 2.6.35-stable boot on my ppc64/p7 machine:
> > 
> > ==================================================
> > [ INFO: suspicious rcu_dereference_check() usage. ]
> > ---------------------------------------------------
> > kernel/sched.c:616 invoked rcu_dereference_check() without protection!
> > other info that might help us debug this:
> > 
> > rcu_scheduler_active = 1, debug_locks = 0
> > 1 lock held by swapper/1:
> >  #0:  (&rq->lock){-.....}, at: [<c0000000007ca2f8>] .init_idle+0x78/0x4a8
> > stack backtrace:
> > Call Trace:
> > [c000000f392bf990] [c000000000014f04] .show_stack+0xb0/0x1a0 (unreliable)
> > [c000000f392bfa50] [c0000000007c87b4] .dump_stack+0x28/0x3c
> > [c000000f392bfad0] [c000000000103e1c] .lockdep_rcu_dereference+0xbc/0xe4
> > [c000000f392bfb70] [c0000000007ca434] .init_idle+0x1b4/0x4a8
> > [c000000f392bfc30] [c0000000007cad04] .fork_idle+0xa4/0xd0
> > [c000000f392bfe30] [c000000000aefaac] .smp_prepare_cpus+0x23c/0x2f4
> > [c000000f392bfed0] [c000000000ae1424] .kernel_init+0xec/0x32c
> > [c000000f392bff90] [c000000000033f40] .kernel_thread+0x54/0x70
> > ==================================================
> > 
> > Please note that this was reported earlier on 2.6.34-rc6:
> > http://marc.info/?l=linux-kernel&m=127313031922395&w=2,
> > The issue was fixed with:
> > 	commit 1ce7e4ff24fe338438bc7837e02780f202bf202b
> > 	Author: Li Zefan <lizf@cn.fujitsu.com>
> > 	Date:   Fri Apr 23 10:35:52 2010 +0800
> > 	cgroup: Check task_lock in task_subsys_state()
> > 
> > According to:
> > 	http://lkml.org/lkml/2010/7/1/883,
> > 	commit dc61b1d65e353d638b2445f71fb8e5b5630f2415
> > 	Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > 	Date:   Tue Jun 8 11:40:42 2010 +0200
> > 	sched: Fix PROVE_RCU vs cpu_cgroup
> > should have fixed this. But this is reproducible on 2.6.35-stable.
> > 
> > Please also see the config file attached.
> 
> Hello, Subrata,
> 
> Thank you for locating this one!  This looks like the same issue that
> Ilia Mirkin located.  Please see below for my analysis -- no fix yet,
> as I need confirmation from cgroups experts.  I can easily create a
> patch that suppresses the warning, but I don't yet know whether this is
> the right thing to do.
> 
> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
> On Thu, Aug 05, 2010 at 01:31:10PM -0400, Ilia Mirkin wrote:
> > On Thu, Jul 1, 2010 at 6:18 PM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > > On Thu, Jul 01, 2010 at 08:21:43AM -0400, Miles Lane wrote:
> > >> [ INFO: suspicious rcu_dereference_check() usage. ]
> > >> ---------------------------------------------------
> > >> kernel/sched.c:616 invoked rcu_dereference_check() without protection!
> > >>
> > >> other info that might help us debug this:
> > >>
> > >> rcu_scheduler_active = 1, debug_locks = 1
> > >> 3 locks held by swapper/1:
> > >>   #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81042914>]
> > >> cpu_maps_update_begin+0x12/0x14
> > >>   #1:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8104294f>]
> > >> cpu_hotplug_begin+0x27/0x4e
> > >>   #2:  (&rq->lock){-.-...}, at: [<ffffffff812f8502>] init_idle+0x2b/0x114
> > >
> > > Hello, Miles!
> > >
> > > I believe that this one is fixed by commit dc61b1d6 in -tip.
> > 
> > Hi Paul,
> > 
> > Looks like that commit made it into 2.6.35:
> > 
> > git tag -l --contains dc61b1d65e353d638b2445f71fb8e5b5630f2415 v2.6.35*
> > v2.6.35
> > v2.6.35-rc4
> > v2.6.35-rc5
> > v2.6.35-rc6
> > 
> > However I still get:
> > 
> > [    0.051203] CPU0: AMD QEMU Virtual CPU version 0.12.4 stepping 03
> > [    0.052999] lockdep: fixing up alternatives.
> > [    0.054105]
> > [    0.054106] ===================================================
> > [    0.054999] [ INFO: suspicious rcu_dereference_check() usage. ]
> > [    0.054999] ---------------------------------------------------
> > [    0.054999] kernel/sched.c:616 invoked rcu_dereference_check()
> > without protection
> > !
> > [    0.054999]
> > [    0.054999] other info that might help us debug this:
> > [    0.054999]
> > [    0.054999]
> > [    0.054999] rcu_scheduler_active = 1, debug_locks = 1
> > [    0.054999] 3 locks held by swapper/1:
> > [    0.054999]  #0:  (cpu_add_remove_lock){+.+.+.}, at:
> > [<ffffffff814be933>] cpu_up+
> > 0x42/0x6a
> > [    0.054999]  #1:  (cpu_hotplug.lock){+.+.+.}, at:
> > [<ffffffff810400d8>] cpu_hotplu
> > g_begin+0x2a/0x51
> > [    0.054999]  #2:  (&rq->lock){-.-...}, at: [<ffffffff814be2f7>]
> > init_idle+0x2f/0x
> > 113
> > [    0.054999]
> > [    0.054999] stack backtrace:
> > [    0.054999] Pid: 1, comm: swapper Not tainted 2.6.35 #1
> > [    0.054999] Call Trace:
> > [    0.054999]  [<ffffffff81068054>] lockdep_rcu_dereference+0x9b/0xa3
> > [    0.054999]  [<ffffffff810325c3>] task_group+0x7b/0x8a
> > [    0.054999]  [<ffffffff810325e5>] set_task_rq+0x13/0x40
> > [    0.054999]  [<ffffffff814be39a>] init_idle+0xd2/0x113
> > [    0.054999]  [<ffffffff814be78a>] fork_idle+0xb8/0xc7
> > [    0.054999]  [<ffffffff81068717>] ? mark_held_locks+0x4d/0x6b
> > [    0.054999]  [<ffffffff814bcebd>] do_fork_idle+0x17/0x2b
> > [    0.054999]  [<ffffffff814bc89b>] native_cpu_up+0x1c1/0x724
> > [    0.054999]  [<ffffffff814bcea6>] ? do_fork_idle+0x0/0x2b
> > [    0.054999]  [<ffffffff814be876>] _cpu_up+0xac/0x127
> > [    0.054999]  [<ffffffff814be946>] cpu_up+0x55/0x6a
> > [    0.054999]  [<ffffffff81ab562a>] kernel_init+0xe1/0x1ff
> > [    0.054999]  [<ffffffff81003854>] kernel_thread_helper+0x4/0x10
> > [    0.054999]  [<ffffffff814c353c>] ? restore_args+0x0/0x30
> > [    0.054999]  [<ffffffff81ab5549>] ? kernel_init+0x0/0x1ff
> > [    0.054999]  [<ffffffff81003850>] ? kernel_thread_helper+0x0/0x10
> > [    0.056074] Booting Node   0, Processors  #1lockdep: fixing up alternatives.
> > [    0.130045]  #2lockdep: fixing up alternatives.
> > [    0.203089]  #3 Ok.
> > [    0.275286] Brought up 4 CPUs
> > [    0.276005] Total of 4 processors activated (16017.17 BogoMIPS).
> 
> This does look like a new one, thank you for reporting it!
> 
> Here is my analysis, which should at least provide some humor value to
> those who understand the code better than I do.  ;-)
> 
> So the corresponding rcu_dereference_check() is in
> task_subsys_state_check(), and is fetching the cpu_cgroup_subsys_id
> element of the newly created task's task->cgroups->subsys[] array.
> The "git grep" command finds only three uses of cpu_cgroup_subsys_id,
> but no definition.
> 
> Now, fork_idle() invokes copy_process(), which invokes cgroup_fork(),
> which sets the child process's ->cgroups pointer to that of the parent,
> also invoking get_css_set(), which increments the corresponding reference
> count, doing both operations under task_lock() protection (->alloc_lock).
> Because fork_idle() does not specify any of CLONE_NEWNS, CLONE_NEWUTS,
> CLONE_NEWIPC, CLONE_NEWPID, or CLONE_NEWNET, copy_namespaces() should
> not create a new namespace, and so there should be no ns_cgroup_clone().
> We should thus retain the parent's ->cgroups pointer.  And copy_process()
> installs the new task in the various lists, so that the task is externally
> accessible upon return.
> 
> After a non-error return from copy_process(), fork_init() invokes
> init_idle_pid(), which does not appear to affect the task's cgroup
> state.  Next fork_init() invokes init_idle(), which in turn invokes
> __set_task_cpu(), which invokes set_task_rq(), which calls task_group()
> several times, which calls task_subsys_state_check(), which calls the
> rcu_dereference_check() that complained above.
> 
> However, the result returns by rcu_dereference_check() is stored into
> the task structure:
> 
> 	p->se.cfs_rq = task_group(p)->cfs_rq[cpu];
> 	p->se.parent = task_group(p)->se[cpu];
> 
> This means that the corresponding structure must have been tied down with
> a reference count or some such.  If such a reference has been taken, then
> this complaint is a false positive, and could be suppressed by putting
> rcu_read_lock() and rcu_read_unlock() around the call to init_idle()
> from fork_idle().  However, although, reference to the enclosing ->cgroups
> struct css_set is held, it is not clear to me that this reference applies
> to the structures pointed to by the ->subsys[] array, especially given
> that the cgroup_subsys_state structures referenced by this array have
> their own reference count, which does not appear to me to be acquired
> by this code path.
> 
> Or are the cgroup_subsys_state structures referenced by idle tasks
> never freed or some such?
> 
> 							Thanx, Paul

^ permalink raw reply

* [PATCH 2/2] powerpc: Dynamically allocate most lppaca structs
From: Paul Mackerras @ 2010-08-13  6:18 UTC (permalink / raw)
  To: linuxppc-dev
In-Reply-To: <20100813061815.GA30234@drongo>

This arranges for the lppaca structs for most cpus to be dynamically
allocated in the same manner as the paca structs.  If we don't include
support for legacy iSeries, only the first lppaca is statically
allocated; the rest are dynamically allocated.  If we include legacy
iSeries support, then we statically allocate the first 64 lppaca
structs, since the iSeries hypervisor requires that the lppaca
structs be present in the data section of the kernel image, but
legacy iSeries supports at most 64 cpus.

With CONFIG_NR_CPUS, the kernel image size for a typical pSeries config
went from:

   text    data     bss     dec     hex filename
9524478 4734564 8469944 22728986        15ad11a ../test-1024/vmlinux

to:

   text    data     bss     dec     hex filename
9524482 3751508 8469944 21745934        14bd10e ../test-1024/vmlinux

a reduction of 983052 bytes overall.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/lppaca.h |    2 +-
 arch/powerpc/kernel/paca.c        |   70 +++++++++++++++++++++++++++++++++++-
 2 files changed, 69 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
index 6b73554..6d02624 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -153,7 +153,7 @@ struct lppaca {
 
 extern struct lppaca lppaca[];
 
-#define lppaca_of(cpu)	(lppaca[cpu])
+#define lppaca_of(cpu)	(*paca[cpu].lppaca_ptr)
 
 /*
  * SLB shadow buffer structure as defined in the PAPR.  The save_area
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index d0a26f1..1e068a4 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -27,6 +27,20 @@ extern unsigned long __toc_start;
 #ifdef CONFIG_PPC_BOOK3S
 
 /*
+ * We only have to have statically allocated lppaca structs on
+ * legacy iSeries, which supports at most 64 cpus.
+ */
+#ifdef CONFIG_PPC_ISERIES
+#if NR_CPUS < 64
+#define NR_LPPACAS	NR_CPUS
+#else
+#define NR_LPPACAS	64
+#endif
+#else /* not iSeries */
+#define NR_LPPACAS	1
+#endif
+
+/*
  * The structure which the hypervisor knows about - this structure
  * should not cross a page boundary.  The vpa_init/register_vpa call
  * is now known to fail if the lppaca structure crosses a page
@@ -36,7 +50,7 @@ extern unsigned long __toc_start;
  * will suffice to ensure that it doesn't cross a page boundary.
  */
 struct lppaca lppaca[] = {
-	[0 ... (NR_CPUS-1)] = {
+	[0 ... (NR_LPPACAS-1)] = {
 		.desc = 0xd397d781,	/* "LpPa" */
 		.size = sizeof(struct lppaca),
 		.dyn_proc_status = 2,
@@ -49,6 +63,54 @@ struct lppaca lppaca[] = {
 	},
 };
 
+static struct lppaca *extra_lppacas;
+static long __initdata lppaca_size;
+
+static void allocate_lppacas(int nr_cpus, unsigned long limit)
+{
+	if (nr_cpus <= NR_LPPACAS)
+		return;
+
+	lppaca_size = PAGE_ALIGN(sizeof(struct lppaca) *
+				 (nr_cpus - NR_LPPACAS));
+	extra_lppacas = __va(memblock_alloc_base(lppaca_size,
+						 PAGE_SIZE, limit));
+}
+
+static struct lppaca *new_lppaca(int cpu)
+{
+	struct lppaca *lp;
+
+	if (cpu < NR_LPPACAS)
+		return &lppaca[cpu];
+
+	lp = extra_lppacas + (cpu - NR_LPPACAS);
+	*lp = lppaca[0];
+
+	return lp;
+}
+
+static void free_lppacas(void)
+{
+	long new_size = 0, nr;
+
+	if (!lppaca_size)
+		return;
+	nr = num_possible_cpus() - NR_LPPACAS;
+	if (nr > 0)
+		new_size = PAGE_ALIGN(nr * sizeof(struct lppaca));
+	if (new_size >= lppaca_size)
+		return;
+
+	memblock_free(__pa(extra_lppacas) + new_size, lppaca_size - new_size);
+	lppaca_size = new_size;
+}
+
+#else
+
+static inline void allocate_lppacas(int, unsigned long) { }
+static inline void free_lppacas(void) { }
+
 #endif /* CONFIG_PPC_BOOK3S */
 
 #ifdef CONFIG_PPC_STD_MMU_64
@@ -88,7 +150,7 @@ void __init initialise_paca(struct paca_struct *new_paca, int cpu)
 	unsigned long kernel_toc = (unsigned long)(&__toc_start) + 0x8000UL;
 
 #ifdef CONFIG_PPC_BOOK3S
-	new_paca->lppaca_ptr = &lppaca[cpu];
+	new_paca->lppaca_ptr = new_lppaca(cpu);
 #else
 	new_paca->kernel_pgd = swapper_pg_dir;
 #endif
@@ -144,6 +206,8 @@ void __init allocate_pacas(void)
 	printk(KERN_DEBUG "Allocated %u bytes for %d pacas at %p\n",
 		paca_size, nr_cpus, paca);
 
+	allocate_lppacas(nr_cpus, limit);
+
 	/* Can't use for_each_*_cpu, as they aren't functional yet */
 	for (cpu = 0; cpu < nr_cpus; cpu++)
 		initialise_paca(&paca[cpu], cpu);
@@ -164,4 +228,6 @@ void __init free_unused_pacas(void)
 		paca_size - new_size);
 
 	paca_size = new_size;
+
+	free_lppacas();
 }
-- 
1.7.1

^ permalink raw reply related

* [PATCH 1/2] powerpc: Abstract indexing of lppaca structs
From: Paul Mackerras @ 2010-08-13  6:18 UTC (permalink / raw)
  To: linuxppc-dev

Currently we have the lppaca structs as a simple array of NR_CPUS
entries, taking up space in the data section of the kernel image.
In future we would like to allocate them dynamically, so this
abstracts out the accesses to the array, making it easier to
change how we locate the lppaca for a given cpu in future.
Specifically, lppaca[cpu] changes to lppaca_of(cpu).

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/lppaca.h     |    2 ++
 arch/powerpc/kernel/lparcfg.c         |   14 +++++++-------
 arch/powerpc/lib/locks.c              |    4 ++--
 arch/powerpc/platforms/iseries/dt.c   |    4 ++--
 arch/powerpc/platforms/iseries/smp.c  |    2 +-
 arch/powerpc/platforms/pseries/dtl.c  |    8 ++++----
 arch/powerpc/platforms/pseries/lpar.c |    4 ++--
 7 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h b/arch/powerpc/include/asm/lppaca.h
index 14b592d..6b73554 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -153,6 +153,8 @@ struct lppaca {
 
 extern struct lppaca lppaca[];
 
+#define lppaca_of(cpu)	(lppaca[cpu])
+
 /*
  * SLB shadow buffer structure as defined in the PAPR.  The save_area
  * contains adjacent ESID and VSID pairs for each shadowed SLB.  The
diff --git a/arch/powerpc/kernel/lparcfg.c b/arch/powerpc/kernel/lparcfg.c
index 50362b6..8d9e3b9 100644
--- a/arch/powerpc/kernel/lparcfg.c
+++ b/arch/powerpc/kernel/lparcfg.c
@@ -56,7 +56,7 @@ static unsigned long get_purr(void)
 
 	for_each_possible_cpu(cpu) {
 		if (firmware_has_feature(FW_FEATURE_ISERIES))
-			sum_purr += lppaca[cpu].emulated_time_base;
+			sum_purr += lppaca_of(cpu).emulated_time_base;
 		else {
 			struct cpu_usage *cu;
 
@@ -263,7 +263,7 @@ static void parse_ppp_data(struct seq_file *m)
 	           ppp_data.active_system_procs);
 
 	/* pool related entries are apropriate for shared configs */
-	if (lppaca[0].shared_proc) {
+	if (lppaca_of(0).shared_proc) {
 		unsigned long pool_idle_time, pool_procs;
 
 		seq_printf(m, "pool=%d\n", ppp_data.pool_num);
@@ -460,8 +460,8 @@ static void pseries_cmo_data(struct seq_file *m)
 		return;
 
 	for_each_possible_cpu(cpu) {
-		cmo_faults += lppaca[cpu].cmo_faults;
-		cmo_fault_time += lppaca[cpu].cmo_fault_time;
+		cmo_faults += lppaca_of(cpu).cmo_faults;
+		cmo_fault_time += lppaca_of(cpu).cmo_fault_time;
 	}
 
 	seq_printf(m, "cmo_faults=%lu\n", cmo_faults);
@@ -479,8 +479,8 @@ static void splpar_dispatch_data(struct seq_file *m)
 	unsigned long dispatch_dispersions = 0;
 
 	for_each_possible_cpu(cpu) {
-		dispatches += lppaca[cpu].yield_count;
-		dispatch_dispersions += lppaca[cpu].dispersion_count;
+		dispatches += lppaca_of(cpu).yield_count;
+		dispatch_dispersions += lppaca_of(cpu).dispersion_count;
 	}
 
 	seq_printf(m, "dispatches=%lu\n", dispatches);
@@ -545,7 +545,7 @@ static int pseries_lparcfg_data(struct seq_file *m, void *v)
 	seq_printf(m, "partition_potential_processors=%d\n",
 		   partition_potential_processors);
 
-	seq_printf(m, "shared_processor_mode=%d\n", lppaca[0].shared_proc);
+	seq_printf(m, "shared_processor_mode=%d\n", lppaca_of(0).shared_proc);
 
 	seq_printf(m, "slb_size=%d\n", mmu_slb_size);
 
diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c
index 58e14fb..9b8182e 100644
--- a/arch/powerpc/lib/locks.c
+++ b/arch/powerpc/lib/locks.c
@@ -34,7 +34,7 @@ void __spin_yield(arch_spinlock_t *lock)
 		return;
 	holder_cpu = lock_value & 0xffff;
 	BUG_ON(holder_cpu >= NR_CPUS);
-	yield_count = lppaca[holder_cpu].yield_count;
+	yield_count = lppaca_of(holder_cpu).yield_count;
 	if ((yield_count & 1) == 0)
 		return;		/* virtual cpu is currently running */
 	rmb();
@@ -65,7 +65,7 @@ void __rw_yield(arch_rwlock_t *rw)
 		return;		/* no write lock at present */
 	holder_cpu = lock_value & 0xffff;
 	BUG_ON(holder_cpu >= NR_CPUS);
-	yield_count = lppaca[holder_cpu].yield_count;
+	yield_count = lppaca_of(holder_cpu).yield_count;
 	if ((yield_count & 1) == 0)
 		return;		/* virtual cpu is currently running */
 	rmb();
diff --git a/arch/powerpc/platforms/iseries/dt.c b/arch/powerpc/platforms/iseries/dt.c
index 7f45a51..fdb7384 100644
--- a/arch/powerpc/platforms/iseries/dt.c
+++ b/arch/powerpc/platforms/iseries/dt.c
@@ -243,7 +243,7 @@ static void __init dt_cpus(struct iseries_flat_dt *dt)
 	pft_size[1] = __ilog2(HvCallHpt_getHptPages() * HW_PAGE_SIZE);
 
 	for (i = 0; i < NR_CPUS; i++) {
-		if (lppaca[i].dyn_proc_status >= 2)
+		if (lppaca_of(i).dyn_proc_status >= 2)
 			continue;
 
 		snprintf(p, 32 - (p - buf), "@%d", i);
@@ -251,7 +251,7 @@ static void __init dt_cpus(struct iseries_flat_dt *dt)
 
 		dt_prop_str(dt, "device_type", device_type_cpu);
 
-		index = lppaca[i].dyn_hv_phys_proc_index;
+		index = lppaca_of(i).dyn_hv_phys_proc_index;
 		d = &xIoHriProcessorVpd[index];
 
 		dt_prop_u32(dt, "i-cache-size", d->xInstCacheSize * 1024);
diff --git a/arch/powerpc/platforms/iseries/smp.c b/arch/powerpc/platforms/iseries/smp.c
index 6590850..6c60299 100644
--- a/arch/powerpc/platforms/iseries/smp.c
+++ b/arch/powerpc/platforms/iseries/smp.c
@@ -91,7 +91,7 @@ static void smp_iSeries_kick_cpu(int nr)
 	BUG_ON((nr < 0) || (nr >= NR_CPUS));
 
 	/* Verify that our partition has a processor nr */
-	if (lppaca[nr].dyn_proc_status >= 2)
+	if (lppaca_of(nr).dyn_proc_status >= 2)
 		return;
 
 	/* The processor is currently spinning, waiting
diff --git a/arch/powerpc/platforms/pseries/dtl.c b/arch/powerpc/platforms/pseries/dtl.c
index a00addb..adfd544 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -107,14 +107,14 @@ static int dtl_enable(struct dtl *dtl)
 	}
 
 	/* set our initial buffer indices */
-	dtl->last_idx = lppaca[dtl->cpu].dtl_idx = 0;
+	dtl->last_idx = lppaca_of(dtl->cpu).dtl_idx = 0;
 
 	/* ensure that our updates to the lppaca fields have occurred before
 	 * we actually enable the logging */
 	smp_wmb();
 
 	/* enable event logging */
-	lppaca[dtl->cpu].dtl_enable_mask = dtl_event_mask;
+	lppaca_of(dtl->cpu).dtl_enable_mask = dtl_event_mask;
 
 	return 0;
 }
@@ -123,7 +123,7 @@ static void dtl_disable(struct dtl *dtl)
 {
 	int hwcpu = get_hard_smp_processor_id(dtl->cpu);
 
-	lppaca[dtl->cpu].dtl_enable_mask = 0x0;
+	lppaca_of(dtl->cpu).dtl_enable_mask = 0x0;
 
 	unregister_dtl(hwcpu, __pa(dtl->buf));
 
@@ -171,7 +171,7 @@ static ssize_t dtl_file_read(struct file *filp, char __user *buf, size_t len,
 	/* actual number of entries read */
 	n_read = 0;
 
-	cur_idx = lppaca[dtl->cpu].dtl_idx;
+	cur_idx = lppaca_of(dtl->cpu).dtl_idx;
 	last_idx = dtl->last_idx;
 
 	if (cur_idx - last_idx > dtl->buf_entries) {
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index cf79b46..a17fe4a 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -250,9 +250,9 @@ void vpa_init(int cpu)
 	long ret;
 
 	if (cpu_has_feature(CPU_FTR_ALTIVEC))
-		lppaca[cpu].vmxregs_in_use = 1;
+		lppaca_of(cpu).vmxregs_in_use = 1;
 
-	addr = __pa(&lppaca[cpu]);
+	addr = __pa(&lppaca_of(cpu));
 	ret = register_vpa(hwcpu, addr);
 
 	if (ret) {
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH] powerpc: Add support for popcnt instructions
From: Anton Blanchard @ 2010-08-13  5:38 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1281673744.2987.362.camel@pasglop>

 
Hi,

> Especially from modules it will suck big time. If kept out of line they
> should probably be linked-in with each module, but I'd rather have them
> inlined.

Inlining would be good, but this is as far as I can take this for now.
If someone else is interested go for it :)

Anton

^ permalink raw reply

* Re: [PATCH] powerpc: Add support for popcnt instructions
From: Benjamin Herrenschmidt @ 2010-08-13  4:29 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: linuxppc-dev
In-Reply-To: <20100813022809.GY29316@kryten>

On Fri, 2010-08-13 at 12:28 +1000, Anton Blanchard wrote:
> POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
> this patch does all the work out of line, but it would be nice to implement
> them as inlines with an out of line fallback.
> 
> The performance issue with hweight was noticed when disabling SMT on a large
> (192 thread) POWER7 box. The patch improves that testcase by about 8%.

Especially from modules it will suck big time. If kept out of line they
should probably be linked-in with each module, but I'd rather have them
inlined.

Cheers,
Ben.

> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---
> 
> Index: powerpc.git/arch/powerpc/include/asm/cputable.h
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/include/asm/cputable.h	2010-08-13 11:19:42.691991439 +1000
> +++ powerpc.git/arch/powerpc/include/asm/cputable.h	2010-08-13 11:24:55.510741618 +1000
> @@ -199,6 +199,8 @@ extern const char *powerpc_base_platform
>  #define CPU_FTR_UNALIGNED_LD_STD	LONG_ASM_CONST(0x0080000000000000)
>  #define CPU_FTR_ASYM_SMT		LONG_ASM_CONST(0x0100000000000000)
>  #define CPU_FTR_STCX_CHECKS_ADDRESS	LONG_ASM_CONST(0x0200000000000000)
> +#define CPU_FTR_POPCNTB			LONG_ASM_CONST(0x0400000000000000)
> +#define CPU_FTR_POPCNTD			LONG_ASM_CONST(0x0800000000000000)
>  
>  #ifndef __ASSEMBLY__
>  
> @@ -403,21 +405,22 @@ extern const char *powerpc_base_platform
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
> -	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS)
> +	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS | \
> +	    CPU_FTR_POPCNTB)
>  #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
>  	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
>  	    CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
> -	    CPU_FTR_STCX_CHECKS_ADDRESS)
> +	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB)
>  #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
>  	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
>  	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
>  	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
> -	    CPU_FTR_STCX_CHECKS_ADDRESS)
> +	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD)
>  #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
>  	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>  	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
> Index: powerpc.git/arch/powerpc/lib/Makefile
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/lib/Makefile	2010-08-13 11:19:43.653241065 +1000
> +++ powerpc.git/arch/powerpc/lib/Makefile	2010-08-13 11:19:45.930743841 +1000
> @@ -18,7 +18,7 @@ obj-$(CONFIG_HAS_IOMEM)	+= devres.o
>  
>  obj-$(CONFIG_PPC64)	+= copypage_64.o copyuser_64.o \
>  			   memcpy_64.o usercopy_64.o mem_64.o string.o \
> -			   checksum_wrappers_64.o
> +			   checksum_wrappers_64.o hweight_64.o
>  obj-$(CONFIG_XMON)	+= sstep.o ldstfp.o
>  obj-$(CONFIG_KPROBES)	+= sstep.o ldstfp.o
>  obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= sstep.o ldstfp.o
> Index: powerpc.git/arch/powerpc/lib/hweight_64.S
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ powerpc.git/arch/powerpc/lib/hweight_64.S	2010-08-13 11:19:45.940741462 +1000
> @@ -0,0 +1,110 @@
> +/*
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
> + *
> + * Copyright (C) IBM Corporation, 2010
> + *
> + * Author: Anton Blanchard <anton@au.ibm.com>
> + */
> +#include <asm/processor.h>
> +#include <asm/ppc_asm.h>
> +
> +/* Note: This code relies on -mminimal-toc */
> +
> +_GLOBAL(__arch_hweight8)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight8
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +	popcntb	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> +
> +_GLOBAL(__arch_hweight16)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight16
> +	nop
> +	nop
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +  BEGIN_FTR_SECTION_NESTED(50)
> +	popcntb r3,r3
> +	srdi	r4,r3,8
> +	add	r3,r4,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  FTR_SECTION_ELSE_NESTED(50)
> +	clrlwi  r3,r3,16
> +	popcntw	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 50)
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> +
> +_GLOBAL(__arch_hweight32)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight32
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +  BEGIN_FTR_SECTION_NESTED(51)
> +	popcntb r3,r3
> +	srdi	r4,r3,16
> +	add	r3,r4,r3
> +	srdi	r4,r3,8
> +	add	r3,r4,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  FTR_SECTION_ELSE_NESTED(51)
> +	popcntw	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 51)
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> +
> +_GLOBAL(__arch_hweight64)
> +BEGIN_FTR_SECTION
> +	b .__sw_hweight64
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +	nop
> +FTR_SECTION_ELSE
> +  BEGIN_FTR_SECTION_NESTED(52)
> +	popcntb r3,r3
> +	srdi	r4,r3,32
> +	add	r3,r4,r3
> +	srdi	r4,r3,16
> +	add	r3,r4,r3
> +	srdi	r4,r3,8
> +	add	r3,r4,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  FTR_SECTION_ELSE_NESTED(52)
> +	popcntd	r3,r3
> +	clrldi	r3,r3,64-8
> +	blr
> +  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 52)
> +ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
> Index: powerpc.git/arch/powerpc/include/asm/bitops.h
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/include/asm/bitops.h	2010-08-13 11:06:20.991992998 +1000
> +++ powerpc.git/arch/powerpc/include/asm/bitops.h	2010-08-13 11:19:45.940741462 +1000
> @@ -267,7 +267,16 @@ static __inline__ int fls64(__u64 x)
>  #include <asm-generic/bitops/fls64.h>
>  #endif /* __powerpc64__ */
>  
> +#ifdef CONFIG_PPC64
> +unsigned int __arch_hweight8(unsigned int w);
> +unsigned int __arch_hweight16(unsigned int w);
> +unsigned int __arch_hweight32(unsigned int w);
> +unsigned long __arch_hweight64(__u64 w);
> +#include <asm-generic/bitops/const_hweight.h>
> +#else
>  #include <asm-generic/bitops/hweight.h>
> +#endif
> +
>  #include <asm-generic/bitops/find.h>
>  
>  /* Little-endian versions */
> Index: powerpc.git/arch/powerpc/kernel/ppc_ksyms.c
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:06:21.011991745 +1000
> +++ powerpc.git/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:19:45.940741462 +1000
> @@ -186,3 +186,10 @@ EXPORT_SYMBOL(__mtdcr);
>  EXPORT_SYMBOL(__mfdcr);
>  #endif
>  EXPORT_SYMBOL(empty_zero_page);
> +
> +#ifdef CONFIG_PPC64
> +EXPORT_SYMBOL(__arch_hweight8);
> +EXPORT_SYMBOL(__arch_hweight16);
> +EXPORT_SYMBOL(__arch_hweight32);
> +EXPORT_SYMBOL(__arch_hweight64);
> +#endif

^ permalink raw reply

* [PATCH] powerpc: Add support for popcnt instructions
From: Anton Blanchard @ 2010-08-13  2:28 UTC (permalink / raw)
  To: benh; +Cc: linuxppc-dev


POWER5 added popcntb, and POWER7 added popcntw and popcntd. As a first step
this patch does all the work out of line, but it would be nice to implement
them as inlines with an out of line fallback.

The performance issue with hweight was noticed when disabling SMT on a large
(192 thread) POWER7 box. The patch improves that testcase by about 8%.

Signed-off-by: Anton Blanchard <anton@samba.org>
---

Index: powerpc.git/arch/powerpc/include/asm/cputable.h
===================================================================
--- powerpc.git.orig/arch/powerpc/include/asm/cputable.h	2010-08-13 11:19:42.691991439 +1000
+++ powerpc.git/arch/powerpc/include/asm/cputable.h	2010-08-13 11:24:55.510741618 +1000
@@ -199,6 +199,8 @@ extern const char *powerpc_base_platform
 #define CPU_FTR_UNALIGNED_LD_STD	LONG_ASM_CONST(0x0080000000000000)
 #define CPU_FTR_ASYM_SMT		LONG_ASM_CONST(0x0100000000000000)
 #define CPU_FTR_STCX_CHECKS_ADDRESS	LONG_ASM_CONST(0x0200000000000000)
+#define CPU_FTR_POPCNTB			LONG_ASM_CONST(0x0400000000000000)
+#define CPU_FTR_POPCNTD			LONG_ASM_CONST(0x0800000000000000)
 
 #ifndef __ASSEMBLY__
 
@@ -403,21 +405,22 @@ extern const char *powerpc_base_platform
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
-	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS)
+	    CPU_FTR_PURR | CPU_FTR_STCX_CHECKS_ADDRESS | \
+	    CPU_FTR_POPCNTB)
 #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
 	    CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
-	    CPU_FTR_STCX_CHECKS_ADDRESS)
+	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB)
 #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_LOCKLESS_TLBIE | \
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
 	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
-	    CPU_FTR_STCX_CHECKS_ADDRESS)
+	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD)
 #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
Index: powerpc.git/arch/powerpc/lib/Makefile
===================================================================
--- powerpc.git.orig/arch/powerpc/lib/Makefile	2010-08-13 11:19:43.653241065 +1000
+++ powerpc.git/arch/powerpc/lib/Makefile	2010-08-13 11:19:45.930743841 +1000
@@ -18,7 +18,7 @@ obj-$(CONFIG_HAS_IOMEM)	+= devres.o
 
 obj-$(CONFIG_PPC64)	+= copypage_64.o copyuser_64.o \
 			   memcpy_64.o usercopy_64.o mem_64.o string.o \
-			   checksum_wrappers_64.o
+			   checksum_wrappers_64.o hweight_64.o
 obj-$(CONFIG_XMON)	+= sstep.o ldstfp.o
 obj-$(CONFIG_KPROBES)	+= sstep.o ldstfp.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= sstep.o ldstfp.o
Index: powerpc.git/arch/powerpc/lib/hweight_64.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ powerpc.git/arch/powerpc/lib/hweight_64.S	2010-08-13 11:19:45.940741462 +1000
@@ -0,0 +1,110 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) IBM Corporation, 2010
+ *
+ * Author: Anton Blanchard <anton@au.ibm.com>
+ */
+#include <asm/processor.h>
+#include <asm/ppc_asm.h>
+
+/* Note: This code relies on -mminimal-toc */
+
+_GLOBAL(__arch_hweight8)
+BEGIN_FTR_SECTION
+	b .__sw_hweight8
+	nop
+	nop
+FTR_SECTION_ELSE
+	popcntb	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
+
+_GLOBAL(__arch_hweight16)
+BEGIN_FTR_SECTION
+	b .__sw_hweight16
+	nop
+	nop
+	nop
+	nop
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(50)
+	popcntb r3,r3
+	srdi	r4,r3,8
+	add	r3,r4,r3
+	clrldi	r3,r3,64-8
+	blr
+  FTR_SECTION_ELSE_NESTED(50)
+	clrlwi  r3,r3,16
+	popcntw	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 50)
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
+
+_GLOBAL(__arch_hweight32)
+BEGIN_FTR_SECTION
+	b .__sw_hweight32
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(51)
+	popcntb r3,r3
+	srdi	r4,r3,16
+	add	r3,r4,r3
+	srdi	r4,r3,8
+	add	r3,r4,r3
+	clrldi	r3,r3,64-8
+	blr
+  FTR_SECTION_ELSE_NESTED(51)
+	popcntw	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 51)
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
+
+_GLOBAL(__arch_hweight64)
+BEGIN_FTR_SECTION
+	b .__sw_hweight64
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+FTR_SECTION_ELSE
+  BEGIN_FTR_SECTION_NESTED(52)
+	popcntb r3,r3
+	srdi	r4,r3,32
+	add	r3,r4,r3
+	srdi	r4,r3,16
+	add	r3,r4,r3
+	srdi	r4,r3,8
+	add	r3,r4,r3
+	clrldi	r3,r3,64-8
+	blr
+  FTR_SECTION_ELSE_NESTED(52)
+	popcntd	r3,r3
+	clrldi	r3,r3,64-8
+	blr
+  ALT_FTR_SECTION_END_NESTED_IFCLR(CPU_FTR_POPCNTD, 52)
+ALT_FTR_SECTION_END_IFCLR(CPU_FTR_POPCNTB)
Index: powerpc.git/arch/powerpc/include/asm/bitops.h
===================================================================
--- powerpc.git.orig/arch/powerpc/include/asm/bitops.h	2010-08-13 11:06:20.991992998 +1000
+++ powerpc.git/arch/powerpc/include/asm/bitops.h	2010-08-13 11:19:45.940741462 +1000
@@ -267,7 +267,16 @@ static __inline__ int fls64(__u64 x)
 #include <asm-generic/bitops/fls64.h>
 #endif /* __powerpc64__ */
 
+#ifdef CONFIG_PPC64
+unsigned int __arch_hweight8(unsigned int w);
+unsigned int __arch_hweight16(unsigned int w);
+unsigned int __arch_hweight32(unsigned int w);
+unsigned long __arch_hweight64(__u64 w);
+#include <asm-generic/bitops/const_hweight.h>
+#else
 #include <asm-generic/bitops/hweight.h>
+#endif
+
 #include <asm-generic/bitops/find.h>
 
 /* Little-endian versions */
Index: powerpc.git/arch/powerpc/kernel/ppc_ksyms.c
===================================================================
--- powerpc.git.orig/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:06:21.011991745 +1000
+++ powerpc.git/arch/powerpc/kernel/ppc_ksyms.c	2010-08-13 11:19:45.940741462 +1000
@@ -186,3 +186,10 @@ EXPORT_SYMBOL(__mtdcr);
 EXPORT_SYMBOL(__mfdcr);
 #endif
 EXPORT_SYMBOL(empty_zero_page);
+
+#ifdef CONFIG_PPC64
+EXPORT_SYMBOL(__arch_hweight8);
+EXPORT_SYMBOL(__arch_hweight16);
+EXPORT_SYMBOL(__arch_hweight32);
+EXPORT_SYMBOL(__arch_hweight64);
+#endif

^ permalink raw reply

* Re: Query regarding 2.6.335 RT[Ingo's] and Non-RT performance
From: Xianghua Xiao @ 2010-08-13  2:18 UTC (permalink / raw)
  To: Jeff Angielski; +Cc: linuxppc-dev
In-Reply-To: <4C64352F.4090005@theptrgroup.com>

On Thu, Aug 12, 2010 at 12:53 PM, Jeff Angielski <jeff@theptrgroup.com> wro=
te:
> On 08/11/2010 06:18 PM, Manikandan Ramachandran wrote:
>>
>> Hello All,
>> =C2=A0 =C2=A0 I created a very simple program which has higher priority =
than
>> normal tasks and runs a tight loop. Under same test environment I ran
>> this program on both non-rt and rt 2.6.33.5 kernel. =C2=A0To my suprise =
I see
>> that performance of non-RT kernel is better than RT. non-RT kernel took
>> 3 sec and 366156 usec while RT kernel took about 3 sec and 418011
>> usec.Can someone please explain why the performance of non-rt kernel is
>> better than rt kernel? From the face of the test result, I feel RT has
>> more overhead,Is there any configuration that I could do to bring down
>> the overhead?
>
> Your "surprise" is due to your definition of "performance".
>
> The purpose of the -rt kernels is to reduce the kernel latency. =C2=A0Thi=
s is
> important for servicing hardware. =C2=A0Normal users find the -rt useful =
for
> audio/video applications. =C2=A0Engineering and scientific users find the=
 -rt
> beneficially for servicing hardware like sensors or control systems.
>
> If you are just trying to run calculations as fast as you can in user spa=
ce,
> you'd be better off using the non-rt variants.
>
>
> --
> Jeff Angielski
> The PTR Group
> www.theptrgroup.com
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>

true, in most cases non-rt will have better performance/throughput,
while rt's major goal is to have better latency for high priority
tasks. also true is that, rt kernel will have more overhead.

xianghua

^ permalink raw reply

* Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
From: Dave Hansen @ 2010-08-12 20:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linuxppc-dev, Greg KH, linux-kernel, linux-mm, KAMEZAWA Hiroyuki
In-Reply-To: <20100812120816.e97d8b9e.akpm@linux-foundation.org>

On Thu, 2010-08-12 at 12:08 -0700, Andrew Morton wrote:
> > This set of patches allows for each directory created in sysfs
> > to cover more than one memory section.  The default behavior for
> > sysfs directory creation is the same, in that each directory
> > represents a single memory section.  A new file 'end_phys_index'
> > in each directory contains the physical_id of the last memory
> > section covered by the directory so that users can easily
> > determine the memory section range of a directory.
> 
> What you're proposing appears to be a non-back-compatible
> userspace-visible change.  This is a big issue! 

Nathan, one thought to get around this at the moment would be to bump up
the size that we export in /sys/devices/system/memory/block_size_bytes.
I think you have already done most of the hard work to accomplish
this.  

You can still add the end_phys_index stuff.  But, for now, it would
always be equal to start_phys_index.

-- Dave

^ permalink raw reply

* Question about dma_direct_ops in PowerPC.
From: Fushen Chen @ 2010-08-12 19:28 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2011 bytes --]

We have a board with PCI device driver that calls for
pci_dma_sync_single_for_device.
This driver used to work for Linux kernel 2.6.25.

We ported to the driver to Linux kernel 2.6.32. The PCI device driver
doesn't work anymore.
The following call trace shows why the PCI driver won't work in kernel
2.6.32.
1. In pci_include/asm-generic/pci-dma-compat.h
    pci_dma_sync_single_for_device calls for dma_sync_single_for_cpu
2. In include/asm-generic/dma-mapping-common.h
    dma_sync_single_for_cpu calls for ops->sync_single_for_cpu
3. In arch/powerpc/kernel/dma.c
struct dma_map_ops dma_direct_ops = {
        .alloc_coherent = dma_direct_alloc_coherent,
        .free_coherent  = dma_direct_free_coherent,
        .map_sg         = dma_direct_map_sg,
        .unmap_sg       = dma_direct_unmap_sg,
        .dma_supported  = dma_direct_dma_supported,
        .map_page       = dma_direct_map_page,
        .unmap_page     = dma_direct_unmap_page,
#ifdef CONFIG_NOT_COHERENT_CACHE
        .sync_single_range_for_cpu      = dma_direct_sync_single_range,
        .sync_single_range_for_device   = dma_direct_sync_single_range,
        .sync_sg_for_cpu                = dma_direct_sync_sg,
        .sync_sg_for_device             = dma_direct_sync_sg,
#endif
};
There is no ops defined for sync_single_for_cpu.
The pci_dma_sync_single_for_device is a no-op.

However Linux kernel 2.6.35.1 from kernel.org has the  .sync_single_for_cpu
for dma_direct_ops.
in arch/powerpc/kernel/dma.c
#ifdef CONFIG_NOT_COHERENT_CACHE
        .sync_single_for_cpu            = dma_direct_sync_single,
        .sync_single_for_device         = dma_direct_sync_single,
        .sync_sg_for_cpu                = dma_direct_sync_sg,
        .sync_sg_for_device             = dma_direct_sync_sg,
#endif


We won't move to Linux kernel 2.6.35 anytime soon.
My questions:
1. Is there any side effect for adding .sync_single_for_cpu to
dma_direct_ops in 2.6.32?
2. What will be the future development here?


Best regards & Thanks,
Fushen

[-- Attachment #2: Type: text/html, Size: 2235 bytes --]

^ permalink raw reply

* Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections
From: Andrew Morton @ 2010-08-12 19:08 UTC (permalink / raw)
  To: Nathan Fontenot
  Cc: linuxppc-dev, Greg KH, linux-kernel, Dave Hansen, linux-mm,
	KAMEZAWA Hiroyuki
In-Reply-To: <4C60407C.2080608@austin.ibm.com>

On Mon, 09 Aug 2010 12:53:00 -0500
Nathan Fontenot <nfont@austin.ibm.com> wrote:

> This set of patches de-couples the idea that there is a single
> directory in sysfs for each memory section.  The intent of the
> patches is to reduce the number of sysfs directories created to
> resolve a boot-time performance issue.  On very large systems
> boot time are getting very long (as seen on powerpc hardware)
> due to the enormous number of sysfs directories being created.
> On a system with 1 TB of memory we create ~63,000 directories.
> For even larger systems boot times are being measured in hours.

And those "hours" are mainly due to this problem, I assume.

> This set of patches allows for each directory created in sysfs
> to cover more than one memory section.  The default behavior for
> sysfs directory creation is the same, in that each directory
> represents a single memory section.  A new file 'end_phys_index'
> in each directory contains the physical_id of the last memory
> section covered by the directory so that users can easily
> determine the memory section range of a directory.

What you're proposing appears to be a non-back-compatible
userspace-visible change.  This is a big issue!

It's not an unresolvable issue, as this is a must-fix problem.  But you
should tell us what your proposal is to prevent breakage of existing
installations.  A Kconfig option would be good, but a boot-time kernel
command line option which selects the new format would be much better.

However you didn't mention this issue at all, and it's the most
important one.


> Updates for version 5 of the patchset include the following:
> 
> Patch 4/8 Add mutex for add/remove of memory blocks
> - Define the mutex using DEFINE_MUTEX macro.
> 
> Patch 8/8 Update memory-hotplug documentation
> - Add information concerning memory holes in phys_index..end_phys_index.

And you forgot to tell us how long those machines boot with the
patchset applied, which is the entire point of the patchset!

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox