* [PATCH] [V2] net: add Xilinx emac lite device driver
From: John Linn @ 2009-08-19 12:29 UTC (permalink / raw)
To: netdev, linuxppc-dev, jgarzik, davem
Cc: Sadanand M, Michal Simek, John Linn, John Williams
This patch adds support for the Xilinx Ethernet Lite device. The
soft logic core from Xilinx is typically used on Virtex and Spartan
designs attached to either a PowerPC or a Microblaze processor.
CC: Grant Likely <grant.likely@secretlab.ca>
CC: Josh Boyer <jwboyer@linux.vnet.ibm.com>
CC: John Williams <john.williams@petalogix.com>
CC: Michal Simek <michal.simek@petalogix.com>
Signed-off-by: Sadanand M <sadanan@xilinx.com>
Signed-off-by: John Linn <john.linn@xilinx.com>
---
V2 - cleanup based on review, added depends for ppc and mb in Kconfig
---
drivers/net/Kconfig | 6 +
drivers/net/Makefile | 1 +
drivers/net/xilinx_emaclite.c | 1040 +++++++++++++++++++++++++++++++++++++++++
3 files changed, 1047 insertions(+), 0 deletions(-)
create mode 100755 drivers/net/xilinx_emaclite.c
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 5f6509a..ec77b69 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -1926,6 +1926,12 @@ config ATL2
To compile this driver as a module, choose M here. The module
will be called atl2.
+config XILINX_EMACLITE
+ tristate "Xilinx 10/100 Ethernet Lite support"
+ depends on PPC32 || MICROBLAZE
+ help
+ This driver supports the 10/100 Ethernet Lite from Xilinx.
+
source "drivers/net/fs_enet/Kconfig"
endif # NET_ETHERNET
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index ead8cab..99ae6d7 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -142,6 +142,7 @@ obj-$(CONFIG_TSI108_ETH) += tsi108_eth.o
obj-$(CONFIG_MV643XX_ETH) += mv643xx_eth.o
ll_temac-objs := ll_temac_main.o ll_temac_mdio.o
obj-$(CONFIG_XILINX_LL_TEMAC) += ll_temac.o
+obj-$(CONFIG_XILINX_EMACLITE) += xilinx_emaclite.o
obj-$(CONFIG_QLA3XXX) += qla3xxx.o
obj-$(CONFIG_QLGE) += qlge/
diff --git a/drivers/net/xilinx_emaclite.c b/drivers/net/xilinx_emaclite.c
new file mode 100755
index 0000000..3716e20
--- /dev/null
+++ b/drivers/net/xilinx_emaclite.c
@@ -0,0 +1,1040 @@
+/*
+ * Xilinx EmacLite Linux driver for the Xilinx Ethernet MAC Lite device.
+ *
+ * This is a new flat driver which is based on the original emac_lite
+ * driver from John Williams <john.williams@petalogix.com>.
+ *
+ * 2007-2009 (c) Xilinx, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/uaccess.h>
+#include <linux/init.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/skbuff.h>
+#include <linux/io.h>
+
+#include <linux/of_device.h>
+#include <linux/of_platform.h>
+
+#define DRIVER_NAME "xilinx_emaclite"
+
+/* Register offsets for the EmacLite Core */
+#define XEL_TXBUFF_OFFSET 0x0 /* Transmit Buffer */
+#define XEL_GIER_OFFSET 0x07F8 /* GIE Register */
+#define XEL_TSR_OFFSET 0x07FC /* Tx status */
+#define XEL_TPLR_OFFSET 0x07F4 /* Tx packet length */
+
+#define XEL_RXBUFF_OFFSET 0x1000 /* Receive Buffer */
+#define XEL_RPLR_OFFSET 0x100C /* Rx packet length */
+#define XEL_RSR_OFFSET 0x17FC /* Rx status */
+
+#define XEL_BUFFER_OFFSET 0x0800 /* Next Tx/Rx buffer's offset */
+
+/* Global Interrupt Enable Register (GIER) Bit Masks */
+#define XEL_GIER_GIE_MASK 0x80000000 /* Global Enable */
+
+/* Transmit Status Register (TSR) Bit Masks */
+#define XEL_TSR_XMIT_BUSY_MASK 0x00000001 /* Tx complete */
+#define XEL_TSR_PROGRAM_MASK 0x00000002 /* Program the MAC address */
+#define XEL_TSR_XMIT_IE_MASK 0x00000008 /* Tx interrupt enable bit */
+#define XEL_TSR_XMIT_ACTIVE_MASK 0x80000000 /* Buffer is active, SW bit
+ * only. This is not documented
+ * in the HW spec */
+
+/* Define for programming the MAC address into the EmacLite */
+#define XEL_TSR_PROG_MAC_ADDR (XEL_TSR_XMIT_BUSY_MASK | XEL_TSR_PROGRAM_MASK)
+
+/* Receive Status Register (RSR) */
+#define XEL_RSR_RECV_DONE_MASK 0x00000001 /* Rx complete */
+#define XEL_RSR_RECV_IE_MASK 0x00000008 /* Rx interrupt enable bit */
+
+/* Transmit Packet Length Register (TPLR) */
+#define XEL_TPLR_LENGTH_MASK 0x0000FFFF /* Tx packet length */
+
+/* Receive Packet Length Register (RPLR) */
+#define XEL_RPLR_LENGTH_MASK 0x0000FFFF /* Rx packet length */
+
+#define XEL_HEADER_OFFSET 12 /* Offset to length field */
+#define XEL_HEADER_SHIFT 16 /* Shift value for length */
+
+/* General Ethernet Definitions */
+#define XEL_ARP_PACKET_SIZE 28 /* Max ARP packet size */
+#define XEL_HEADER_IP_LENGTH_OFFSET 16 /* IP Length Offset */
+
+
+
+#define TX_TIMEOUT (60*HZ) /* Tx timeout is 60 seconds. */
+#define ALIGNMENT 4
+
+/* BUFFER_ALIGN(adr) calculates the number of bytes to the next alignment. */
+#define BUFFER_ALIGN(adr) ((ALIGNMENT - ((u32) adr)) % ALIGNMENT)
+
+/**
+ * struct net_local - Our private per device data
+ * @ndev: instance of the network device
+ * @tx_ping_pong: indicates whether Tx Pong buffer is configured in HW
+ * @rx_ping_pong: indicates whether Rx Pong buffer is configured in HW
+ * @next_tx_buf_to_use: next Tx buffer to write to
+ * @next_rx_buf_to_use: next Rx buffer to read from
+ * @base_addr: base address of the Emaclite device
+ * @reset_lock: lock used for synchronization
+ * @deferred_skb: holds an skb (for transmission at a later time) when the
+ * Tx buffer is not free
+ */
+struct net_local {
+
+ struct net_device *ndev;
+
+ bool tx_ping_pong;
+ bool rx_ping_pong;
+ u32 next_tx_buf_to_use;
+ u32 next_rx_buf_to_use;
+ void __iomem *base_addr;
+
+ spinlock_t reset_lock;
+ struct sk_buff *deferred_skb;
+};
+
+
+/*************************/
+/* EmacLite driver calls */
+/*************************/
+
+/**
+ * xemaclite_enable_interrupts - Enable the interrupts for the EmacLite device
+ * @drvdata: Pointer to the Emaclite device private data
+ *
+ * This function enables the Tx and Rx interrupts for the Emaclite device along
+ * with the Global Interrupt Enable.
+ */
+static void xemaclite_enable_interrupts(struct net_local *drvdata)
+{
+ u32 reg_data;
+
+ /* Enable the Tx interrupts for the first Buffer */
+ reg_data = in_be32(drvdata->base_addr + XEL_TSR_OFFSET);
+ out_be32(drvdata->base_addr + XEL_TSR_OFFSET,
+ reg_data | XEL_TSR_XMIT_IE_MASK);
+
+ /* Enable the Tx interrupts for the second Buffer if
+ * configured in HW */
+ if (drvdata->tx_ping_pong != 0) {
+ reg_data = in_be32(drvdata->base_addr +
+ XEL_BUFFER_OFFSET + XEL_TSR_OFFSET);
+ out_be32(drvdata->base_addr + XEL_BUFFER_OFFSET +
+ XEL_TSR_OFFSET,
+ reg_data | XEL_TSR_XMIT_IE_MASK);
+ }
+
+ /* Enable the Rx interrupts for the first buffer */
+ reg_data = in_be32(drvdata->base_addr + XEL_RSR_OFFSET);
+ out_be32(drvdata->base_addr + XEL_RSR_OFFSET,
+ reg_data | XEL_RSR_RECV_IE_MASK);
+
+ /* Enable the Rx interrupts for the second Buffer if
+ * configured in HW */
+ if (drvdata->rx_ping_pong != 0) {
+ reg_data = in_be32(drvdata->base_addr + XEL_BUFFER_OFFSET +
+ XEL_RSR_OFFSET);
+ out_be32(drvdata->base_addr + XEL_BUFFER_OFFSET +
+ XEL_RSR_OFFSET,
+ reg_data | XEL_RSR_RECV_IE_MASK);
+ }
+
+ /* Enable the Global Interrupt Enable */
+ out_be32(drvdata->base_addr + XEL_GIER_OFFSET, XEL_GIER_GIE_MASK);
+}
+
+/**
+ * xemaclite_disable_interrupts - Disable the interrupts for the EmacLite device
+ * @drvdata: Pointer to the Emaclite device private data
+ *
+ * This function disables the Tx and Rx interrupts for the Emaclite device,
+ * along with the Global Interrupt Enable.
+ */
+static void xemaclite_disable_interrupts(struct net_local *drvdata)
+{
+ u32 reg_data;
+
+ /* Disable the Global Interrupt Enable */
+ out_be32(drvdata->base_addr + XEL_GIER_OFFSET, XEL_GIER_GIE_MASK);
+
+ /* Disable the Tx interrupts for the first buffer */
+ reg_data = in_be32(drvdata->base_addr + XEL_TSR_OFFSET);
+ out_be32(drvdata->base_addr + XEL_TSR_OFFSET,
+ reg_data & (~XEL_TSR_XMIT_IE_MASK));
+
+ /* Disable the Tx interrupts for the second Buffer
+ * if configured in HW */
+ if (drvdata->tx_ping_pong != 0) {
+ reg_data = in_be32(drvdata->base_addr + XEL_BUFFER_OFFSET +
+ XEL_TSR_OFFSET);
+ out_be32(drvdata->base_addr + XEL_BUFFER_OFFSET +
+ XEL_TSR_OFFSET,
+ reg_data & (~XEL_TSR_XMIT_IE_MASK));
+ }
+
+ /* Disable the Rx interrupts for the first buffer */
+ reg_data = in_be32(drvdata->base_addr + XEL_RSR_OFFSET);
+ out_be32(drvdata->base_addr + XEL_RSR_OFFSET,
+ reg_data & (~XEL_RSR_RECV_IE_MASK));
+
+ /* Disable the Rx interrupts for the second buffer
+ * if configured in HW */
+ if (drvdata->rx_ping_pong != 0) {
+
+ reg_data = in_be32(drvdata->base_addr + XEL_BUFFER_OFFSET +
+ XEL_RSR_OFFSET);
+ out_be32(drvdata->base_addr + XEL_BUFFER_OFFSET +
+ XEL_RSR_OFFSET,
+ reg_data & (~XEL_RSR_RECV_IE_MASK));
+ }
+}
+
+/**
+ * xemaclite_aligned_write - Write from 16-bit aligned to 32-bit aligned address
+ * @src_ptr: Void pointer to the 16-bit aligned source address
+ * @dest_ptr: Pointer to the 32-bit aligned destination address
+ * @length: Number bytes to write from source to destination
+ *
+ * This function writes data from a 16-bit aligned buffer to a 32-bit aligned
+ * address in the EmacLite device.
+ */
+static void xemaclite_aligned_write(void *src_ptr, u32 *dest_ptr,
+ unsigned length)
+{
+ u32 align_buffer;
+ u32 *to_u32_ptr;
+ u16 *from_u16_ptr, *to_u16_ptr;
+
+ to_u32_ptr = dest_ptr;
+ from_u16_ptr = (u16 *) src_ptr;
+ align_buffer = 0;
+
+ for (; length > 3; length -= 4) {
+ to_u16_ptr = (u16 *) ((void *) &align_buffer);
+ *to_u16_ptr++ = *from_u16_ptr++;
+ *to_u16_ptr++ = *from_u16_ptr++;
+
+ /* Output a word */
+ *to_u32_ptr++ = align_buffer;
+ }
+ if (length) {
+ u8 *from_u8_ptr, *to_u8_ptr;
+
+ /* Set up to output the remaining data */
+ align_buffer = 0;
+ to_u8_ptr = (u8 *) &align_buffer;
+ from_u8_ptr = (u8 *) from_u16_ptr;
+
+ /* Output the remaining data */
+ for (; length > 0; length--)
+ *to_u8_ptr++ = *from_u8_ptr++;
+
+ *to_u32_ptr = align_buffer;
+ }
+}
+
+/**
+ * xemaclite_aligned_read - Read from 32-bit aligned to 16-bit aligned buffer
+ * @src_ptr: Pointer to the 32-bit aligned source address
+ * @dest_ptr: Pointer to the 16-bit aligned destination address
+ * @length: Number bytes to read from source to destination
+ *
+ * This function reads data from a 32-bit aligned address in the EmacLite device
+ * to a 16-bit aligned buffer.
+ */
+static void xemaclite_aligned_read(u32 *src_ptr, u8 *dest_ptr,
+ unsigned length)
+{
+ u16 *to_u16_ptr, *from_u16_ptr;
+ u32 *from_u32_ptr;
+ u32 align_buffer;
+
+ from_u32_ptr = src_ptr;
+ to_u16_ptr = (u16 *) dest_ptr;
+
+ for (; length > 3; length -= 4) {
+ /* Copy each word into the temporary buffer */
+ align_buffer = *from_u32_ptr++;
+ from_u16_ptr = (u16 *)&align_buffer;
+
+ /* Read data from source */
+ *to_u16_ptr++ = *from_u16_ptr++;
+ *to_u16_ptr++ = *from_u16_ptr++;
+ }
+
+ if (length) {
+ u8 *to_u8_ptr, *from_u8_ptr;
+
+ /* Set up to read the remaining data */
+ to_u8_ptr = (u8 *) to_u16_ptr;
+ align_buffer = *from_u32_ptr++;
+ from_u8_ptr = (u8 *) &align_buffer;
+
+ /* Read the remaining data */
+ for (; length > 0; length--)
+ *to_u8_ptr = *from_u8_ptr;
+ }
+}
+
+/**
+ * xemaclite_send_data - Send an Ethernet frame
+ * @drvdata: Pointer to the Emaclite device private data
+ * @data: Pointer to the data to be sent
+ * @byte_count: Total frame size, including header
+ *
+ * This function checks if the Tx buffer of the Emaclite device is free to send
+ * data. If so, it fills the Tx buffer with data for transmission. Otherwise, it
+ * returns an error.
+ *
+ * Return: 0 upon success or -1 if the buffer(s) are full.
+ *
+ * Note: The maximum Tx packet size can not be more than Ethernet header
+ * (14 Bytes) + Maximum MTU (1500 bytes). This is excluding FCS.
+ */
+static int xemaclite_send_data(struct net_local *drvdata, u8 *data,
+ unsigned int byte_count)
+{
+ u32 reg_data;
+ void __iomem *addr;
+
+ /* Determine the expected Tx buffer address */
+ addr = drvdata->base_addr + drvdata->next_tx_buf_to_use;
+
+ /* If the length is too large, truncate it */
+ if (byte_count > ETH_FRAME_LEN)
+ byte_count = ETH_FRAME_LEN;
+
+ /* Check if the expected buffer is available */
+ reg_data = in_be32(addr + XEL_TSR_OFFSET);
+ if ((reg_data & (XEL_TSR_XMIT_BUSY_MASK |
+ XEL_TSR_XMIT_ACTIVE_MASK)) == 0) {
+
+ /* Switch to next buffer if configured */
+ if (drvdata->tx_ping_pong != 0)
+ drvdata->next_tx_buf_to_use ^= XEL_BUFFER_OFFSET;
+ } else if (drvdata->tx_ping_pong != 0) {
+ /* If the expected buffer is full, try the other buffer,
+ * if it is configured in HW */
+
+ addr = (void __iomem __force *)((u32 __force)addr ^
+ XEL_BUFFER_OFFSET);
+ reg_data = in_be32(addr + XEL_TSR_OFFSET);
+
+ if ((reg_data & (XEL_TSR_XMIT_BUSY_MASK |
+ XEL_TSR_XMIT_ACTIVE_MASK)) != 0)
+ return -1; /* Buffers were full, return failure */
+ } else
+ return -1; /* Buffer was full, return failure */
+
+ /* Write the frame to the buffer */
+ xemaclite_aligned_write(data, (u32 __force *) addr, byte_count);
+
+ out_be32(addr + XEL_TPLR_OFFSET, (byte_count & XEL_TPLR_LENGTH_MASK));
+
+ /* Update the Tx Status Register to indicate that there is a
+ * frame to send. Set the XEL_TSR_XMIT_ACTIVE_MASK flag which
+ * is used by the interrupt handler to check whether a frame
+ * has been transmitted */
+ reg_data = in_be32(addr + XEL_TSR_OFFSET);
+ reg_data |= (XEL_TSR_XMIT_BUSY_MASK | XEL_TSR_XMIT_ACTIVE_MASK);
+ out_be32(addr + XEL_TSR_OFFSET, reg_data);
+
+ return 0;
+}
+
+/**
+ * xemaclite_recv_data - Receive a frame
+ * @drvdata: Pointer to the Emaclite device private data
+ * @data: Address where the data is to be received
+ *
+ * This function is intended to be called from the interrupt context or
+ * with a wrapper which waits for the receive frame to be available.
+ *
+ * Return: Total number of bytes received
+ */
+static u16 xemaclite_recv_data(struct net_local *drvdata, u8 *data)
+{
+ void __iomem *addr;
+ u16 length, proto_type;
+ u32 reg_data;
+
+ /* Determine the expected buffer address */
+ addr = (drvdata->base_addr + drvdata->next_rx_buf_to_use);
+
+ /* Verify which buffer has valid data */
+ reg_data = in_be32(addr + XEL_RSR_OFFSET);
+
+ if ((reg_data & XEL_RSR_RECV_DONE_MASK) == XEL_RSR_RECV_DONE_MASK) {
+ if (drvdata->rx_ping_pong != 0)
+ drvdata->next_rx_buf_to_use ^= XEL_BUFFER_OFFSET;
+ } else {
+ /* The instance is out of sync, try other buffer if other
+ * buffer is configured, return 0 otherwise. If the instance is
+ * out of sync, do not update the 'next_rx_buf_to_use' since it
+ * will correct on subsequent calls */
+ if (drvdata->rx_ping_pong != 0)
+ addr = (void __iomem __force *)((u32 __force)addr ^
+ XEL_BUFFER_OFFSET);
+ else
+ return 0; /* No data was available */
+
+ /* Verify that buffer has valid data */
+ reg_data = in_be32(addr + XEL_RSR_OFFSET);
+ if ((reg_data & XEL_RSR_RECV_DONE_MASK) !=
+ XEL_RSR_RECV_DONE_MASK)
+ return 0; /* No data was available */
+ }
+
+ /* Get the protocol type of the ethernet frame that arrived */
+ proto_type = ((in_be32(addr + XEL_HEADER_OFFSET +
+ XEL_RXBUFF_OFFSET) >> XEL_HEADER_SHIFT) &
+ XEL_RPLR_LENGTH_MASK);
+
+ /* Check if received ethernet frame is a raw ethernet frame
+ * or an IP packet or an ARP packet */
+ if (proto_type > (ETH_FRAME_LEN + ETH_FCS_LEN)) {
+
+ if (proto_type == ETH_P_IP) {
+ length = ((in_be32(addr +
+ XEL_HEADER_IP_LENGTH_OFFSET +
+ XEL_RXBUFF_OFFSET) >>
+ XEL_HEADER_SHIFT) &
+ XEL_RPLR_LENGTH_MASK);
+ length += ETH_HLEN + ETH_FCS_LEN;
+
+ } else if (proto_type == ETH_P_ARP)
+ length = XEL_ARP_PACKET_SIZE + ETH_HLEN + ETH_FCS_LEN;
+ else
+ /* Field contains type other than IP or ARP, use max
+ * frame size and let user parse it */
+ length = ETH_FRAME_LEN + ETH_FCS_LEN;
+ } else
+ /* Use the length in the frame, plus the header and trailer */
+ length = proto_type + ETH_HLEN + ETH_FCS_LEN;
+
+ /* Read from the EmacLite device */
+ xemaclite_aligned_read((u32 __force *) (addr + XEL_RXBUFF_OFFSET),
+ data, length);
+
+ /* Acknowledge the frame */
+ reg_data = in_be32(addr + XEL_RSR_OFFSET);
+ reg_data &= ~XEL_RSR_RECV_DONE_MASK;
+ out_be32(addr + XEL_RSR_OFFSET, reg_data);
+
+ return length;
+}
+
+/**
+ * xemaclite_set_mac_address - Set the MAC address for this device
+ * @drvdata: Pointer to the Emaclite device private data
+ * @address_ptr:Pointer to the MAC address (MAC address is a 48-bit value)
+ *
+ * Tx must be idle and Rx should be idle for deterministic results.
+ * It is recommended that this function should be called after the
+ * initialization and before transmission of any packets from the device.
+ * The MAC address can be programmed using any of the two transmit
+ * buffers (if configured).
+ */
+static void xemaclite_set_mac_address(struct net_local *drvdata,
+ u8 *address_ptr)
+{
+ void __iomem *addr;
+ u32 reg_data;
+
+ /* Determine the expected Tx buffer address */
+ addr = drvdata->base_addr + drvdata->next_tx_buf_to_use;
+
+ xemaclite_aligned_write(address_ptr, (u32 __force *) addr, ETH_ALEN);
+
+ out_be32(addr + XEL_TPLR_OFFSET, ETH_ALEN);
+
+ /* Update the MAC address in the EmacLite */
+ reg_data = in_be32(addr + XEL_TSR_OFFSET);
+ out_be32(addr + XEL_TSR_OFFSET, reg_data | XEL_TSR_PROG_MAC_ADDR);
+
+ /* Wait for EmacLite to finish with the MAC address update */
+ while ((in_be32(addr + XEL_TSR_OFFSET) &
+ XEL_TSR_PROG_MAC_ADDR) != 0)
+ ;
+}
+
+/**
+ * xemaclite_tx_timeout - Callback for Tx Timeout
+ * @dev: Pointer to the network device
+ *
+ * This function is called when Tx time out occurs for Emaclite device.
+ */
+static void xemaclite_tx_timeout(struct net_device *dev)
+{
+ struct net_local *lp = (struct net_local *) netdev_priv(dev);
+ unsigned long flags;
+
+ dev_err(&lp->ndev->dev, "Exceeded transmit timeout of %lu ms\n",
+ TX_TIMEOUT * 1000UL / HZ);
+
+ dev->stats.tx_errors++;
+
+ /* Reset the device */
+ spin_lock_irqsave(&lp->reset_lock, flags);
+
+ /* Shouldn't really be necessary, but shouldn't hurt */
+ netif_stop_queue(dev);
+
+ xemaclite_disable_interrupts(lp);
+ xemaclite_enable_interrupts(lp);
+
+ if (lp->deferred_skb) {
+ dev_kfree_skb(lp->deferred_skb);
+ lp->deferred_skb = NULL;
+ dev->stats.tx_errors++;
+ }
+
+ /* To exclude tx timeout */
+ dev->trans_start = 0xffffffff - TX_TIMEOUT - TX_TIMEOUT;
+
+ /* We're all ready to go. Start the queue */
+ netif_wake_queue(dev);
+ spin_unlock_irqrestore(&lp->reset_lock, flags);
+}
+
+/**********************/
+/* Interrupt Handlers */
+/**********************/
+
+/**
+ * xemaclite_tx_handler - Interrupt handler for frames sent
+ * @dev: Pointer to the network device
+ *
+ * This function updates the number of packets transmitted and handles the
+ * deferred skb, if there is one.
+ */
+static void xemaclite_tx_handler(struct net_device *dev)
+{
+ struct net_local *lp = (struct net_local *) netdev_priv(dev);
+
+ dev->stats.tx_packets++;
+ if (lp->deferred_skb) {
+ if (xemaclite_send_data(lp,
+ (u8 *) lp->deferred_skb->data,
+ lp->deferred_skb->len) != 0)
+ return;
+ else {
+ dev->stats.tx_bytes += lp->deferred_skb->len;
+ dev_kfree_skb_irq(lp->deferred_skb);
+ lp->deferred_skb = NULL;
+ dev->trans_start = jiffies;
+ netif_wake_queue(dev);
+ }
+ }
+}
+
+/**
+ * xemaclite_rx_handler- Interrupt handler for frames received
+ * @dev: Pointer to the network device
+ *
+ * This function allocates memory for a socket buffer, fills it with data
+ * received and hands it over to the TCP/IP stack.
+ */
+static void xemaclite_rx_handler(struct net_device *dev)
+{
+ struct net_local *lp = (struct net_local *) netdev_priv(dev);
+ struct sk_buff *skb;
+ unsigned int align;
+ u32 len;
+
+ len = ETH_FRAME_LEN + ETH_FCS_LEN;
+ skb = dev_alloc_skb(len + ALIGNMENT);
+ if (!skb) {
+ /* Couldn't get memory. */
+ dev->stats.rx_dropped++;
+ dev_err(&lp->ndev->dev, "Could not allocate receive buffer\n");
+ return;
+ }
+
+ /*
+ * A new skb should have the data halfword aligned, but this code is
+ * here just in case that isn't true. Calculate how many
+ * bytes we should reserve to get the data to start on a word
+ * boundary */
+ align = BUFFER_ALIGN(skb->data);
+ if (align)
+ skb_reserve(skb, align);
+
+ skb_reserve(skb, 2);
+
+ len = xemaclite_recv_data(lp, (u8 *) skb->data);
+
+ if (!len) {
+ dev->stats.rx_errors++;
+ dev_kfree_skb_irq(skb);
+ return;
+ }
+
+ skb_put(skb, len); /* Tell the skb how much data we got */
+ skb->dev = dev; /* Fill out required meta-data */
+
+ skb->protocol = eth_type_trans(skb, dev);
+ skb->ip_summed = CHECKSUM_NONE;
+
+ dev->stats.rx_packets++;
+ dev->stats.rx_bytes += len;
+ dev->last_rx = jiffies;
+
+ netif_rx(skb); /* Send the packet upstream */
+}
+
+/**
+ * xemaclite_interrupt - Interrupt handler for this driver
+ * @irq: Irq of the Emaclite device
+ * @dev_id: Void pointer to the network device instance used as callback
+ * reference
+ *
+ * This function handles the Tx and Rx interrupts of the EmacLite device.
+ */
+static irqreturn_t xemaclite_interrupt(int irq, void *dev_id)
+{
+ bool tx_complete = 0;
+ struct net_device *dev = dev_id;
+ struct net_local *lp = (struct net_local *) netdev_priv(dev);
+ void __iomem *base_addr = lp->base_addr;
+ u32 tx_status;
+
+ /* Check if there is Rx Data available */
+ if ((in_be32(base_addr + XEL_RSR_OFFSET) & XEL_RSR_RECV_DONE_MASK) ||
+ (in_be32(base_addr + XEL_BUFFER_OFFSET + XEL_RSR_OFFSET)
+ & XEL_RSR_RECV_DONE_MASK))
+
+ xemaclite_rx_handler(dev);
+
+ /* Check if the Transmission for the first buffer is completed */
+ tx_status = in_be32(base_addr + XEL_TSR_OFFSET);
+ if (((tx_status & XEL_TSR_XMIT_BUSY_MASK) == 0) &&
+ (tx_status & XEL_TSR_XMIT_ACTIVE_MASK) != 0) {
+
+ tx_status &= ~XEL_TSR_XMIT_ACTIVE_MASK;
+ out_be32(base_addr + XEL_TSR_OFFSET, tx_status);
+
+ tx_complete = 1;
+ }
+
+ /* Check if the Transmission for the second buffer is completed */
+ tx_status = in_be32(base_addr + XEL_BUFFER_OFFSET + XEL_TSR_OFFSET);
+ if (((tx_status & XEL_TSR_XMIT_BUSY_MASK) == 0) &&
+ (tx_status & XEL_TSR_XMIT_ACTIVE_MASK) != 0) {
+
+ tx_status &= ~XEL_TSR_XMIT_ACTIVE_MASK;
+ out_be32(base_addr + XEL_BUFFER_OFFSET + XEL_TSR_OFFSET,
+ tx_status);
+
+ tx_complete = 1;
+ }
+
+ /* If there was a Tx interrupt, call the Tx Handler */
+ if (tx_complete != 0)
+ xemaclite_tx_handler(dev);
+
+ return IRQ_HANDLED;
+}
+
+/**
+ * xemaclite_open - Open the network device
+ * @dev: Pointer to the network device
+ *
+ * This function sets the MAC address, requests an IRQ and enables interrupts
+ * for the Emaclite device and starts the Tx queue.
+ */
+static int xemaclite_open(struct net_device *dev)
+{
+ struct net_local *lp = (struct net_local *) netdev_priv(dev);
+ int retval;
+
+ /* Just to be safe, stop the device first */
+ xemaclite_disable_interrupts(lp);
+
+ /* Set the MAC address each time opened */
+ xemaclite_set_mac_address(lp, dev->dev_addr);
+
+ /* Grab the IRQ */
+ retval = request_irq(dev->irq, &xemaclite_interrupt, 0, dev->name, dev);
+ if (retval) {
+ dev_err(&lp->ndev->dev, "Could not allocate interrupt %d\n",
+ dev->irq);
+ return retval;
+ }
+
+ /* Enable Interrupts */
+ xemaclite_enable_interrupts(lp);
+
+ /* We're ready to go */
+ netif_start_queue(dev);
+
+ return 0;
+}
+
+/**
+ * xemaclite_close - Close the network device
+ * @dev: Pointer to the network device
+ *
+ * This function stops the Tx queue, disables interrupts and frees the IRQ for
+ * the Emaclite device.
+ */
+static int xemaclite_close(struct net_device *dev)
+{
+ struct net_local *lp = (struct net_local *) netdev_priv(dev);
+
+ netif_stop_queue(dev);
+ xemaclite_disable_interrupts(lp);
+ free_irq(dev->irq, dev);
+
+ return 0;
+}
+
+/**
+ * xemaclite_get_stats - Get the stats for the net_device
+ * @dev: Pointer to the network device
+ *
+ * This function returns the address of the 'net_device_stats' structure for the
+ * given network device. This structure holds usage statistics for the network
+ * device.
+ *
+ * Return: Pointer to the net_device_stats structure.
+ */
+static struct net_device_stats *xemaclite_get_stats(struct net_device *dev)
+{
+ return &dev->stats;
+}
+
+/**
+ * xemaclite_send - Transmit a frame
+ * @orig_skb: Pointer to the socket buffer to be transmitted
+ * @dev: Pointer to the network device
+ *
+ * This function checks if the Tx buffer of the Emaclite device is free to send
+ * data. If so, it fills the Tx buffer with data from socket buffer data,
+ * updates the stats and frees the socket buffer. The Tx completion is signaled
+ * by an interrupt. If the Tx buffer isn't free, then the socket buffer is
+ * deferred and the Tx queue is stopped so that the deferred socket buffer can
+ * be transmitted when the Emaclite device is free to transmit data.
+ *
+ * Return: 0, always.
+ */
+static int xemaclite_send(struct sk_buff *orig_skb, struct net_device *dev)
+{
+ struct net_local *lp = (struct net_local *) netdev_priv(dev);
+ struct sk_buff *new_skb;
+ unsigned int len;
+ unsigned long flags;
+
+ len = orig_skb->len;
+
+ new_skb = orig_skb;
+
+ spin_lock_irqsave(&lp->reset_lock, flags);
+ if (xemaclite_send_data(lp, (u8 *) new_skb->data, len) != 0) {
+ /* If the Emaclite Tx buffer is busy, stop the Tx queue and
+ * defer the skb for transmission at a later point when the
+ * current transmission is complete */
+ netif_stop_queue(dev);
+ lp->deferred_skb = new_skb;
+ spin_unlock_irqrestore(&lp->reset_lock, flags);
+ return 0;
+ }
+ spin_unlock_irqrestore(&lp->reset_lock, flags);
+
+ dev->stats.tx_bytes += len;
+ dev_kfree_skb(new_skb);
+ dev->trans_start = jiffies;
+
+ return 0;
+}
+
+/**
+ * xemaclite_ioctl - Perform IO Control operations on the network device
+ * @dev: Pointer to the network device
+ * @rq: Pointer to the interface request structure
+ * @cmd: IOCTL command
+ *
+ * The only IOCTL operation supported by this function is setting the MAC
+ * address. An error is reported if any other operations are requested.
+ *
+ * Return: 0 to indicate success, or a negative error for failure.
+ */
+static int xemaclite_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+ struct net_local *lp = (struct net_local *) netdev_priv(dev);
+ struct hw_addr_data *hw_addr = (struct hw_addr_data *) &rq->ifr_hwaddr;
+
+ switch (cmd) {
+ case SIOCETHTOOL:
+ return -EIO;
+
+ case SIOCSIFHWADDR:
+ dev_err(&lp->ndev->dev, "SIOCSIFHWADDR\n");
+
+ /* Copy MAC address in from user space */
+ copy_from_user((void __force *) dev->dev_addr,
+ (void __user __force *) hw_addr,
+ IFHWADDRLEN);
+ xemaclite_set_mac_address(lp, dev->dev_addr);
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
+/**
+ * xemaclite_remove_ndev - Free the network device
+ * @ndev: Pointer to the network device to be freed
+ *
+ * This function un maps the IO region of the Emaclite device and frees the net
+ * device.
+ */
+static void xemaclite_remove_ndev(struct net_device *ndev)
+{
+ if (ndev) {
+ struct net_local *lp = (struct net_local *) netdev_priv(ndev);
+
+ if (lp->base_addr)
+ iounmap((void __iomem __force *) (lp->base_addr));
+ free_netdev(ndev);
+ }
+}
+
+/**
+ * get_bool - Get a parameter from the OF device
+ * @ofdev: Pointer to OF device structure
+ * @s: Property to be retrieved
+ *
+ * This function looks for a property in the device node and returns the value
+ * of the property if its found or 0 if the property is not found.
+ *
+ * Return: Value of the parameter if the parameter is found, or 0 otherwise
+ */
+static bool get_bool(struct of_device *ofdev, const char *s)
+{
+ u32 *p = (u32 *)of_get_property(ofdev->node, s, NULL);
+
+ if (p) {
+ return (bool)*p;
+ } else {
+ dev_warn(&ofdev->dev, "Parameter %s not found,"
+ "defaulting to false\n", s);
+ return 0;
+ }
+}
+
+static struct net_device_ops xemaclite_netdev_ops;
+
+/**
+ * xemaclite_of_probe - Probe method for the Emaclite device.
+ * @ofdev: Pointer to OF device structure
+ * @match: Pointer to the structure used for matching a device
+ *
+ * This function probes for the Emaclite device in the device tree.
+ * It initializes the driver data structure and the hardware, sets the MAC
+ * address and registers the network device.
+ *
+ * Return: 0, if the driver is bound to the Emaclite device, or
+ * a negative error if there is failure.
+ */
+static int __devinit xemaclite_of_probe(struct of_device *ofdev,
+ const struct of_device_id *match)
+{
+ struct resource r_irq; /* Interrupt resources */
+ struct resource r_mem; /* IO mem resources */
+ struct net_device *ndev = NULL;
+ struct net_local *lp = NULL;
+ struct device *dev = &ofdev->dev;
+ const void *mac_address;
+
+ int rc = 0;
+
+ dev_info(dev, "Device Tree Probing\n");
+
+ /* Get iospace for the device */
+ rc = of_address_to_resource(ofdev->node, 0, &r_mem);
+ if (rc) {
+ dev_err(dev, "invalid address\n");
+ return rc;
+ }
+
+ /* Get IRQ for the device */
+ rc = of_irq_to_resource(ofdev->node, 0, &r_irq);
+ if (rc == NO_IRQ) {
+ dev_err(dev, "no IRQ found\n");
+ return rc;
+ }
+
+ /* Create an ethernet device instance */
+ ndev = alloc_etherdev(sizeof(struct net_local));
+ if (!ndev) {
+ dev_err(dev, "Could not allocate network device\n");
+ return -ENOMEM;
+ }
+
+ dev_set_drvdata(dev, ndev);
+
+ ndev->irq = r_irq.start;
+ ndev->mem_start = r_mem.start;
+ ndev->mem_end = r_mem.end;
+
+ lp = netdev_priv(ndev);
+ lp->ndev = ndev;
+
+ if (!request_mem_region(ndev->mem_start,
+ ndev->mem_end - ndev->mem_start + 1,
+ DRIVER_NAME)) {
+ dev_err(dev, "Couldn't lock memory region at %p\n",
+ (void *)ndev->mem_start);
+ rc = -EBUSY;
+ goto error2;
+ }
+
+ /* Get the virtual base address for the device */
+ lp->base_addr = ioremap(r_mem.start, r_mem.end - r_mem.start + 1);
+ if (NULL == lp->base_addr) {
+ dev_err(dev, "EmacLite: Could not allocate iomem\n");
+ rc = -EIO;
+ goto error1;
+ }
+
+ lp->next_tx_buf_to_use = 0x0;
+ lp->next_rx_buf_to_use = 0x0;
+ lp->tx_ping_pong = get_bool(ofdev, "xlnx,tx-ping-pong");
+ lp->rx_ping_pong = get_bool(ofdev, "xlnx,rx-ping-pong");
+ mac_address = of_get_mac_address(ofdev->node);
+
+ if (mac_address)
+ /* Set the MAC address. */
+ memcpy(ndev->dev_addr, mac_address, 6);
+ else
+ dev_warn(dev, "No MAC address found\n");
+
+ /* Clear the Tx CSR's in case this is a restart */
+ out_be32(lp->base_addr + XEL_TSR_OFFSET, 0);
+ out_be32(lp->base_addr + XEL_BUFFER_OFFSET + XEL_TSR_OFFSET, 0);
+
+ /* Set the MAC address in the EmacLite device */
+ xemaclite_set_mac_address(lp, ndev->dev_addr);
+
+ dev_info(dev,
+ "MAC address is now %2x:%2x:%2x:%2x:%2x:%2x\n",
+ ndev->dev_addr[0], ndev->dev_addr[1],
+ ndev->dev_addr[2], ndev->dev_addr[3],
+ ndev->dev_addr[4], ndev->dev_addr[5]);
+
+ ndev->netdev_ops = &xemaclite_netdev_ops;
+ ndev->flags &= ~IFF_MULTICAST;
+ ndev->watchdog_timeo = TX_TIMEOUT;
+
+ /* Finally, register the device */
+ rc = register_netdev(ndev);
+ if (rc) {
+ dev_err(dev,
+ "Cannot register network device, aborting\n");
+ goto error1;
+ }
+
+ dev_info(dev,
+ "Xilinx EmacLite at 0x%08X mapped to 0x%08X, irq=%d\n",
+ (unsigned int __force)ndev->mem_start,
+ (unsigned int __force)lp->base_addr, ndev->irq);
+ return 0;
+
+error1:
+ release_mem_region(ndev->mem_start, r_mem.end - r_mem.start + 1);
+
+error2:
+ xemaclite_remove_ndev(ndev);
+ return rc;
+}
+
+/**
+ * xemaclite_of_remove - Unbind the driver from the Emaclite device.
+ * @of_dev: Pointer to OF device structure
+ *
+ * This function is called if a device is physically removed from the system or
+ * if the driver module is being unloaded. It frees any resources allocated to
+ * the device.
+ *
+ * Return: 0, always.
+ */
+static int __devexit xemaclite_of_remove(struct of_device *of_dev)
+{
+ struct device *dev = &of_dev->dev;
+ struct net_device *ndev = dev_get_drvdata(dev);
+
+ unregister_netdev(ndev);
+
+ release_mem_region(ndev->mem_start, ndev->mem_end-ndev->mem_start + 1);
+
+ xemaclite_remove_ndev(ndev);
+
+ dev_set_drvdata(dev, NULL);
+
+ return 0;
+}
+
+static struct net_device_ops xemaclite_netdev_ops = {
+ .ndo_open = xemaclite_open,
+ .ndo_stop = xemaclite_close,
+ .ndo_start_xmit = xemaclite_send,
+ .ndo_do_ioctl = xemaclite_ioctl,
+ .ndo_tx_timeout = xemaclite_tx_timeout,
+ .ndo_get_stats = xemaclite_get_stats,
+};
+
+/* Match table for OF platform binding */
+static struct of_device_id xemaclite_of_match[] __devinitdata = {
+ { .compatible = "xlnx,opb-ethernetlite-1.01.a", },
+ { .compatible = "xlnx,opb-ethernetlite-1.01.b", },
+ { .compatible = "xlnx,xps-ethernetlite-1.00.a", },
+ { .compatible = "xlnx,xps-ethernetlite-2.00.a", },
+ { .compatible = "xlnx,xps-ethernetlite-2.01.a", },
+ { /* end of list */ },
+};
+MODULE_DEVICE_TABLE(of, xemaclite_of_match);
+
+static struct of_platform_driver xemaclite_of_driver = {
+ .name = DRIVER_NAME,
+ .match_table = xemaclite_of_match,
+ .probe = xemaclite_of_probe,
+ .remove = __devexit_p(xemaclite_of_remove),
+};
+
+/**
+ * xgpiopss_init - Initial driver registration call
+ *
+ * Return: 0 upon success, or a negative error upon failure.
+ */
+static int __init xemaclite_init(void)
+{
+ /* No kernel boot options used, we just need to register the driver */
+ return of_register_platform_driver(&xemaclite_of_driver);
+}
+
+/**
+ * xemaclite_cleanup - Driver un-registration call
+ */
+static void __exit xemaclite_cleanup(void)
+{
+ of_unregister_platform_driver(&xemaclite_of_driver);
+}
+
+module_init(xemaclite_init);
+module_exit(xemaclite_cleanup);
+
+MODULE_AUTHOR("Xilinx, Inc.");
+MODULE_DESCRIPTION("Xilinx Ethernet MAC Lite driver");
+MODULE_LICENSE("GPL");
--
1.6.2.1
This email and any attachments are intended for the sole use of the named recipient(s) and contain(s) confidential information that may be proprietary, privileged or copyrighted under applicable law. If you are not the intended recipient, do not read, copy, or forward this email message or any attachments. Delete this email message and any attachments immediately.
^ permalink raw reply related
* Re: powerpc/405ex: Support cuImage for PPC405EX
From: Josh Boyer @ 2009-08-19 11:45 UTC (permalink / raw)
To: tiejun.chen; +Cc: linuxppc-dev
In-Reply-To: <4A8B5929.2030000@windriver.com>
On Wed, Aug 19, 2009 at 09:45:13AM +0800, tiejun.chen wrote:
>Josh Boyer wrote:
>> On Tue, Aug 18, 2009 at 10:28:02AM +0800, Tiejun Chen wrote:
>>> Summary: powerpc/405ex: Support cuImage for PPC405EX
>>> Reviewers: Benjmain and linux-ppc
>>> ----------------------------------------------------
>>> These patch series are used to support cuImage on the kilauea board based on PPC405ex.
>>>
>>> Tested on the amcc kilauea board:
>>
>> Hm. The U-Boot version that ships on the AMCC Kilauea board is FDT aware, so
>> cuImage shouldn't be needed at all. I'm slightly confused why we need this.
>
>That the newer u-boot can aware extra on Kilauea is really as you said. But I
>think I/we need to consider some other requirements for 405EX, such as only
>using one image for convenient on embedded system, and u-boot don't pass dtb for
>kernel since it will be isolated with one customer software layer.
>
>And actually to support old U-boot compatibility is original *goal* of
>implementing cuImage in kernel as we know. Right?
Yes. Supporting _old_ U-Boot compatibility. Not FDT aware U-Boot :).
>As Documentation/powerpc/bootwrapper.txt I think the kernel should support more
>image variant to satisfy different requirement. Sometimes that is also beyond we
>can discuss here.
>
>> Are you using it for some other board derived from 405EX that doesn't have
>> an FDT aware U-Boot?
>>
>
>Yes, this is really another factor. Anyway thinks your comments.
That's fine then. I have no real problems with having a cuImage that works
on Kilauea. I just wouldn't want to promote it as the default because the
cuImage wrapper is a bit of a hack.
Anyway, I'll review the patches soon. Thanks for the explanation.
josh
^ permalink raw reply
* Re: spin_is_locked() broken for uniprocessor?
From: Leon Woestenberg @ 2009-08-19 11:16 UTC (permalink / raw)
To: Alan Cox
Cc: Peter Zijlstra, Steven Rostedt, Linux-Kernel List,
linuxppc-dev list, Thomas Gleixner, Ingo Molnar
In-Reply-To: <20090819115301.563be6da@lxorguk.ukuu.org.uk>
Hello,
On Wed, Aug 19, 2009 at 12:53 PM, Alan Cox<alan@lxorguk.ukuu.org.uk> wrote:
> On Wed, 19 Aug 2009 10:38:06 +0100
>
> in drivers because there is driver code that uses spin_is_locked() in
> fairly sensible fashion when dealing with locking.
>
One use is to measure lock contention hits on a particular spin lock.
However I wonder if there are tracing capabilities to measure lock
contention on a particular lock?
Currently I have inserted code much like this to get a feeling on the
contention:
this_cpu = get_cpu();
put_cpu();
contention = spin_is_locked(&lock);
spin_lock*(&lock);
if (contention) {
/* spin lock was contended, prev_cpu, this_cpu */
/* no hard guarantee, as we had a possible race inbetween
is_locked() and lock(), but works for driver/irq spin lock */
}
/* critical section */
prev_cpu = this_cpu;
spin_unlock*(&lock);
Regards,
--
Leon
^ permalink raw reply
* Re: spin_is_locked() broken for uniprocessor?
From: Peter Zijlstra @ 2009-08-19 11:23 UTC (permalink / raw)
To: Leon Woestenberg
Cc: Steven Rostedt, Linux-Kernel List, linuxppc-dev list,
Thomas Gleixner, Ingo Molnar, Alan Cox
In-Reply-To: <c384c5ea0908190416r4e0b5508x4765381888516484@mail.gmail.com>
On Wed, 2009-08-19 at 13:16 +0200, Leon Woestenberg wrote:
> Hello,
>
> On Wed, Aug 19, 2009 at 12:53 PM, Alan Cox<alan@lxorguk.ukuu.org.uk> wrote:
> > On Wed, 19 Aug 2009 10:38:06 +0100
> >
> > in drivers because there is driver code that uses spin_is_locked() in
> > fairly sensible fashion when dealing with locking.
> >
> One use is to measure lock contention hits on a particular spin lock.
>
>
> However I wonder if there are tracing capabilities to measure lock
> contention on a particular lock?
lock_stat no good for you?
^ permalink raw reply
* Re: spin_is_locked() broken for uniprocessor?
From: Alan Cox @ 2009-08-19 10:53 UTC (permalink / raw)
To: David Howells
Cc: Peter Zijlstra, Steven Rostedt, Linux-Kernel List,
linuxppc-dev list, Thomas Gleixner, Ingo Molnar
In-Reply-To: <7099.1250674686@redhat.com>
On Wed, 19 Aug 2009 10:38:06 +0100
David Howells <dhowells@redhat.com> wrote:
> Thomas Gleixner <tglx@linutronix.de> wrote:
>
> > > which implies to me that spin_is_locked() will always return false. Is this
> > > expected behavior.
> >
> > That's wrong. spin_is_locked should always return true on UP.
>
> Surely it's not that simple? Maybe spin_is_lock() should be undefined on UP.
That would lead to a lot of
#ifdef CONFIG_SMP
#endif
in drivers because there is driver code that uses spin_is_locked() in
fairly sensible fashion when dealing with locking.
^ permalink raw reply
* Regarding 64-bit implementation of mtd driver
From: Thirumalai @ 2009-08-19 10:18 UTC (permalink / raw)
To: linuxppc-dev
Hi all,
In my custom board the flash memory was 64 bit data size means two 32
bit flash chips are combined together to form as 64 bit bus width. Is there
any patch to use the flash memory in mtd layer on the linux-2.6 kernel.
Because on kernel there is no support for 64 bit read/write on the low leve
cfi driver(command set 002).
Regards,
T.
**************** CAUTION - Disclaimer *****************This email may contain confidential and privileged material for the
sole use of the intended recipient(s). Any review, use, retention, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. Also, email is susceptible to data corruption, interception, tampering, unauthorized amendment and viruses. We only send and receive emails on the basis that we are not liable for any such corruption, interception, tampering, amendment or viruses or any consequence thereof.
*********** End of Disclaimer ***********DataPatterns ITS Group**********
^ permalink raw reply
* Re: spin_is_locked() broken for uniprocessor?
From: Peter Zijlstra @ 2009-08-19 9:41 UTC (permalink / raw)
To: David Howells
Cc: Steven Rostedt, Linux-Kernel List, linuxppc-dev list,
Thomas Gleixner, Ingo Molnar
In-Reply-To: <7099.1250674686@redhat.com>
On Wed, 2009-08-19 at 10:38 +0100, David Howells wrote:
> Thomas Gleixner <tglx@linutronix.de> wrote:
>
> > > which implies to me that spin_is_locked() will always return false. Is this
> > > expected behavior.
> >
> > That's wrong. spin_is_locked should always return true on UP.
>
> Surely it's not that simple? Maybe spin_is_lock() should be undefined on UP.
#define spin_is_locked(l) panic()
should sort things out quickly ;-)
^ permalink raw reply
* Re: [PATCH] spinlock: __raw_spin_is_locked() should return true for UP
From: Olivier Galibert @ 2009-08-19 9:31 UTC (permalink / raw)
To: Linus Torvalds
Cc: peterz, Steven Rostedt, linux-kernel, linuxppc-dev, mingo, tglx
In-Reply-To: <alpine.LFD.2.01.0908181937400.3158@localhost.localdomain>
On Tue, Aug 18, 2009 at 07:40:16PM -0700, Linus Torvalds wrote:
>
>
> On Tue, 18 Aug 2009, Kumar Gala wrote:
> >
> > I agree its a little too easy to abuse spin_is_locked. However we should be
> > consistent between spin_is_locked on UP between with and without
> > CONFIG_DEBUG_SPINLOCK enabled.
>
> No we shouldn't.
>
> With CONFIG_DEBUG_SPINLOCK, you have an actual lock variable for debugging
> purposes, so spin_is_locked() can clearly return a _valid_ answer, and
> should do so.
>
> Without DEBUG_SPINLOCK, there isn't any answer to return.
>
> So there's no way we can or should be consistent. In one case an answer
> exists, in another one the answer is meaningless and doesn't exist.
I always thought behaviour should be consistent between code with
debugging on and code without. Otherwise you may end up with cases of
"it starts working when I turn on debugging" which are a pain to fix.
Has something changed?
Or in other words, do you think lockdep should try solving deadlocks
instead of just reporting them for instance?
OG.
^ permalink raw reply
* Re: spin_is_locked() broken for uniprocessor?
From: David Howells @ 2009-08-19 9:38 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Peter Zijlstra, Steven Rostedt, Linux-Kernel List,
linuxppc-dev list, Ingo Molnar
In-Reply-To: <alpine.LFD.2.00.0908190022570.3361@localhost.localdomain>
Thomas Gleixner <tglx@linutronix.de> wrote:
> > which implies to me that spin_is_locked() will always return false. Is this
> > expected behavior.
>
> That's wrong. spin_is_locked should always return true on UP.
Surely it's not that simple? Maybe spin_is_lock() should be undefined on UP.
David
^ permalink raw reply
* Re: [PATCH] spinlock: __raw_spin_is_locked() should return true for UP
From: Peter Zijlstra @ 2009-08-19 9:38 UTC (permalink / raw)
To: Olivier Galibert
Cc: Steven Rostedt, linux-kernel, linuxppc-dev, tglx, Linus Torvalds,
mingo
In-Reply-To: <20090819093145.GA53298@dspnet.fr.eu.org>
On Wed, 2009-08-19 at 11:31 +0200, Olivier Galibert wrote:
> On Tue, Aug 18, 2009 at 07:40:16PM -0700, Linus Torvalds wrote:
> >
> >
> > On Tue, 18 Aug 2009, Kumar Gala wrote:
> > >
> > > I agree its a little too easy to abuse spin_is_locked. However we should be
> > > consistent between spin_is_locked on UP between with and without
> > > CONFIG_DEBUG_SPINLOCK enabled.
> >
> > No we shouldn't.
> >
> > With CONFIG_DEBUG_SPINLOCK, you have an actual lock variable for debugging
> > purposes, so spin_is_locked() can clearly return a _valid_ answer, and
> > should do so.
> >
> > Without DEBUG_SPINLOCK, there isn't any answer to return.
> >
> > So there's no way we can or should be consistent. In one case an answer
> > exists, in another one the answer is meaningless and doesn't exist.
>
> I always thought behaviour should be consistent between code with
> debugging on and code without. Otherwise you may end up with cases of
> "it starts working when I turn on debugging" which are a pain to fix.
> Has something changed?
>
> Or in other words, do you think lockdep should try solving deadlocks
> instead of just reporting them for instance?
The point is spin_is_locked() is a broken interface in that respect. Its
plain impossible to give the right answer.
Suppose there's code doing:
/*
* Ensure we don't have foo lock taken, because that would cause
* lock inversion under bar lock.
*/
BUG_ON(spin_is_locked(&foo));
spin_lock(&bar);
and other code doing:
/*
* Ensure we've got foo locked because it protects bar
*/
BUG_ON(!spin_is_locked(&foo));
bar = fancy;
What value should you return when locks don't exist (which is the case
for UP)?
There simply is no right answer other than: don't use spin_is_locked().
^ permalink raw reply
* RE: [PATCH] net: add Xilinx emac lite device driver
From: John Linn @ 2009-08-19 7:20 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Michal Simek, netdev, Sadanand Mutyala, linuxppc-dev, jgarzik,
davem, John Williams
In-Reply-To: <20090818140418.347cb8ac@nehalam>
> -----Original Message-----
> From: Stephen Hemminger [mailto:shemminger@vyatta.com]
> Sent: Tuesday, August 18, 2009 10:04 PM
> To: John Linn
> Cc: netdev@vger.kernel.org; linuxppc-dev@ozlabs.org;
jgarzik@pobox.com; davem@davemloft.net; John
> Linn; grant.likely@secretlab.ca; Josh Boyer; John Williams; Michal
Simek; Sadanand Mutyala
> Subject: Re: [PATCH] net: add Xilinx emac lite device driver
> =
> On Tue, 18 Aug 2009 09:30:41 -0600
> John Linn <john.linn@xilinx.com> wrote:
> =
> > +/**
> > + * xemaclite_enable_interrupts - Enable the interrupts for the
EmacLite device
> > + * @drvdata: Pointer to the Emaclite device private data
> > + *
> > + * This function enables the Tx and Rx interrupts for the Emaclite
device along
> > + * with the Global Interrupt Enable.
> > + */
> > +static void xemaclite_enable_interrupts(struct net_local *drvdata)
> =
> Docbook format is really a not necessary on local functions that
> are only used in the driver. It is fine if you want to use it, as
> long as the file isn't processed by kernel make docs but
> the docbook is intended for automatic generation of kernel API
manuals.
Thanks, we realize that and have been consistent with our drivers in
that area. We aren't adding them to the make for the docs so it doesn't
seem like it should be a problem. =
> =
> --
This email and any attachments are intended for the sole use of the named r=
ecipient(s) and contain(s) confidential information that may be proprietary=
, privileged or copyrighted under applicable law. If you are not the intend=
ed recipient, do not read, copy, or forward this email message or any attac=
hments. Delete this email message and any attachments immediately.
^ permalink raw reply
* powerpc-next rebased
From: Benjamin Herrenschmidt @ 2009-08-19 7:33 UTC (permalink / raw)
To: linuxppc-dev list
Allright it's done, let's hope I didn't screw up :-)
So I rebased the whole thing on top of latest upstream, fixing along the
way the bug that Becky found in tlb.h and fixing up a commit name from
Kumar that was referencing the wrong board. I also applied to -next the
remaining things that were in -test, except for my _PAGE_EXEC rework.
I've added a Kumar's fix for assert_pte_locked() and merged Paulus per
counter callchain support while at it
I also pushed out an updated -test with the new variant of the
_PAGE_EXEC rework and a couple more things hanging off patchwork, I'll
probably add a bit more such as the pending iommu bits tomorrow.
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH] powerpc: Fix __flush_icache_range on 44x
From: Benjamin Herrenschmidt @ 2009-08-19 7:29 UTC (permalink / raw)
To: Josh Boyer; +Cc: linuxppc-dev
In-Reply-To: <20090817134136.GB8710@zod.rchland.ibm.com>
On Mon, 2009-08-17 at 09:41 -0400, Josh Boyer wrote:
> The ptrace POKETEXT interface allows a process to modify the text pages of
> a child process being ptraced, usually to insert breakpoints via trap
> instructions. The kernel eventually calls copy_to_user_page, which in turn
> calls __flush_icache_range to invalidate the icache lines for the child
> process.
>
> However, this function does not work on 44x due to the icache being virtually
> indexed. This was noticed by a breakpoint being triggered after it had been
> cleared by ltrace on a 440EPx board. The convenient solution is to do a
> flash invalidate of the icache in the __flush_icache_range function.
>
> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
I've put it in -test for now, but I'd like you to respin when you get a
chance with:
- removing the icbi loop in the 440 case
- adding a nice fat comment explaining why next to the iccci
instruction itself so we don't be tempted to remove it.
Cheers,
Ben.
> ---
>
> diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
> index 15f28e0..c9805a4 100644
> --- a/arch/powerpc/kernel/misc_32.S
> +++ b/arch/powerpc/kernel/misc_32.S
> @@ -346,6 +346,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_ICACHE)
> 2: icbi 0,r6
> addi r6,r6,L1_CACHE_BYTES
> bdnz 2b
> +#ifdef CONFIG_44x
> + iccci r0, r0
> +#endif
> sync /* additional sync needed on g4 */
> isync
> blr
^ permalink raw reply
* Re: [PATCH 1/5] powerpc/mm: Add MMU features for TLB reservation & Paired MAS registers
From: Benjamin Herrenschmidt @ 2009-08-19 7:25 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev
In-Reply-To: <1250658513-13009-1-git-send-email-galak@kernel.crashing.org>
On Wed, 2009-08-19 at 00:08 -0500, Kumar Gala wrote:
> Support for TLB reservation (or TLB Write Conditional) and Paired MAS
> registers are optional for a processor implementation so we handle
> them via MMU feature sections.
>
> We currently only used paired MAS registers to access the full RPN + perm
> bits that are kept in MAS7||MAS3. We assume that if an implementation has
> hardware page table at this time it also implements in TLB reservations.
You also need to be careful with this code:
virt_page_table_tlb_miss_done:
/* We have overriden MAS2:EPN but currently our primary TLB miss
* handler will always restore it so that should not be an issue,
* if we ever optimize the primary handler to not write MAS2 on
* some cases, we'll have to restore MAS2:EPN here based on the
* original fault's DEAR. If we do that we have to modify the
* ITLB miss handler to also store SRR0 in the exception frame
* as DEAR.
*
* However, one nasty thing we did is we cleared the reservation
* (well, potentially we did). We do a trick here thus if we
* are not a level 0 exception (we interrupted the TLB miss) we
* offset the return address by -4 in order to replay the tlbsrx
* instruction there
*/
subf r10,r13,r12
cmpldi cr0,r10,PACA_EXTLB+EX_TLB_SIZE
bne- 1f
ld r11,PACA_EXTLB+EX_TLB_SIZE+EX_TLB_SRR0(r13)
addi r10,r11,-4
std r10,PACA_EXTLB+EX_TLB_SIZE+EX_TLB_SRR0(r13)
You may want to make the 3 last lines conditional on having tlbsrx.
Right now, in the no-tlbsrx. case, what happens is that it will go back
to the previous instruction, an or, which hopefully should be harmless
-but- this code is nasty enough you really don't want to take that
sort of chances.
Feel free to add a fat comment next to the ld in the tlbsrx case itself
explaining why those two instructions must be kept together and any
change here must be reflected in the second level handler.
Cheers,
Ben.
^ permalink raw reply
* RE: [PATCH] net: add Xilinx emac lite device driver
From: John Linn @ 2009-08-19 7:13 UTC (permalink / raw)
To: David Miller
Cc: michal.simek, netdev, Sadanand Mutyala, linuxppc-dev, jgarzik,
john.williams
In-Reply-To: <20090818.232815.73515404.davem@davemloft.net>
Thanks David, I may have got over zealous in my clean up to get it ready
for mainline.
I'll take care of that.
-- John
> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Wednesday, August 19, 2009 7:28 AM
> To: John Linn
> Cc: netdev@vger.kernel.org; linuxppc-dev@ozlabs.org;
jgarzik@pobox.com; grant.likely@secretlab.ca;
> jwboyer@linux.vnet.ibm.com; john.williams@petalogix.com;
michal.simek@petalogix.com; Sadanand Mutyala
> Subject: Re: [PATCH] net: add Xilinx emac lite device driver
> =
> From: John Linn <john.linn@xilinx.com>
> Date: Tue, 18 Aug 2009 09:30:41 -0600
> =
> > @@ -1926,6 +1926,11 @@ config ATL2
> > To compile this driver as a module, choose M here. The module
> > will be called atl2.
> >
> > +config XILINX_EMACLITE
> > + tristate "Xilinx 10/100 Ethernet Lite support"
> > + help
> > + This driver supports the 10/100 Ethernet Lite from Xilinx.
> > +
> > source "drivers/net/fs_enet/Kconfig"
> >
> > endif # NET_ETHERNET
> =
> You can't unconditionally enable this driver everywhere.
> It depends upon things like openfirmware support, etc.
> so the driver will fail to link if the architecture doesn't
> have those support routines.
> =
> Please fix this up, thanks.
This email and any attachments are intended for the sole use of the named r=
ecipient(s) and contain(s) confidential information that may be proprietary=
, privileged or copyrighted under applicable law. If you are not the intend=
ed recipient, do not read, copy, or forward this email message or any attac=
hments. Delete this email message and any attachments immediately.
^ permalink raw reply
* Re: [PATCH] net: add Xilinx emac lite device driver
From: David Miller @ 2009-08-19 6:28 UTC (permalink / raw)
To: john.linn
Cc: michal.simek, netdev, sadanan, linuxppc-dev, jgarzik,
john.williams
In-Reply-To: <20090818153047.48E42ED8051@mail66-dub.bigfish.com>
From: John Linn <john.linn@xilinx.com>
Date: Tue, 18 Aug 2009 09:30:41 -0600
> @@ -1926,6 +1926,11 @@ config ATL2
> To compile this driver as a module, choose M here. The module
> will be called atl2.
>
> +config XILINX_EMACLITE
> + tristate "Xilinx 10/100 Ethernet Lite support"
> + help
> + This driver supports the 10/100 Ethernet Lite from Xilinx.
> +
> source "drivers/net/fs_enet/Kconfig"
>
> endif # NET_ETHERNET
You can't unconditionally enable this driver everywhere.
It depends upon things like openfirmware support, etc.
so the driver will fail to link if the architecture doesn't
have those support routines.
Please fix this up, thanks.
^ permalink raw reply
* [PATCH 5/5] powerpc/book3e-64: Add support to initial_tlb_book3e for non-HES TLB
From: Kumar Gala @ 2009-08-19 5:08 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1250658513-13009-4-git-send-email-galak@kernel.crashing.org>
We now search through TLBnCFG looking for the first array that has IPROT
support (we assume that there is only one). If that TLB has hardware
entry select (HES) support we use the existing code and with the proper
TLB select (the HES code still needs to clean up bolted entries from
firmware). The non-HES code is pretty similiar to the 32-bit FSL Book-E
code but does make some new assumtions (like that we have tlbilx) and
simplifies things down a bit.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
---
Ben,
One concern I had with this patch was the fact that I'm using
r5..r8 w/o saving them off in head_64.S.
- k
arch/powerpc/include/asm/reg_booke.h | 2 +
arch/powerpc/kernel/exceptions-64e.S | 204 +++++++++++++++++++++++++++++++++-
2 files changed, 202 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/reg_booke.h b/arch/powerpc/include/asm/reg_booke.h
index 2c9c706..e204de6 100644
--- a/arch/powerpc/include/asm/reg_booke.h
+++ b/arch/powerpc/include/asm/reg_booke.h
@@ -108,6 +108,8 @@
#define SPRN_PID2 0x27A /* Process ID Register 2 */
#define SPRN_TLB0CFG 0x2B0 /* TLB 0 Config Register */
#define SPRN_TLB1CFG 0x2B1 /* TLB 1 Config Register */
+#define SPRN_TLB2CFG 0x2B2 /* TLB 2 Config Register */
+#define SPRN_TLB3CFG 0x2B3 /* TLB 3 Config Register */
#define SPRN_EPR 0x2BE /* External Proxy Register */
#define SPRN_CCR1 0x378 /* Core Configuration Register 1 */
#define SPRN_ZPR 0x3B0 /* Zone Protection Register (40x) */
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 662236c..9048f96 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -616,18 +616,214 @@ bad_stack_book3e:
* Setup the initial TLB for a core. This current implementation
* assume that whatever we are running off will not conflict with
* the new mapping at PAGE_OFFSET.
- * We also make various assumptions about the processor we run on,
- * this might have to be made more flexible based on the content
- * of MMUCFG and friends.
*/
_GLOBAL(initial_tlb_book3e)
+ /* Look for the first TLB with IPROT set */
+ mfspr r4,SPRN_TLB0CFG
+ andi. r3,r4,TLBnCFG_IPROT
+ lis r3,MAS0_TLBSEL(0)@h
+ bne found_iprot
+
+ mfspr r4,SPRN_TLB1CFG
+ andi. r3,r4,TLBnCFG_IPROT
+ lis r3,MAS0_TLBSEL(1)@h
+ bne found_iprot
+
+ mfspr r4,SPRN_TLB2CFG
+ andi. r3,r4,TLBnCFG_IPROT
+ lis r3,MAS0_TLBSEL(2)@h
+ bne found_iprot
+
+ lis r3,MAS0_TLBSEL(3)@h
+ mfspr r4,SPRN_TLB3CFG
+ /* fall through */
+
+found_iprot:
+ andi. r5,r4,TLBnCFG_HES
+ bne have_hes
+
+ mflr r8 /* save LR */
+/* 1. Find the index of the entry we're executing in
+ *
+ * r3 = MAS0_TLBSEL (for the iprot array)
+ * r4 = SPRN_TLBnCFG
+ */
+ bl invstr /* Find our address */
+invstr: mflr r6 /* Make it accessible */
+ mfmsr r7
+ rlwinm r5,r7,27,31,31 /* extract MSR[IS] */
+ mfspr r7,SPRN_PID
+ slwi r7,r7,16
+ or r7,r7,r5
+ mtspr SPRN_MAS6,r7
+ tlbsx 0,r6 /* search MSR[IS], SPID=PID */
+
+ mfspr r3,SPRN_MAS0
+ rlwinm r5,r3,16,20,31 /* Extract MAS0(Entry) */
+
+ mfspr r7,SPRN_MAS1 /* Insure IPROT set */
+ oris r7,r7,MAS1_IPROT@h
+ mtspr SPRN_MAS1,r7
+ tlbwe
+
+/* 2. Invalidate all entries except the entry we're executing in
+ *
+ * r3 = MAS0 w/TLBSEL & ESEL for the entry we are running in
+ * r4 = SPRN_TLBnCFG
+ * r5 = ESEL of entry we are running in
+ */
+ andi. r4,r4,TLBnCFG_N_ENTRY /* Extract # entries */
+ li r6,0 /* Set Entry counter to 0 */
+1: mr r7,r3 /* Set MAS0(TLBSEL) */
+ rlwimi r7,r6,16,4,15 /* Setup MAS0 = TLBSEL | ESEL(r6) */
+ mtspr SPRN_MAS0,r7
+ tlbre
+ mfspr r7,SPRN_MAS1
+ rlwinm r7,r7,0,2,31 /* Clear MAS1 Valid and IPROT */
+ cmpw r5,r6
+ beq skpinv /* Dont update the current execution TLB */
+ mtspr SPRN_MAS1,r7
+ tlbwe
+ isync
+skpinv: addi r6,r6,1 /* Increment */
+ cmpw r6,r4 /* Are we done? */
+ bne 1b /* If not, repeat */
+
+ /* Invalidate all TLBs */
+ PPC_TLBILX_ALL(0,0)
+ sync
+ isync
+
+/* 3. Setup a temp mapping and jump to it
+ *
+ * r3 = MAS0 w/TLBSEL & ESEL for the entry we are running in
+ * r5 = ESEL of entry we are running in
+ */
+ andi. r7,r5,0x1 /* Find an entry not used and is non-zero */
+ addi r7,r7,0x1
+ mr r4,r3 /* Set MAS0(TLBSEL) = 1 */
+ mtspr SPRN_MAS0,r4
+ tlbre
+
+ rlwimi r4,r7,16,4,15 /* Setup MAS0 = TLBSEL | ESEL(r7) */
+ mtspr SPRN_MAS0,r4
+
+ mfspr r7,SPRN_MAS1
+ xori r6,r7,MAS1_TS /* Setup TMP mapping in the other Address space */
+ mtspr SPRN_MAS1,r6
+
+ tlbwe
+
+ mfmsr r6
+ xori r6,r6,MSR_IS
+ mtspr SPRN_SRR1,r6
+ bl 1f /* Find our address */
+1: mflr r6
+ addi r6,r6,(2f - 1b)
+ mtspr SPRN_SRR0,r6
+ rfi
+2:
+
+/* 4. Clear out PIDs & Search info
+ *
+ * r3 = MAS0 w/TLBSEL & ESEL for the entry we started in
+ * r4 = MAS0 w/TLBSEL & ESEL for the temp mapping
+ * r5 = MAS3
+ */
+ li r6,0
+ mtspr SPRN_MAS6,r6
+ mtspr SPRN_PID,r6
+
+/* 5. Invalidate mapping we started in
+ *
+ * r3 = MAS0 w/TLBSEL & ESEL for the entry we started in
+ * r4 = MAS0 w/TLBSEL & ESEL for the temp mapping
+ * r5 = MAS3
+ */
+ mtspr SPRN_MAS0,r3
+ tlbre
+ mfspr r6,SPRN_MAS1
+ rlwinm r6,r6,0,2,0 /* clear IPROT */
+ mtspr SPRN_MAS1,r6
+ tlbwe
+
+ /* Invalidate TLB1 */
+ PPC_TLBILX_ALL(0,0)
+ sync
+ isync
+
+/* The mapping only needs to be cache-coherent on SMP */
+#ifdef CONFIG_SMP
+#define M_IF_SMP MAS2_M
+#else
+#define M_IF_SMP 0
+#endif
+
+/* 6. Setup KERNELBASE mapping in TLB[0]
+ *
+ * r3 = MAS0 w/TLBSEL & ESEL for the entry we started in
+ * r4 = MAS0 w/TLBSEL & ESEL for the temp mapping
+ * r5 = MAS3
+ */
+ rlwinm r3,r3,0,16,3 /* clear ESEL */
+ mtspr SPRN_MAS0,r3
+ lis r6,(MAS1_VALID|MAS1_IPROT)@h
+ ori r6,r6,(MAS1_TSIZE(BOOK3E_PAGESZ_1GB))@l
+ mtspr SPRN_MAS1,r6
+
+ LOAD_REG_IMMEDIATE(r6, PAGE_OFFSET | M_IF_SMP)
+ mtspr SPRN_MAS2,r6
+
+ rlwinm r5,r5,0,0,25
+ ori r5,r5,MAS3_SR | MAS3_SW | MAS3_SX
+ mtspr SPRN_MAS3,r5
+ li r5,-1
+ rlwinm r5,r5,0,0,25
+
+ tlbwe
+
+/* 7. Jump to KERNELBASE mapping
+ *
+ * r4 = MAS0 w/TLBSEL & ESEL for the temp mapping
+ */
+ /* Now we branch the new virtual address mapped by this entry */
+ LOAD_REG_IMMEDIATE(r6,2f)
+ lis r7,MSR_KERNEL@h
+ ori r7,r7,MSR_KERNEL@l
+ mtspr SPRN_SRR0,r6
+ mtspr SPRN_SRR1,r7
+ rfi /* start execution out of TLB1[0] entry */
+2:
+
+/* 8. Clear out the temp mapping
+ *
+ * r4 = MAS0 w/TLBSEL & ESEL for the entry we are running in
+ */
+ mtspr SPRN_MAS0,r4
+ tlbre
+ mfspr r5,SPRN_MAS1
+ rlwinm r5,r5,0,2,0 /* clear IPROT */
+ mtspr SPRN_MAS1,r5
+ tlbwe
+
+ /* Invalidate TLB1 */
+ PPC_TLBILX_ALL(0,0)
+ sync
+ isync
+
+ /* We translate LR and return */
+ tovirt(r8,r8)
+ mtlr r8
+ blr
+
+have_hes:
/* Setup MAS 0,1,2,3 and 7 for tlbwe of a 1G entry that maps the
* kernel linear mapping. We also set MAS8 once for all here though
* that will have to be made dependent on whether we are running under
* a hypervisor I suppose.
*/
- li r3,MAS0_HES | MAS0_WQ_ALLWAYS
+ ori r3,r3,MAS0_HES | MAS0_WQ_ALLWAYS
mtspr SPRN_MAS0,r3
lis r3,(MAS1_VALID | MAS1_IPROT)@h
ori r3,r3,BOOK3E_PAGESZ_1GB << MAS1_TSIZE_SHIFT
--
1.6.0.6
^ permalink raw reply related
* [PATCH 4/5] powerpc/book3e-64: Add helper function to setup IVORs
From: Kumar Gala @ 2009-08-19 5:08 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1250658513-13009-3-git-send-email-galak@kernel.crashing.org>
Not all 64-bit Book-3E parts will have fixed IVORs so add a function that
cpusetup code can call to setup the base IVORs (0..15) to match the fixed
offsets. We need to 'or' part of interrupt_base_book3e into the IVORs
since on parts that have them the IVPR doesn't extend as far down.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
---
arch/powerpc/include/asm/exception-64e.h | 4 ++++
arch/powerpc/kernel/exceptions-64e.S | 19 +++++++++++++++++++
2 files changed, 23 insertions(+), 0 deletions(-)
diff --git a/arch/powerpc/include/asm/exception-64e.h b/arch/powerpc/include/asm/exception-64e.h
index 94cb3d7..6d53f31 100644
--- a/arch/powerpc/include/asm/exception-64e.h
+++ b/arch/powerpc/include/asm/exception-64e.h
@@ -196,6 +196,10 @@ exc_##label##_book3e:
#define TLB_MISS_STATS_SAVE_INFO
#endif
+#define SET_IVOR(vector_number, vector_offset) \
+ li r3,vector_offset@l; \
+ ori r3,r3,interrupt_base_book3e@l; \
+ mtspr SPRN_IVOR##vector_number,r3;
#endif /* _ASM_POWERPC_EXCEPTION_64E_H */
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 3611b0e..662236c 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -782,5 +782,24 @@ _STATIC(init_thread_book3e)
blr
+_GLOBAL(__setup_base_ivors)
+ SET_IVOR(0, 0x020) /* Critical Input */
+ SET_IVOR(1, 0x000) /* Machine Check */
+ SET_IVOR(2, 0x060) /* Data Storage */
+ SET_IVOR(3, 0x080) /* Instruction Storage */
+ SET_IVOR(4, 0x0a0) /* External Input */
+ SET_IVOR(5, 0x0c0) /* Alignment */
+ SET_IVOR(6, 0x0e0) /* Program */
+ SET_IVOR(7, 0x100) /* FP Unavailable */
+ SET_IVOR(8, 0x120) /* System Call */
+ SET_IVOR(9, 0x140) /* Auxiliary Processor Unavailable */
+ SET_IVOR(10, 0x160) /* Decrementer */
+ SET_IVOR(11, 0x180) /* Fixed Interval Timer */
+ SET_IVOR(12, 0x1a0) /* Watchdog Timer */
+ SET_IVOR(13, 0x1c0) /* Data TLB Error */
+ SET_IVOR(14, 0x1e0) /* Instruction TLB Error */
+ SET_IVOR(15, 0x040) /* Debug */
+ sync
+ blr
--
1.6.0.6
^ permalink raw reply related
* [PATCH 2/5] powerpc/book3e-64: Move the default cpu table entry
From: Kumar Gala @ 2009-08-19 5:08 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1250658513-13009-1-git-send-email-galak@kernel.crashing.org>
Move the default cpu entry table for CONFIG_PPC_BOOK3E_64 to the
very end since we will probably want to support both 32-bit and
64-bit kernels for some processors that are higher up in the list.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
---
arch/powerpc/kernel/cputable.c | 49 ++++++++++++++++++++++------------------
1 files changed, 27 insertions(+), 22 deletions(-)
diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index 9f38ecb..0b9c913 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -89,8 +89,12 @@ extern void __restore_cpu_power7(void);
#define COMMON_USER_PA6T (COMMON_USER_PPC64 | PPC_FEATURE_PA6T |\
PPC_FEATURE_TRUE_LE | \
PPC_FEATURE_HAS_ALTIVEC_COMP)
+#ifdef CONFIG_PPC_BOOK3E_64
+#define COMMON_USER_BOOKE (COMMON_USER_PPC64 | PPC_FEATURE_BOOKE)
+#else
#define COMMON_USER_BOOKE (PPC_FEATURE_32 | PPC_FEATURE_HAS_MMU | \
PPC_FEATURE_BOOKE)
+#endif
static struct cpu_spec __initdata cpu_specs[] = {
#ifdef CONFIG_PPC_BOOK3S_64
@@ -509,28 +513,6 @@ static struct cpu_spec __initdata cpu_specs[] = {
.platform = "power4",
}
#endif /* CONFIG_PPC_BOOK3S_64 */
-#ifdef CONFIG_PPC_BOOK3E_64
- { /* This is a default entry to get going, to be replaced by
- * a real one at some stage
- */
-#define CPU_FTRS_BASE_BOOK3E (CPU_FTR_USE_TB | \
- CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_SMT | \
- CPU_FTR_NODSISRALIGN | CPU_FTR_NOEXECUTE)
- .pvr_mask = 0x00000000,
- .pvr_value = 0x00000000,
- .cpu_name = "Book3E",
- .cpu_features = CPU_FTRS_BASE_BOOK3E,
- .cpu_user_features = COMMON_USER_PPC64,
- .mmu_features = MMU_FTR_TYPE_3E | MMU_FTR_USE_TLBILX |
- MMU_FTR_USE_TLBIVAX_BCAST |
- MMU_FTR_LOCK_BCAST_INVAL,
- .icache_bsize = 64,
- .dcache_bsize = 64,
- .num_pmcs = 0,
- .machine_check = machine_check_generic,
- .platform = "power6",
- },
-#endif
#ifdef CONFIG_PPC32
#if CLASSIC_PPC
@@ -1846,6 +1828,29 @@ static struct cpu_spec __initdata cpu_specs[] = {
}
#endif /* CONFIG_E500 */
#endif /* CONFIG_PPC32 */
+
+#ifdef CONFIG_PPC_BOOK3E_64
+ { /* This is a default entry to get going, to be replaced by
+ * a real one at some stage
+ */
+#define CPU_FTRS_BASE_BOOK3E (CPU_FTR_USE_TB | \
+ CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_SMT | \
+ CPU_FTR_NODSISRALIGN | CPU_FTR_NOEXECUTE)
+ .pvr_mask = 0x00000000,
+ .pvr_value = 0x00000000,
+ .cpu_name = "Book3E",
+ .cpu_features = CPU_FTRS_BASE_BOOK3E,
+ .cpu_user_features = COMMON_USER_PPC64,
+ .mmu_features = MMU_FTR_TYPE_3E | MMU_FTR_USE_TLBILX |
+ MMU_FTR_USE_TLBIVAX_BCAST |
+ MMU_FTR_LOCK_BCAST_INVAL,
+ .icache_bsize = 64,
+ .dcache_bsize = 64,
+ .num_pmcs = 0,
+ .machine_check = machine_check_generic,
+ .platform = "power6",
+ },
+#endif
};
static struct cpu_spec the_cpu_spec;
--
1.6.0.6
^ permalink raw reply related
* [PATCH 3/5] powerpc/book3e-64: Wait til generic_calibrate_decr to enable decrementer
From: Kumar Gala @ 2009-08-19 5:08 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
In-Reply-To: <1250658513-13009-2-git-send-email-galak@kernel.crashing.org>
Match what we do on 32-bit Book-E processors and enable the decrementer
in generic_calibrate_decr. We need to make sure we disable the
decrementer early in boot since we currently use lazy (soft) interrupt
on 64-bit Book-E and possible get a decrementer exception before we
are ready for it.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
---
arch/powerpc/kernel/exceptions-64e.S | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 695d484..3611b0e 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -774,9 +774,11 @@ _STATIC(init_thread_book3e)
/* Make sure interrupts are off */
wrteei 0
- /* disable watchdog and FIT and enable DEC interrupts */
- lis r3,TCR_DIE@h
+ /* disable all timers and clear out status */
+ li r3,0
mtspr SPRN_TCR,r3
+ mfspr r3,SPRN_TSR
+ mtspr SPRN_TSR,r3
blr
--
1.6.0.6
^ permalink raw reply related
* [PATCH 1/5] powerpc/mm: Add MMU features for TLB reservation & Paired MAS registers
From: Kumar Gala @ 2009-08-19 5:08 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
Support for TLB reservation (or TLB Write Conditional) and Paired MAS
registers are optional for a processor implementation so we handle
them via MMU feature sections.
We currently only used paired MAS registers to access the full RPN + perm
bits that are kept in MAS7||MAS3. We assume that if an implementation has
hardware page table at this time it also implements in TLB reservations.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
---
arch/powerpc/include/asm/mmu.h | 9 +++++++++
arch/powerpc/mm/tlb_low_64e.S | 36 +++++++++++++++++++++++++++++++++++-
2 files changed, 44 insertions(+), 1 deletions(-)
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 2fcfefc..7ffbb65 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -58,6 +58,15 @@
*/
#define MMU_FTR_TLBIE_206 ASM_CONST(0x00400000)
+/* Enable use of TLB reservation. Processor should support tlbsrx.
+ * instruction and MAS0[WQ].
+ */
+#define MMU_FTR_USE_TLBRSRV ASM_CONST(0x00800000)
+
+/* Use paired MAS registers (MAS7||MAS3, etc.)
+ */
+#define MMU_FTR_USE_PAIRED_MAS ASM_CONST(0x01000000)
+
#ifndef __ASSEMBLY__
#include <asm/cputable.h>
diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
index 10d524d..5b8e274 100644
--- a/arch/powerpc/mm/tlb_low_64e.S
+++ b/arch/powerpc/mm/tlb_low_64e.S
@@ -189,12 +189,16 @@ normal_tlb_miss:
clrrdi r14,r14,3
or r10,r15,r14
+BEGIN_MMU_FTR_SECTION
/* Set the TLB reservation and seach for existing entry. Then load
* the entry.
*/
PPC_TLBSRX_DOT(0,r16)
ld r14,0(r10)
beq normal_tlb_miss_done
+MMU_FTR_SECTION_ELSE
+ ld r14,0(r10)
+ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_USE_TLBRSRV)
finish_normal_tlb_miss:
/* Check if required permissions are met */
@@ -241,7 +245,14 @@ finish_normal_tlb_miss:
bne 1f
li r11,MAS3_SW|MAS3_UW
andc r15,r15,r11
-1: mtspr SPRN_MAS7_MAS3,r15
+1:
+BEGIN_MMU_FTR_SECTION
+ srdi r16,r15,32
+ mtspr SPRN_MAS3,r15
+ mtspr SPRN_MAS7,r16
+MMU_FTR_SECTION_ELSE
+ mtspr SPRN_MAS7_MAS3,r15
+ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_USE_PAIRED_MAS)
tlbwe
@@ -311,11 +322,13 @@ virt_page_table_tlb_miss:
rlwinm r10,r10,0,16,1 /* Clear TID */
mtspr SPRN_MAS1,r10
1:
+BEGIN_MMU_FTR_SECTION
/* Search if we already have a TLB entry for that virtual address, and
* if we do, bail out.
*/
PPC_TLBSRX_DOT(0,r16)
beq virt_page_table_tlb_miss_done
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_USE_TLBRSRV)
/* Now, we need to walk the page tables. First check if we are in
* range.
@@ -367,7 +380,14 @@ virt_page_table_tlb_miss:
*/
clrldi r11,r15,4 /* remove region ID from RPN */
ori r10,r11,1 /* Or-in SR */
+
+BEGIN_MMU_FTR_SECTION
+ srdi r16,r10,32
+ mtspr SPRN_MAS3,r10
+ mtspr SPRN_MAS7,r16
+MMU_FTR_SECTION_ELSE
mtspr SPRN_MAS7_MAS3,r10
+ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_USE_PAIRED_MAS)
tlbwe
@@ -618,7 +638,14 @@ htw_tlb_miss:
#else
ori r10,r15,(BOOK3E_PAGESZ_4K << MAS3_SPSIZE_SHIFT)
#endif
+
+BEGIN_MMU_FTR_SECTION
+ srdi r16,r10,32
+ mtspr SPRN_MAS3,r10
+ mtspr SPRN_MAS7,r16
+MMU_FTR_SECTION_ELSE
mtspr SPRN_MAS7_MAS3,r10
+ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_USE_PAIRED_MAS)
tlbwe
@@ -700,7 +727,14 @@ tlb_load_linear:
clrrdi r10,r16,30 /* 1G page index */
clrldi r10,r10,4 /* clear region bits */
ori r10,r10,MAS3_SR|MAS3_SW|MAS3_SX
+
+BEGIN_MMU_FTR_SECTION
+ srdi r16,r10,32
+ mtspr SPRN_MAS3,r10
+ mtspr SPRN_MAS7,r16
+MMU_FTR_SECTION_ELSE
mtspr SPRN_MAS7_MAS3,r10
+ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_USE_PAIRED_MAS)
tlbwe
--
1.6.0.6
^ permalink raw reply related
* [PATCH] powerpc/mm: Cleanup handling of execute permission v2
From: Benjamin Herrenschmidt @ 2009-08-19 5:00 UTC (permalink / raw)
To: linuxppc-dev list
This is an attempt at cleaning up a bit the way we handle execute
permission on powerpc. _PAGE_HWEXEC is gone, _PAGE_EXEC is now only
defined by CPUs that can do something with it, and the myriad of
#ifdef's in the I$/D$ coherency code is reduced to 2 cases that
hopefully should cover everything.
The logic on BookE is a little bit different than what it was though
not by much. Since now, _PAGE_EXEC will be set by the generic code
for executable pages, we need to filter out if they are unclean and
recover it. However, I don't expect the code to be more bloated than
it already was in that area due to that change.
I could boast that this brings proper enforcing of per-page execute
permissions to all BookE and 40x but in fact, we've had that now for
some time as a side effect of my previous rework in that area (and
I didn't even know it :-) We would only enable execute permission if
the page was cache clean and we would only cache clean it if we took
and exec fault. Since we now enforce that the later only work if
VM_EXEC is part of the VMA flags, we de-fact already enforce per-page
execute permissions... Unless I missed something
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
This version restores the missing (oops) cache flush in
ptep_set_access_flags() which broke BookE.
diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/pgtable-ppc32.h
index c9ff9d7..f2c52e2 100644
--- a/arch/powerpc/include/asm/pgtable-ppc32.h
+++ b/arch/powerpc/include/asm/pgtable-ppc32.h
@@ -186,7 +186,7 @@ static inline unsigned long pte_update(pte_t *p,
#endif /* !PTE_ATOMIC_UPDATES */
#ifdef CONFIG_44x
- if ((old & _PAGE_USER) && (old & _PAGE_HWEXEC))
+ if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
icache_44x_need_flush = 1;
#endif
return old;
@@ -217,7 +217,7 @@ static inline unsigned long long pte_update(pte_t *p,
#endif /* !PTE_ATOMIC_UPDATES */
#ifdef CONFIG_44x
- if ((old & _PAGE_USER) && (old & _PAGE_HWEXEC))
+ if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
icache_44x_need_flush = 1;
#endif
return old;
@@ -267,8 +267,7 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
{
unsigned long bits = pte_val(entry) &
- (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW |
- _PAGE_HWEXEC | _PAGE_EXEC);
+ (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
pte_update(ptep, 0, bits);
}
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 200ec2d..806abe7 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -313,8 +313,7 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
{
unsigned long bits = pte_val(entry) &
- (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW |
- _PAGE_EXEC | _PAGE_HWEXEC);
+ (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
#ifdef PTE_ATOMIC_UPDATES
unsigned long old, tmp;
diff --git a/arch/powerpc/include/asm/pte-40x.h b/arch/powerpc/include/asm/pte-40x.h
index 07630fa..6c3e1f4 100644
--- a/arch/powerpc/include/asm/pte-40x.h
+++ b/arch/powerpc/include/asm/pte-40x.h
@@ -46,7 +46,7 @@
#define _PAGE_RW 0x040 /* software: Writes permitted */
#define _PAGE_DIRTY 0x080 /* software: dirty page */
#define _PAGE_HWWRITE 0x100 /* hardware: Dirty & RW, set in exception */
-#define _PAGE_HWEXEC 0x200 /* hardware: EX permission */
+#define _PAGE_EXEC 0x200 /* hardware: EX permission */
#define _PAGE_ACCESSED 0x400 /* software: R: page referenced */
#define _PMD_PRESENT 0x400 /* PMD points to page of PTEs */
diff --git a/arch/powerpc/include/asm/pte-44x.h b/arch/powerpc/include/asm/pte-44x.h
index 37e98bc..4192b9b 100644
--- a/arch/powerpc/include/asm/pte-44x.h
+++ b/arch/powerpc/include/asm/pte-44x.h
@@ -78,7 +78,7 @@
#define _PAGE_PRESENT 0x00000001 /* S: PTE valid */
#define _PAGE_RW 0x00000002 /* S: Write permission */
#define _PAGE_FILE 0x00000004 /* S: nonlinear file mapping */
-#define _PAGE_HWEXEC 0x00000004 /* H: Execute permission */
+#define _PAGE_EXEC 0x00000004 /* H: Execute permission */
#define _PAGE_ACCESSED 0x00000008 /* S: Page referenced */
#define _PAGE_DIRTY 0x00000010 /* S: Page dirty */
#define _PAGE_SPECIAL 0x00000020 /* S: Special page */
diff --git a/arch/powerpc/include/asm/pte-8xx.h b/arch/powerpc/include/asm/pte-8xx.h
index 8c6e312..94e9797 100644
--- a/arch/powerpc/include/asm/pte-8xx.h
+++ b/arch/powerpc/include/asm/pte-8xx.h
@@ -36,7 +36,6 @@
/* These five software bits must be masked out when the entry is loaded
* into the TLB.
*/
-#define _PAGE_EXEC 0x0008 /* software: i-cache coherency required */
#define _PAGE_GUARDED 0x0010 /* software: guarded access */
#define _PAGE_DIRTY 0x0020 /* software: page changed */
#define _PAGE_RW 0x0040 /* software: user write access allowed */
diff --git a/arch/powerpc/include/asm/pte-book3e.h b/arch/powerpc/include/asm/pte-book3e.h
index 1d27c77..9800565 100644
--- a/arch/powerpc/include/asm/pte-book3e.h
+++ b/arch/powerpc/include/asm/pte-book3e.h
@@ -37,12 +37,13 @@
#define _PAGE_WRITETHRU 0x800000 /* W: cache write-through */
/* "Higher level" linux bit combinations */
-#define _PAGE_EXEC _PAGE_BAP_SX /* Can be executed from potentially */
-#define _PAGE_HWEXEC _PAGE_BAP_UX /* .. and was cache cleaned */
-#define _PAGE_RW (_PAGE_BAP_SW | _PAGE_BAP_UW) /* User write permission */
-#define _PAGE_KERNEL_RW (_PAGE_BAP_SW | _PAGE_BAP_SR | _PAGE_DIRTY)
-#define _PAGE_KERNEL_RO (_PAGE_BAP_SR)
-#define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
+#define _PAGE_EXEC _PAGE_BAP_UX /* .. and was cache cleaned */
+#define _PAGE_RW (_PAGE_BAP_SW | _PAGE_BAP_UW) /* User write permission */
+#define _PAGE_KERNEL_RW (_PAGE_BAP_SW | _PAGE_BAP_SR | _PAGE_DIRTY)
+#define _PAGE_KERNEL_RO (_PAGE_BAP_SR)
+#define _PAGE_KERNEL_RWX (_PAGE_BAP_SW | _PAGE_BAP_SR | _PAGE_DIRTY | _PAGE_BAP_SX)
+#define _PAGE_KERNEL_ROX (_PAGE_BAP_SR | _PAGE_BAP_SX)
+#define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
#define _PAGE_HASHPTE 0
#define _PAGE_BUSY 0
diff --git a/arch/powerpc/include/asm/pte-common.h b/arch/powerpc/include/asm/pte-common.h
index 8bb6464..c3b6507 100644
--- a/arch/powerpc/include/asm/pte-common.h
+++ b/arch/powerpc/include/asm/pte-common.h
@@ -13,9 +13,6 @@
#ifndef _PAGE_HWWRITE
#define _PAGE_HWWRITE 0
#endif
-#ifndef _PAGE_HWEXEC
-#define _PAGE_HWEXEC 0
-#endif
#ifndef _PAGE_EXEC
#define _PAGE_EXEC 0
#endif
@@ -48,10 +45,16 @@
#define PMD_PAGE_SIZE(pmd) bad_call_to_PMD_PAGE_SIZE()
#endif
#ifndef _PAGE_KERNEL_RO
-#define _PAGE_KERNEL_RO 0
+#define _PAGE_KERNEL_RO 0
+#endif
+#ifndef _PAGE_KERNEL_ROX
+#define _PAGE_KERNEL_ROX (_PAGE_EXEC)
#endif
#ifndef _PAGE_KERNEL_RW
-#define _PAGE_KERNEL_RW (_PAGE_DIRTY | _PAGE_RW | _PAGE_HWWRITE)
+#define _PAGE_KERNEL_RW (_PAGE_DIRTY | _PAGE_RW | _PAGE_HWWRITE)
+#endif
+#ifndef _PAGE_KERNEL_RWX
+#define _PAGE_KERNEL_RWX (_PAGE_DIRTY | _PAGE_RW | _PAGE_HWWRITE | _PAGE_EXEC)
#endif
#ifndef _PAGE_HPTEFLAGS
#define _PAGE_HPTEFLAGS _PAGE_HASHPTE
@@ -96,8 +99,7 @@ extern unsigned long bad_call_to_PMD_PAGE_SIZE(void);
#define PAGE_PROT_BITS (_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
_PAGE_WRITETHRU | _PAGE_ENDIAN | _PAGE_4K_PFN | \
_PAGE_USER | _PAGE_ACCESSED | \
- _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | \
- _PAGE_EXEC | _PAGE_HWEXEC)
+ _PAGE_RW | _PAGE_HWWRITE | _PAGE_DIRTY | _PAGE_EXEC)
/*
* We define 2 sets of base prot bits, one for basic pages (ie,
@@ -154,11 +156,9 @@ extern unsigned long bad_call_to_PMD_PAGE_SIZE(void);
_PAGE_NO_CACHE)
#define PAGE_KERNEL_NCG __pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
_PAGE_NO_CACHE | _PAGE_GUARDED)
-#define PAGE_KERNEL_X __pgprot(_PAGE_BASE | _PAGE_KERNEL_RW | _PAGE_EXEC | \
- _PAGE_HWEXEC)
+#define PAGE_KERNEL_X __pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
#define PAGE_KERNEL_RO __pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
-#define PAGE_KERNEL_ROX __pgprot(_PAGE_BASE | _PAGE_KERNEL_RO | _PAGE_EXEC | \
- _PAGE_HWEXEC)
+#define PAGE_KERNEL_ROX __pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
/* Protection used for kernel text. We want the debuggers to be able to
* set breakpoints anywhere, so don't write protect the kernel text
diff --git a/arch/powerpc/include/asm/pte-fsl-booke.h b/arch/powerpc/include/asm/pte-fsl-booke.h
index 10820f5..ce8a9e9 100644
--- a/arch/powerpc/include/asm/pte-fsl-booke.h
+++ b/arch/powerpc/include/asm/pte-fsl-booke.h
@@ -23,7 +23,7 @@
#define _PAGE_FILE 0x00002 /* S: when !present: nonlinear file mapping */
#define _PAGE_RW 0x00004 /* S: Write permission (SW) */
#define _PAGE_DIRTY 0x00008 /* S: Page dirty */
-#define _PAGE_HWEXEC 0x00010 /* H: SX permission */
+#define _PAGE_EXEC 0x00010 /* H: SX permission */
#define _PAGE_ACCESSED 0x00020 /* S: Page referenced */
#define _PAGE_ENDIAN 0x00040 /* H: E bit */
diff --git a/arch/powerpc/include/asm/pte-hash32.h b/arch/powerpc/include/asm/pte-hash32.h
index 16e571c..4aad413 100644
--- a/arch/powerpc/include/asm/pte-hash32.h
+++ b/arch/powerpc/include/asm/pte-hash32.h
@@ -26,7 +26,6 @@
#define _PAGE_WRITETHRU 0x040 /* W: cache write-through */
#define _PAGE_DIRTY 0x080 /* C: page changed */
#define _PAGE_ACCESSED 0x100 /* R: page referenced */
-#define _PAGE_EXEC 0x200 /* software: i-cache coherency required */
#define _PAGE_RW 0x400 /* software: user write access allowed */
#define _PAGE_SPECIAL 0x800 /* software: Special page */
diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S
index 656cfb2..711368b 100644
--- a/arch/powerpc/kernel/head_44x.S
+++ b/arch/powerpc/kernel/head_44x.S
@@ -497,7 +497,7 @@ tlb_44x_patch_hwater_D:
mtspr SPRN_MMUCR,r12
/* Make up the required permissions */
- li r13,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_HWEXEC
+ li r13,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC
/* Compute pgdir/pmd offset */
rlwinm r12, r10, PPC44x_PGD_OFF_SHIFT, PPC44x_PGD_OFF_MASK_BIT, 29
diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
index eca8048..2c5af52 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -643,7 +643,7 @@ interrupt_base:
4:
/* Make up the required permissions */
- li r13,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_HWEXEC
+ li r13,_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_EXEC
FIND_PTE
andc. r13,r13,r11 /* Check permission */
@@ -742,7 +742,7 @@ finish_tlb_load:
#endif
mtspr SPRN_MAS2, r12
- li r10, (_PAGE_HWEXEC | _PAGE_PRESENT)
+ li r10, (_PAGE_EXEC | _PAGE_PRESENT)
rlwimi r10, r11, 31, 29, 29 /* extract _PAGE_DIRTY into SW */
and r12, r11, r10
andi. r10, r11, _PAGE_USER /* Test for _PAGE_USER */
diff --git a/arch/powerpc/mm/40x_mmu.c b/arch/powerpc/mm/40x_mmu.c
index 29954dc..f5e7b9c 100644
--- a/arch/powerpc/mm/40x_mmu.c
+++ b/arch/powerpc/mm/40x_mmu.c
@@ -105,7 +105,7 @@ unsigned long __init mmu_mapin_ram(void)
while (s >= LARGE_PAGE_SIZE_16M) {
pmd_t *pmdp;
- unsigned long val = p | _PMD_SIZE_16M | _PAGE_HWEXEC | _PAGE_HWWRITE;
+ unsigned long val = p | _PMD_SIZE_16M | _PAGE_EXEC | _PAGE_HWWRITE;
pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
pmd_val(*pmdp++) = val;
@@ -120,7 +120,7 @@ unsigned long __init mmu_mapin_ram(void)
while (s >= LARGE_PAGE_SIZE_4M) {
pmd_t *pmdp;
- unsigned long val = p | _PMD_SIZE_4M | _PAGE_HWEXEC | _PAGE_HWWRITE;
+ unsigned long val = p | _PMD_SIZE_4M | _PAGE_EXEC | _PAGE_HWWRITE;
pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
pmd_val(*pmdp) = val;
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index cafb2a2..cfc2499 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -128,28 +128,6 @@ void pte_free_finish(void)
#endif /* CONFIG_SMP */
-/*
- * Handle i/d cache flushing, called from set_pte_at() or ptep_set_access_flags()
- */
-static pte_t do_dcache_icache_coherency(pte_t pte)
-{
- unsigned long pfn = pte_pfn(pte);
- struct page *page;
-
- if (unlikely(!pfn_valid(pfn)))
- return pte;
- page = pfn_to_page(pfn);
-
- if (!PageReserved(page) && !test_bit(PG_arch_1, &page->flags)) {
- pr_devel("do_dcache_icache_coherency... flushing\n");
- flush_dcache_icache_page(page);
- set_bit(PG_arch_1, &page->flags);
- }
- else
- pr_devel("do_dcache_icache_coherency... already clean\n");
- return __pte(pte_val(pte) | _PAGE_HWEXEC);
-}
-
static inline int is_exec_fault(void)
{
return current->thread.regs && TRAP(current->thread.regs) == 0x400;
@@ -157,49 +135,139 @@ static inline int is_exec_fault(void)
/* We only try to do i/d cache coherency on stuff that looks like
* reasonably "normal" PTEs. We currently require a PTE to be present
- * and we avoid _PAGE_SPECIAL and _PAGE_NO_CACHE
+ * and we avoid _PAGE_SPECIAL and _PAGE_NO_CACHE. We also only do that
+ * on userspace PTEs
*/
static inline int pte_looks_normal(pte_t pte)
{
return (pte_val(pte) &
- (_PAGE_PRESENT | _PAGE_SPECIAL | _PAGE_NO_CACHE)) ==
- (_PAGE_PRESENT);
+ (_PAGE_PRESENT | _PAGE_SPECIAL | _PAGE_NO_CACHE | _PAGE_USER)) ==
+ (_PAGE_PRESENT | _PAGE_USER);
}
-#if defined(CONFIG_PPC_STD_MMU)
+struct page * maybe_pte_to_page(pte_t pte)
+{
+ unsigned long pfn = pte_pfn(pte);
+ struct page *page;
+
+ if (unlikely(!pfn_valid(pfn)))
+ return NULL;
+ page = pfn_to_page(pfn);
+ if (PageReserved(page))
+ return NULL;
+ return page;
+}
+
+#if defined(CONFIG_PPC_STD_MMU) || _PAGE_EXEC == 0
+
/* Server-style MMU handles coherency when hashing if HW exec permission
- * is supposed per page (currently 64-bit only). Else, we always flush
- * valid PTEs in set_pte.
+ * is supposed per page (currently 64-bit only). If not, then, we always
+ * flush the cache for valid PTEs in set_pte. Embedded CPU without HW exec
+ * support falls into the same category.
*/
-static inline int pte_need_exec_flush(pte_t pte, int set_pte)
+
+static pte_t set_pte_filter(pte_t pte)
{
- return set_pte && pte_looks_normal(pte) &&
- !(cpu_has_feature(CPU_FTR_COHERENT_ICACHE) ||
- cpu_has_feature(CPU_FTR_NOEXECUTE));
+ pte = __pte(pte_val(pte) & ~_PAGE_HPTEFLAGS);
+ if (pte_looks_normal(pte) && !(cpu_has_feature(CPU_FTR_COHERENT_ICACHE) ||
+ cpu_has_feature(CPU_FTR_NOEXECUTE))) {
+ struct page *pg = maybe_pte_to_page(pte);
+ if (!pg)
+ return pte;
+ if (!test_bit(PG_arch_1, &pg->flags)) {
+ flush_dcache_icache_page(pg);
+ set_bit(PG_arch_1, &pg->flags);
+ }
+ }
+ return pte;
}
-#elif _PAGE_HWEXEC == 0
-/* Embedded type MMU without HW exec support (8xx only so far), we flush
- * the cache for any present PTE
- */
-static inline int pte_need_exec_flush(pte_t pte, int set_pte)
+
+static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
+ int dirty)
{
- return set_pte && pte_looks_normal(pte);
+ return pte;
}
-#else
-/* Other embedded CPUs with HW exec support per-page, we flush on exec
- * fault if HWEXEC is not set
+
+#else /* defined(CONFIG_PPC_STD_MMU) || _PAGE_EXEC == 0 */
+
+/* Embedded type MMU with HW exec support. This is a bit more complicated
+ * as we don't have two bits to spare for _PAGE_EXEC and _PAGE_HWEXEC so
+ * instead we "filter out" the exec permission for non clean pages.
*/
-static inline int pte_need_exec_flush(pte_t pte, int set_pte)
+static pte_t set_pte_filter(pte_t pte)
{
- return pte_looks_normal(pte) && is_exec_fault() &&
- !(pte_val(pte) & _PAGE_HWEXEC);
+ struct page *pg;
+
+ /* No exec permission in the first place, move on */
+ if (!(pte_val(pte) & _PAGE_EXEC) || !pte_looks_normal(pte))
+ return pte;
+
+ /* If you set _PAGE_EXEC on weird pages you're on your own */
+ pg = maybe_pte_to_page(pte);
+ if (unlikely(!pg))
+ return pte;
+
+ /* If the page clean, we move on */
+ if (test_bit(PG_arch_1, &pg->flags))
+ return pte;
+
+ /* If it's an exec fault, we flush the cache and make it clean */
+ if (is_exec_fault()) {
+ flush_dcache_icache_page(pg);
+ set_bit(PG_arch_1, &pg->flags);
+ return pte;
+ }
+
+ /* Else, we filter out _PAGE_EXEC */
+ return __pte(pte_val(pte) & ~_PAGE_EXEC);
}
-#endif
+
+static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
+ int dirty)
+{
+ struct page *pg;
+
+ /* So here, we only care about exec faults, as we use them
+ * to recover lost _PAGE_EXEC and perform I$/D$ coherency
+ * if necessary. Also if _PAGE_EXEC is already set, same deal,
+ * we just bail out
+ */
+ if (dirty || (pte_val(pte) & _PAGE_EXEC) || !is_exec_fault())
+ return pte;
+
+#ifdef CONFIG_DEBUG_VM
+ /* So this is an exec fault, _PAGE_EXEC is not set. If it was
+ * an error we would have bailed out earlier in do_page_fault()
+ * but let's make sure of it
+ */
+ if (WARN_ON(!(vma->vm_flags & VM_EXEC)))
+ return pte;
+#endif /* CONFIG_DEBUG_VM */
+
+ /* If you set _PAGE_EXEC on weird pages you're on your own */
+ pg = maybe_pte_to_page(pte);
+ if (unlikely(!pg))
+ goto bail;
+
+ /* If the page is already clean, we move on */
+ if (test_bit(PG_arch_1, &pg->flags))
+ goto bail;
+
+ /* Clean the page and set PG_arch_1 */
+ flush_dcache_icache_page(pg);
+ set_bit(PG_arch_1, &pg->flags);
+
+ bail:
+ return __pte(pte_val(pte) | _PAGE_EXEC);
+}
+
+#endif /* !(defined(CONFIG_PPC_STD_MMU) || _PAGE_EXEC == 0) */
/*
* set_pte stores a linux PTE into the linux page table.
*/
-void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte)
+void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+ pte_t pte)
{
#ifdef CONFIG_DEBUG_VM
WARN_ON(pte_present(*ptep));
@@ -208,9 +276,7 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte
* this context might not have been activated yet when this
* is called.
*/
- pte = __pte(pte_val(pte) & ~_PAGE_HPTEFLAGS);
- if (pte_need_exec_flush(pte, 1))
- pte = do_dcache_icache_coherency(pte);
+ pte = set_pte_filter(pte);
/* Perform the setting of the PTE */
__set_pte_at(mm, addr, ptep, pte, 0);
@@ -227,8 +293,7 @@ int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
pte_t *ptep, pte_t entry, int dirty)
{
int changed;
- if (!dirty && pte_need_exec_flush(entry, 0))
- entry = do_dcache_icache_coherency(entry);
+ entry = set_access_flags_filter(entry, vma, dirty);
changed = !pte_same(*(ptep), entry);
if (changed) {
if (!(vma->vm_flags & VM_HUGETLB))
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 5422169..cb96cb2 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -142,7 +142,7 @@ ioremap_flags(phys_addr_t addr, unsigned long size, unsigned long flags)
flags |= _PAGE_DIRTY | _PAGE_HWWRITE;
/* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
- flags &= ~(_PAGE_USER | _PAGE_EXEC | _PAGE_HWEXEC);
+ flags &= ~(_PAGE_USER | _PAGE_EXEC);
return __ioremap_caller(addr, size, flags, __builtin_return_address(0));
}
diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
index 10d524d..cd92f62 100644
--- a/arch/powerpc/mm/tlb_low_64e.S
+++ b/arch/powerpc/mm/tlb_low_64e.S
@@ -133,7 +133,7 @@
/* We do the user/kernel test for the PID here along with the RW test
*/
- li r11,_PAGE_PRESENT|_PAGE_HWEXEC /* Base perm */
+ li r11,_PAGE_PRESENT|_PAGE_EXEC /* Base perm */
oris r11,r11,_PAGE_ACCESSED@h
cmpldi cr0,r15,0 /* Check for user region */
@@ -256,7 +256,7 @@ normal_tlb_miss_done:
normal_tlb_miss_access_fault:
/* We need to check if it was an instruction miss */
- andi. r10,r11,_PAGE_HWEXEC
+ andi. r10,r11,_PAGE_EXEC
bne 1f
ld r14,EX_TLB_DEAR(r12)
ld r15,EX_TLB_ESR(r12)
--
1.6.1.2.14.gf26b5
^ permalink raw reply related
* Re: [PATCH] powerpc/mm: Fix assert_pte_locked to work properly on uniprocessor
From: Benjamin Herrenschmidt @ 2009-08-19 3:35 UTC (permalink / raw)
To: Kumar Gala; +Cc: linuxppc-dev
In-Reply-To: <1250644900-32133-1-git-send-email-galak@kernel.crashing.org>
On Tue, 2009-08-18 at 20:21 -0500, Kumar Gala wrote:
> Since the pte_lockptr is a spinlock it gets optimized away on
> uniprocessor builds so using spin_is_locked is not correct. We can use
> assert_spin_locked instead and get the proper behavior between UP and
> SMP builds.
Thanks. Applied.
Cheers,
Ben.
> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
> ---
> arch/powerpc/mm/pgtable.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> index 627767d..4b9a27d 100644
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -242,7 +242,7 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> BUG_ON(pud_none(*pud));
> pmd = pmd_offset(pud, addr);
> BUG_ON(!pmd_present(*pmd));
> - BUG_ON(!spin_is_locked(pte_lockptr(mm, pmd)));
> + assert_spin_locked(pte_lockptr(mm, pmd));
> }
> #endif /* CONFIG_DEBUG_VM */
>
^ permalink raw reply
* Re: [PATCH] USB: fsl_qe_udc: Add fsl,mpc8323-qe-usb compatible entry
From: Li Yang @ 2009-08-19 3:19 UTC (permalink / raw)
To: Anton Vorontsov
Cc: David Brownell, Greg Kroah-Hartman, linuxppc-dev, linux-usb,
Scott Wood
In-Reply-To: <20090818222335.GA28135@oksana.dev.rtsoft.ru>
On Wed, Aug 19, 2009 at 6:23 AM, Anton
Vorontsov<avorontsov@ru.mvista.com> wrote:
> Current bindings specify that "fsl,mpc8323-qe-usb" compatible entry
> should be used as a base match for QE UDCs, so update the driver to
> comply with the bindings.
>
> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Li Yang <leoli@freescale.com>
- Leo
^ permalink raw reply
* Re: [PATCH] spinlock: __raw_spin_is_locked() should return true for UP
From: Linus Torvalds @ 2009-08-19 2:40 UTC (permalink / raw)
To: Kumar Gala
Cc: peterz, linux-kernel, Steven Rostedt, linuxppc-dev, mingo, tglx
In-Reply-To: <BB2762BC-C760-4D4C-BDCF-76EFD3E1B18D@kernel.crashing.org>
On Tue, 18 Aug 2009, Kumar Gala wrote:
>
> I agree its a little too easy to abuse spin_is_locked. However we should be
> consistent between spin_is_locked on UP between with and without
> CONFIG_DEBUG_SPINLOCK enabled.
No we shouldn't.
With CONFIG_DEBUG_SPINLOCK, you have an actual lock variable for debugging
purposes, so spin_is_locked() can clearly return a _valid_ answer, and
should do so.
Without DEBUG_SPINLOCK, there isn't any answer to return.
So there's no way we can or should be consistent. In one case an answer
exists, in another one the answer is meaningless and doesn't exist.
> How much of this do we want to try and address in .31?
Absolutely nothing.
> The PPC test really should be using assert_spin_locked and I'll send a patch
> to Ben for that.
Yes, that's the correct fix.
Linus
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox