Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next 6/6] be2net: fix to set ecmd->autoneg correctly
From: Ajit Khaparde @ 2011-08-05 20:01 UTC (permalink / raw)
  To: davem; +Cc: netdev

Set the autonegotation settings correctly based on the port speed.

Signed-off-by: Suresh Reddy <suresh.reddy@emulex.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
---
 drivers/net/benet/be_ethtool.c |   21 +++++++++++++++++----
 1 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/benet/be_ethtool.c b/drivers/net/benet/be_ethtool.c
index 5dd3ed6..2177c8c 100644
--- a/drivers/net/benet/be_ethtool.c
+++ b/drivers/net/benet/be_ethtool.c
@@ -353,6 +353,8 @@ static int be_get_settings(struct net_device *netdev, struct ethtool_cmd *ecmd)
 	u8 mac_speed = 0;
 	u16 link_speed = 0;
 	int status;
+	u16 port_speed = 0;
+	u16 dac_cable_len = 0;
 
 	if ((adapter->link_speed < 0) || (!(netdev->flags & IFF_UP))) {
 		status = be_cmd_link_status_query(adapter, &mac_speed,
@@ -397,11 +399,9 @@ static int be_get_settings(struct net_device *netdev, struct ethtool_cmd *ecmd)
 			switch (phy_info.interface_type) {
 			case PHY_TYPE_KR_10GB:
 			case PHY_TYPE_KX4_10GB:
-				ecmd->autoneg = AUTONEG_ENABLE;
 			ecmd->transceiver = XCVR_INTERNAL;
 				break;
 			default:
-				ecmd->autoneg = AUTONEG_DISABLE;
 				ecmd->transceiver = XCVR_EXTERNAL;
 				break;
 			}
@@ -411,12 +411,25 @@ static int be_get_settings(struct net_device *netdev, struct ethtool_cmd *ecmd)
 		adapter->link_speed = ethtool_cmd_speed(ecmd);
 		adapter->port_type = ecmd->port;
 		adapter->transceiver = ecmd->transceiver;
-		adapter->autoneg = ecmd->autoneg;
 	} else {
 		ethtool_cmd_speed_set(ecmd, adapter->link_speed);
 		ecmd->port = adapter->port_type;
 		ecmd->transceiver = adapter->transceiver;
-		ecmd->autoneg = adapter->autoneg;
+	}
+
+	be_cmd_get_port_speed(adapter, adapter->port_num,
+			&dac_cable_len, &port_speed);
+	switch (port_speed) {
+	case SPEED_FORCED_10GB:
+	case SPEED_FORCED_1GB:
+	case SPEED_FORCED_100MB:
+	case SPEED_FORCED_10MB:
+	case SPEED_DEFAULT:
+		ecmd->autoneg = AUTONEG_DISABLE;
+		break;
+	default:
+		ecmd->autoneg = AUTONEG_ENABLE;
+		break;
 	}
 
 	ecmd->duplex = DUPLEX_FULL;
-- 
1.7.4.1


^ permalink raw reply related

* Always send NETDEV_CHANGEADDR up
From: Andrei Warkentin @ 2011-08-05 21:04 UTC (permalink / raw)
  To: netdev
In-Reply-To: <CALfQTu7MVjfO7vHB-mAC=RwokBRsi7vR6_XVfQX0+vU2ZCVHOw@mail.gmail.com>

Here is the v4 of the patch, now rebased on net-next-2.6.

ToC:
[PATCHv4] Bridge: Always send NETDEV_CHANGEADDR up on br MAC change.

Thank you,
A

^ permalink raw reply

* [PATCHv4] Bridge: Always send NETDEV_CHANGEADDR up on br MAC change.
From: Andrei Warkentin @ 2011-08-05 21:04 UTC (permalink / raw)
  To: netdev; +Cc: Andrei Warkentin, Stephen Hemminger
In-Reply-To: <CALfQTu7MVjfO7vHB-mAC=RwokBRsi7vR6_XVfQX0+vU2ZCVHOw@mail.gmail.com>

This ensures the neighbor entries associated with the bridge
dev are flushed, also invalidating the associated cached L2 headers.

This means we br_add_if/br_del_if ports to implement hand-over and
not wind up with bridge packets going out with stale MAC.

This means we can also change MAC of port device and also not wind
up with bridge packets going out with stale MAC.

This builds on Stephen Hemminger's patch, also handling the br_del_if
case and the port MAC change case.

Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Andrei Warkentin <andreiw@motorola.com>
---
 net/bridge/br_if.c     |    6 +++++-
 net/bridge/br_notify.c |    7 ++++++-
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 3176e2e..2cdf007 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -417,6 +417,7 @@ put_back:
 int br_del_if(struct net_bridge *br, struct net_device *dev)
 {
 	struct net_bridge_port *p;
+	bool changed_addr;
 
 	p = br_port_get_rtnl(dev);
 	if (!p || p->br != br)
@@ -425,9 +426,12 @@ int br_del_if(struct net_bridge *br, struct net_device *dev)
 	del_nbp(p);
 
 	spin_lock_bh(&br->lock);
-	br_stp_recalculate_bridge_id(br);
+	changed_addr = br_stp_recalculate_bridge_id(br);
 	spin_unlock_bh(&br->lock);
 
+	if (changed_addr)
+		call_netdevice_notifiers(NETDEV_CHANGEADDR, br->dev);
+
 	netdev_update_features(br->dev);
 
 	return 0;
diff --git a/net/bridge/br_notify.c b/net/bridge/br_notify.c
index 6545ee9..a76b621 100644
--- a/net/bridge/br_notify.c
+++ b/net/bridge/br_notify.c
@@ -34,6 +34,7 @@ static int br_device_event(struct notifier_block *unused, unsigned long event, v
 	struct net_device *dev = ptr;
 	struct net_bridge_port *p;
 	struct net_bridge *br;
+	bool changed_addr;
 	int err;
 
 	/* register of bridge completed, add sysfs entries */
@@ -57,8 +58,12 @@ static int br_device_event(struct notifier_block *unused, unsigned long event, v
 	case NETDEV_CHANGEADDR:
 		spin_lock_bh(&br->lock);
 		br_fdb_changeaddr(p, dev->dev_addr);
-		br_stp_recalculate_bridge_id(br);
+		changed_addr = br_stp_recalculate_bridge_id(br);
 		spin_unlock_bh(&br->lock);
+
+		if (changed_addr)
+			call_netdevice_notifiers(NETDEV_CHANGEADDR, br->dev);
+
 		break;
 
 	case NETDEV_CHANGE:
-- 
1.7.0.4


^ permalink raw reply related

* Re: [PATCH] sunrpc: use better NUMA affinities
From: J. Bruce Fields @ 2011-08-05 21:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Trond Myklebust, Neil Brown, David Miller,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA, netdev, linux-kernel
In-Reply-To: <1311876249.2346.39.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>

On Thu, Jul 28, 2011 at 08:04:09PM +0200, Eric Dumazet wrote:
> Use NUMA aware allocations to reduce latencies and increase throughput.
> 
> sunrpc kthreads can use kthread_create_on_node() if pool_mode is
> "percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can
> also take into account NUMA node affinity for memory allocations.

By the way, thanks, applying for 3.2 with one minor fixup below.--b.

diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
index ce620b5..516f337 100644
--- a/fs/nfs/callback.c
+++ b/fs/nfs/callback.c
@@ -199,7 +199,7 @@ nfs41_callback_up(struct svc_serv *serv, struct rpc_xprt *xprt)
 	INIT_LIST_HEAD(&serv->sv_cb_list);
 	spin_lock_init(&serv->sv_cb_lock);
 	init_waitqueue_head(&serv->sv_cb_waitq);
-	rqstp = svc_prepare_thread(serv, &serv->sv_pools[0]);
+	rqstp = svc_prepare_thread(serv, &serv->sv_pools[0], NUMA_NO_NODE);
 	if (IS_ERR(rqstp)) {
 		svc_xprt_put(serv->sv_bc_xprt);
 		serv->sv_bc_xprt = NULL;

> 
> Signed-off-by: Eric Dumazet <eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> CC: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
> CC: Neil Brown <neilb-l3A5Bk7waGM@public.gmane.org>
> CC: David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
> ---
>  fs/lockd/svc.c             |    2 +-
>  fs/nfs/callback.c          |    2 +-
>  include/linux/sunrpc/svc.h |    2 +-
>  net/sunrpc/svc.c           |   33 ++++++++++++++++++++++++---------
>  4 files changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
> index abfff9d..c061b9a 100644
> --- a/fs/lockd/svc.c
> +++ b/fs/lockd/svc.c
> @@ -282,7 +282,7 @@ int lockd_up(void)
>  	/*
>  	 * Create the kernel thread and wait for it to start.
>  	 */
> -	nlmsvc_rqst = svc_prepare_thread(serv, &serv->sv_pools[0]);
> +	nlmsvc_rqst = svc_prepare_thread(serv, &serv->sv_pools[0], NUMA_NO_NODE);
>  	if (IS_ERR(nlmsvc_rqst)) {
>  		error = PTR_ERR(nlmsvc_rqst);
>  		nlmsvc_rqst = NULL;
> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
> index e3d2942..ce620b5 100644
> --- a/fs/nfs/callback.c
> +++ b/fs/nfs/callback.c
> @@ -125,7 +125,7 @@ nfs4_callback_up(struct svc_serv *serv)
>  	else
>  		goto out_err;
>  
> -	return svc_prepare_thread(serv, &serv->sv_pools[0]);
> +	return svc_prepare_thread(serv, &serv->sv_pools[0], NUMA_NO_NODE);
>  
>  out_err:
>  	if (ret == 0)
> diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
> index 223588a..a78a51e 100644
> --- a/include/linux/sunrpc/svc.h
> +++ b/include/linux/sunrpc/svc.h
> @@ -404,7 +404,7 @@ struct svc_procedure {
>  struct svc_serv *svc_create(struct svc_program *, unsigned int,
>  			    void (*shutdown)(struct svc_serv *));
>  struct svc_rqst *svc_prepare_thread(struct svc_serv *serv,
> -					struct svc_pool *pool);
> +					struct svc_pool *pool, int node);
>  void		   svc_exit_thread(struct svc_rqst *);
>  struct svc_serv *  svc_create_pooled(struct svc_program *, unsigned int,
>  			void (*shutdown)(struct svc_serv *),
> diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
> index 6a69a11..30d70ab 100644
> --- a/net/sunrpc/svc.c
> +++ b/net/sunrpc/svc.c
> @@ -295,6 +295,18 @@ svc_pool_map_put(void)
>  }
>  
>  
> +static int svc_pool_map_get_node(unsigned int pidx)
> +{
> +	const struct svc_pool_map *m = &svc_pool_map;
> +
> +	if (m->count) {
> +		if (m->mode == SVC_POOL_PERCPU)
> +			return cpu_to_node(m->pool_to[pidx]);
> +		if (m->mode == SVC_POOL_PERNODE)
> +			return m->pool_to[pidx];
> +	}
> +	return NUMA_NO_NODE;
> +}
>  /*
>   * Set the given thread's cpus_allowed mask so that it
>   * will only run on cpus in the given pool.
> @@ -499,7 +511,7 @@ EXPORT_SYMBOL_GPL(svc_destroy);
>   * We allocate pages and place them in rq_argpages.
>   */
>  static int
> -svc_init_buffer(struct svc_rqst *rqstp, unsigned int size)
> +svc_init_buffer(struct svc_rqst *rqstp, unsigned int size, int node)
>  {
>  	unsigned int pages, arghi;
>  
> @@ -513,7 +525,7 @@ svc_init_buffer(struct svc_rqst *rqstp, unsigned int size)
>  	arghi = 0;
>  	BUG_ON(pages > RPCSVC_MAXPAGES);
>  	while (pages) {
> -		struct page *p = alloc_page(GFP_KERNEL);
> +		struct page *p = alloc_pages_node(node, GFP_KERNEL, 0);
>  		if (!p)
>  			break;
>  		rqstp->rq_pages[arghi++] = p;
> @@ -536,11 +548,11 @@ svc_release_buffer(struct svc_rqst *rqstp)
>  }
>  
>  struct svc_rqst *
> -svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool)
> +svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool, int node)
>  {
>  	struct svc_rqst	*rqstp;
>  
> -	rqstp = kzalloc(sizeof(*rqstp), GFP_KERNEL);
> +	rqstp = kzalloc_node(sizeof(*rqstp), GFP_KERNEL, node);
>  	if (!rqstp)
>  		goto out_enomem;
>  
> @@ -554,15 +566,15 @@ svc_prepare_thread(struct svc_serv *serv, struct svc_pool *pool)
>  	rqstp->rq_server = serv;
>  	rqstp->rq_pool = pool;
>  
> -	rqstp->rq_argp = kmalloc(serv->sv_xdrsize, GFP_KERNEL);
> +	rqstp->rq_argp = kmalloc_node(serv->sv_xdrsize, GFP_KERNEL, node);
>  	if (!rqstp->rq_argp)
>  		goto out_thread;
>  
> -	rqstp->rq_resp = kmalloc(serv->sv_xdrsize, GFP_KERNEL);
> +	rqstp->rq_resp = kmalloc_node(serv->sv_xdrsize, GFP_KERNEL, node);
>  	if (!rqstp->rq_resp)
>  		goto out_thread;
>  
> -	if (!svc_init_buffer(rqstp, serv->sv_max_mesg))
> +	if (!svc_init_buffer(rqstp, serv->sv_max_mesg, node))
>  		goto out_thread;
>  
>  	return rqstp;
> @@ -647,6 +659,7 @@ svc_set_num_threads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
>  	struct svc_pool *chosen_pool;
>  	int error = 0;
>  	unsigned int state = serv->sv_nrthreads-1;
> +	int node;
>  
>  	if (pool == NULL) {
>  		/* The -1 assumes caller has done a svc_get() */
> @@ -662,14 +675,16 @@ svc_set_num_threads(struct svc_serv *serv, struct svc_pool *pool, int nrservs)
>  		nrservs--;
>  		chosen_pool = choose_pool(serv, pool, &state);
>  
> -		rqstp = svc_prepare_thread(serv, chosen_pool);
> +		node = svc_pool_map_get_node(chosen_pool->sp_id);
> +		rqstp = svc_prepare_thread(serv, chosen_pool, node);
>  		if (IS_ERR(rqstp)) {
>  			error = PTR_ERR(rqstp);
>  			break;
>  		}
>  
>  		__module_get(serv->sv_module);
> -		task = kthread_create(serv->sv_function, rqstp, serv->sv_name);
> +		task = kthread_create_on_node(serv->sv_function, rqstp,
> +					      node, serv->sv_name);
>  		if (IS_ERR(task)) {
>  			error = PTR_ERR(task);
>  			module_put(serv->sv_module);
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH net-next 4/6] be2net: add ethtool::set_settings support
From: Ben Hutchings @ 2011-08-05 21:43 UTC (permalink / raw)
  To: Ajit Khaparde; +Cc: davem, netdev
In-Reply-To: <20110805200036.GA13585@akhaparde-VBox>

On Fri, 2011-08-05 at 15:00 -0500, Ajit Khaparde wrote:
> Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
> ---
>  drivers/net/benet/be_ethtool.c |   63 ++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 63 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/benet/be_ethtool.c b/drivers/net/benet/be_ethtool.c
> index f144a6f..5dd3ed6 100644
> --- a/drivers/net/benet/be_ethtool.c
> +++ b/drivers/net/benet/be_ethtool.c
> @@ -443,6 +443,68 @@ static int be_get_settings(struct net_device *netdev, struct ethtool_cmd *ecmd)
>  	return 0;
>  }
>  
> +static int be_set_settings(struct net_device *netdev,
> +				struct ethtool_cmd *ecmd)
> +{
> +	struct be_adapter *adapter = netdev_priv(netdev);
> +	struct be_phy_info phy_info;
> +	u16 mac_speed = 0;
> +	u16 dac_cable_len = 0;
> +	u16 port_speed = 0;
> +	int status;
> +
> +	status = be_cmd_get_phy_info(adapter, &phy_info);
> +	if (status) {
> +		dev_err(&adapter->pdev->dev, "Get phy info cmd failed.\n");
> +		return status;
> +	}
> +
> +	if (ecmd->autoneg == AUTONEG_ENABLE) {
> +		switch (phy_info.interface_type) {
> +		case PHY_TYPE_SFP_1GB:
> +		case PHY_TYPE_BASET_1GB:
> +		case PHY_TYPE_BASEX_1GB:
> +		case PHY_TYPE_SGMII:
> +			mac_speed = SPEED_AUTONEG_1GB_100MB_10MB;
> +			break;
> +		case PHY_TYPE_SFP_PLUS_10GB:
> +			 dev_warn(&adapter->pdev->dev,
> +				"Autoneg not supported on this module.\n");
> +			 return -EINVAL;
> +		case PHY_TYPE_KR_10GB:
> +		case PHY_TYPE_KX4_10GB:
> +			 mac_speed = SPEED_AUTONEG_10GB_1GB;
> +			 break;
> +		case PHY_TYPE_BASET_10GB:
> +			 mac_speed = SPEED_AUTONEG_10GB_1GB_100MB;
> +			 break;
> +		}
[....]

This is wrong.  When autoneg is enabled, you have to look at the
'advertised' field to find out which link modes you are supposed to
enable.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH net-next 6/6] be2net: fix to set ecmd->autoneg correctly
From: Ben Hutchings @ 2011-08-05 21:51 UTC (permalink / raw)
  To: Ajit Khaparde; +Cc: davem, netdev
In-Reply-To: <20110805200107.GA13631@akhaparde-VBox>

On Fri, 2011-08-05 at 15:01 -0500, Ajit Khaparde wrote:
> Set the autonegotation settings correctly based on the port speed.
[...]

Autonegotiation and multi-speed are two entirely different things.

Selecting a single speed doesn't mean turning autoneg off.  If you
enable only 1000BASE-T or only 10GBASE-T, the PHY must still go through
autoneg to determine whether it is the clock master for the link.

Enabling multiple speeds doesn't mean turning autoneg on.  An SFP+ slot
can take 1G and 10G modules but the modules generally won't support
autoneg.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* include/linux/netlink.h: problem when included by an application
From: Michel Machado @ 2011-08-05 21:45 UTC (permalink / raw)
  To: netdev

Hi there,

   When an application includes header <linux/netlink.h> obtained with
'make headers_install' or from /usr/include/, it produces the following
error:

/usr/include/linux/netlink.h:31:2: error: expected
specifier-qualifier-list before ‘sa_family_t’

   The error doesn't come up in the kernel because
include/linux/netlink.h has the following line:

#include <linux/socket.h> /* for sa_family_t */

   However, <linux/socket.h> from /usr/include/ doesn't have sa_family_t
because it's protected by an $ifdef __KERNEL__ in
include/linux/socket.h.

   A workaround for an application is to include <sys/socket.h> before
<linux/netlink.h>. However, shouldn't include/linux/netlink.h be fixed?

   The simplest solution that I came up was replacing sa_family_t in
include/linux/netlink.h to 'unsigned short' as header
include/linux/socket.h does for struct __kernel_sockaddr_storage
available to applications.

-- 
[ ]'s
Michel Machado




^ permalink raw reply

* [RFC 0/4] [flexcan] Add support for powerpc (freescale p1010) -V5
From: Robin Holt @ 2011-08-06  4:05 UTC (permalink / raw)
  To: Robin Holt, Marc Kleine-Budde, Wolfgang Grandegger,
	U Bhaskar-B22300
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA

Marc, Wolfgang or U Bhaskar,

This patch set should have all your comments included.

I did implement a very simple clock source in the p1010rdb.c file, which,
unfortunately, your tree will not have so please do not apply the last
patch in the series.  That will need to go to the powerpc folks and
follow the p1010rdb patch from freescale.

Could you please apply the first three patches to a test branch, compile
and test them on an arm based system?  I would like to at least feel
comfortable that I have not broken anything there.

I have tested the full set on a p1010rdb with an external PSOC based
can communicator.  That PSOC code has a bunch of erroneous can comms it
can generate, but I do not know how the developer of that code injects
those errors.  As a result, no error handling from the can input has been
tested.  I have tested both flexcan interfaces on the board and both work
with these patches in addition to the other p1010rdb patches not included.

Thanks,
Robin Holt

^ permalink raw reply

* [RFC 1/4] [flexcan] Abstract off read/write for big/little endian.
From: Robin Holt @ 2011-08-06  4:05 UTC (permalink / raw)
  To: Robin Holt, Marc Kleine-Budde, Wolfgang Grandegger,
	U Bhaskar-B22300
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA, Marc Kleine-Budde
In-Reply-To: <1312603504-30282-1-git-send-email-holt-sJ/iWh9BUns@public.gmane.org>

First step in converting the flexcan driver from supporting just arm to
supporting both arm and powerpc architectures.

Signed-off-by: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
Acked-by: Marc Kleine-Budde <mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
To: Wolfgang Grandegger <wg-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
To: U Bhaskar-B22300 <B22300-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 drivers/net/can/flexcan.c |  140 ++++++++++++++++++++++++++------------------
 1 files changed, 83 insertions(+), 57 deletions(-)

diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index 67d9fc0..74b1706 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -196,6 +196,31 @@ static struct can_bittiming_const flexcan_bittiming_const = {
 };
 
 /*
+ * Abstract off the read/write for arm versus ppc.
+ */
+#if defined(__BIG_ENDIAN)
+static inline u32 flexcan_read(void __iomem *addr)
+{
+	return in_be32(addr);
+}
+
+static inline void flexcan_write(u32 val, void __iomem *addr)
+{
+	out_be32(addr, val);
+}
+#else
+static inline u32 flexcan_read(void __iomem *addr)
+{
+	return readl(addr);
+}
+
+static inline void flexcan_write(u32 val, void __iomem *addr)
+{
+	writel(val, addr);
+}
+#endif
+
+/*
  * Swtich transceiver on or off
  */
 static void flexcan_transceiver_switch(const struct flexcan_priv *priv, int on)
@@ -216,9 +241,9 @@ static inline void flexcan_chip_enable(struct flexcan_priv *priv)
 	struct flexcan_regs __iomem *regs = priv->base;
 	u32 reg;
 
-	reg = readl(&regs->mcr);
+	reg = flexcan_read(&regs->mcr);
 	reg &= ~FLEXCAN_MCR_MDIS;
-	writel(reg, &regs->mcr);
+	flexcan_write(reg, &regs->mcr);
 
 	udelay(10);
 }
@@ -228,9 +253,9 @@ static inline void flexcan_chip_disable(struct flexcan_priv *priv)
 	struct flexcan_regs __iomem *regs = priv->base;
 	u32 reg;
 
-	reg = readl(&regs->mcr);
+	reg = flexcan_read(&regs->mcr);
 	reg |= FLEXCAN_MCR_MDIS;
-	writel(reg, &regs->mcr);
+	flexcan_write(reg, &regs->mcr);
 }
 
 static int flexcan_get_berr_counter(const struct net_device *dev,
@@ -238,7 +263,7 @@ static int flexcan_get_berr_counter(const struct net_device *dev,
 {
 	const struct flexcan_priv *priv = netdev_priv(dev);
 	struct flexcan_regs __iomem *regs = priv->base;
-	u32 reg = readl(&regs->ecr);
+	u32 reg = flexcan_read(&regs->ecr);
 
 	bec->txerr = (reg >> 0) & 0xff;
 	bec->rxerr = (reg >> 8) & 0xff;
@@ -272,15 +297,15 @@ static int flexcan_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	if (cf->can_dlc > 0) {
 		u32 data = be32_to_cpup((__be32 *)&cf->data[0]);
-		writel(data, &regs->cantxfg[FLEXCAN_TX_BUF_ID].data[0]);
+		flexcan_write(data, &regs->cantxfg[FLEXCAN_TX_BUF_ID].data[0]);
 	}
 	if (cf->can_dlc > 3) {
 		u32 data = be32_to_cpup((__be32 *)&cf->data[4]);
-		writel(data, &regs->cantxfg[FLEXCAN_TX_BUF_ID].data[1]);
+		flexcan_write(data, &regs->cantxfg[FLEXCAN_TX_BUF_ID].data[1]);
 	}
 
-	writel(can_id, &regs->cantxfg[FLEXCAN_TX_BUF_ID].can_id);
-	writel(ctrl, &regs->cantxfg[FLEXCAN_TX_BUF_ID].can_ctrl);
+	flexcan_write(can_id, &regs->cantxfg[FLEXCAN_TX_BUF_ID].can_id);
+	flexcan_write(ctrl, &regs->cantxfg[FLEXCAN_TX_BUF_ID].can_ctrl);
 
 	kfree_skb(skb);
 
@@ -468,8 +493,8 @@ static void flexcan_read_fifo(const struct net_device *dev,
 	struct flexcan_mb __iomem *mb = &regs->cantxfg[0];
 	u32 reg_ctrl, reg_id;
 
-	reg_ctrl = readl(&mb->can_ctrl);
-	reg_id = readl(&mb->can_id);
+	reg_ctrl = flexcan_read(&mb->can_ctrl);
+	reg_id = flexcan_read(&mb->can_id);
 	if (reg_ctrl & FLEXCAN_MB_CNT_IDE)
 		cf->can_id = ((reg_id >> 0) & CAN_EFF_MASK) | CAN_EFF_FLAG;
 	else
@@ -479,12 +504,12 @@ static void flexcan_read_fifo(const struct net_device *dev,
 		cf->can_id |= CAN_RTR_FLAG;
 	cf->can_dlc = get_can_dlc((reg_ctrl >> 16) & 0xf);
 
-	*(__be32 *)(cf->data + 0) = cpu_to_be32(readl(&mb->data[0]));
-	*(__be32 *)(cf->data + 4) = cpu_to_be32(readl(&mb->data[1]));
+	*(__be32 *)(cf->data + 0) = cpu_to_be32(flexcan_read(&mb->data[0]));
+	*(__be32 *)(cf->data + 4) = cpu_to_be32(flexcan_read(&mb->data[1]));
 
 	/* mark as read */
-	writel(FLEXCAN_IFLAG_RX_FIFO_AVAILABLE, &regs->iflag1);
-	readl(&regs->timer);
+	flexcan_write(FLEXCAN_IFLAG_RX_FIFO_AVAILABLE, &regs->iflag1);
+	flexcan_read(&regs->timer);
 }
 
 static int flexcan_read_frame(struct net_device *dev)
@@ -520,17 +545,17 @@ static int flexcan_poll(struct napi_struct *napi, int quota)
 	 * The error bits are cleared on read,
 	 * use saved value from irq handler.
 	 */
-	reg_esr = readl(&regs->esr) | priv->reg_esr;
+	reg_esr = flexcan_read(&regs->esr) | priv->reg_esr;
 
 	/* handle state changes */
 	work_done += flexcan_poll_state(dev, reg_esr);
 
 	/* handle RX-FIFO */
-	reg_iflag1 = readl(&regs->iflag1);
+	reg_iflag1 = flexcan_read(&regs->iflag1);
 	while (reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_AVAILABLE &&
 	       work_done < quota) {
 		work_done += flexcan_read_frame(dev);
-		reg_iflag1 = readl(&regs->iflag1);
+		reg_iflag1 = flexcan_read(&regs->iflag1);
 	}
 
 	/* report bus errors */
@@ -540,8 +565,8 @@ static int flexcan_poll(struct napi_struct *napi, int quota)
 	if (work_done < quota) {
 		napi_complete(napi);
 		/* enable IRQs */
-		writel(FLEXCAN_IFLAG_DEFAULT, &regs->imask1);
-		writel(priv->reg_ctrl_default, &regs->ctrl);
+		flexcan_write(FLEXCAN_IFLAG_DEFAULT, &regs->imask1);
+		flexcan_write(priv->reg_ctrl_default, &regs->ctrl);
 	}
 
 	return work_done;
@@ -555,9 +580,9 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id)
 	struct flexcan_regs __iomem *regs = priv->base;
 	u32 reg_iflag1, reg_esr;
 
-	reg_iflag1 = readl(&regs->iflag1);
-	reg_esr = readl(&regs->esr);
-	writel(FLEXCAN_ESR_ERR_INT, &regs->esr);	/* ACK err IRQ */
+	reg_iflag1 = flexcan_read(&regs->iflag1);
+	reg_esr = flexcan_read(&regs->esr);
+	flexcan_write(FLEXCAN_ESR_ERR_INT, &regs->esr);	/* ACK err IRQ */
 
 	/*
 	 * schedule NAPI in case of:
@@ -573,16 +598,16 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id)
 		 * save them for later use.
 		 */
 		priv->reg_esr = reg_esr & FLEXCAN_ESR_ERR_BUS;
-		writel(FLEXCAN_IFLAG_DEFAULT & ~FLEXCAN_IFLAG_RX_FIFO_AVAILABLE,
-		       &regs->imask1);
-		writel(priv->reg_ctrl_default & ~FLEXCAN_CTRL_ERR_ALL,
+		flexcan_write(FLEXCAN_IFLAG_DEFAULT &
+			~FLEXCAN_IFLAG_RX_FIFO_AVAILABLE, &regs->imask1);
+		flexcan_write(priv->reg_ctrl_default & ~FLEXCAN_CTRL_ERR_ALL,
 		       &regs->ctrl);
 		napi_schedule(&priv->napi);
 	}
 
 	/* FIFO overflow */
 	if (reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_OVERFLOW) {
-		writel(FLEXCAN_IFLAG_RX_FIFO_OVERFLOW, &regs->iflag1);
+		flexcan_write(FLEXCAN_IFLAG_RX_FIFO_OVERFLOW, &regs->iflag1);
 		dev->stats.rx_over_errors++;
 		dev->stats.rx_errors++;
 	}
@@ -591,7 +616,7 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id)
 	if (reg_iflag1 & (1 << FLEXCAN_TX_BUF_ID)) {
 		/* tx_bytes is incremented in flexcan_start_xmit */
 		stats->tx_packets++;
-		writel((1 << FLEXCAN_TX_BUF_ID), &regs->iflag1);
+		flexcan_write((1 << FLEXCAN_TX_BUF_ID), &regs->iflag1);
 		netif_wake_queue(dev);
 	}
 
@@ -605,7 +630,7 @@ static void flexcan_set_bittiming(struct net_device *dev)
 	struct flexcan_regs __iomem *regs = priv->base;
 	u32 reg;
 
-	reg = readl(&regs->ctrl);
+	reg = flexcan_read(&regs->ctrl);
 	reg &= ~(FLEXCAN_CTRL_PRESDIV(0xff) |
 		 FLEXCAN_CTRL_RJW(0x3) |
 		 FLEXCAN_CTRL_PSEG1(0x7) |
@@ -629,11 +654,11 @@ static void flexcan_set_bittiming(struct net_device *dev)
 		reg |= FLEXCAN_CTRL_SMP;
 
 	dev_info(dev->dev.parent, "writing ctrl=0x%08x\n", reg);
-	writel(reg, &regs->ctrl);
+	flexcan_write(reg, &regs->ctrl);
 
 	/* print chip status */
 	dev_dbg(dev->dev.parent, "%s: mcr=0x%08x ctrl=0x%08x\n", __func__,
-		readl(&regs->mcr), readl(&regs->ctrl));
+		flexcan_read(&regs->mcr), flexcan_read(&regs->ctrl));
 }
 
 /*
@@ -654,10 +679,10 @@ static int flexcan_chip_start(struct net_device *dev)
 	flexcan_chip_enable(priv);
 
 	/* soft reset */
-	writel(FLEXCAN_MCR_SOFTRST, &regs->mcr);
+	flexcan_write(FLEXCAN_MCR_SOFTRST, &regs->mcr);
 	udelay(10);
 
-	reg_mcr = readl(&regs->mcr);
+	reg_mcr = flexcan_read(&regs->mcr);
 	if (reg_mcr & FLEXCAN_MCR_SOFTRST) {
 		dev_err(dev->dev.parent,
 			"Failed to softreset can module (mcr=0x%08x)\n",
@@ -679,12 +704,12 @@ static int flexcan_chip_start(struct net_device *dev)
 	 * choose format C
 	 *
 	 */
-	reg_mcr = readl(&regs->mcr);
+	reg_mcr = flexcan_read(&regs->mcr);
 	reg_mcr |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_FEN | FLEXCAN_MCR_HALT |
 		FLEXCAN_MCR_SUPV | FLEXCAN_MCR_WRN_EN |
 		FLEXCAN_MCR_IDAM_C;
 	dev_dbg(dev->dev.parent, "%s: writing mcr=0x%08x", __func__, reg_mcr);
-	writel(reg_mcr, &regs->mcr);
+	flexcan_write(reg_mcr, &regs->mcr);
 
 	/*
 	 * CTRL
@@ -702,7 +727,7 @@ static int flexcan_chip_start(struct net_device *dev)
 	 * (FLEXCAN_CTRL_ERR_MSK), too. Otherwise we don't get any
 	 * warning or bus passive interrupts.
 	 */
-	reg_ctrl = readl(&regs->ctrl);
+	reg_ctrl = flexcan_read(&regs->ctrl);
 	reg_ctrl &= ~FLEXCAN_CTRL_TSYN;
 	reg_ctrl |= FLEXCAN_CTRL_BOFF_REC | FLEXCAN_CTRL_LBUF |
 		FLEXCAN_CTRL_ERR_STATE | FLEXCAN_CTRL_ERR_MSK;
@@ -710,38 +735,39 @@ static int flexcan_chip_start(struct net_device *dev)
 	/* save for later use */
 	priv->reg_ctrl_default = reg_ctrl;
 	dev_dbg(dev->dev.parent, "%s: writing ctrl=0x%08x", __func__, reg_ctrl);
-	writel(reg_ctrl, &regs->ctrl);
+	flexcan_write(reg_ctrl, &regs->ctrl);
 
 	for (i = 0; i < ARRAY_SIZE(regs->cantxfg); i++) {
-		writel(0, &regs->cantxfg[i].can_ctrl);
-		writel(0, &regs->cantxfg[i].can_id);
-		writel(0, &regs->cantxfg[i].data[0]);
-		writel(0, &regs->cantxfg[i].data[1]);
+		flexcan_write(0, &regs->cantxfg[i].can_ctrl);
+		flexcan_write(0, &regs->cantxfg[i].can_id);
+		flexcan_write(0, &regs->cantxfg[i].data[0]);
+		flexcan_write(0, &regs->cantxfg[i].data[1]);
 
 		/* put MB into rx queue */
-		writel(FLEXCAN_MB_CNT_CODE(0x4), &regs->cantxfg[i].can_ctrl);
+		flexcan_write(FLEXCAN_MB_CNT_CODE(0x4),
+			&regs->cantxfg[i].can_ctrl);
 	}
 
 	/* acceptance mask/acceptance code (accept everything) */
-	writel(0x0, &regs->rxgmask);
-	writel(0x0, &regs->rx14mask);
-	writel(0x0, &regs->rx15mask);
+	flexcan_write(0x0, &regs->rxgmask);
+	flexcan_write(0x0, &regs->rx14mask);
+	flexcan_write(0x0, &regs->rx15mask);
 
 	flexcan_transceiver_switch(priv, 1);
 
 	/* synchronize with the can bus */
-	reg_mcr = readl(&regs->mcr);
+	reg_mcr = flexcan_read(&regs->mcr);
 	reg_mcr &= ~FLEXCAN_MCR_HALT;
-	writel(reg_mcr, &regs->mcr);
+	flexcan_write(reg_mcr, &regs->mcr);
 
 	priv->can.state = CAN_STATE_ERROR_ACTIVE;
 
 	/* enable FIFO interrupts */
-	writel(FLEXCAN_IFLAG_DEFAULT, &regs->imask1);
+	flexcan_write(FLEXCAN_IFLAG_DEFAULT, &regs->imask1);
 
 	/* print chip status */
 	dev_dbg(dev->dev.parent, "%s: reading mcr=0x%08x ctrl=0x%08x\n",
-		__func__, readl(&regs->mcr), readl(&regs->ctrl));
+		__func__, flexcan_read(&regs->mcr), flexcan_read(&regs->ctrl));
 
 	return 0;
 
@@ -763,12 +789,12 @@ static void flexcan_chip_stop(struct net_device *dev)
 	u32 reg;
 
 	/* Disable all interrupts */
-	writel(0, &regs->imask1);
+	flexcan_write(0, &regs->imask1);
 
 	/* Disable + halt module */
-	reg = readl(&regs->mcr);
+	reg = flexcan_read(&regs->mcr);
 	reg |= FLEXCAN_MCR_MDIS | FLEXCAN_MCR_HALT;
-	writel(reg, &regs->mcr);
+	flexcan_write(reg, &regs->mcr);
 
 	flexcan_transceiver_switch(priv, 0);
 	priv->can.state = CAN_STATE_STOPPED;
@@ -860,24 +886,24 @@ static int __devinit register_flexcandev(struct net_device *dev)
 
 	/* select "bus clock", chip must be disabled */
 	flexcan_chip_disable(priv);
-	reg = readl(&regs->ctrl);
+	reg = flexcan_read(&regs->ctrl);
 	reg |= FLEXCAN_CTRL_CLK_SRC;
-	writel(reg, &regs->ctrl);
+	flexcan_write(reg, &regs->ctrl);
 
 	flexcan_chip_enable(priv);
 
 	/* set freeze, halt and activate FIFO, restrict register access */
-	reg = readl(&regs->mcr);
+	reg = flexcan_read(&regs->mcr);
 	reg |= FLEXCAN_MCR_FRZ | FLEXCAN_MCR_HALT |
 		FLEXCAN_MCR_FEN | FLEXCAN_MCR_SUPV;
-	writel(reg, &regs->mcr);
+	flexcan_write(reg, &regs->mcr);
 
 	/*
 	 * Currently we only support newer versions of this core
 	 * featuring a RX FIFO. Older cores found on some Coldfire
 	 * derivates are not yet supported.
 	 */
-	reg = readl(&regs->mcr);
+	reg = flexcan_read(&regs->mcr);
 	if (!(reg & FLEXCAN_MCR_FEN)) {
 		dev_err(dev->dev.parent,
 			"Could not enable RX FIFO, unsupported core\n");
-- 
1.7.2.1

^ permalink raw reply related

* [RFC 2/4] [flexcan] Add of_match to platform_device definition.
From: Robin Holt @ 2011-08-06  4:05 UTC (permalink / raw)
  To: Robin Holt, Marc Kleine-Budde, Wolfgang Grandegger,
	U Bhaskar-B22300
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1312603504-30282-1-git-send-email-holt-sJ/iWh9BUns@public.gmane.org>

The OpenFirmware devices are not matched without specifying
an of_match array.  Introduce that array as that is used for
matching on the Freescale P1010 processor.

Signed-off-by: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
To: Marc Kleine-Budde <mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
To: Wolfgang Grandegger <wg-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
To: U Bhaskar-B22300 <B22300-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 drivers/net/can/flexcan.c |   16 +++++++++++++++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index 74b1706..c20d673 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -1033,8 +1033,22 @@ static int __devexit flexcan_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static struct of_device_id flexcan_of_match[] = {
+	{
+		.compatible = "fsl,flexcan-v1.0",
+	},
+	{
+		.compatible = "fsl,flexcan",
+	},
+	{},
+};
+
 static struct platform_driver flexcan_driver = {
-	.driver.name = DRV_NAME,
+	.driver = {
+		.name = DRV_NAME,
+		.owner = THIS_MODULE,
+		.of_match_table = flexcan_of_match,
+	},
 	.probe = flexcan_probe,
 	.remove = __devexit_p(flexcan_remove),
 };
-- 
1.7.2.1

^ permalink raw reply related

* [RFC 4/4] [powerpc] Implement a p1010rdb clock source.
From: Robin Holt @ 2011-08-06  4:05 UTC (permalink / raw)
  To: Robin Holt, Marc Kleine-Budde, Wolfgang Grandegger,
	U Bhaskar-B22300
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1312603504-30282-1-git-send-email-holt-sJ/iWh9BUns@public.gmane.org>

flexcan driver needs the clk_get, clk_get_rate, etc functions
to work.  This patch provides the minimum functionality.

Signed-off-by: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
To: Marc Kleine-Budde <mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
To: Wolfgang Grandegger <wg-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
To: U Bhaskar-B22300 <B22300-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 arch/powerpc/platforms/85xx/p1010rdb.c |   78 ++++++++++++++++++++++++++++++++
 1 files changed, 78 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/p1010rdb.c b/arch/powerpc/platforms/85xx/p1010rdb.c
index 3540a88..8f78ddd 100644
--- a/arch/powerpc/platforms/85xx/p1010rdb.c
+++ b/arch/powerpc/platforms/85xx/p1010rdb.c
@@ -28,6 +28,7 @@
 #include <asm/udbg.h>
 #include <asm/mpic.h>
 #include <asm/swiotlb.h>
+#include <asm/clk_interface.h>
 
 #include <sysdev/fsl_soc.h>
 #include <sysdev/fsl_pci.h>
@@ -164,6 +165,82 @@ static void __init p1010_rdb_setup_arch(void)
 	printk(KERN_INFO "P1010 RDB board from Freescale Semiconductor\n");
 }
 
+/*
+ * p1010rdb needs to provide a clock source for the flexcan driver.
+ */
+struct clk {
+	unsigned long rate;
+} p1010rdb_system_clk;
+
+static struct clk *p1010_rdb_clk_get(struct device *dev, const char *id)
+{
+	struct clk *clk;
+	u32 *of_property;
+	unsigned long clock_freq, clock_divider;
+	const char *dev_init_name;
+
+	if (!dev)
+		return ERR_PTR(-ENOENT);
+
+	/*
+	 * The can devices are named ffe1c000.can0 and ffe1d000.can1 on
+	 * the p1010rdb.  Check for the "can" portion of that name before
+	 * returning a clock source.
+	 */
+	dev_init_name = dev_name(dev);
+	if (strlen(dev_init_name) != 13)
+		return ERR_PTR(-ENOENT);
+	dev_init_name += 9;
+	if (strncmp(dev_init_name, "can", 3))
+		return ERR_PTR(-ENOENT);
+
+	of_property = (u32 *)of_get_property(dev->of_node, "clock_freq", NULL);
+	if (!of_property)
+		return ERR_PTR(-ENOENT);
+	clock_freq = *of_property;
+
+	of_property = (u32 *)of_get_property(dev->of_node,
+					     "fsl,flexcan-clock-divider", NULL);
+	if (!of_property)
+		return ERR_PTR(-ENOENT);
+	clock_divider = *of_property;
+
+	clk = kmalloc(sizeof(struct clk), GFP_KERNEL);
+	if (!clk)
+		return ERR_PTR(-ENOMEM);
+
+	clk->rate = DIV_ROUND_CLOSEST(clock_freq / clock_divider, 1000);
+	clk->rate *= 1000;
+
+	return clk;
+}
+
+static void p1010_rdb_clk_put(struct clk *clk)
+{
+	kfree(clk);
+}
+
+static unsigned long p1010_rdb_clk_get_rate(struct clk *clk)
+{
+	return clk->rate;
+}
+
+static struct clk_interface p1010_rdb_clk_functions = {
+	.clk_get		= p1010_rdb_clk_get,
+	.clk_get_rate		= p1010_rdb_clk_get_rate,
+	.clk_put		= p1010_rdb_clk_put,
+};
+
+static void __init p1010_rdb_clk_init(void)
+{
+	clk_functions = p1010_rdb_clk_functions;
+}
+
+static void __init p1010_rdb_init(void)
+{
+	p1010_rdb_clk_init();
+}
+
 static struct of_device_id __initdata p1010rdb_ids[] = {
 	{ .type = "soc", },
 	{ .compatible = "soc", },
@@ -195,6 +272,7 @@ define_machine(p1010_rdb) {
 	.name			= "P1010 RDB",
 	.probe			= p1010_rdb_probe,
 	.setup_arch		= p1010_rdb_setup_arch,
+	.init			= p1010_rdb_init,
 	.init_IRQ		= p1010_rdb_pic_init,
 #ifdef CONFIG_PCI
 	.pcibios_fixup_bus	= fsl_pcibios_fixup_bus,
-- 
1.7.2.1

^ permalink raw reply related

* [RFC 3/4] [flexcan] Add support for FLEXCAN_DEBUG
From: Robin Holt @ 2011-08-06  4:05 UTC (permalink / raw)
  To: Robin Holt, Marc Kleine-Budde, Wolfgang Grandegger,
	U Bhaskar-B22300
  Cc: Robin Holt, socketcan-core, netdev
In-Reply-To: <1312603504-30282-1-git-send-email-holt@sgi.com>

Add a wrapper function for a register dump when a developer defines
FLEXCAN_DEBUG.

Signed-off-by: Robin Holt <holt@sgi.com>
To: Marc Kleine-Budde <mkl@pengutronix.de>
To: Wolfgang Grandegger <wg@grandegger.com>
To: U Bhaskar-B22300 <B22300@freescale.com>
Cc: socketcan-core@lists.berlios.de
Cc: netdev@vger.kernel.org
---
 drivers/net/can/flexcan.c |   37 +++++++++++++++++++++++++++++++++++++
 1 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index c20d673..941b99e 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -220,6 +220,35 @@ static inline void flexcan_write(u32 val, void __iomem *addr)
 }
 #endif
 
+#if defined(FLEXCAN_DEBUG)
+void _flexcan_reg_dump(struct net_device *dev, const char *file, int line,
+		       const char *func)
+{
+	const struct flexcan_priv *priv = netdev_priv(dev);
+	struct flexcan_regs __iomem *regs = priv->base;
+
+	netdev_info("flexcan_reg_dump:%s:%d:%s()\n", file, line, func);
+	netdev_info("\t  mcr 0x%08x  ctrl 0x%08x timer 0x%08x   rxg 0x%08x",
+		flexcan_read(&regs->mcr),
+		flexcan_read(&regs->ctrl),
+		flexcan_read(&regs->timer),
+		flexcan_read(&regs->rxgmask));
+	netdev_info("\t rx14 0x%08x  rx15 0x%08x   ecr 0x%08x   esr 0x%08x",
+		flexcan_read(&regs->rx14mask),
+		flexcan_read(&regs->rx15mask),
+		flexcan_read(&regs->ecr),
+		flexcan_read(&regs->esr));
+	netdev_info("\timsk2 0x%08x imsk1 0x%08x iflg2 0x%08x iflg1 0x%08x",
+		flexcan_read(&regs->imask2),
+		flexcan_read(&regs->imask1),
+		flexcan_read(&regs->iflag2),
+		flexcan_read(&regs->iflag1));
+}
+#define flexcan_reg_dump(_d) _flexcan_reg_dump(_d, __FILE__, __LINE__, __func__)
+#else
+#define flexcan_reg_dump(_d)
+#endif
+
 /*
  * Swtich transceiver on or off
  */
@@ -280,6 +309,8 @@ static int flexcan_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	u32 can_id;
 	u32 ctrl = FLEXCAN_MB_CNT_CODE(0xc) | (cf->can_dlc << 16);
 
+	flexcan_reg_dump(dev);
+
 	if (can_dropped_invalid_skb(dev, skb))
 		return NETDEV_TX_OK;
 
@@ -312,6 +343,8 @@ static int flexcan_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* tx_packets is incremented in flexcan_irq */
 	stats->tx_bytes += cf->can_dlc;
 
+	flexcan_reg_dump(dev);
+
 	return NETDEV_TX_OK;
 }
 
@@ -580,6 +613,8 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id)
 	struct flexcan_regs __iomem *regs = priv->base;
 	u32 reg_iflag1, reg_esr;
 
+	flexcan_reg_dump(dev);
+
 	reg_iflag1 = flexcan_read(&regs->iflag1);
 	reg_esr = flexcan_read(&regs->esr);
 	flexcan_write(FLEXCAN_ESR_ERR_INT, &regs->esr);	/* ACK err IRQ */
@@ -620,6 +655,8 @@ static irqreturn_t flexcan_irq(int irq, void *dev_id)
 		netif_wake_queue(dev);
 	}
 
+	flexcan_reg_dump(dev);
+
 	return IRQ_HANDLED;
 }
 
-- 
1.7.2.1


^ permalink raw reply related

* Re: [net-next-2.6 PATCH] enic: Add timestamp to network interface stats
From: David Miller @ 2011-08-06  6:25 UTC (permalink / raw)
  To: dannguo; +Cc: netdev
In-Reply-To: <20110805001124.32402.42919.stgit@savbu-pc100.cisco.com>

From: Danny Guo <dannguo@cisco.com>
Date: Thu, 04 Aug 2011 17:11:24 -0700

> From: Danny Guo <dannguo@cisco.com>
> 
> This patch adds timestamps in ethtool stats. It makes it easier to provide scripts to users to calculate throughput, etc. It also allows software to synchronize timestamps with host time for correlating host events with stats collection.
> 
> Signed-off-by: Danny Guo <dannguo@cisco.com>
> Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
> Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
> Signed-off-by: David Wang <dwang2@cisco.com>

It's "easier" but only for your specific driver if we let this patch
go in.

Everyone tends to come up with the "easy" but localized and selfish
solution.

We have not one but several rate estimators in the kernel, if that's
unusable then fix them instead of just going for private facilities
that only will work with your driver.

I'm not applying this, sorry.

^ permalink raw reply

* Re: [RFC 0/4] [flexcan] Add support for powerpc (freescale p1010) -V5
From: Robin Holt @ 2011-08-06 11:06 UTC (permalink / raw)
  To: Robin Holt, Marc Kleine-Budde, Wolfgang Grandegger,
	U Bhaskar-B22300
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1312603504-30282-1-git-send-email-holt-sJ/iWh9BUns@public.gmane.org>

On Fri, Aug 05, 2011 at 11:05:00PM -0500, Robin Holt wrote:
> Marc, Wolfgang or U Bhaskar,
> 
> This patch set should have all your comments included.
> 
> I did implement a very simple clock source in the p1010rdb.c file, which,
> unfortunately, your tree will not have so please do not apply the last
> patch in the series.  That will need to go to the powerpc folks and
> follow the p1010rdb patch from freescale.
> 
> Could you please apply the first three patches to a test branch, compile
> and test them on an arm based system?  I would like to at least feel
> comfortable that I have not broken anything there.
> 
> I have tested the full set on a p1010rdb with an external PSOC based
> can communicator.  That PSOC code has a bunch of erroneous can comms it
> can generate, but I do not know how the developer of that code injects
> those errors.  As a result, no error handling from the can input has been
> tested.  I have tested both flexcan interfaces on the board and both work
> with these patches in addition to the other p1010rdb patches not included.

ARGH!

I just did a quick look back at my git log, and I have one other patch
earlier in the series where I committed a one-line change to flexcan.c
which is probably very relevant to you, but not so much to me.  I removed
the mach/clock.h which does not seem to exist for powerpc.

Can any of you tell me if that is relevant for the arm flexcan build?
If not, does it seem reasonable to just remove it early on?

Thanks,
Robin

^ permalink raw reply

* Re: [RFC 0/4] [flexcan] Add support for powerpc (freescale p1010) -V5
From: Robin Holt @ 2011-08-06 11:26 UTC (permalink / raw)
  To: Robin Holt, Marc Kleine-Budde, Wolfgang Grandegger,
	U Bhaskar-B22300
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20110806110602.GO4926-sJ/iWh9BUns@public.gmane.org>

On Sat, Aug 06, 2011 at 06:06:02AM -0500, Robin Holt wrote:
> On Fri, Aug 05, 2011 at 11:05:00PM -0500, Robin Holt wrote:
> > Marc, Wolfgang or U Bhaskar,
> > 
> > This patch set should have all your comments included.
> > 
> > I did implement a very simple clock source in the p1010rdb.c file, which,
> > unfortunately, your tree will not have so please do not apply the last
> > patch in the series.  That will need to go to the powerpc folks and
> > follow the p1010rdb patch from freescale.
> > 
> > Could you please apply the first three patches to a test branch, compile
> > and test them on an arm based system?  I would like to at least feel
> > comfortable that I have not broken anything there.
> > 
> > I have tested the full set on a p1010rdb with an external PSOC based
> > can communicator.  That PSOC code has a bunch of erroneous can comms it
> > can generate, but I do not know how the developer of that code injects
> > those errors.  As a result, no error handling from the can input has been
> > tested.  I have tested both flexcan interfaces on the board and both work
> > with these patches in addition to the other p1010rdb patches not included.
> 
> ARGH!
> 
> I just did a quick look back at my git log, and I have one other patch
> earlier in the series where I committed a one-line change to flexcan.c
> which is probably very relevant to you, but not so much to me.  I removed
> the mach/clock.h which does not seem to exist for powerpc.
> 
> Can any of you tell me if that is relevant for the arm flexcan build?
> If not, does it seem reasonable to just remove it early on?

It looks like the more-nearly right thing to do is to #include
<linux/clkdev.h> but powerpc does not implement one.

Thanks,
Robin

^ permalink raw reply

* ethtool 3.0 released
From: Ben Hutchings @ 2011-08-06 11:58 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 998 bytes --]

ethtool version 3.0 has been released.

Home page: https://ftp.kernel.org/pub/software/network/ethtool/
Download link:
https://ftp.kernel.org/pub/software/network/ethtool/ethtool-3.0.tar.gz

Release notes:

        * Feature: Report supported pause frame modes
        * Feature: Support firmware dump (-w and -W options)
        * Feature: Report advertised and supported 20G link modes
        * Feature: Add an 'l4data' option for ip4 filters (-U option)
        * Fix: Correct swapped h_source and h_dest fields for ether filters
          (-U option)
        * Fix: Set ip_ver field correctly for ip4 filters (-U option)
        * Fix: Correct parameter validation for -e and -E options; in
          particular, treat the 'magic' value as unsigned

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* ethtool 3.0 released
From: Ben Hutchings @ 2011-08-06 12:00 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 1067 bytes --]

[OK, Evolution seems to be confused about which account I'm using.]

ethtool version 3.0 has been released.

Home page: https://ftp.kernel.org/pub/software/network/ethtool/
Download link:
https://ftp.kernel.org/pub/software/network/ethtool/ethtool-3.0.tar.gz

Release notes:

        * Feature: Report supported pause frame modes
        * Feature: Support firmware dump (-w and -W options)
        * Feature: Report advertised and supported 20G link modes
        * Feature: Add an 'l4data' option for ip4 filters (-U option)
        * Fix: Correct swapped h_source and h_dest fields for ether filters
          (-U option)
        * Fix: Set ip_ver field correctly for ip4 filters (-U option)
        * Fix: Correct parameter validation for -e and -E options; in
          particular, treat the 'magic' value as unsigned

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 482 bytes --]

^ permalink raw reply

* [EXAMPLE CODE] Parasite thread injection and TCP connection hijacking
From: Tejun Heo @ 2011-08-06 12:12 UTC (permalink / raw)
  To: Matt Helsley, Pavel Emelyanov, Nathan Lynch, Oren Laadan,
	Daniel Lezcano, S
  Cc: James E.J. Bottomley, David S. Miller, linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 23047 bytes --]

Hello, guys.

So, here's transparent TCP connection hijacking (ie. checkpointing in
one process and restoring in another) which adds only relatively small
pieces to the kernel.  It's by no means complete but already works
rather reliably in my test setup even with heavy delay induced with
tc.

I wrote a rather long README describing how it's working, what's
missing which is appended at the end of this mail so if you're
interested in the details please go ahead and read.

Several ioctls are added to enable TCP connection CR, which adds
around 130 lines of code.  Note that the interface is ugly.  As said
above, it's proof-of-concept.  We'll need a bit more information
exported and knobs to turn and hopefully prettier interface.

As my knowledge of networking is fairly rudimentary, I only tried to
get the basics working.  e.g. I didn't try to store negotiated options
and re-establish them on restoration (ie. window scaling, mss, various
extra features), and am likely to have made wrong assumptions even on
the basics.  If you spot some, please shout.

The source is available in the following git branch (just git clone
from the URL)

  https://htejun@code.google.com/p/ptrace-parasite/

and can be browsed at

  http://code.google.com/p/ptrace-parasite/source/browse/

I'm attaching the tcp-ioctls.patch and the source tarball.

Thanks.

--
tejun


Parasite thread injection and TCP connection hijacking example code
===================================================================

This example code is for linux >= 3.1-rc1 on x86_64.

The goal is to demonstrate the followings.

* Transparent injection of a thread into the target process using new
  ptrace commands - PTRACE_SEIZE and INTERRUPT.

* Using the injected thread to capture a TCP connection and restoring
  it in another process.

Both are primarily to serve as the start point for mostly userland
checkpoint-restart implementation.  The latter is likely to be of
interest to virtual server farms and high availability too.

The code contained here is by no means ready for production.  It's
more of proof-of-concept; however, I'll try to document the missing
pieces here and in the comments.


Organization
============

The following files are for the executable 'parasite'.

 main.c		The bulk of the example program which seizes target
		process, inject parasite and sequence it to hijack TCP
		connection.

 parasite.c	Self contained code which is compiled as PIC and
		injected into target process.  Parasite thread
		executes this code.

 parasite.h	Included by both main.c and parasite.c.  Defines
		protocol used between the main program and injected
		parasite thread.

 syscall.h	Syscall interface used by parasite.c.

 setup-nfqueue	Helper script to setup iptables rules to send packets
		for a specified connection to nfqueue.  Don't mix with
		working firewall setup.  Invoked by the executable
		while hijacking TCP connection and assumed to be in
		the same directory.

 flush-nfqueue	Naive script to reverse setup-nfqueue.  It just clears
		INPUT and OUTPUT tables.  Again, don't mix with
		working firewall setup.  Invoked by the executable
		while hijacking TCP connection and assumed to be in
		the same directory.

Build procedure is a bit unusual.  parasite.c is compiled as PIC,
linked into raw binary parasite.bin using parasite.lds and hexdumped
into C array 'char parasite_blob[]' in parasite-blob.h.  main.c
includes this file and the final executable embeds the parasite blob
in it.

The followings are example programs to serve as host to inject
parasite into.

 simple-host.c	Simple pthread program.  Five threads print out
		heartbeat messages each second.  Has simple SIGUSR1/2
		handler for signal testing.

 net-host.c	TCP connection test program.  Depending on parameter,
		it either listens for connection or connects as
		directed.  Once connection is established, both
		parties keep sending incrementing uint64_t and verify
		that received data is incrementing uint64_t's.

		Has bandwidth throttling on the receiver side.  This
		ensures that both local rx and remote tx queues are
		populated.

		SIGUSR1 injects uint64_t which isn't part of the
		sequence which will be detected and reported by the
		remote side.  This can be used for verification and
		measuring send(2) to recv(2) latency.

		SIGUSR2 tests SIOCGOUTSEQS and SIOCPEEKOUTQ.  Just
		added to verify kernel features.

tcp-ioctls.patch is a patch to implement extra TCP ioctls for
connection hijacking.  This will be discussed further later.
Applicable to kernel 3.1-rc1.


Parasite thread injection
=========================

ptrace provides access to full process memory space and register
states, so it could always have manipulated the tracee however it
pleased including making it executing arbitrary code.  Unfortunately,
the previously existing commands depended on signals to interrupt the
tracee and interaction with job control was both poorly designed and
implemented.  This meant that although ptrace could be used to inject
arbitrary code into tracee, it couldn't do that without affecting
signal and job control states.

Broken parts of ptrace have been fixed and three new ptrace requests
are available from kernel 3.1 under development flag (which is
scheduled to be removed from 3.2) - PTRACE_SEIZE, INTERRUPT and
LISTEN.  These new requests allow transparent monitoring and
manipulation of tracee.  Note that transparency is not absolute in the
sense that ptrace operations would behave as signal delivery attempt
which can affect execution of certain system calls; however, userland
is already mandated to handle the condition regardless of ptrace and
albeit not absolute it still is complete in the scope defined by the
API.

Once all threads of the target process are seized, tracer can execute
arbitrary code using the following sequence.

1. PTRACE_SEIZE and INTERRUPT all threads belonging to the target
   process.  This is implemented in main.c::seize_process().  As noted
   in the source, the implementation isn't complete.  Proper
   implementation requires verification and retries in case thread
   creations and/or destructions race against seizing.

   Once INTERRUPT is issued, tracee either stops at job control trap
   or in exec path.  If tracee is about to exec, there isn't much to
   do anyway, so the example code simply aborts in such cases.

   From now on, all operations are assumed to be performed on one of
   the threads (any thread will do).

2. Decide where to inject the foreign code and save the original code
   with PTRACE_PEEKDATA.  Tracer can poke any mapped area regardless
   of protection flags but it can't add execution permission to the
   code, so it needs to choose memory area which already has X flag
   set.  The example code uses the page the %rip is in.

   To allow synchronization, the foreign code raises debug trap (int3)
   after execution finishes.

3. Inject the foreign code using PTRACE_POKEDATA.  The foreign code
   would usually have to be position independent and self-contained.

   Note that this page will be modified with PTRACE_POKEDATA is likely
   to trigger COW.  If a process is to be manipulated multiple times,
   it might be beneficial to use the same page every time.

4. Acquire and save the current register states using PTRACE_GETREGS
   and modify the register states for execution with PTRACE_SETREGS.
   Among others, %rip should point to the start address of the
   injected code and %orig_rax should be set to -1 to avoid
   end-of-syscall processing while returning to userland (register
   state will be restored later and end-of-syscall processing should
   happen after that).

5. Issue PTRACE_CONT to let the tracee return to userland and execute
   the injected code.  Tracer wait(2)s for tracee to enter stop state.
   Only two things can happen - signal delivery or end of execution
   notification via int3.

   Signal delivery either changes job control stop state, kills the
   process or schedules userland signal handler.  Nothing special to
   do about the first two.  For userland signal handler scheduling,
   issuing PTRACE_INTERRUPT before telling tracee to deliver signal
   with PTRACE_CONT makes tracee to re-trap after userland signal
   handler is scheduled without actually executing any userland code.
   Once scheduling is complete, retry from #4.

   After successful execution, tracee would be trapped indicating
   SIGTRAP delivery.  Squash it and put tracee back into job control
   trap by first issuing PTRACE_INTERRUPT followed by PTRACE_CONT.

   This step is implemented in execute_blob().

6. Restore saved registers and memory, and PTRACE_DETACH from all
   threads.

As arbitrary syscall can be issued using injected code, it isn't
difficult to inject larger chunk of code and create a parasite thread
on it.  The example code blocks all signals, uses mmap to allocate
memory, fill it with parasite_blob[], creates the parasite thread
using clone(2) and let it execute the injected code.


EXECUTION EXAMPLE

Running simple-host in a session first and parasite on another yields
the following outputs.  The alphabets in the first column are
referenced below to explain what's going on.

  # ./simple-host
  thread 01(1330): alive
  thread 02(1331): alive
  thread 03(1332): alive
  thread 04(1333): alive
  thread 00(1329): alive
A BLOB: hello, world!
B PARASITE STARTED
C PARASITE SAY: ah ah! mic test!
  thread 01(1330): alive
  thread 02(1331): alive
  thread 03(1332): alive
  thread 00(1329): alive
  ...

  # parasite `pidof simple-host`
  Seizing 1329
  Seizing 1330
  Seizing 1331
  Seizing 1332
  Seizing 1333
A executing test blob
  blocking all signals = 0, prev_sigmask 0
  executing mmap blob = 0x7f16f3024000
  executing clone blob = 1336
B executing parasite
C waiting for connection... connected
  executing munmap blob = 0
  restoring sigmask = 0, prev_sigmask 0xfffffffffffbfeef

On <A>, simple test code blob which says hi to world is injected into
the host and executed by one of the host threads to verify blob
execution works.

A series of blobs are executed afterwards to prepare for thread
injection.  The parasite thread is created in the host and released
for execution on <B>.  The first thing it does is printing out STARTED
message.

After that the injected thread connects back to the main 'parasite'
program using a TCP connection, at which point it's directed to print
out mic test message via SAY command.  This is happening on <C>.

After that, the prep steps are reversed and the target process is
released to continue normal execution.

Job control and USR signals directed at the target process should
behave as expected (sans the extra latency introduced by parasite) no
matter when they are generated.


TCP connection hijacking
========================

This part is much less complete and really a proof-of-concept.  The
goal is to show that TCP connection can be checkpointed in one process
and restored in another with only small additions to the networking
stack.  This also demonstrates that, with parasite threads, most
information is already available to checkpointer and adding mechanisms
to extract and manipulate more states can be done in very
non-obtrusive manner by extending existing API.  There's no new
security, locking or visibility boundary issues.

Note that CR in many cases wouldn't need this transparent snapshotting
of TCP connections.  For example, when CRing whole distributed HPC
workload, there's no reason to maintain TCP details which aren't
visible to applications at all.  Checkpointing threads injected to
both ends can simply drain the connection using recv(2) and restore
them by opening a new connection and repopulating the send buffer on
the other side with send(2).  DMTCP already uses this method to CR TCP
connections.

In general, unless the target connection is going to be terminated
from the target process and restored somewhere else immediately
(connection migration), there is little point in saving and restoring
TCP details including send and receive buffers as they become invalid
as soon as the target process exchanges further packets with the peer
after the checkpointing, and, if the peer is being checkpointed
together, draining and repopulating from each end point as described
above is far better and simpler.

With the above said, the basic states of a TCP connection can be
checkpointed and restarted with the following extra ioctls.  Note that
these ioctls should be considered as proof-of-concept.

 SIOCGINSEQ	Determine TCP sequence to be read on the next
		recv(2) - ie. tp->copied_seq.

 SIOCGOUTSEQS	Determine TCP sequences scheduled for transmission in
		reverse order.  ie. If the seq after SYN was 6, and
		20, 30 and 40 byte packets are in the tx queue without
		receiving any ack, it would return 96, 56, 26 and 6.

 SIOCSOUTSEQ	Set initial TCP sequence to use when establishing a
		connection.  Only valid on a not-yet-connected or
		listening socket.  The next connection established
		will start with the specified sequence.

 SIOCPEEKOUTQ	Peek the content of tx queue.

 SIOCFORCEOUTBD	Force packet separation on the next send(2).
		ie. data from the next send(2) won't be merged into
		the same packet with currently queued data.

A TCP connection can be snapshotted using the following sequence.

s1. Seize target process and inject a parasite thread.

s2. Acquire basic target socket information - IPs and ports.

s3. Block both incoming and outgoing packets belonging to the
    connection.

s4. Acquire rx queue information - the sequence number of the next
    byte to be read and the content of recv buffer.  The former is
    available through SIOCGINSEQ and the latter with recvmsg(2) w/
    MSG_PEEK.

s5. Acquire tx queue information - the sequence numbers of all pending
    packets and the content of send buffer.  The former is available
    through SIOCGOUTSEQS and the latter SIOCPEEKOUTQ.

None of the above steps has irreversible side effect and the
connection can be safely resumed.  To restore the connection, the
following steps can be used.

r1. Packets for the connection are still blocked from s3.  Create a
    way to intercept those packets and inject packets - nf_queue works
    for the former and raw socket for the latter.  It should drop all
    packets other than the ones injected via raw socket.

r2. Create a TCP socket, set outgoing sequence with SIOCSOUTSEQ so
    that it matches the sequence number at the head of the stored send
    queue, and initiate connection.

r3. Upon intercepting SYN, inject SYN/ACK with the sequence number
    matching the head of the stored rx queue.

r4. Upon intercepting ACK reply for SYN/ACK, repopulate the rx queue
    from the stored copy by injecting data packets and waiting for
    ACKs.

r5. Repopulate tx queue with send(2) with interleaving SIOCFORCEOUTBD
    calls to preserve the original packet boundaries.

r6. Connection is ready now.  Let the packets pass through.

The following points are worth mentioning regarding the above
sequences.

* As long as queue information is acquired after packets are blocked,
  there's no danger of data loss due to race condition on both rx and
  tx queues.  If data is received after rx queue is stored, the ack
  wouldn't reach the peer, so it will be retransmitted.  If ack is
  received after tx queue is stored, it just has extra data which will
  be acked and discarded again later.

* Both recv and send buffers need to be blown up before repopulating
  them with stored data.  SO_RCV/SNDBUFFORCE are used for this which
  disabled automatic buffer sizing.  It would be nice if there's a way
  to tell the TCP stack to resume auto resizing afterwards.

* Packet boundaries in tx queue need to be preserved, at least between
  the tx queue head and tp->snd_nxt.  This is because queue
  restoration can result in different packet division and if the peer
  already had received some of the packets before, stream can't be
  resumed with sequences falling inside existing packets.  Note that
  having more divisions is fine as long as the original boundaries are
  still there.

* Another subtlety with tx queue is that the TCP socket needs to
  think that all packets which were transmitted by the original
  connection are already transmitted before the packet barrier comes
  down - ie. its tp->snd_nxt needs to be the same as or after the
  original tp->snd_nxt; otherwise, it might end up ignoring ACKs
  stalling the connection.

  This currently is achieved by advertising maximum window on injected
  response packets so that the TCP socket sends out all queued data
  immediately.  If this isn't a guaranteed behavior, it would make
  sense to provide a way to manipulate tp->snd_nxt.

* The above sequence makes the new socket connect(2) but it would be
  better to reverse the direction to enable restoring server
  connections with N:1 port mapping.

Note that the implemented example code is incomplete in the following
aspects.

* No URG handling.  As OOB data can be acquired inline with other
  data.  Adding a mechanism to export URG offset from the tail of
  queue should be enough.

* Assumes ESTABLISHED.  Proof-of-concept! I get to be lazy! :P

* Doesn't handle options properly during connection negotiation.  I
  was being lazy but also at the same time am not a network expert and
  can't tell which ones should do what.  Needs more trained eyes here.

* Connection faking isn't robust at all.  Again, needs more work and
  some love from network gurus.

* No IPv6.


EXECUTION EXAMPLE

Incomplete as it may be, the example implementation actually works
rather reliably.  The parasite needs to be run as root as it uses
SO_RCV/SNDBUFFORCE and executes setup-nfqueue and flush-nfqueue
scripts which manipulate netfilter tables (don't run it on your
production machine with working firewall).  Also, it assumes that the
peer of the target TCP connection is on a remote machine and only
packets injected via raw socket pass through the loopback device.

Two instances of net-host keep talking to each other on 10.7.7.1 and
10.7.8.1 verifying received stream is sequence of incrementing
uint64_t's.  We want to hijack the socket from the net-host instance
on 10.7.8.1 and splice ourselves inside the connection so that the end
result looks like the following.

 net-host on 10.7.7.1 <---> parasite on 10.7.8.1 <---> net-host on 10.7.8.1

On 10.7.7.1,

  # ./net-host 9999 1024
A Connected to 10.7.8.2:40986
H signal 10 si_code=0
  inserting contaminant @0x22f682
G foreign data @0x2a7500 : 0xdeadbeefbeefdead

On 10.7.8.1,

  # ./net-host 10.7.7.1:9999 1024
A Connected to 10.7.7.1:9999
  BLOB: hello, world!
  PARASITE STARTED
B PARASITE SAY: ah ah! mic test!
H foreign data @0x22f682 : 0xdeadbeefbeefdead
G signal 10 si_code=0
  inserting contaminant @0x2a7500

On a different session on 10.7.8.1,

  # ls -l /proc/`pidof net-host`/fd
  total 0
  lrwx------ 1 root root 64 Aug  6 12:45 0 -> /dev/ttyS0
  lrwx------ 1 root root 64 Aug  6 12:45 1 -> /dev/ttyS0
  lrwx------ 1 root root 64 Aug  6 12:45 2 -> /dev/ttyS0
  lrwx------ 1 root root 64 Aug  6 12:45 3 -> socket:[12199]
A # ./parasite `pidof net-host` 3
  Seizing 1388
  Seizing 1389
  executing test blob
  blocking all signals = 0, prev_sigmask 0
  executing mmap blob = 0x7fa1e7caa000
  executing clone blob = 1397
  executing parasite
B waiting for connection... connected
  target socket: 10.7.8.2:40986 -> 10.7.7.1:9999 in 65392@0x185a3439 out 31856@0xb0e08180
C peeked socket buffer in 65392 out 26064
  executing munmap blob = 0
  restoring sigmask = 0, prev_sigmask 0xfffffffffffbfeef
D restoring connection, connecting...
  pkt: R->L S 185b33a9 A b0e02158 D 00000 a___ DROP
  pkt: R->L S 185b33a9 A b0e02158 D 00000 a___ DROP
  pkt: L->R S b0e02158 A 185b33a9 D 00000 a__r DROP
  pkt: L->R S b0e01baf A 00000000 D 00000 _s__ DROP DONE
  got SYN, replying with SYN/ACK
  pkt: R->L S 185a3438 A b0e01bb0 D 00000 as__ ACPT
  pkt: L->R S b0e01bb0 A 185a3439 D 00000 a___ DROP DONE
  connection established, repopulating rx/tx queues
E pkt: R->L S 185a3439 A b0e01bb0 D 01360 a___ ACPT
  pkt: L->R S b0e01bb0 A 185a3989 D 00000 a___ DROP DONE
  pkt: R->L S 185a3989 A b0e01bb0 D 01360 a___ ACPT
  pkt: L->R S b0e01bb0 A 185a3ed9 D 00000 a___ DROP DONE
  ...
  pkt: R->L S 185b3339 A b0e01bb0 D 00112 a___ ACPT
  pkt: L->R S b0e01bb0 A 185b33a9 D 00000 a___ DROP DONE
F snd: ---- S b0e01bb0 A -------- D 00000
  snd: ---- S b0e01bb0 A -------- D 01448
  ...
  snd: ---- S b0e07bd8 A -------- D 01448
G connection restored
  pkt: L->R S b0e01bb0 A 185b33a9 D 01448 a___ ACPT
  pkt: L->R S b0e02158 A 185b33a9 D 01448 a___ ACPT
  ...
  pkt: L->R S b0e04e98 A 185b33a9 D 01448 a___ ACPT
  pkt: R->L S 185b3629 A b0e02158 D 00000 a___ ACPT
  pkt: R->L S 185b3629 A b0e02700 D 00000 a___ ACPT
  ...
  pkt: R->L S 185b3629 A b0e05440 D 00000 a___ ACPT

On yet another session on 10.7.7.1,

H # killall -USR1 net-host

On yet another session on 10.7.8.1,

G # killall -USR1 net-host

<A> Two net-host instances are connected to each other sending and
    verifying each other's stream.  From /proc the socket fd is
    determined to be 3 and parasite is executed to hijack the socket.

<B> Upto this point, it's the same as the previous thread injection
    example.  Parasite thread is injected and test command is
    executed.

<C> Snapshot steps s3 - s5 are executed and the fd 3 is dupped over by
    a connection to the main program.  This means on 10.7.8.1, the
    connection doesn't exist anymore; however, 10.7.7.1 doesn't know
    this as packets belonging to the connection are being dropped.

<D> Restoration steps r1 - r3 are executed.

<E> rx queue is being repopulated.

<F> tx queue is being repopulated.

<G> Connection restored and the main parasite program now owns the
    original connection to net-host on 10.7.7.1 and a new connection
    to net-host on 10.7.8.1.  It pipes data between the two.

<H> Verify data is still flowing by triggering net-host on 10.7.7.1 to
    insert foreign data in the stream, which is soon received by
    net-host on 10.7.8.1.

<G> Vice-versa from 10.7.8.1 to 10.7.7.1.


NOTES

* As said above, the ioctls are only proof-of-concept.  We'll probably
  need more information exported and maybe a few more ways to
  manipulate the states.  As long as the state manipulations stay out
  of usual stream processing - ie. they only affect connection setup,
  I don't think the added complexity or maintenance overhead would be
  noticeable.

* Having socket inode # match in iptables would solve the packet
  matching problem.  Note that even on a busy system, this connection
  intervention shouldn't add much overhead.  While not snapshotting,
  no firewall rule is needed.  During snapshotting, only packets of
  the target connections are ipqueue'd and ipset can match many
  connections without much overhead.

* While writing, I had more things I wanted to talk about in this
  section but apparently forgot them. :( I'll add as they come back.

[-- Attachment #2: parasite-20110806.tar.gz --]
[-- Type: application/x-gzip, Size: 25006 bytes --]

[-- Attachment #3: tcp-ioctls.patch --]
[-- Type: text/plain, Size: 6650 bytes --]

 include/linux/sockios.h |    6 ++
 include/linux/tcp.h     |    2 
 net/ipv4/tcp.c          |  113 +++++++++++++++++++++++++++++++++++++++++++++++-
 net/ipv4/tcp_ipv4.c     |   16 ++++--
 4 files changed, 130 insertions(+), 7 deletions(-)

diff --git a/include/linux/sockios.h b/include/linux/sockios.h
index 7997a50..f5c3e41 100644
--- a/include/linux/sockios.h
+++ b/include/linux/sockios.h
@@ -127,6 +127,12 @@
 /* hardware time stamping: parameters in linux/net_tstamp.h */
 #define SIOCSHWTSTAMP   0x89b0
 
+#define SIOCGINSEQ	0x89b1		/* get copied_seq */
+#define SIOCGOUTSEQS	0x89b2		/* get seqs for pending tx pkts */
+#define SIOCSOUTSEQ	0x89b3		/* set write_seq */
+#define SIOCPEEKOUTQ	0x89b4		/* peek output queue */
+#define SIOCFORCEOUTBD	0x89b5		/* force output packet boundary */
+
 /* Device private ioctl calls */
 
 /*
diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 531ede8..c0945fe 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -365,6 +365,8 @@ struct tcp_sock {
 	u32	snd_up;		/* Urgent pointer		*/
 
 	u8	keepalive_probes; /* num of allowed keep alive probes	*/
+	u8	wseq_set    : 1;/* Write sequence set via setsockopt	*/
+	u8	force_outbd : 1;/* force packet boundary on next send	*/
 /*
  *      Options received (usually on last packet, some only on SYN packets).
  */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 46febca..3389827 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -464,12 +464,118 @@ unsigned int tcp_poll(struct file *file, struct socket *sock, poll_table *wait)
 }
 EXPORT_SYMBOL(tcp_poll);
 
+static int tcp_get_out_seqs(struct sock *sk, u32 __user *p, int size)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct sk_buff *skb;
+	int pos = 0, cnt = size / sizeof(u32);
+
+	if (pos < cnt && put_user(tp->write_seq, &p[pos++]))
+		return -EFAULT;
+
+	skb_queue_reverse_walk(&sk->sk_write_queue, skb) {
+		struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);
+
+		if (pos < cnt && put_user(tcb->seq, &p[pos++]))
+			return -EFAULT;
+	}
+	return pos * sizeof(u32);
+}
+
+static int tcp_peek_outq(struct sock *sk, void __user *arg, int size)
+{
+	struct tcp_sock *tp = tcp_sk(sk);
+	struct iovec iov = { .iov_base = arg, .iov_len = size };
+	struct sk_buff *skb;
+	int copied = 0, err = 0;
+	int outq, skip;
+
+	lock_sock(sk);
+
+	/* XXX: why doesn't SIOCOUTQ[NSD] account for queued fin? */
+	outq = tp->write_seq - tp->snd_una;
+	skb = skb_peek_tail(&sk->sk_write_queue);
+	if (outq && skb)
+		outq -= tcp_hdr(skb)->fin;
+
+	skip = outq - min(size, outq);
+
+	skb_queue_walk(&sk->sk_write_queue, skb) {
+		int off = 0, todo;
+
+		if (skip) {
+			off = min_t(int, skip, skb->len);
+			skip -= off;
+		}
+
+		if (!(todo = skb->len - off))
+			continue;
+
+		if (WARN_ON_ONCE(iov.iov_len < todo)) {
+			err = -EINVAL;
+			break;
+		}
+
+		err = skb_copy_datagram_iovec(skb, off, &iov, todo);
+		if (err)
+			break;
+		copied += todo;
+	}
+
+	release_sock(sk);
+
+	return err ?: copied;
+}
+
 int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	int answ;
 
 	switch (cmd) {
+	case SIOCGOUTSEQS: {
+		s32 size;
+
+		if (get_user(size, (s32 __user *)arg))
+			return -EFAULT;
+		if (size < 0)
+			return -EINVAL;
+		return tcp_get_out_seqs(sk, (u32 __user *)arg, size);
+	}
+	case SIOCSOUTSEQ: {
+		u32 seq;
+
+		if (get_user(seq, (u32 __user *)arg))
+			return -EFAULT;
+
+		lock_sock(sk);
+		answ = -EISCONN;
+		if ((sk->sk_socket->state == SS_UNCONNECTED &&
+		     sk->sk_state == TCP_CLOSE) || sk->sk_state == TCP_LISTEN) {
+			tp->write_seq = seq;
+			tp->wseq_set = true;
+			answ = 0;
+		}
+		release_sock(sk);
+		return answ;
+	}
+	case SIOCPEEKOUTQ: {
+		u32 size;
+
+		if (get_user(size, (u32 __user *)arg))
+			return -EFAULT;
+		if ((int)size < size)
+			return -EINVAL;
+		return tcp_peek_outq(sk, (void __user *)arg, size);
+	}
+	case SIOCFORCEOUTBD:
+		lock_sock(sk);
+		tp->force_outbd = true;
+		release_sock(sk);
+		return 0;
+	}
+
+	switch (cmd) {
 	case SIOCINQ:
 		if (sk->sk_state == TCP_LISTEN)
 			return -EINVAL;
@@ -514,6 +620,9 @@ int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg)
 		else
 			answ = tp->write_seq - tp->snd_nxt;
 		break;
+	case SIOCGINSEQ:
+		answ = tp->copied_seq;
+		break;
 	default:
 		return -ENOIOCTLCMD;
 	}
@@ -965,7 +1074,7 @@ int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 				copy = max - skb->len;
 			}
 
-			if (copy <= 0) {
+			if (copy <= 0 || unlikely(tp->force_outbd)) {
 new_segment:
 				/* Allocate new segment. If the interface is SG,
 				 * allocate skb fitting to single page.
@@ -979,6 +1088,8 @@ new_segment:
 				if (!skb)
 					goto wait_for_memory;
 
+				tp->force_outbd = false;
+
 				/*
 				 * Check whether we can use HW checksum.
 				 */
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 955b8e6..579234c 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -201,7 +201,8 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 		/* Reset inherited state */
 		tp->rx_opt.ts_recent	   = 0;
 		tp->rx_opt.ts_recent_stamp = 0;
-		tp->write_seq		   = 0;
+		if (!tp->wseq_set)
+			tp->write_seq      = 0;
 	}
 
 	if (tcp_death_row.sysctl_tw_recycle &&
@@ -252,12 +253,12 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 	sk->sk_gso_type = SKB_GSO_TCPV4;
 	sk_setup_caps(sk, &rt->dst);
 
-	if (!tp->write_seq)
+	if (!tp->write_seq && !tp->wseq_set)
 		tp->write_seq = secure_tcp_sequence_number(inet->inet_saddr,
 							   inet->inet_daddr,
 							   inet->inet_sport,
 							   usin->sin_port);
-
+	tp->wseq_set = false;
 	inet->inet_id = tp->write_seq ^ jiffies;
 
 	err = tcp_connect(sk);
@@ -1252,7 +1253,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 		if (net_ratelimit())
 			syn_flood_warning(skb);
 #ifdef CONFIG_SYN_COOKIES
-		if (sysctl_tcp_syncookies) {
+		if (sysctl_tcp_syncookies && !tp->wseq_set) {
 			want_cookie = 1;
 		} else
 #endif
@@ -1334,7 +1335,10 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
 	if (!want_cookie || tmp_opt.tstamp_ok)
 		TCP_ECN_create_request(req, tcp_hdr(skb));
 
-	if (want_cookie) {
+	if (unlikely(tp->wseq_set)) {
+		isn = tp->write_seq;
+		tp->wseq_set = false;
+	} else if (want_cookie) {
 		isn = cookie_v4_init_sequence(sk, skb, &req->mss);
 		req->cookie_ts = tmp_opt.tstamp_ok;
 	} else if (!isn) {
@@ -1526,7 +1530,7 @@ static struct sock *tcp_v4_hnd_req(struct sock *sk, struct sk_buff *skb)
 	}
 
 #ifdef CONFIG_SYN_COOKIES
-	if (!th->syn)
+	if (!th->syn && !tcp_sk(sk)->wseq_set)
 		sk = cookie_v4_check(sk, skb, &(IPCB(skb)->opt));
 #endif
 	return sk;

^ permalink raw reply related

* Re: [EXAMPLE CODE] Parasite thread injection and TCP connection hijacking
From: Andy Lutomirski @ 2011-08-06 12:45 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Matt Helsley, Pavel Emelyanov, Nathan Lynch, Oren Laadan,
	Daniel Lezcano, S, James E.J. Bottomley, David S. Miller,
	linux-kernel, netdev
In-Reply-To: <20110806121247.GC23937@htj.dyndns.org>

On 08/06/2011 08:12 AM, Tejun Heo wrote:
> Hello, guys.
>
> So, here's transparent TCP connection hijacking (ie. checkpointing in
> one process and restoring in another) which adds only relatively small
> pieces to the kernel.  It's by no means complete but already works
> rather reliably in my test setup even with heavy delay induced with
> tc.
>
> I wrote a rather long README describing how it's working, what's
> missing which is appended at the end of this mail so if you're
> interested in the details please go ahead and read.

That's a little gross but quite cool.

I think you have an annoying corner case, though:

 > 2. Decide where to inject the foreign code and save the original code
 >    with PTRACE_PEEKDATA.  Tracer can poke any mapped area regardless
 >    of protection flags but it can't add execution permission to the
 >    code, so it needs to choose memory area which already has X flag
 >    set.  The example code uses the page the %rip is in.

If the process is executing from the vsyscall page, then you'll probably 
fail.  (Admittedly, this is rather unlikely, given that the vsyscalls 
are now exactly one instruction.)  Presumably you also fail if executing 
from a read-only MAP_SHARED mapping.

Windows has a facility to more-or-less call mmap on behalf of another 
process, and another one to directly inject a thread into a remote 
process.  It's traditional to use them for this type of manipulation. 
Perhaps Linux should get the same thing.  (Although you could accomplish 
much the same thing if you could create a task with your mm but the 
tracee's fs.)

--Andy

^ permalink raw reply

* Re: [EXAMPLE CODE] Parasite thread injection and TCP connection hijacking
From: Tejun Heo @ 2011-08-06 13:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Matt Helsley, Pavel Emelyanov, Nathan Lynch, Oren Laadan,
	Daniel Lezcano, S, James E.J. Bottomley, David S. Miller,
	linux-kernel, netdev
In-Reply-To: <4E3D3768.3070108@mit.edu>

Hello,

On Sat, Aug 06, 2011 at 08:45:28AM -0400, Andy Lutomirski wrote:
> > 2. Decide where to inject the foreign code and save the original code
> >    with PTRACE_PEEKDATA.  Tracer can poke any mapped area regardless
> >    of protection flags but it can't add execution permission to the
> >    code, so it needs to choose memory area which already has X flag
> >    set.  The example code uses the page the %rip is in.
> 
> If the process is executing from the vsyscall page, then you'll
> probably fail.  (Admittedly, this is rather unlikely, given that the
> vsyscalls are now exactly one instruction.)  Presumably you also
> fail if executing from a read-only MAP_SHARED mapping.

Heh, yeah, I originally thought about scanning /proc/PID/maps to look
for the page to use but was lazy and just used %rip.  I think that
should work.  I'll note the problem in README.

> Windows has a facility to more-or-less call mmap on behalf of
> another process, and another one to directly inject a thread into a
> remote process.  It's traditional to use them for this type of
> manipulation. Perhaps Linux should get the same thing.  (Although
> you could accomplish much the same thing if you could create a task
> with your mm but the tracee's fs.)

Actually, the only thing we need on x86_64 is two bytes for the
syscall instruction because all params are passed through registers
anyway.  We can just set up parameters for mmap, turn on single step,
point %rip to syscall in the vsyscall page.  So, either way, I don't
think this would be too difficult to solve.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [EXAMPLE CODE] Parasite thread injection and TCP connection hijacking
From: Andrew Lutomirski @ 2011-08-06 13:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Matt Helsley, Pavel Emelyanov, Nathan Lynch, Oren Laadan,
	Daniel Lezcano, S, James E.J. Bottomley, David S. Miller,
	linux-kernel, netdev
In-Reply-To: <20110806130037.GD23937@htj.dyndns.org>

On Sat, Aug 6, 2011 at 9:00 AM, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Sat, Aug 06, 2011 at 08:45:28AM -0400, Andy Lutomirski wrote:
>> > 2. Decide where to inject the foreign code and save the original code
>> >    with PTRACE_PEEKDATA.  Tracer can poke any mapped area regardless
>> >    of protection flags but it can't add execution permission to the
>> >    code, so it needs to choose memory area which already has X flag
>> >    set.  The example code uses the page the %rip is in.
>>
>> If the process is executing from the vsyscall page, then you'll
>> probably fail.  (Admittedly, this is rather unlikely, given that the
>> vsyscalls are now exactly one instruction.)  Presumably you also
>> fail if executing from a read-only MAP_SHARED mapping.
>
> Heh, yeah, I originally thought about scanning /proc/PID/maps to look
> for the page to use but was lazy and just used %rip.  I think that
> should work.  I'll note the problem in README.
>
>> Windows has a facility to more-or-less call mmap on behalf of
>> another process, and another one to directly inject a thread into a
>> remote process.  It's traditional to use them for this type of
>> manipulation. Perhaps Linux should get the same thing.  (Although
>> you could accomplish much the same thing if you could create a task
>> with your mm but the tracee's fs.)
>
> Actually, the only thing we need on x86_64 is two bytes for the
> syscall instruction because all params are passed through registers
> anyway.  We can just set up parameters for mmap, turn on single step,
> point %rip to syscall in the vsyscall page.  So, either way, I don't
> think this would be too difficult to solve.

Not any more -- that syscall instruction is gone as of 3.1.  You could
search through the vdso to find a syscall, but that seems fragile.

Why not just add a ptrace command to issue a syscall?

--Andy

^ permalink raw reply

* Re: [EXAMPLE CODE] Parasite thread injection and TCP connection hijacking
From: Tejun Heo @ 2011-08-06 13:20 UTC (permalink / raw)
  To: Andrew Lutomirski
  Cc: Matt Helsley, Pavel Emelyanov, Nathan Lynch, Oren Laadan,
	Daniel Lezcano, S, James E.J. Bottomley, David S. Miller,
	linux-kernel, netdev
In-Reply-To: <CAObL_7FeZQpnOzpXHqSviZdBRUCL+rODbQOBb+6JPSd3PB=fig@mail.gmail.com>

Hello,

On Sat, Aug 06, 2011 at 09:15:45AM -0400, Andrew Lutomirski wrote:
> On Sat, Aug 6, 2011 at 9:00 AM, Tejun Heo <tj@kernel.org> wrote:
> > Actually, the only thing we need on x86_64 is two bytes for the
> > syscall instruction because all params are passed through registers
> > anyway.  We can just set up parameters for mmap, turn on single step,
> > point %rip to syscall in the vsyscall page.  So, either way, I don't
> > think this would be too difficult to solve.
> 
> Not any more -- that syscall instruction is gone as of 3.1.  You could
> search through the vdso to find a syscall, but that seems fragile.
> 
> Why not just add a ptrace command to issue a syscall?

Yeah, maybe.  If this thing proves to be useful enough and looking for
a page to poke under proc too cumbersome.  I'm not against it but
don't really see strong need either at this point.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [RFC 4/4] [powerpc] Implement a p1010rdb clock source.
From: Marc Kleine-Budde @ 2011-08-06 13:58 UTC (permalink / raw)
  To: Robin Holt
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA, U Bhaskar-B22300,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, Wolfgang Grandegger
In-Reply-To: <1312603504-30282-5-git-send-email-holt-sJ/iWh9BUns@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 4113 bytes --]

On 08/06/2011 06:05 AM, Robin Holt wrote:
> flexcan driver needs the clk_get, clk_get_rate, etc functions
> to work.  This patch provides the minimum functionality.

This patch has to go via the powerpc git tree. Added
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org on CC.

> Signed-off-by: Robin Holt <holt-sJ/iWh9BUns@public.gmane.org>
> To: Marc Kleine-Budde <mkl-bIcnvbaLZ9MEGnE8C9+IrQ@public.gmane.org>
> To: Wolfgang Grandegger <wg-5Yr1BZd7O62+XT7JhA+gdA@public.gmane.org>
> To: U Bhaskar-B22300 <B22300-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
> Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> ---
>  arch/powerpc/platforms/85xx/p1010rdb.c |   78 ++++++++++++++++++++++++++++++++
>  1 files changed, 78 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/85xx/p1010rdb.c b/arch/powerpc/platforms/85xx/p1010rdb.c
> index 3540a88..8f78ddd 100644
> --- a/arch/powerpc/platforms/85xx/p1010rdb.c
> +++ b/arch/powerpc/platforms/85xx/p1010rdb.c
> @@ -28,6 +28,7 @@
>  #include <asm/udbg.h>
>  #include <asm/mpic.h>
>  #include <asm/swiotlb.h>
> +#include <asm/clk_interface.h>
>  
>  #include <sysdev/fsl_soc.h>
>  #include <sysdev/fsl_pci.h>
> @@ -164,6 +165,82 @@ static void __init p1010_rdb_setup_arch(void)
>  	printk(KERN_INFO "P1010 RDB board from Freescale Semiconductor\n");
>  }
>  
> +/*
> + * p1010rdb needs to provide a clock source for the flexcan driver.
> + */
> +struct clk {
> +	unsigned long rate;
> +} p1010rdb_system_clk;
> +
> +static struct clk *p1010_rdb_clk_get(struct device *dev, const char *id)
> +{
> +	struct clk *clk;
> +	u32 *of_property;
> +	unsigned long clock_freq, clock_divider;
> +	const char *dev_init_name;
> +
> +	if (!dev)
> +		return ERR_PTR(-ENOENT);
> +
> +	/*
> +	 * The can devices are named ffe1c000.can0 and ffe1d000.can1 on
> +	 * the p1010rdb.  Check for the "can" portion of that name before
> +	 * returning a clock source.
> +	 */
> +	dev_init_name = dev_name(dev);
> +	if (strlen(dev_init_name) != 13)
> +		return ERR_PTR(-ENOENT);
> +	dev_init_name += 9;
> +	if (strncmp(dev_init_name, "can", 3))
> +		return ERR_PTR(-ENOENT);
> +
> +	of_property = (u32 *)of_get_property(dev->of_node, "clock_freq", NULL);
> +	if (!of_property)
> +		return ERR_PTR(-ENOENT);
> +	clock_freq = *of_property;
> +
> +	of_property = (u32 *)of_get_property(dev->of_node,
> +					     "fsl,flexcan-clock-divider", NULL);
> +	if (!of_property)
> +		return ERR_PTR(-ENOENT);
> +	clock_divider = *of_property;
> +
> +	clk = kmalloc(sizeof(struct clk), GFP_KERNEL);
> +	if (!clk)
> +		return ERR_PTR(-ENOMEM);
> +
> +	clk->rate = DIV_ROUND_CLOSEST(clock_freq / clock_divider, 1000);
> +	clk->rate *= 1000;
> +
> +	return clk;
> +}
> +
> +static void p1010_rdb_clk_put(struct clk *clk)
> +{
> +	kfree(clk);
> +}
> +
> +static unsigned long p1010_rdb_clk_get_rate(struct clk *clk)
> +{
> +	return clk->rate;
> +}
> +
> +static struct clk_interface p1010_rdb_clk_functions = {
> +	.clk_get		= p1010_rdb_clk_get,
> +	.clk_get_rate		= p1010_rdb_clk_get_rate,
> +	.clk_put		= p1010_rdb_clk_put,
> +};
> +
> +static void __init p1010_rdb_clk_init(void)
> +{
> +	clk_functions = p1010_rdb_clk_functions;
> +}
> +
> +static void __init p1010_rdb_init(void)
> +{
> +	p1010_rdb_clk_init();
> +}
> +
>  static struct of_device_id __initdata p1010rdb_ids[] = {
>  	{ .type = "soc", },
>  	{ .compatible = "soc", },
> @@ -195,6 +272,7 @@ define_machine(p1010_rdb) {
>  	.name			= "P1010 RDB",
>  	.probe			= p1010_rdb_probe,
>  	.setup_arch		= p1010_rdb_setup_arch,
> +	.init			= p1010_rdb_init,
>  	.init_IRQ		= p1010_rdb_pic_init,
>  #ifdef CONFIG_PCI
>  	.pcibios_fixup_bus	= fsl_pcibios_fixup_bus,

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

[-- Attachment #2: Type: text/plain, Size: 188 bytes --]

_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core

^ permalink raw reply

* [patch] qla3xxx: remove an extra semi-colon
From: Dan Carpenter @ 2011-08-06 14:26 UTC (permalink / raw)
  To: Ron Mercer
  Cc: supporter:QLOGIC QLA3XXX NE..., open list:QLOGIC QLA3XXX NE...,
	kernel-janitors

The define is only used one place, and it's at the end of a line so
the semi-colon doesn't affect anything.  But let's clean it up
anyway.

Signed-off-by: Dan Carpenter <error27@gmail.com>

diff --git a/drivers/net/qla3xxx.c b/drivers/net/qla3xxx.c
index 2f69140..ccde806 100644
--- a/drivers/net/qla3xxx.c
+++ b/drivers/net/qla3xxx.c
@@ -1650,7 +1650,7 @@ static int ql_mii_setup(struct ql3_adapter *qdev)
 				 SUPPORTED_1000baseT_Half |	\
 				 SUPPORTED_1000baseT_Full |	\
 				 SUPPORTED_Autoneg |		\
-				 SUPPORTED_TP);			\
+				 SUPPORTED_TP)			\
 
 static u32 ql_supported_modes(struct ql3_adapter *qdev)
 {

^ permalink raw reply related

* [RFC 0/4] [flexcan] Add support for powerpc (freescale p1010) -V6
From: Robin Holt @ 2011-08-06 14:34 UTC (permalink / raw)
  To: Robin Holt, Marc Kleine-Budde, Wolfgang Grandegger,
	U Bhaskar-B22300
  Cc: socketcan-core-0fE9KPoRgkgATYTw5x5z8w,
	netdev-u79uwXL29TY76Z2rM5mHXA

Marc, Wolfgang or U Bhaskar,

This patch set should have all your comments included.

I did implement a very simple clock source in the p1010rdb.c file, which,
unfortunately, your tree will not have so please do not apply the last
patch in the series.  That will need to go to the powerpc folks and
follow the p1010rdb patch from freescale.

Could you please apply the first three patches to a test branch, compile
and test them on an arm based system?  I would like to at least feel
comfortable that I have not broken anything there.

I have tested the full set on a p1010rdb with an external PSOC based
can communicator.  That PSOC code has a bunch of erroneous can comms it
can generate, but I do not know how the developer of that code injects
those errors.  As a result, no error handling from the can input has been
tested.  I have tested both flexcan interfaces on the board and both work
with these patches in addition to the other p1010rdb patches not included.

This series has one additional patch which replaces the mach/clock.h
with the linux/clkdev.h include.

Thanks,
Robin Holt

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox