Netdev List

Netdev List
 help / color / mirror / Atom feed

* RE: [PATCH] net: use hardware buffer pool to allocate skb
From: Jiafei.Pan @ 2014-10-16  2:30 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: davem@davemloft.net, jkosina@suse.cz, netdev@vger.kernel.org,
	LeoLi@freescale.com, linux-doc@vger.kernel.org,
	Jiafei.Pan@freescale.com
In-Reply-To: <20141015113323.5321b2f7@uryu.home.lan>



> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, October 15, 2014 5:33 PM
> To: Pan Jiafei-B37022
> Cc: davem@davemloft.net; jkosina@suse.cz; netdev@vger.kernel.org; Li Yang-Leo-
> R58472; linux-doc@vger.kernel.org
> Subject: Re: [PATCH] net: use hardware buffer pool to allocate skb
> 
> Since an skb can sit forever in an application queue, you have created
> an easy way to livelock the system when enough skb's are waiting to be
> read.

I think there is no possible to livelock the system, because in my patch
The function __netdev_alloc_skb will try to allocate hardware block buffer
Firstly if dev->alloc_hw_skb is set, but it will continue allocate normal
skb buffer if the hardware block buffer allocation fails.

^ permalink raw reply

* RE: [PATCH] net: use hardware buffer pool to allocate skb
From: Jiafei.Pan @ 2014-10-16  2:17 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, jkosina@suse.cz, netdev@vger.kernel.org,
	LeoLi@freescale.com, linux-doc@vger.kernel.org,
	Jiafei.Pan@freescale.com
In-Reply-To: <1413364533.12304.44.camel@edumazet-glaptop2.roam.corp.google.com>


> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Wednesday, October 15, 2014 5:16 PM
> To: Pan Jiafei-B37022
> Cc: David Miller; jkosina@suse.cz; netdev@vger.kernel.org; Li Yang-Leo-R58472;
> linux-doc@vger.kernel.org
> Subject: Re: [PATCH] net: use hardware buffer pool to allocate skb
> 
> On Wed, 2014-10-15 at 05:34 +0000, Jiafei.Pan@freescale.com wrote:
> 
> > Yes, for this matter, in order to do this we should modify the Ethernet
> > drivers. For example, driver A want to driver B, C, D.. to support driver
> > A's Hardware block access functions, so we have to modify driver B, C, D...
> > It will be so complex for this matter.
> >
> > But by using my patch, I just modify a Ethernet device (I don't care
> > Which driver it is used) flag in driver A in order to implement this
> > Ethernet device using hardware block access functions provided by
> > Driver A.
> 
> We care a lot of all the bugs added by your patches. You have little
> idea of how many of them were added. We do not want to spend days of
> work explaining everything or fixing all the details for you.
> 
> Carefully read net/core/skbuff.c, net/core/dev.c, GRO layer, you'll see
> how many spots you missed.
> 
> You cannot control how skbs are cooked before reaching your driver
> ndo_start_xmit(). You are not going to add hooks in UDP , TCP, or other
> drivers RX path. This would be absolutely insane.
> 
> Trying to control how skb are cooked in RX path is absolutely something
> drivers do, using page frags that are read-only by all the stack.
> 
> Fix your driver to use existing infra, your suggestion is not going to
> be accepted.
> 
I think my patch can connect some hardware buffer block with the third party
net card drivers. this should be a general requirement in order to get
a better performance. Yes, maybe some defect in my patch, but any comments
and suggestions for this target is welcome and thanks.


^ permalink raw reply

* RE: [PATCH] net: use hardware buffer pool to allocate skb
From: Jiafei.Pan @ 2014-10-16  2:17 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, jkosina@suse.cz, netdev@vger.kernel.org,
	LeoLi@freescale.com, linux-doc@vger.kernel.org,
	Jiafei.Pan@freescale.com
In-Reply-To: <1413364533.12304.44.camel@edumazet-glaptop2.roam.corp.google.com>


> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Wednesday, October 15, 2014 5:16 PM
> To: Pan Jiafei-B37022
> Cc: David Miller; jkosina@suse.cz; netdev@vger.kernel.org; Li Yang-Leo-R58472;
> linux-doc@vger.kernel.org
> Subject: Re: [PATCH] net: use hardware buffer pool to allocate skb
> 
> On Wed, 2014-10-15 at 05:34 +0000, Jiafei.Pan@freescale.com wrote:
> 
> > Yes, for this matter, in order to do this we should modify the Ethernet
> > drivers. For example, driver A want to driver B, C, D.. to support driver
> > A's Hardware block access functions, so we have to modify driver B, C, D...
> > It will be so complex for this matter.
> >
> > But by using my patch, I just modify a Ethernet device (I don't care
> > Which driver it is used) flag in driver A in order to implement this
> > Ethernet device using hardware block access functions provided by
> > Driver A.
> 
> We care a lot of all the bugs added by your patches. You have little
> idea of how many of them were added. We do not want to spend days of
> work explaining everything or fixing all the details for you.
> 
> Carefully read net/core/skbuff.c, net/core/dev.c, GRO layer, you'll see
> how many spots you missed.
> 
> You cannot control how skbs are cooked before reaching your driver
> ndo_start_xmit(). You are not going to add hooks in UDP , TCP, or other
> drivers RX path. This would be absolutely insane.
> 

Thanks for your comments and suggestion. In my case, I want to build skb
from hardware block specified memory, I only can see two ways, one is modified
net card driver replace common skb allocation function with my specially
functions, another way is to hack common skb allocation function in which
redirect to my specially functions. My patch is just for the second way.
Except these two ways, would you please give me some advice for some other
ways for my case? Thanks.

> Trying to control how skb are cooked in RX path is absolutely something
> drivers do, using page frags that are read-only by all the stack.
> 
> Fix your driver to use existing infra, your suggestion is not going to
> be accepted.
> 


^ permalink raw reply

* [PATCH net-next] cxgb4 : Improve handling of DCB negotiation or loss thereof
From: Anish Bhatt @ 2014-10-16  2:08 UTC (permalink / raw)
  To: netdev; +Cc: davem, hariprasad, leedom, Anish Bhatt

Clear out any DCB apps we might have added to kernel table when we lose DCB
sync (or IEEE equivalent event). IEEE allows individual components to work
independently, so improve check for IEEE completion by specifying individual
components.

Signed-off-by: Anish Bhatt <anish@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c | 48 ++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
index 8edf0f5bd679..ee819fd12bd2 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_dcb.c
@@ -60,6 +60,42 @@ void cxgb4_dcb_version_init(struct net_device *dev)
 	dcb->dcb_version = FW_PORT_DCB_VER_AUTO;
 }
 
+static void cxgb4_dcb_cleanup_apps(struct net_device *dev)
+{
+	struct port_info *pi = netdev2pinfo(dev);
+	struct adapter *adap = pi->adapter;
+	struct port_dcb_info *dcb = &pi->dcb;
+	struct dcb_app app;
+	int i, err;
+
+	/* zero priority implies remove */
+	app.priority = 0;
+
+	for (i = 0; i < CXGB4_MAX_DCBX_APP_SUPPORTED; i++) {
+		/* Check if app list is exhausted */
+		if (!dcb->app_priority[i].protocolid)
+			break;
+
+		app.protocol = dcb->app_priority[i].protocolid;
+
+		if (dcb->dcb_version == FW_PORT_DCB_VER_IEEE) {
+			app.selector = dcb->app_priority[i].sel_field + 1;
+			err = dcb_ieee_setapp(dev, &app);
+		} else {
+			app.selector = !!(dcb->app_priority[i].sel_field);
+			err = dcb_setapp(dev, &app);
+		}
+
+		if (err) {
+			dev_err(adap->pdev_dev,
+				"Failed DCB Clear %s Application Priority: sel=%d, prot=%d, , err=%d\n",
+				dcb_ver_array[dcb->dcb_version], app.selector,
+				app.protocol, -err);
+			break;
+		}
+	}
+}
+
 /* Finite State machine for Data Center Bridging.
  */
 void cxgb4_dcb_state_fsm(struct net_device *dev,
@@ -145,6 +181,7 @@ void cxgb4_dcb_state_fsm(struct net_device *dev,
 			 * state.  We need to reset back to a ground state
 			 * of incomplete.
 			 */
+			cxgb4_dcb_cleanup_apps(dev);
 			cxgb4_dcb_state_init(dev);
 			dcb->state = CXGB4_DCB_STATE_FW_INCOMPLETE;
 			dcb->supported = CXGB4_DCBX_FW_SUPPORT;
@@ -833,11 +870,16 @@ static int cxgb4_setapp(struct net_device *dev, u8 app_idtype, u16 app_id,
 
 /* Return whether IEEE Data Center Bridging has been negotiated.
  */
-static inline int cxgb4_ieee_negotiation_complete(struct net_device *dev)
+static inline int
+cxgb4_ieee_negotiation_complete(struct net_device *dev,
+				enum cxgb4_dcb_fw_msgs dcb_subtype)
 {
 	struct port_info *pi = netdev2pinfo(dev);
 	struct port_dcb_info *dcb = &pi->dcb;
 
+	if (dcb_subtype && !(dcb->msgs & dcb_subtype))
+		return 0;
+
 	return (dcb->state == CXGB4_DCB_STATE_FW_ALLSYNCED &&
 		(dcb->supported & DCB_CAP_DCBX_VER_IEEE));
 }
@@ -850,7 +892,7 @@ static int cxgb4_ieee_getapp(struct net_device *dev, struct dcb_app *app)
 {
 	int prio;
 
-	if (!cxgb4_ieee_negotiation_complete(dev))
+	if (!cxgb4_ieee_negotiation_complete(dev, CXGB4_DCB_FW_APP_ID))
 		return -EINVAL;
 	if (!(app->selector && app->protocol))
 		return -EINVAL;
@@ -872,7 +914,7 @@ static int cxgb4_ieee_setapp(struct net_device *dev, struct dcb_app *app)
 {
 	int ret;
 
-	if (!cxgb4_ieee_negotiation_complete(dev))
+	if (!cxgb4_ieee_negotiation_complete(dev, CXGB4_DCB_FW_APP_ID))
 		return -EINVAL;
 	if (!(app->selector && app->protocol))
 		return -EINVAL;
-- 
2.1.2

^ permalink raw reply related

* [PATCH] vxlan: using pskb_may_pull as early as possible
From: roy.qing.li @ 2014-10-16  1:17 UTC (permalink / raw)
  To: netdev; +Cc: xiyou.wangcong

From: Li RongQing <roy.qing.li@gmail.com>

pskb_may_pull should be used to check if skb->data has enough space,
skb->len can not ensure that.

Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 drivers/net/vxlan.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index faf1bd1..77ab844 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1437,9 +1437,6 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb)
 	if (!in6_dev)
 		goto out;
 
-	if (!pskb_may_pull(skb, skb->len))
-		goto out;
-
 	iphdr = ipv6_hdr(skb);
 	saddr = &iphdr->saddr;
 	daddr = &iphdr->daddr;
@@ -1880,7 +1877,8 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct net_device *dev)
 			return arp_reduce(dev, skb);
 #if IS_ENABLED(CONFIG_IPV6)
 		else if (ntohs(eth->h_proto) == ETH_P_IPV6 &&
-			 skb->len >= sizeof(struct ipv6hdr) + sizeof(struct nd_msg) &&
+			 pskb_may_pull(skb, sizeof(struct ipv6hdr)
+				       + sizeof(struct nd_msg)) &&
 			 ipv6_hdr(skb)->nexthdr == IPPROTO_ICMPV6) {
 				struct nd_msg *msg;
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 4/4] cxgb4: Add support for mps_tcam debugfs
From: Hariprasad Shenai @ 2014-10-16  1:22 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, kumaras, nirranjan, santosh, anish,
	Hariprasad Shenai
In-Reply-To: <1413422524-14054-1-git-send-email-hariprasad@chelsio.com>

Debug log to get the MPS TCAM table

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c |  130 ++++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h       |   16 +++
 2 files changed, 146 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index e8b84dc..7d114ed 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -435,6 +435,135 @@ static const struct file_operations devlog_fops = {
 	.release = seq_release_private
 };
 
+static inline void tcamxy2valmask(u64 x, u64 y, u8 *addr, u64 *mask)
+{
+	*mask = x | y;
+	y = (__force u64)cpu_to_be64(y);
+	memcpy(addr, (char *)&y + 2, ETH_ALEN);
+}
+
+static int mps_tcam_show(struct seq_file *seq, void *v)
+{
+	if (v == SEQ_START_TOKEN)
+		seq_puts(seq, "Idx  Ethernet address     Mask     Vld Ports PF"
+			 "  VF              Replication             P0 P1 P2 P3  ML\n");
+	else {
+		u64 mask;
+		u8 addr[ETH_ALEN];
+		struct adapter *adap = seq->private;
+		unsigned int idx = (uintptr_t)v - 2;
+		u64 tcamy = t4_read_reg64(adap, MPS_CLS_TCAM_Y_L(idx));
+		u64 tcamx = t4_read_reg64(adap, MPS_CLS_TCAM_X_L(idx));
+		u32 cls_lo = t4_read_reg(adap, MPS_CLS_SRAM_L(idx));
+		u32 cls_hi = t4_read_reg(adap, MPS_CLS_SRAM_H(idx));
+		u32 rplc[4] = {0, 0, 0, 0};
+
+		if (tcamx & tcamy) {
+			seq_printf(seq, "%3u         -\n", idx);
+			goto out;
+		}
+
+		if (cls_lo & REPLICATE) {
+			struct fw_ldst_cmd ldst_cmd;
+			int ret;
+
+			memset(&ldst_cmd, 0, sizeof(ldst_cmd));
+			ldst_cmd.op_to_addrspace =
+				htonl(FW_CMD_OP(FW_LDST_CMD) |
+				      FW_CMD_REQUEST |
+				      FW_CMD_READ |
+				      FW_LDST_CMD_ADDRSPACE(
+					      FW_LDST_ADDRSPC_MPS));
+			ldst_cmd.cycles_to_len16 = htonl(FW_LEN16(ldst_cmd));
+			ldst_cmd.u.mps.fid_ctl =
+				htons(FW_LDST_CMD_FID(FW_LDST_MPS_RPLC) |
+				      FW_LDST_CMD_CTL(idx));
+			ret = t4_wr_mbox(adap, adap->mbox, &ldst_cmd,
+					 sizeof(ldst_cmd), &ldst_cmd);
+			if (ret)
+				dev_warn(adap->pdev_dev, "Can't read MPS "
+					 "replication map for idx %d: %d\n",
+					 idx, -ret);
+			else {
+				rplc[0] = ntohl(ldst_cmd.u.mps.rplc31_0);
+				rplc[1] = ntohl(ldst_cmd.u.mps.rplc63_32);
+				rplc[2] = ntohl(ldst_cmd.u.mps.rplc95_64);
+				rplc[3] = ntohl(ldst_cmd.u.mps.rplc127_96);
+			}
+		}
+
+		tcamxy2valmask(tcamx, tcamy, addr, &mask);
+		seq_printf(seq, "%3u %02x:%02x:%02x:%02x:%02x:%02x %012llx"
+			   "%3c   %#x%4u%4d",
+			   idx, addr[0], addr[1], addr[2], addr[3], addr[4],
+			   addr[5], (unsigned long long)mask,
+			   (cls_lo & SRAM_VLD) ? 'Y' : 'N', PORTMAP_GET(cls_hi),
+			   PF_GET(cls_lo),
+			   (cls_lo & VF_VALID) ? VF_GET(cls_lo) : -1);
+		if (cls_lo & REPLICATE)
+			seq_printf(seq, " %08x %08x %08x %08x",
+				   rplc[3], rplc[2], rplc[1], rplc[0]);
+		else
+			seq_printf(seq, "%36c", ' ');
+		seq_printf(seq, "%4u%3u%3u%3u %#x\n",
+			   SRAM_PRIO0_GET(cls_lo), SRAM_PRIO1_GET(cls_lo),
+			   SRAM_PRIO2_GET(cls_lo), SRAM_PRIO3_GET(cls_lo),
+			   (cls_lo >> MULTILISTEN0_SHIFT) & 0xf);
+	}
+out:	return 0;
+}
+
+static inline void *mps_tcam_get_idx(struct seq_file *seq, loff_t pos)
+{
+	struct adapter *adap = seq->private;
+	int max_mac_addr = is_t4(adap->params.chip) ?
+				NUM_MPS_CLS_SRAM_L_INSTANCES :
+				NUM_MPS_T5_CLS_SRAM_L_INSTANCES;
+	return ((pos <= max_mac_addr) ? (void *)(uintptr_t)(pos + 1) : NULL);
+}
+
+static void *mps_tcam_start(struct seq_file *seq, loff_t *pos)
+{
+	return *pos ? mps_tcam_get_idx(seq, *pos) : SEQ_START_TOKEN;
+}
+
+static void *mps_tcam_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+	++*pos;
+	return mps_tcam_get_idx(seq, *pos);
+}
+
+static void mps_tcam_stop(struct seq_file *seq, void *v)
+{
+}
+
+static const struct seq_operations mps_tcam_seq_ops = {
+	.start = mps_tcam_start,
+	.next  = mps_tcam_next,
+	.stop  = mps_tcam_stop,
+	.show  = mps_tcam_show
+};
+
+static int mps_tcam_open(struct inode *inode, struct file *file)
+{
+	int res = seq_open(file, &mps_tcam_seq_ops);
+
+	if (!res) {
+		struct seq_file *seq = file->private_data;
+
+		seq->private = inode->i_private;
+	}
+	return res;
+}
+
+static const struct file_operations mps_tcam_debugfs_fops = {
+	.owner   = THIS_MODULE,
+	.open    = mps_tcam_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = seq_release,
+};
+
 static ssize_t mem_read(struct file *file, char __user *buf, size_t count,
 			loff_t *ppos)
 {
@@ -517,6 +646,7 @@ int t4_setup_debugfs(struct adapter *adap)
 		{ "cim_qcfg", &cim_qcfg_fops, S_IRUSR, 0 },
 		{ "devlog", &devlog_fops, S_IRUSR, 0 },
 		{ "l2t", &t4_l2t_fops, S_IRUSR, 0},
+		{ "mps_tcam", &mps_tcam_debugfs_fops, S_IRUSR, 0 },
 	};
 
 	add_debugfs_files(adap,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
index 562cd70..6c4369e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
@@ -1054,6 +1054,22 @@
 
 #define MPS_RX_PERR_INT_CAUSE 0x11074
 
+#define MPS_CLS_TCAM_Y_L(idx) (0xf000 + (idx) * 16)
+#define MPS_CLS_TCAM_X_L(idx) (0xf008 + (idx) * 16)
+#define MPS_CLS_SRAM_L(idx) (0xe000 + (idx) * 8)
+#define MPS_CLS_SRAM_H(idx) (0xe004 + (idx) * 8)
+#define REPLICATE    (1U << 11)
+#define SRAM_VLD    (1U << 12)
+#define SRAM_PRIO0_GET(x) (((x) >> 13) & 0x7U)
+#define SRAM_PRIO1_GET(x) (((x) >> 16) & 0x7U)
+#define SRAM_PRIO2_GET(x) (((x) >> 19) & 0x7U)
+#define SRAM_PRIO3_GET(x) (((x) >> 22) & 0x7U)
+#define PORTMAP_GET(x) (((x) >> 0) & 0xfU)
+#define PF_GET(x) (((x) >> 8) & 0x7U)
+#define VF_GET(x) (((x) >> 0) & 0x7fU)
+#define VF_VALID    (1U << 7)
+#define MULTILISTEN0_SHIFT    25
+
 #define CPL_INTR_CAUSE 0x19054
 #define  CIM_OP_MAP_PERR   0x00000020U
 #define  CIM_OVFL_ERROR    0x00000010U
-- 
1.7.1

^ permalink raw reply related

* [PATCH net-next 3/4] cxgb4: Add support for cim_la_qcfg entry in debugfs
From: Hariprasad Shenai @ 2014-10-16  1:22 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, kumaras, nirranjan, santosh, anish,
	Hariprasad Shenai
In-Reply-To: <1413422524-14054-1-git-send-email-hariprasad@chelsio.com>

Adds debug log to get cim la queue config

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |    1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c |   70 ++++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c         |   35 ++++++++++
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.h         |    3 +
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h       |   22 ++++++
 5 files changed, 131 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index d8a62cc..ab3f925 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -1020,6 +1020,7 @@ int t4_cim_read(struct adapter *adap, unsigned int addr, unsigned int n,
 int t4_cim_write(struct adapter *adap, unsigned int addr, unsigned int n,
 		 const unsigned int *valp);
 int t4_cim_read_la(struct adapter *adap, u32 *la_buf, unsigned int *wrptr);
+void t4_read_cimq_cfg(struct adapter *adap, u16 *base, u16 *size, u16 *thres);
 const char *t4_get_port_type_description(enum fw_port_type port_type);
 void t4_get_port_stats(struct adapter *adap, int idx, struct port_stats *p);
 void t4_read_mtu_tbl(struct adapter *adap, u16 *mtus, u8 *mtu_log);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index 04b490e..e8b84dc 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -168,6 +168,75 @@ static const struct file_operations cim_la_fops = {
 	.release = seq_release_private
 };
 
+static int cim_qcfg_show(struct seq_file *seq, void *v)
+{
+	static const char * const qname[] = {
+		"TP0", "TP1", "ULP", "SGE0", "SGE1", "NC-SI",
+		"ULP0", "ULP1", "ULP2", "ULP3", "SGE", "NC-SI",
+		"SGE0-RX", "SGE1-RX"
+	};
+
+	int i;
+	struct adapter *adap = seq->private;
+	u16 base[CIM_NUM_IBQ + CIM_NUM_OBQ_T5];
+	u16 size[CIM_NUM_IBQ + CIM_NUM_OBQ_T5];
+	u32 stat[(4 * (CIM_NUM_IBQ + CIM_NUM_OBQ_T5))];
+	u16 thres[CIM_NUM_IBQ];
+	u32 obq_wr_t4[2 * CIM_NUM_OBQ], *wr;
+	u32 obq_wr_t5[2 * CIM_NUM_OBQ_T5];
+	u32 *p = stat;
+	int cim_num_obq = is_t4(adap->params.chip) ?
+				CIM_NUM_OBQ : CIM_NUM_OBQ_T5;
+
+	i = t4_cim_read(adap, is_t4(adap->params.chip) ? UP_IBQ_0_RDADDR :
+			UP_IBQ_0_SHADOW_RDADDR,
+			ARRAY_SIZE(stat), stat);
+	if (!i) {
+		if (is_t4(adap->params.chip)) {
+			i = t4_cim_read(adap, UP_OBQ_0_REALADDR,
+					ARRAY_SIZE(obq_wr_t4), obq_wr_t4);
+				wr = obq_wr_t4;
+		} else {
+			i = t4_cim_read(adap, UP_OBQ_0_SHADOW_REALADDR,
+					ARRAY_SIZE(obq_wr_t5), obq_wr_t5);
+				wr = obq_wr_t5;
+		}
+	}
+	if (i)
+		return i;
+
+	t4_read_cimq_cfg(adap, base, size, thres);
+
+	seq_printf(seq,
+		   "  Queue  Base  Size Thres  RdPtr WrPtr  SOP  EOP Avail\n");
+	for (i = 0; i < CIM_NUM_IBQ; i++, p += 4)
+		seq_printf(seq, "%7s %5x %5u %5u %6x  %4x %4u %4u %5u\n",
+			   qname[i], base[i], size[i], thres[i],
+			   IBQRDADDR_GET(p[0]), IBQWRADDR_GET(p[1]),
+			   QUESOPCNT_GET(p[3]), QUEEOPCNT_GET(p[3]),
+			   QUEREMFLITS_GET(p[2]) * 16);
+	for ( ; i < CIM_NUM_IBQ + cim_num_obq; i++, p += 4, wr += 2)
+		seq_printf(seq, "%7s %5x %5u %12x  %4x %4u %4u %5u\n",
+			   qname[i], base[i], size[i],
+			   QUERDADDR_GET(p[0]) & 0x3fff, wr[0] - base[i],
+			   QUESOPCNT_GET(p[3]), QUEEOPCNT_GET(p[3]),
+			   QUEREMFLITS_GET(p[2]) * 16);
+	return 0;
+}
+
+static int cim_qcfg_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, cim_qcfg_show, inode->i_private);
+}
+
+static const struct file_operations cim_qcfg_fops = {
+	.owner   = THIS_MODULE,
+	.open    = cim_qcfg_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = single_release,
+};
+
 /* Firmware Device Log dump. */
 static const char * const devlog_level_strings[] = {
 	[FW_DEVLOG_LEVEL_EMERG]		= "EMERG",
@@ -445,6 +514,7 @@ int t4_setup_debugfs(struct adapter *adap)
 
 	static struct t4_debugfs_entry t4_debugfs_files[] = {
 		{ "cim_la", &cim_la_fops, S_IRUSR, 0 },
+		{ "cim_qcfg", &cim_qcfg_fops, S_IRUSR, 0 },
 		{ "devlog", &devlog_fops, S_IRUSR, 0 },
 		{ "l2t", &t4_l2t_fops, S_IRUSR, 0},
 	};
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index 81e1ef2..900f275 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -4128,6 +4128,41 @@ int t4_port_init(struct adapter *adap, int mbox, int pf, int vf)
 }
 
 /**
+ *	t4_read_cimq_cfg - read CIM queue configuration
+ *	@adap: the adapter
+ *	@base: holds the queue base addresses in bytes
+ *	@size: holds the queue sizes in bytes
+ *	@thres: holds the queue full thresholds in bytes
+ *
+ *	Returns the current configuration of the CIM queues, starting with
+ *	the IBQs, then the OBQs.
+ */
+void t4_read_cimq_cfg(struct adapter *adap, u16 *base, u16 *size, u16 *thres)
+{
+	unsigned int i, v;
+	int cim_num_obq = is_t4(adap->params.chip) ?
+				CIM_NUM_OBQ : CIM_NUM_OBQ_T5;
+
+	for (i = 0; i < CIM_NUM_IBQ; i++) {
+		t4_write_reg(adap, CIM_QUEUE_CONFIG_REF, IBQSELECT |
+			     QUENUMSELECT(i));
+		v = t4_read_reg(adap, CIM_QUEUE_CONFIG_CTRL);
+		/* value is in 256-byte units */
+		*base++ = CIMQBASE_GET(v) * 256;
+		*size++ = CIMQSIZE_GET(v) * 256;
+		*thres++ = QUEFULLTHRSH_GET(v) * 8; /* 8-byte unit */
+	}
+	for (i = 0; i < cim_num_obq; i++) {
+		t4_write_reg(adap, CIM_QUEUE_CONFIG_REF, OBQSELECT |
+			     QUENUMSELECT(i));
+		v = t4_read_reg(adap, CIM_QUEUE_CONFIG_CTRL);
+		/* value is in 256-byte units */
+		*base++ = CIMQBASE_GET(v) * 256;
+		*size++ = CIMQSIZE_GET(v) * 256;
+	}
+}
+
+/**
  *	t4_cim_read - read a block from CIM internal address space
  *	@adap: the adapter
  *	@addr: the start address within the CIM address space
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
index bcc925b..f6b82da 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
@@ -56,6 +56,9 @@ enum {
 };
 
 enum {
+	CIM_NUM_IBQ    = 6,     /* # of CIM IBQs */
+	CIM_NUM_OBQ    = 6,     /* # of CIM OBQs */
+	CIM_NUM_OBQ_T5 = 8,     /* # of CIM OBQs for T5 adapter */
 	CIMLA_SIZE     = 2048,  /* # of 32-bit words in CIM LA */
 };
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
index 4f6d6fa..562cd70 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
@@ -623,6 +623,28 @@
 #define  ILLTRANSINT      0x00000002U
 #define  RSVDSPACEINT     0x00000001U
 
+#define CIM_QUEUE_CONFIG_REF 0x7b48
+#define CIM_QUEUE_CONFIG_CTRL 0x7b4c
+
+#define CIMQBASE_GET(x) (((x) >> 16) & 0x3fU)
+#define CIMQSIZE_GET(x) (((x) >> 24) & 0x3fU)
+#define QUEFULLTHRSH_GET(x) (((x) >> 0) & 0x1ffU)
+
+#define UP_IBQ_0_RDADDR 0x10
+#define UP_IBQ_0_SHADOW_RDADDR 0x280
+#define UP_OBQ_0_REALADDR 0x104
+#define UP_OBQ_0_SHADOW_REALADDR 0x394
+
+#define IBQRDADDR_GET(x) (((x) >> 0) & 0x1fffU)
+#define IBQWRADDR_GET(x) (((x) >> 0) & 0x1fffU)
+#define QUERDADDR_GET(x) (((x) >> 0) & 0x7fffU)
+#define QUESOPCNT_GET(x) (((x) >> 0) & 0xfffU)
+#define QUEEOPCNT_GET(x) (((x) >> 16) & 0xfffU)
+#define QUEREMFLITS_GET(x) (((x) >> 0) & 0x7ffU)
+#define OBQSELECT	(1U << 4)
+#define IBQSELECT	(1U << 3)
+#define QUENUMSELECT(x) ((x) << 0)
+
 #define TP_OUT_CONFIG 0x7d04
 #define  VLANEXTENABLE_MASK  0x0000f000U
 #define  VLANEXTENABLE_SHIFT 12
-- 
1.7.1

^ permalink raw reply related

* [PATCH net-next 2/4] cxgb4: Add support for cim_la entry in debugfs
From: Hariprasad Shenai @ 2014-10-16  1:22 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, kumaras, nirranjan, santosh, anish,
	Hariprasad Shenai
In-Reply-To: <1413422524-14054-1-git-send-email-hariprasad@chelsio.com>

The CIM LA captures the embedded processor’s internal state. Optionally, it can
also trace the flow of data in and out of the embedded processor. Therefore, the
CIM LA output contains detailed information of what code the embedded processor
executed prior to the CIM LA capture.

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |    7 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c |  129 +++++++++++++++++++-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h |   12 ++
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c         |  120 ++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.h         |    4 +
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h       |   14 ++
 6 files changed, 284 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 081b7ef..d8a62cc 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -297,6 +297,8 @@ struct adapter_params {
 	struct devlog_params devlog;
 	enum pcie_memwin drv_memwin;
 
+	unsigned int cim_la_size;
+
 	unsigned int sf_size;             /* serial flash size in bytes */
 	unsigned int sf_nsec;             /* # of flash sectors */
 	unsigned int sf_fw_start;         /* start of FW image in flash */
@@ -1013,6 +1015,11 @@ int t4_mc_read(struct adapter *adap, int idx, u32 addr, __be32 *data,
 	       u64 *parity);
 int t4_edc_read(struct adapter *adap, int idx, u32 addr, __be32 *data,
 		u64 *parity);
+int t4_cim_read(struct adapter *adap, unsigned int addr, unsigned int n,
+		unsigned int *valp);
+int t4_cim_write(struct adapter *adap, unsigned int addr, unsigned int n,
+		 const unsigned int *valp);
+int t4_cim_read_la(struct adapter *adap, u32 *la_buf, unsigned int *wrptr);
 const char *t4_get_port_type_description(enum fw_port_type port_type);
 void t4_get_port_stats(struct adapter *adap, int idx, struct port_stats *p);
 void t4_read_mtu_tbl(struct adapter *adap, u16 *mtus, u8 *mtu_log);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index 0704af5..04b490e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -43,8 +43,132 @@
 #include "cxgb4_debugfs.h"
 #include "l2t.h"
 
-/* Firmware Device Log dump.
- */
+/* generic seq_file support for showing a table of size rows x width. */
+static void *seq_tab_get_idx(struct seq_tab *tb, loff_t pos)
+{
+	pos -= tb->skip_first;
+	return pos >= tb->rows ? NULL : &tb->data[pos * tb->width];
+}
+
+static void *seq_tab_start(struct seq_file *seq, loff_t *pos)
+{
+	struct seq_tab *tb = seq->private;
+
+	if (tb->skip_first && *pos == 0)
+		return SEQ_START_TOKEN;
+
+	return seq_tab_get_idx(tb, *pos);
+}
+
+static void *seq_tab_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+	v = seq_tab_get_idx(seq->private, *pos + 1);
+	if (v)
+		++*pos;
+	return v;
+}
+
+static void seq_tab_stop(struct seq_file *seq, void *v)
+{
+}
+
+static int seq_tab_show(struct seq_file *seq, void *v)
+{
+	const struct seq_tab *tb = seq->private;
+
+	return tb->show(seq, v, ((char *)v - tb->data) / tb->width);
+}
+
+static const struct seq_operations seq_tab_ops = {
+	.start = seq_tab_start,
+	.next  = seq_tab_next,
+	.stop  = seq_tab_stop,
+	.show  = seq_tab_show
+};
+
+struct seq_tab *seq_open_tab(struct file *f, unsigned int rows,
+			     unsigned int width, unsigned int have_header,
+			     int (*show)(struct seq_file *seq, void *v, int i))
+{
+	struct seq_tab *p;
+
+	p = __seq_open_private(f, &seq_tab_ops, sizeof(*p) + rows * width);
+	if (p) {
+		p->show = show;
+		p->rows = rows;
+		p->width = width;
+		p->skip_first = have_header != 0;
+	}
+	return p;
+}
+
+static int cim_la_show(struct seq_file *seq, void *v, int idx)
+{
+	if (v == SEQ_START_TOKEN)
+		seq_puts(seq, "Status   Data      PC     LS0Stat  LS0Addr "
+			 "            LS0Data\n");
+	else {
+		const u32 *p = v;
+
+		seq_printf(seq,
+			   "  %02x  %x%07x %x%07x %08x %08x %08x%08x%08x%08x\n",
+			   (p[0] >> 4) & 0xff, p[0] & 0xf, p[1] >> 4,
+			   p[1] & 0xf, p[2] >> 4, p[2] & 0xf, p[3], p[4], p[5],
+			   p[6], p[7]);
+	}
+	return 0;
+}
+
+static int cim_la_show_3in1(struct seq_file *seq, void *v, int idx)
+{
+	if (v == SEQ_START_TOKEN) {
+		seq_puts(seq, "Status   Data      PC\n");
+	} else {
+		const u32 *p = v;
+
+		seq_printf(seq, "  %02x   %08x %08x\n", p[5] & 0xff, p[6],
+			   p[7]);
+		seq_printf(seq, "  %02x   %02x%06x %02x%06x\n",
+			   (p[3] >> 8) & 0xff, p[3] & 0xff, p[4] >> 8,
+			   p[4] & 0xff, p[5] >> 8);
+		seq_printf(seq, "  %02x   %x%07x %x%07x\n", (p[0] >> 4) & 0xff,
+			   p[0] & 0xf, p[1] >> 4, p[1] & 0xf, p[2] >> 4);
+	}
+	return 0;
+}
+
+static int cim_la_open(struct inode *inode, struct file *file)
+{
+	int ret;
+	unsigned int cfg;
+	struct seq_tab *p;
+	struct adapter *adap = inode->i_private;
+
+	ret = t4_cim_read(adap, UP_UP_DBG_LA_CFG, 1, &cfg);
+	if (ret)
+		return ret;
+
+	p = seq_open_tab(file, adap->params.cim_la_size / 8, 8 * sizeof(u32), 1,
+			 cfg & UPDBGLACAPTPCONLY ?
+			 cim_la_show_3in1 : cim_la_show);
+	if (!p)
+		return -ENOMEM;
+
+	ret = t4_cim_read_la(adap, (u32 *)p->data, NULL);
+	if (ret)
+		seq_release_private(inode, file);
+	return ret;
+}
+
+static const struct file_operations cim_la_fops = {
+	.owner   = THIS_MODULE,
+	.open    = cim_la_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = seq_release_private
+};
+
+/* Firmware Device Log dump. */
 static const char * const devlog_level_strings[] = {
 	[FW_DEVLOG_LEVEL_EMERG]		= "EMERG",
 	[FW_DEVLOG_LEVEL_CRIT]		= "CRIT",
@@ -320,6 +444,7 @@ int t4_setup_debugfs(struct adapter *adap)
 	u32 size;
 
 	static struct t4_debugfs_entry t4_debugfs_files[] = {
+		{ "cim_la", &cim_la_fops, S_IRUSR, 0 },
 		{ "devlog", &devlog_fops, S_IRUSR, 0 },
 		{ "l2t", &t4_l2t_fops, S_IRUSR, 0},
 	};
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h
index a3d8867..70fcbc9 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h
@@ -44,6 +44,18 @@ struct t4_debugfs_entry {
 	unsigned char data;
 };
 
+struct seq_tab {
+	int (*show)(struct seq_file *seq, void *v, int idx);
+	unsigned int rows;        /* # of entries */
+	unsigned char width;      /* size in bytes of each entry */
+	unsigned char skip_first; /* whether the first line is a header */
+	char data[0];             /* the table data */
+};
+
+struct seq_tab *seq_open_tab(struct file *f, unsigned int rows,
+			     unsigned int width, unsigned int have_header,
+			     int (*show)(struct seq_file *seq, void *v, int i));
+
 int t4_setup_debugfs(struct adapter *adap);
 void add_debugfs_files(struct adapter *adap,
 		       struct t4_debugfs_entry *files,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index 1fff149..81e1ef2 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -3953,6 +3953,7 @@ int t4_prep_adapter(struct adapter *adapter)
 		return -EINVAL;
 	}
 
+	adapter->params.cim_la_size = CIMLA_SIZE;
 	init_cong_ctrl(adapter->params.a_wnd, adapter->params.b_wnd);
 
 	/*
@@ -4125,3 +4126,122 @@ int t4_port_init(struct adapter *adap, int mbox, int pf, int vf)
 	}
 	return 0;
 }
+
+/**
+ *	t4_cim_read - read a block from CIM internal address space
+ *	@adap: the adapter
+ *	@addr: the start address within the CIM address space
+ *	@n: number of words to read
+ *	@valp: where to store the result
+ *
+ *	Reads a block of 4-byte words from the CIM intenal address space.
+ */
+int t4_cim_read(struct adapter *adap, unsigned int addr, unsigned int n,
+		unsigned int *valp)
+{
+	int ret = 0;
+
+	if (t4_read_reg(adap, CIM_HOST_ACC_CTRL) & HOSTBUSY)
+		return -EBUSY;
+
+	for ( ; !ret && n--; addr += 4) {
+		t4_write_reg(adap, CIM_HOST_ACC_CTRL, addr);
+		ret = t4_wait_op_done(adap, CIM_HOST_ACC_CTRL, HOSTBUSY,
+				      0, 5, 2);
+		if (!ret)
+			*valp++ = t4_read_reg(adap, CIM_HOST_ACC_DATA);
+	}
+	return ret;
+}
+
+/**
+ *	t4_cim_write - write a block into CIM internal address space
+ *	@adap: the adapter
+ *	@addr: the start address within the CIM address space
+ *	@n: number of words to write
+ *	@valp: set of values to write
+ *
+ *	Writes a block of 4-byte words into the CIM intenal address space.
+ */
+int t4_cim_write(struct adapter *adap, unsigned int addr, unsigned int n,
+		 const unsigned int *valp)
+{
+	int ret = 0;
+
+	if (t4_read_reg(adap, CIM_HOST_ACC_CTRL) & HOSTBUSY)
+		return -EBUSY;
+
+	for ( ; !ret && n--; addr += 4) {
+		t4_write_reg(adap, CIM_HOST_ACC_DATA, *valp++);
+		t4_write_reg(adap, CIM_HOST_ACC_CTRL, addr | HOSTWRITE);
+		ret = t4_wait_op_done(adap, CIM_HOST_ACC_CTRL, HOSTBUSY,
+				      0, 5, 2);
+	}
+	return ret;
+}
+
+static int t4_cim_write1(struct adapter *adap, unsigned int addr,
+			 unsigned int val)
+{
+	return t4_cim_write(adap, addr, 1, &val);
+}
+
+/**
+ *	t4_cim_read_la - read CIM LA capture buffer
+ *	@adap: the adapter
+ *	@la_buf: where to store the LA data
+ *	@wrptr: the HW write pointer within the capture buffer
+ *
+ *	Reads the contents of the CIM LA buffer with the most recent entry at
+ *	the end	of the returned data and with the entry at @wrptr first.
+ *	We try to leave the LA in the running state we find it in.
+ */
+int t4_cim_read_la(struct adapter *adap, u32 *la_buf, unsigned int *wrptr)
+{
+	int i, ret;
+	unsigned int cfg, val, idx;
+
+	ret = t4_cim_read(adap, UP_UP_DBG_LA_CFG, 1, &cfg);
+	if (ret)
+		return ret;
+
+	if (cfg & UPDBGLAEN) {                /* LA is running, freeze it */
+		ret = t4_cim_write1(adap, UP_UP_DBG_LA_CFG, 0);
+		if (ret)
+			return ret;
+	}
+
+	ret = t4_cim_read(adap, UP_UP_DBG_LA_CFG, 1, &val);
+	if (ret)
+		goto restart;
+
+	idx = UPDBGLAWRPTR_GET(val);
+	if (wrptr)
+		*wrptr = idx;
+
+	for (i = 0; i < adap->params.cim_la_size; i++) {
+		ret = t4_cim_write1(adap, UP_UP_DBG_LA_CFG,
+				    UPDBGLARDPTR(idx) | UPDBGLARDEN);
+		if (ret)
+			break;
+		ret = t4_cim_read(adap, UP_UP_DBG_LA_CFG, 1, &val);
+		if (ret)
+			break;
+		if (val & UPDBGLARDEN) {
+			ret = -ETIMEDOUT;
+			break;
+		}
+		ret = t4_cim_read(adap, UP_UP_DBG_LA_DATA, 1, &la_buf[i]);
+		if (ret)
+			break;
+		idx = (idx + 1) & UPDBGLARDPTR_MASK;
+	}
+restart:
+	if (cfg & UPDBGLAEN) {
+		int r = t4_cim_write1(adap, UP_UP_DBG_LA_CFG,
+				      cfg & ~UPDBGLARDEN);
+		if (!ret)
+			ret = r;
+	}
+	return ret;
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
index 5e5eee6..bcc925b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
@@ -56,6 +56,10 @@ enum {
 };
 
 enum {
+	CIMLA_SIZE     = 2048,  /* # of 32-bit words in CIM LA */
+};
+
+enum {
 	SF_PAGE_SIZE = 256,           /* serial flash page size */
 	SF_SEC_SIZE = 64 * 1024,      /* serial flash sector size */
 };
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
index a1024db..4f6d6fa 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
@@ -1331,4 +1331,18 @@
 #define S_FT_VNID_ID_VLD                16
 #define V_FT_VNID_ID_VLD(x)             ((x) << S_FT_VNID_ID_VLD)
 
+/* registers for module CIM */
+#define CIM_HOST_ACC_CTRL	0x7b50
+#define CIM_HOST_ACC_DATA	0x7b54
+#define UP_UP_DBG_LA_CFG	0x140
+#define HOSTBUSY	(1U << 17)
+#define HOSTWRITE	(1U << 16)
+#define UPDBGLAEN	(1U << 0)
+#define UPDBGLARDEN	(1U << 1)
+#define UP_UP_DBG_LA_DATA	0x144
+#define UPDBGLARDPTR_MASK	0xfffU
+#define UPDBGLARDPTR(x)	((x) << 2)
+#define UPDBGLAWRPTR_GET(x)	(((x) >> 16) & 0xfffU)
+#define UPDBGLACAPTPCONLY	(1U << 30)
+
 #endif /* __T4_REGS_H */
-- 
1.7.1

^ permalink raw reply related

* [PATCH net-next 1/4] cxgb4: Add cxgb4_debugfs.c and devlog support
From: Hariprasad Shenai @ 2014-10-16  1:22 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, kumaras, nirranjan, santosh, anish,
	Hariprasad Shenai
In-Reply-To: <1413422524-14054-1-git-send-email-hariprasad@chelsio.com>

All the debugfs stuff is moved to cxgb4_debugfs.c.
Also add support for device log entry in debugfs

Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/Makefile        |    2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |    9 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c |  358 ++++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h |   52 +++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    |  121 ++-----
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.h         |   12 +
 drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h      |   72 ++++
 7 files changed, 532 insertions(+), 94 deletions(-)
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h

diff --git a/drivers/net/ethernet/chelsio/cxgb4/Makefile b/drivers/net/ethernet/chelsio/cxgb4/Makefile
index 1df65c9..de34d4b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/Makefile
+++ b/drivers/net/ethernet/chelsio/cxgb4/Makefile
@@ -4,5 +4,5 @@
 
 obj-$(CONFIG_CHELSIO_T4) += cxgb4.o
 
-cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o
+cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o cxgb4_debugfs.o
 cxgb4-$(CONFIG_CHELSIO_T4_DCB) +=  cxgb4_dcb.o
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 410ed58..081b7ef 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -284,10 +284,18 @@ enum chip_type {
 	T5_LAST_REV	= T5_A1,
 };
 
+struct devlog_params {
+	u32 memtype;                    /* which memory (EDC0, EDC1, MC) */
+	u32 start;                      /* start of log in firmware memory */
+	u32 size;                       /* size of log */
+};
+
 struct adapter_params {
 	struct tp_params  tp;
 	struct vpd_params vpd;
 	struct pci_params pci;
+	struct devlog_params devlog;
+	enum pcie_memwin drv_memwin;
 
 	unsigned int sf_size;             /* serial flash size in bytes */
 	unsigned int sf_nsec;             /* # of flash sectors */
@@ -1083,4 +1091,5 @@ void t4_db_dropped(struct adapter *adapter);
 int t4_fwaddrspace_write(struct adapter *adap, unsigned int mbox,
 			 u32 addr, u32 val);
 void t4_sge_decode_idma_state(struct adapter *adapter, int state);
+void t4_free_mem(void *addr);
 #endif /* __CXGB4_H__ */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
new file mode 100644
index 0000000..0704af5
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -0,0 +1,358 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2003-2014 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/seq_file.h>
+#include <linux/debugfs.h>
+#include <linux/string_helpers.h>
+#include <linux/sort.h>
+
+#include "cxgb4.h"
+#include "t4_regs.h"
+#include "t4fw_api.h"
+#include "cxgb4_debugfs.h"
+#include "l2t.h"
+
+/* Firmware Device Log dump.
+ */
+static const char * const devlog_level_strings[] = {
+	[FW_DEVLOG_LEVEL_EMERG]		= "EMERG",
+	[FW_DEVLOG_LEVEL_CRIT]		= "CRIT",
+	[FW_DEVLOG_LEVEL_ERR]		= "ERR",
+	[FW_DEVLOG_LEVEL_NOTICE]	= "NOTICE",
+	[FW_DEVLOG_LEVEL_INFO]		= "INFO",
+	[FW_DEVLOG_LEVEL_DEBUG]		= "DEBUG"
+};
+
+static const char * const devlog_facility_strings[] = {
+	[FW_DEVLOG_FACILITY_CORE]	= "CORE",
+	[FW_DEVLOG_FACILITY_SCHED]	= "SCHED",
+	[FW_DEVLOG_FACILITY_TIMER]	= "TIMER",
+	[FW_DEVLOG_FACILITY_RES]	= "RES",
+	[FW_DEVLOG_FACILITY_HW]		= "HW",
+	[FW_DEVLOG_FACILITY_FLR]	= "FLR",
+	[FW_DEVLOG_FACILITY_DMAQ]	= "DMAQ",
+	[FW_DEVLOG_FACILITY_PHY]	= "PHY",
+	[FW_DEVLOG_FACILITY_MAC]	= "MAC",
+	[FW_DEVLOG_FACILITY_PORT]	= "PORT",
+	[FW_DEVLOG_FACILITY_VI]		= "VI",
+	[FW_DEVLOG_FACILITY_FILTER]	= "FILTER",
+	[FW_DEVLOG_FACILITY_ACL]	= "ACL",
+	[FW_DEVLOG_FACILITY_TM]		= "TM",
+	[FW_DEVLOG_FACILITY_QFC]	= "QFC",
+	[FW_DEVLOG_FACILITY_DCB]	= "DCB",
+	[FW_DEVLOG_FACILITY_ETH]	= "ETH",
+	[FW_DEVLOG_FACILITY_OFLD]	= "OFLD",
+	[FW_DEVLOG_FACILITY_RI]		= "RI",
+	[FW_DEVLOG_FACILITY_ISCSI]	= "ISCSI",
+	[FW_DEVLOG_FACILITY_FCOE]	= "FCOE",
+	[FW_DEVLOG_FACILITY_FOISCSI]	= "FOISCSI",
+	[FW_DEVLOG_FACILITY_FOFCOE]	= "FOFCOE"
+};
+
+/* Information gathered by Device Log Open routine for the display routine.
+ */
+struct devlog_info {
+	unsigned int nentries;		/* number of entries in log[] */
+	unsigned int first;		/* first [temporal] entry in log[] */
+	struct fw_devlog_e log[0];	/* Firmware Device Log */
+};
+
+/* Dump a Firmaware Device Log entry.
+ */
+static int devlog_show(struct seq_file *seq, void *v)
+{
+	if (v == SEQ_START_TOKEN)
+		seq_printf(seq, "%10s  %15s  %8s  %8s  %s\n",
+			   "Seq#", "Tstamp", "Level", "Facility", "Message");
+	else {
+		struct devlog_info *dinfo = seq->private;
+		int fidx = (uintptr_t)v - 2;
+		unsigned long index;
+		struct fw_devlog_e *e;
+
+		/* Get a pointer to the log entry to display.  Skip unused log
+		 * entries.
+		 */
+		index = dinfo->first + fidx;
+		if (index >= dinfo->nentries)
+			index -= dinfo->nentries;
+		e = &dinfo->log[index];
+		if (e->timestamp == 0)
+			return 0;
+
+		/* Print the message.  This depends on the firmware using
+		 * exactly the same formating strings as the kernel so we may
+		 * eventually have to put a format interpreter in here ...
+		 */
+		seq_printf(seq, "%10d  %15llu  %8s  %8s  ",
+			   e->seqno, e->timestamp,
+			   (e->level < ARRAY_SIZE(devlog_level_strings)
+			    ? devlog_level_strings[e->level]
+			    : "UNKNOWN"),
+			   (e->facility < ARRAY_SIZE(devlog_facility_strings)
+			    ? devlog_facility_strings[e->facility]
+			    : "UNKNOWN"));
+		seq_printf(seq, e->fmt, e->params[0], e->params[1],
+			   e->params[2], e->params[3], e->params[4],
+			   e->params[5], e->params[6], e->params[7]);
+	}
+
+	return 0;
+}
+
+/* Sequential File Operations for Device Log.
+ */
+static inline void *devlog_get_idx(struct devlog_info *dinfo, loff_t pos)
+{
+	if (pos > dinfo->nentries)
+		return NULL;
+
+	return (void *)(uintptr_t)(pos + 1);
+}
+
+static void *devlog_start(struct seq_file *seq, loff_t *pos)
+{
+	struct devlog_info *dinfo = seq->private;
+
+	return (*pos
+		? devlog_get_idx(dinfo, *pos)
+		: SEQ_START_TOKEN);
+}
+
+static void *devlog_next(struct seq_file *seq, void *v, loff_t *pos)
+{
+	struct devlog_info *dinfo = seq->private;
+
+	(*pos)++;
+	return devlog_get_idx(dinfo, *pos);
+}
+
+static void devlog_stop(struct seq_file *seq, void *v)
+{
+}
+
+static const struct seq_operations devlog_seq_ops = {
+	.start = devlog_start,
+	.next  = devlog_next,
+	.stop  = devlog_stop,
+	.show  = devlog_show
+};
+
+/* Set up for reading the firmware's device log.  We read the entire log here
+ * and then display it incrementally in devlog_show().
+ */
+static int devlog_open(struct inode *inode, struct file *file)
+{
+	struct adapter *adap = inode->i_private;
+	struct devlog_params *dparams = &adap->params.devlog;
+	struct devlog_info *dinfo;
+	unsigned int index;
+	u32 fseqno;
+	int ret;
+
+	/* If we don't know where the log is we can't do anything.
+	 */
+	if (dparams->start == 0)
+		return -ENXIO;
+
+	/* Allocate the space to read in the firmware's device log and set up
+	 * for the iterated call to our display function.
+	 */
+	dinfo = __seq_open_private(file, &devlog_seq_ops,
+				   sizeof(*dinfo) + dparams->size);
+	if (dinfo == NULL)
+		return -ENOMEM;
+
+	/* Record the basic log buffer information and read in the raw log.
+	 */
+	dinfo->nentries = (dparams->size / sizeof(struct fw_devlog_e));
+	dinfo->first = 0;
+	spin_lock(&adap->win0_lock);
+	ret = t4_memory_rw(adap, adap->params.drv_memwin, dparams->memtype,
+			   dparams->start, dparams->size, (__be32 *)dinfo->log,
+			   T4_MEMORY_READ);
+	spin_unlock(&adap->win0_lock);
+	if (ret) {
+		seq_release_private(inode, file);
+		return ret;
+	}
+
+	/* Translate log multi-byte integral elements into host native format
+	 * and determine where the first entry in the log is.
+	 */
+	for (fseqno = ~((u32)0), index = 0; index < dinfo->nentries; index++) {
+		struct fw_devlog_e *e = &dinfo->log[index];
+		int i;
+		__u32 seqno;
+
+		if (e->timestamp == 0)
+			continue;
+
+		e->timestamp = (__force __be64)be64_to_cpu(e->timestamp);
+		seqno = be32_to_cpu(e->seqno);
+		for (i = 0; i < 8; i++)
+			e->params[i] =
+				(__force __be32)be32_to_cpu(e->params[i]);
+
+		if (seqno < fseqno) {
+			fseqno = seqno;
+			dinfo->first = index;
+		}
+	}
+
+	return 0;
+}
+
+static const struct file_operations devlog_fops = {
+	.owner   = THIS_MODULE,
+	.open    = devlog_open,
+	.read    = seq_read,
+	.llseek  = seq_lseek,
+	.release = seq_release_private
+};
+
+static ssize_t mem_read(struct file *file, char __user *buf, size_t count,
+			loff_t *ppos)
+{
+	loff_t pos = *ppos;
+	loff_t avail = file_inode(file)->i_size;
+	unsigned int mem = (uintptr_t)file->private_data & 3;
+	struct adapter *adap = file->private_data - mem;
+	__be32 *data;
+	int ret;
+
+	if (pos < 0)
+		return -EINVAL;
+	if (pos >= avail)
+		return 0;
+	if (count > avail - pos)
+		count = avail - pos;
+
+	data = t4_alloc_mem(count);
+	if (!data)
+		return -ENOMEM;
+
+	spin_lock(&adap->win0_lock);
+	ret = t4_memory_rw(adap, 0, mem, pos, count, data, T4_MEMORY_READ);
+	spin_unlock(&adap->win0_lock);
+	if (ret) {
+		t4_free_mem(data);
+		return ret;
+	}
+	ret = copy_to_user(buf, data, count);
+
+	t4_free_mem(data);
+	if (ret)
+		return -EFAULT;
+
+	*ppos = pos + count;
+	return count;
+}
+
+static const struct file_operations mem_debugfs_fops = {
+	.owner   = THIS_MODULE,
+	.open    = simple_open,
+	.read    = mem_read,
+	.llseek  = default_llseek,
+};
+
+static void add_debugfs_mem(struct adapter *adap, const char *name,
+			    unsigned int idx, unsigned int size_mb)
+{
+	struct dentry *de;
+
+	de = debugfs_create_file(name, S_IRUSR, adap->debugfs_root,
+				 (void *)adap + idx, &mem_debugfs_fops);
+	if (de && de->d_inode)
+		de->d_inode->i_size = size_mb << 20;
+}
+
+/* Add an array of Debug FS files.
+ */
+void add_debugfs_files(struct adapter *adap,
+		       struct t4_debugfs_entry *files,
+		       unsigned int nfiles)
+{
+	int i;
+
+	/* debugfs support is best effort */
+	for (i = 0; i < nfiles; i++)
+		debugfs_create_file(files[i].name, files[i].mode,
+				    adap->debugfs_root,
+				    (void *)adap + files[i].data,
+				    files[i].ops);
+}
+
+int t4_setup_debugfs(struct adapter *adap)
+{
+	int i;
+	u32 size;
+
+	static struct t4_debugfs_entry t4_debugfs_files[] = {
+		{ "devlog", &devlog_fops, S_IRUSR, 0 },
+		{ "l2t", &t4_l2t_fops, S_IRUSR, 0},
+	};
+
+	add_debugfs_files(adap,
+			  t4_debugfs_files,
+			  ARRAY_SIZE(t4_debugfs_files));
+
+	i = t4_read_reg(adap, MA_TARGET_MEM_ENABLE);
+	if (i & EDRAM0_ENABLE) {
+		size = t4_read_reg(adap, MA_EDRAM0_BAR);
+		add_debugfs_mem(adap, "edc0", MEM_EDC0, EDRAM_SIZE_GET(size));
+	}
+	if (i & EDRAM1_ENABLE) {
+		size = t4_read_reg(adap, MA_EDRAM1_BAR);
+		add_debugfs_mem(adap, "edc1", MEM_EDC1, EDRAM_SIZE_GET(size));
+	}
+	if (is_t4(adap->params.chip)) {
+		size = t4_read_reg(adap, MA_EXT_MEMORY_BAR);
+		if (i & EXT_MEM_ENABLE)
+			add_debugfs_mem(adap, "mc", MEM_MC,
+					EXT_MEM_SIZE_GET(size));
+	} else {
+		if (i & EXT_MEM_ENABLE) {
+			size = t4_read_reg(adap, MA_EXT_MEMORY_BAR);
+			add_debugfs_mem(adap, "mc0", MEM_MC0,
+					EXT_MEM_SIZE_GET(size));
+		}
+		if (i & EXT_MEM1_ENABLE) {
+			size = t4_read_reg(adap, MA_EXT_MEMORY1_BAR);
+			add_debugfs_mem(adap, "mc1", MEM_MC1,
+					EXT_MEM_SIZE_GET(size));
+		}
+	}
+	return 0;
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h
new file mode 100644
index 0000000..a3d8867
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h
@@ -0,0 +1,52 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2003-2014 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __CXGB4_DEBUGFS_H
+#define __CXGB4_DEBUGFS_H
+
+#include <linux/export.h>
+
+struct t4_debugfs_entry {
+	const char *name;
+	const struct file_operations *ops;
+	mode_t mode;
+	unsigned char data;
+};
+
+int t4_setup_debugfs(struct adapter *adap);
+void add_debugfs_files(struct adapter *adap,
+		       struct t4_debugfs_entry *files,
+		       unsigned int nfiles);
+
+#endif
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 5b38e95..645defa 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -68,6 +68,7 @@
 #include "t4_msg.h"
 #include "t4fw_api.h"
 #include "cxgb4_dcb.h"
+#include "cxgb4_debugfs.h"
 #include "l2t.h"
 
 #include <../drivers/net/bonding/bonding.h>
@@ -1283,7 +1284,7 @@ void *t4_alloc_mem(size_t size)
 /*
  * Free memory allocated through alloc_mem().
  */
-static void t4_free_mem(void *addr)
+void t4_free_mem(void *addr)
 {
 	if (is_vmalloc_addr(addr))
 		vfree(addr);
@@ -3113,103 +3114,12 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
 	.flash_device      = set_flash,
 };
 
-/*
- * debugfs support
- */
-static ssize_t mem_read(struct file *file, char __user *buf, size_t count,
-			loff_t *ppos)
-{
-	loff_t pos = *ppos;
-	loff_t avail = file_inode(file)->i_size;
-	unsigned int mem = (uintptr_t)file->private_data & 3;
-	struct adapter *adap = file->private_data - mem;
-	__be32 *data;
-	int ret;
-
-	if (pos < 0)
-		return -EINVAL;
-	if (pos >= avail)
-		return 0;
-	if (count > avail - pos)
-		count = avail - pos;
-
-	data = t4_alloc_mem(count);
-	if (!data)
-		return -ENOMEM;
-
-	spin_lock(&adap->win0_lock);
-	ret = t4_memory_rw(adap, 0, mem, pos, count, data, T4_MEMORY_READ);
-	spin_unlock(&adap->win0_lock);
-	if (ret) {
-		t4_free_mem(data);
-		return ret;
-	}
-	ret = copy_to_user(buf, data, count);
-
-	t4_free_mem(data);
-	if (ret)
-		return -EFAULT;
-
-	*ppos = pos + count;
-	return count;
-}
-
-static const struct file_operations mem_debugfs_fops = {
-	.owner   = THIS_MODULE,
-	.open    = simple_open,
-	.read    = mem_read,
-	.llseek  = default_llseek,
-};
-
-static void add_debugfs_mem(struct adapter *adap, const char *name,
-			    unsigned int idx, unsigned int size_mb)
-{
-	struct dentry *de;
-
-	de = debugfs_create_file(name, S_IRUSR, adap->debugfs_root,
-				 (void *)adap + idx, &mem_debugfs_fops);
-	if (de && de->d_inode)
-		de->d_inode->i_size = size_mb << 20;
-}
-
 static int setup_debugfs(struct adapter *adap)
 {
-	int i;
-	u32 size;
-
 	if (IS_ERR_OR_NULL(adap->debugfs_root))
 		return -1;
 
-	i = t4_read_reg(adap, MA_TARGET_MEM_ENABLE);
-	if (i & EDRAM0_ENABLE) {
-		size = t4_read_reg(adap, MA_EDRAM0_BAR);
-		add_debugfs_mem(adap, "edc0", MEM_EDC0,	EDRAM_SIZE_GET(size));
-	}
-	if (i & EDRAM1_ENABLE) {
-		size = t4_read_reg(adap, MA_EDRAM1_BAR);
-		add_debugfs_mem(adap, "edc1", MEM_EDC1, EDRAM_SIZE_GET(size));
-	}
-	if (is_t4(adap->params.chip)) {
-		size = t4_read_reg(adap, MA_EXT_MEMORY_BAR);
-		if (i & EXT_MEM_ENABLE)
-			add_debugfs_mem(adap, "mc", MEM_MC,
-					EXT_MEM_SIZE_GET(size));
-	} else {
-		if (i & EXT_MEM_ENABLE) {
-			size = t4_read_reg(adap, MA_EXT_MEMORY_BAR);
-			add_debugfs_mem(adap, "mc0", MEM_MC0,
-					EXT_MEM_SIZE_GET(size));
-		}
-		if (i & EXT_MEM1_ENABLE) {
-			size = t4_read_reg(adap, MA_EXT_MEMORY1_BAR);
-			add_debugfs_mem(adap, "mc1", MEM_MC1,
-					EXT_MEM_SIZE_GET(size));
-		}
-	}
-	if (adap->l2t)
-		debugfs_create_file("l2t", S_IRUSR, adap->debugfs_root, adap,
-				    &t4_l2t_fops);
-	return 0;
+	return t4_setup_debugfs(adap);
 }
 
 /*
@@ -5656,6 +5566,8 @@ static int adap_init0(struct adapter *adap)
 	enum dev_state state;
 	u32 params[7], val[7];
 	struct fw_caps_config_cmd caps_cmd;
+	struct fw_devlog_cmd devlog_cmd;
+	u32 devlog_meminfo;
 	int reset = 1;
 
 	/*
@@ -5744,6 +5656,29 @@ static int adap_init0(struct adapter *adap)
 	if (ret < 0)
 		goto bye;
 
+	/* Read firmware device log parameters.  We really need to find a way
+	 * to get these parameters initialized with some default values (which
+	 * are likely to be correct) for the case where we either don't
+	 * attache to the firmware or it's crashed when we probe the adapter.
+	 * That way we'll still be able to perform early firmware startup
+	 * debugging ...  If the request to get the Firmware's Device Log
+	 * parameters fails, we'll live so we don't make that a fatal error.
+	 */
+	memset(&devlog_cmd, 0, sizeof(devlog_cmd));
+	devlog_cmd.op_to_write = htonl(FW_CMD_OP(FW_DEVLOG_CMD) |
+				       FW_CMD_REQUEST | FW_CMD_READ);
+	devlog_cmd.retval_len16 = htonl(FW_LEN16(devlog_cmd));
+	ret = t4_wr_mbox(adap, adap->mbox, &devlog_cmd, sizeof(devlog_cmd),
+			 &devlog_cmd);
+	if (ret == 0) {
+		devlog_meminfo =
+			ntohl(devlog_cmd.memtype_devlog_memaddr16_devlog);
+		adap->params.devlog.memtype =
+			FW_DEVLOG_CMD_MEMTYPE_DEVLOG_GET(devlog_meminfo);
+		adap->params.devlog.start =
+			FW_DEVLOG_CMD_MEMADDR16_DEVLOG_GET(devlog_meminfo) << 4;
+		adap->params.devlog.size = ntohl(devlog_cmd.memsize_devlog);
+	}
 	/*
 	 * Find out what ports are available to us.  Note that we need to do
 	 * this before calling adap_init0_no_config() since it needs nports
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
index c19a90e..5e5eee6 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.h
@@ -110,6 +110,18 @@ enum {
 	SGE_INGPADBOUNDARY_SHIFT = 5,/* ingress queue pad boundary */
 };
 
+/* PCI-e memory window access */
+enum pcie_memwin {
+	MEMWIN_NIC      = 0,
+	MEMWIN_RSVD1    = 1,
+	MEMWIN_RSVD2    = 2,
+	MEMWIN_RDMA     = 3,
+	MEMWIN_RSVD4    = 4,
+	MEMWIN_FOISCSI  = 5,
+	MEMWIN_CSIOSTOR = 6,
+	MEMWIN_RSVD7    = 7,
+};
+
 struct sge_qstat {                /* data written to SGE queue status entries */
 	__be32 qid;
 	__be16 cidx;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h b/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
index 3409756..e9ceb3a 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h
@@ -618,6 +618,7 @@ enum fw_cmd_opcodes {
 	FW_RSS_IND_TBL_CMD             = 0x20,
 	FW_RSS_GLB_CONFIG_CMD          = 0x22,
 	FW_RSS_VI_CONFIG_CMD           = 0x23,
+	FW_DEVLOG_CMD                  = 0x25,
 	FW_CLIP_CMD                    = 0x28,
 	FW_LASTC2E_CMD                 = 0x40,
 	FW_ERROR_CMD                   = 0x80,
@@ -2279,4 +2280,75 @@ enum fw_hdr_flags {
 	FW_HDR_FLAGS_RESET_HALT = 0x00000001,
 };
 
+/* length of the formatting string  */
+#define FW_DEVLOG_FMT_LEN	192
+
+/* maximum number of the formatting string parameters */
+#define FW_DEVLOG_FMT_PARAMS_NUM 8
+
+/* priority levels */
+enum fw_devlog_level {
+	FW_DEVLOG_LEVEL_EMERG	= 0x0,
+	FW_DEVLOG_LEVEL_CRIT	= 0x1,
+	FW_DEVLOG_LEVEL_ERR	= 0x2,
+	FW_DEVLOG_LEVEL_NOTICE	= 0x3,
+	FW_DEVLOG_LEVEL_INFO	= 0x4,
+	FW_DEVLOG_LEVEL_DEBUG	= 0x5,
+	FW_DEVLOG_LEVEL_MAX	= 0x5,
+};
+
+/* facilities that may send a log message */
+enum fw_devlog_facility {
+	FW_DEVLOG_FACILITY_CORE		= 0x00,
+	FW_DEVLOG_FACILITY_CF		= 0x01,
+	FW_DEVLOG_FACILITY_SCHED	= 0x02,
+	FW_DEVLOG_FACILITY_TIMER	= 0x04,
+	FW_DEVLOG_FACILITY_RES		= 0x06,
+	FW_DEVLOG_FACILITY_HW		= 0x08,
+	FW_DEVLOG_FACILITY_FLR		= 0x10,
+	FW_DEVLOG_FACILITY_DMAQ		= 0x12,
+	FW_DEVLOG_FACILITY_PHY		= 0x14,
+	FW_DEVLOG_FACILITY_MAC		= 0x16,
+	FW_DEVLOG_FACILITY_PORT		= 0x18,
+	FW_DEVLOG_FACILITY_VI		= 0x1A,
+	FW_DEVLOG_FACILITY_FILTER	= 0x1C,
+	FW_DEVLOG_FACILITY_ACL		= 0x1E,
+	FW_DEVLOG_FACILITY_TM		= 0x20,
+	FW_DEVLOG_FACILITY_QFC		= 0x22,
+	FW_DEVLOG_FACILITY_DCB		= 0x24,
+	FW_DEVLOG_FACILITY_ETH		= 0x26,
+	FW_DEVLOG_FACILITY_OFLD		= 0x28,
+	FW_DEVLOG_FACILITY_RI		= 0x2A,
+	FW_DEVLOG_FACILITY_ISCSI	= 0x2C,
+	FW_DEVLOG_FACILITY_FCOE		= 0x2E,
+	FW_DEVLOG_FACILITY_FOISCSI	= 0x30,
+	FW_DEVLOG_FACILITY_FOFCOE	= 0x32,
+	FW_DEVLOG_FACILITY_MAX		= 0x32,
+};
+
+/* log message format */
+struct fw_devlog_e {
+	__be64	timestamp;
+	__be32	seqno;
+	__be16	reserved1;
+	__u8	level;
+	__u8	facility;
+	__u8	fmt[FW_DEVLOG_FMT_LEN];
+	__be32	params[FW_DEVLOG_FMT_PARAMS_NUM];
+	__be32	reserved3[4];
+};
+
+struct fw_devlog_cmd {
+	__be32 op_to_write;
+	__be32 retval_len16;
+	__u8   level;
+	__u8   r2[7];
+	__be32 memtype_devlog_memaddr16_devlog;
+	__be32 memsize_devlog;
+	__be32 r3[2];
+};
+
+#define FW_DEVLOG_CMD_MEMTYPE_DEVLOG_GET(x)	(((x) >> 28) & 0xf)
+#define FW_DEVLOG_CMD_MEMADDR16_DEVLOG_GET(x)	(((x) >> 0) & 0xfffffff)
+
 #endif /* _T4FW_INTERFACE_H_ */
-- 
1.7.1

^ permalink raw reply related

* [PATCH net-next 0/4] Add support for few debugfs entries
From: Hariprasad Shenai @ 2014-10-16  1:22 UTC (permalink / raw)
  To: netdev; +Cc: davem, leedom, kumaras, nirranjan, santosh, anish,
	Hariprasad Shenai

Hi,

This patch series adds a new file for debugfs and support for devlog, cim_la,
cim_la_qcfg and mps_tcam debugfs entries.

The patches series is created against 'net-next' tree.
And includes patches on cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review the
change and let us know in case of any review comments.

Thanks

Hariprasad Shenai (4):
  cxgb4: Add cxgb4_debugfs.c and devlog support
  cxgb4: Add support for cim_la entry in debugfs
  cxgb4: Add support for cim_la_qcfg entry in debugfs
  cxgb4: Add support for mps_tcam debugfs

 drivers/net/ethernet/chelsio/cxgb4/Makefile        |    2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |   17 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c |  683 ++++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h |   64 ++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    |  121 +---
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.c         |  155 +++++
 drivers/net/ethernet/chelsio/cxgb4/t4_hw.h         |   19 +
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h       |   52 ++
 drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h      |   72 ++
 9 files changed, 1091 insertions(+), 94 deletions(-)
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.h

^ permalink raw reply

* Re: [PATCH 1/1 net-next] openvswitch: kerneldoc warning fix
From: Pravin Shelar @ 2014-10-16  1:12 UTC (permalink / raw)
  To: Fabian Frederick; +Cc: LKML, David S. Miller, dev@openvswitch.org, netdev
In-Reply-To: <1413399798-8514-1-git-send-email-fabf@skynet.be>

On Wed, Oct 15, 2014 at 12:03 PM, Fabian Frederick <fabf@skynet.be> wrote:
> s/sock/gs
>
> Signed-off-by: Fabian Frederick <fabf@skynet.be>

Looks good.

Acked-by: Pravin B Shelar <pshelar@nicira.com>

Thanks.

> ---
>  net/openvswitch/vport-geneve.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/vport-geneve.c b/net/openvswitch/vport-geneve.c
> index 910b3ef..106a9d8 100644
> --- a/net/openvswitch/vport-geneve.c
> +++ b/net/openvswitch/vport-geneve.c
> @@ -30,7 +30,7 @@
>
>  /**
>   * struct geneve_port - Keeps track of open UDP ports
> - * @sock: The socket created for this port number.
> + * @gs: The socket created for this port number.
>   * @name: vport name.
>   */
>  struct geneve_port {
> --
> 1.9.3
>

^ permalink raw reply

* Re: [PATCH 1/1 net-next] openvswitch: use vport instead of p
From: Pravin Shelar @ 2014-10-16  1:11 UTC (permalink / raw)
  To: Fabian Frederick; +Cc: LKML, David S. Miller, dev@openvswitch.org, netdev
In-Reply-To: <1413399821-8558-1-git-send-email-fabf@skynet.be>

On Wed, Oct 15, 2014 at 12:03 PM, Fabian Frederick <fabf@skynet.be> wrote:
> All functions used struct vport *vport except
> ovs_vport_find_upcall_portid.
>
> This fixes 1 kerneldoc warning
>
> Signed-off-by: Fabian Frederick <fabf@skynet.be>


Acked-by: Pravin B Shelar <pshelar@nicira.com>

Thanks.

> ---
>  net/openvswitch/vport.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> index 53001b0..6015802 100644
> --- a/net/openvswitch/vport.c
> +++ b/net/openvswitch/vport.c
> @@ -408,13 +408,13 @@ int ovs_vport_get_upcall_portids(const struct vport *vport,
>   *
>   * Returns the portid of the target socket.  Must be called with rcu_read_lock.
>   */
> -u32 ovs_vport_find_upcall_portid(const struct vport *p, struct sk_buff *skb)
> +u32 ovs_vport_find_upcall_portid(const struct vport *vport, struct sk_buff *skb)
>  {
>         struct vport_portids *ids;
>         u32 ids_index;
>         u32 hash;
>
> -       ids = rcu_dereference(p->upcall_portids);
> +       ids = rcu_dereference(vport->upcall_portids);
>
>         if (ids->n_ids == 1 && ids->ids[0] == 0)
>                 return 0;
> --
> 1.9.3
>

^ permalink raw reply

* [PATCH] vxlan: fix a use after free in vxlan_encap_bypass
From: roy.qing.li @ 2014-10-16  0:49 UTC (permalink / raw)
  To: netdev

From: Li RongQing <roy.qing.li@gmail.com>

when netif_rx() is done, the netif_rx handled skb maybe be freed,
and should not be used.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
---
 drivers/net/vxlan.c |    8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 6f83be7..f677cd0 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1668,6 +1668,8 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan,
 	struct pcpu_sw_netstats *tx_stats, *rx_stats;
 	union vxlan_addr loopback;
 	union vxlan_addr *remote_ip = &dst_vxlan->default_dst.remote_ip;
+	struct net_device *dev = skb->dev;
+	int len = skb->len;
 
 	tx_stats = this_cpu_ptr(src_vxlan->dev->tstats);
 	rx_stats = this_cpu_ptr(dst_vxlan->dev->tstats);
@@ -1691,16 +1693,16 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan,
 
 	u64_stats_update_begin(&tx_stats->syncp);
 	tx_stats->tx_packets++;
-	tx_stats->tx_bytes += skb->len;
+	tx_stats->tx_bytes += len;
 	u64_stats_update_end(&tx_stats->syncp);
 
 	if (netif_rx(skb) == NET_RX_SUCCESS) {
 		u64_stats_update_begin(&rx_stats->syncp);
 		rx_stats->rx_packets++;
-		rx_stats->rx_bytes += skb->len;
+		rx_stats->rx_bytes += len;
 		u64_stats_update_end(&rx_stats->syncp);
 	} else {
-		skb->dev->stats.rx_dropped++;
+		dev->stats.rx_dropped++;
 	}
 }
 
-- 
1.7.10.4

^ permalink raw reply related

* Re: feature suggestion: implement SO_PEERCRED on local AF_INET/AF_INET6 sockets (allow uid-based identification on localhost)
From: Andy Lutomirski @ 2014-10-16  0:11 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list, Linux network mailing-list
In-Reply-To: <20141016000705.GA12208@achernar.madore.org>

On Wed, Oct 15, 2014 at 5:07 PM, David Madore <david+ml@madore.org> wrote:
> On Wed, Oct 15, 2014 at 03:54:08PM -0700, Andy Lutomirski wrote:
>> On Wed, Oct 15, 2014 at 3:30 PM, David Madore <david+ml@madore.org> wrote:
>> > Note that since the possibility of using SO_PEERCRED on AF_INET
>> > sockets does not hitherto exist on Linux, we can be sure that nobody
>> > uses it, so it's not like it might open vulnerabilities in existing
>> > code.  If you think it's insecure, it can be documented as such (by
>> > comparing it with identd): I still think it's better than having no
>> > control at all when binding to localhost, which is the present
>> > situation (causing, e.g., CVE-2014-2914).
>>
>> This doesn't follow.  *Everybody* uses connect on AF_INET.
>>
>> IMO anything that sends a caller's credentials needs to be explicit and opt-in.
>
> I'm confused as to whether you mean "opt-in" on the side of the caller
> (=process requesting the endpoint's credentials), or on that of the
> endpoint (=authenticated process).  On the one hand I don't understand
> what it could mean on the caller side, on the other hand you mention
> explicit support in OpenSSH, which would be the caller in my scenario.

I mean the authenticated process, not the process doing the authentication.

>
> So, in case I haven't been clear enough, the situation I have in mind
> is: on "thishost", I run "ssh -L 14321:remotehost:4321 somehost" to
> forward connexions on from the local port 14321 of thishost (where ssh
> listens on the loopback) to the port 4321 of remotehost.
> Unfortunately, now everyone with an acccount on thishost can connect
> to port 14321 and effectively emit a connection from somehost to
> remotehost on my behalf.  I think everyone agrees that this is a huge
> problem.  But I don't understand how you propose to remedy this.

Unfortunately, I think that you need client changes.  These could be
semi-transparent (using LD_PRELOAD) or almost completely transparent
(using network namespaces).

Actually, a network namespace-based proxying tool could be very useful.

>
> Patching ssh is an option, but I don't see how to do it (ssh needs to
> make sure that the connections it receives on 14321 are from the same
> uid, and this seems impossible without the feature I'm discussing).
> Patching the kernel is an option.  Patching clients that connect to
> 14321, on the other hand, is not, because there are many different
> ones, and their protocol is defined by immutable Internet standards,
> so we have no latitude there (for example, we can't ask a Web browser
> to connect to Unix domain sockets: there simply isn't a URL scheme to
> refer to them).  Adding iptables rules is not an option if I'm not the
> system administrator on thishost.

I misunderstood.  I though that you wanted a server-side solution.

>
> So, how can we solve this problem securely?
>
>> I believe that there is no secure way to authenticate clients that
>> currently don't authenticate themselves without changing the clients.
>> That's the whole point: currently-secure are written under the
>> assumption that they are not exercising their credentials.  You can't
>> safely change that without making it opt-in.
>
> Then what are we to do, given that modifying the clients is
> impossible?
>
> What about my proposal that user credentials would be returned only if
> they refer to the same user as the caller user and that the caller is
> permitted to ptrace the endpoint?  This answers your objection of
> leaking credentials: the caller could do anything at all with the
> other side since it could ptrace it - we're just permitting a user to
> authenticate their own sockets.  A further sysctl could enable the use
> of the call in more general cases, for those administrators who think
> it should be allowed.
>

Ugh.

That's probably safe, but it's quite disgusting IMO.

--Andy

^ permalink raw reply

* Re: feature suggestion: implement SO_PEERCRED on local AF_INET/AF_INET6 sockets (allow uid-based identification on localhost)
From: David Madore @ 2014-10-16  0:07 UTC (permalink / raw)
  To: Linux Kernel mailing-list, Linux network mailing-list; +Cc: Andy Lutomirski
In-Reply-To: <CALCETrXnfU6cGaYNBy-X=GKwBZ07Wf8VvnUz1R-qgKVtGrS=hQ@mail.gmail.com>

On Wed, Oct 15, 2014 at 03:54:08PM -0700, Andy Lutomirski wrote:
> On Wed, Oct 15, 2014 at 3:30 PM, David Madore <david+ml@madore.org> wrote:
> > Note that since the possibility of using SO_PEERCRED on AF_INET
> > sockets does not hitherto exist on Linux, we can be sure that nobody
> > uses it, so it's not like it might open vulnerabilities in existing
> > code.  If you think it's insecure, it can be documented as such (by
> > comparing it with identd): I still think it's better than having no
> > control at all when binding to localhost, which is the present
> > situation (causing, e.g., CVE-2014-2914).
> 
> This doesn't follow.  *Everybody* uses connect on AF_INET.
> 
> IMO anything that sends a caller's credentials needs to be explicit and opt-in.

I'm confused as to whether you mean "opt-in" on the side of the caller
(=process requesting the endpoint's credentials), or on that of the
endpoint (=authenticated process).  On the one hand I don't understand
what it could mean on the caller side, on the other hand you mention
explicit support in OpenSSH, which would be the caller in my scenario.

So, in case I haven't been clear enough, the situation I have in mind
is: on "thishost", I run "ssh -L 14321:remotehost:4321 somehost" to
forward connexions on from the local port 14321 of thishost (where ssh
listens on the loopback) to the port 4321 of remotehost.
Unfortunately, now everyone with an acccount on thishost can connect
to port 14321 and effectively emit a connection from somehost to
remotehost on my behalf.  I think everyone agrees that this is a huge
problem.  But I don't understand how you propose to remedy this.

Patching ssh is an option, but I don't see how to do it (ssh needs to
make sure that the connections it receives on 14321 are from the same
uid, and this seems impossible without the feature I'm discussing).
Patching the kernel is an option.  Patching clients that connect to
14321, on the other hand, is not, because there are many different
ones, and their protocol is defined by immutable Internet standards,
so we have no latitude there (for example, we can't ask a Web browser
to connect to Unix domain sockets: there simply isn't a URL scheme to
refer to them).  Adding iptables rules is not an option if I'm not the
system administrator on thishost.

So, how can we solve this problem securely?

> I believe that there is no secure way to authenticate clients that
> currently don't authenticate themselves without changing the clients.
> That's the whole point: currently-secure are written under the
> assumption that they are not exercising their credentials.  You can't
> safely change that without making it opt-in.

Then what are we to do, given that modifying the clients is
impossible?

What about my proposal that user credentials would be returned only if
they refer to the same user as the caller user and that the caller is
permitted to ptrace the endpoint?  This answers your objection of
leaking credentials: the caller could do anything at all with the
other side since it could ptrace it - we're just permitting a user to
authenticate their own sockets.  A further sysctl could enable the use
of the call in more general cases, for those administrators who think
it should be allowed.

-- 
     David A. Madore
   ( http://www.madore.org/~david/ )

^ permalink raw reply

* Re: [PATCH 1/1 net-next] openvswitch: kerneldoc warning fix
From: David Miller @ 2014-10-16  0:01 UTC (permalink / raw)
  To: fabf; +Cc: linux-kernel, pshelar, dev, netdev
In-Reply-To: <1413399798-8514-1-git-send-email-fabf@skynet.be>


Pravin could you please review these two openvswitch kerneldoc
fixes from Fabian?

Thanks.

^ permalink raw reply

* Re: Netlink mmap tx security?
From: Andy Lutomirski @ 2014-10-15 23:58 UTC (permalink / raw)
  To: David Miller
  Cc: Daniel Borkmann, Linus Torvalds, Patrick McHardy,
	Network Development, Thomas Graf
In-Reply-To: <20141015.195737.1429281929513331763.davem@davemloft.net>

On Wed, Oct 15, 2014 at 4:57 PM, David Miller <davem@davemloft.net> wrote:
> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Thu, 16 Oct 2014 01:45:22 +0200
>
>> On 10/15/2014 04:01 AM, David Miller wrote:
>>> From: Andy Lutomirski <luto@amacapital.net>
>>> Date: Tue, 14 Oct 2014 15:16:46 -0700
>>>
>>>> It's at least remotely possible that there's something that assumes
>>>> that assumes that the availability of NETLINK_RX_RING implies
>>>> NETLINK_TX_RING, which would be unfortunate.
>>>
>>> I already found one such case, nlmon :-/
>>
>> Hmm, can you elaborate? I currently don't think that nlmon cares
>> actually.
>
> nlmon cares, openvswitch cares, etc:

It'll still work, just slower, right?

+    if (setsockopt(sock->fd, SOL_NETLINK, NETLINK_RX_RING, &req,
sizeof(req)) < 0
+        || setsockopt(sock->fd, SOL_NETLINK, NETLINK_TX_RING, &req,
sizeof(req)) < 0) {
+        VLOG_INFO("mmap netlink is not supported");
+        return 0;
+    }
+

--Andy

^ permalink raw reply

* Re: Netlink mmap tx security?
From: David Miller @ 2014-10-15 23:57 UTC (permalink / raw)
  To: dborkman; +Cc: luto, torvalds, kaber, netdev, tgraf
In-Reply-To: <543F0712.8080503@redhat.com>

From: Daniel Borkmann <dborkman@redhat.com>
Date: Thu, 16 Oct 2014 01:45:22 +0200

> On 10/15/2014 04:01 AM, David Miller wrote:
>> From: Andy Lutomirski <luto@amacapital.net>
>> Date: Tue, 14 Oct 2014 15:16:46 -0700
>>
>>> It's at least remotely possible that there's something that assumes
>>> that assumes that the availability of NETLINK_RX_RING implies
>>> NETLINK_TX_RING, which would be unfortunate.
>>
>> I already found one such case, nlmon :-/
> 
> Hmm, can you elaborate? I currently don't think that nlmon cares
> actually.

nlmon cares, openvswitch cares, etc:

http://openvswitch.org/pipermail/dev/2013-December/034496.html

^ permalink raw reply

* RE: ixgbe: Question about Flow Control on 10G
From: Tantilov, Emil S @ 2014-10-15 23:46 UTC (permalink / raw)
  To: dom, Skidmore, Donald C, netdev@vger.kernel.org
In-Reply-To: <543EC362.5060507@citrix.com>

>-----Original Message-----
>From: dom [mailto:dominic.curran@citrix.com]
>Sent: Wednesday, October 15, 2014 11:57 AM
>To: Tantilov, Emil S; Skidmore, Donald C;
>netdev@vger.kernel.org
>Subject: Re: ixgbe: Question about Flow Control on 10G
>
>On 10/14/2014 01:54 PM, Tantilov, Emil S wrote:
>>> -----Original Message-----
>>> From: netdev-owner@vger.kernel.org [mailto:netdev-
>>> owner@vger.kernel.org] On Behalf Of dom
>>> Sent: Tuesday, October 14, 2014 12:01 PM
>>> To: Skidmore, Donald C; netdev@vger.kernel.org
>>> Subject: ixgbe: Question about Flow Control on 10G
>>>
>>> Hi
>>>
>>> I have a question about the ixgbe driver's handling of
>>> 'ethtool -a ethX'
>>> when the NIC is using fibre.
>>>
>>> Specifically I don't understand the code introduced by this
>>> commit:
>>>
>>> commit 73d80953dfd1d5a92948005798c857c311c2834b
>>> Author: Don Skidmore <donald.c.skidmore@intel.com>
>>> Date:   Wed Jul 31 02:19:24 2013 +0000
>>> Subject: ixgbe: fix fc autoneg ethtool reporting.
>>>
>>> The function introduced the function:
>>>         ixgbe_device_supports_autoneg_fc()
>>>
>>> which gets called by
>>> ixgbe_get_pauseparam()/ixgbe_set_pauseparam().
>>>
>>> specifically there is a  case in
>>> ixgbe_device_supports_autoneg_fc()
>>>
>>>      case ixgbe_media_type_fiber_qsfp:
>>>      case ixgbe_media_type_fiber:
>>>          hw->mac.ops.check_link(hw, &speed, &link_up,
>>> false);
>>>          /* if link is down, assume supported */
>>>          if (link_up)
>>>              supported = speed == IXGBE_LINK_SPEED_1GB_FULL ?
>>>                  true : false;
>>>
>>> If link_up=1 then why is supported only true for a
>>> speed=IXGBE_LINK_SPEED_1GB_FULL ?
>>>
>>> Why is Flow Control not supported for IXGBE_LINK_SPEED_10GB_FULL ?
>> For SFP modules (media_type_fiber) flow control autoneg is not supported at 10gig.
>> You can still set flow control manually to enabled/disabled, just not autoneg.
>>
>> Thanks,
>> Emil
>Hi Emil
>
>Thank you for the quick answer.  I have a one follow-up question if I may...
>
>We noticed that back in 3.2.9 (before 73d80953dfd basically) the
>behaviour was different for 10G fibre.  i.e.  autonegotiate showed 'on'.
>
># ethtool -a eth1
>Pause parameters for eth1:
>Autonegotiate:  on
>RX:             on
>TX:             on
>
>
>The code:
>     if (hw->fc.disable_fc_autoneg ||
>         (hw->fc.current_mode == ixgbe_fc_none))
>         pause->autoneg = 0;
>     else
>         pause->autoneg = 1;
>
>So I assume this old output from 'ethtool -a' for autogen was just
>wrong, is that correct ?

Correct.

Thanks,
Emil

>
>[I'm asking cos I _know_ my h/w collegues are going to ask why the change.]
>
>Thanks again
>dom

^ permalink raw reply

* Re: Netlink mmap tx security?
From: Daniel Borkmann @ 2014-10-15 23:45 UTC (permalink / raw)
  To: David Miller; +Cc: luto, torvalds, kaber, netdev, tgraf
In-Reply-To: <20141014.220111.179628329028952302.davem@davemloft.net>

On 10/15/2014 04:01 AM, David Miller wrote:
> From: Andy Lutomirski <luto@amacapital.net>
> Date: Tue, 14 Oct 2014 15:16:46 -0700
>
>> It's at least remotely possible that there's something that assumes
>> that assumes that the availability of NETLINK_RX_RING implies
>> NETLINK_TX_RING, which would be unfortunate.
>
> I already found one such case, nlmon :-/

Hmm, can you elaborate? I currently don't think that nlmon cares
actually.

^ permalink raw reply

* Re: [PATCH net 2/3] net: sctp: fix panic on duplicate ASCONF chunks
From: Daniel Borkmann @ 2014-10-15 23:13 UTC (permalink / raw)
  To: Neil Horman; +Cc: davem, linux-sctp, netdev, Vlad Yasevich
In-Reply-To: <20141015025833.GA6493@localhost.localdomain>

On 10/15/2014 04:58 AM, Neil Horman wrote:
> On Mon, Oct 13, 2014 at 01:25:11AM +0200, Daniel Borkmann wrote:
>> On 10/12/2014 03:42 AM, Neil Horman wrote:
>>> On Sat, Oct 11, 2014 at 12:02:31AM +0200, Daniel Borkmann wrote:
>>>> On 10/10/2014 05:39 PM, Neil Horman wrote:
>>>> ...
>>>>> Is it worth adding a WARN_ON, to indicate that two ASCONF chunks have been
>>>>> received with duplicate serials?
>>>>
>>>> Don't think so, as this would be triggerable from outside.
>>>>
>>> WARN_ON_ONCE then, per serial number?
>>
>> Sorry, but no. If someone seriously runs that in production and it
>> triggers a WARN() from outside, admins will start sending us bug
>> reports that apparently something with the kernel code is wrong.
>>
>> WARN() should only be used if we have some *internal* unexpected bug,
>> but can still fail gracefully. This would neither be an actual code bug
>> nor would it be an internally triggered one, plus we add unnecessary
>> complexity to the code. Similarly, for those reasons we don't WARN()
>> and throw a stack trace when we receive, say, an skb of invalid length
>> elsewhere.
>>
>> I'd also like to avoid any additional pr_debug().
>>
>> I don't think people enable them in production, and if they really do,
>> it's too late anyway as we already have received this chunk. If anything,
>> I'd rather like to see debugging code further removed as we have already
>> different facilities in the kernel for runtime debugging that are much
>> more powerful.
 >
> What do you suggest then?  It seems like this is a protocol error that an
> administrator will want to be made aware of.  I'm open to other options, but
> just saying "no" isn't sufficient for me.

So what do you want an admin to do with this information?

Currently, we don't throw WARN() stack traces or whatever on other 'malformed'
inputs, we either discard them silently in SCTP or send an ABORT, for example.

Do you suggest the kernel should now start throwing a WARN() on every possible
skb that we discard in order to inform the admin?

Given that we can handle this gracefully by basically ignoring the 2nd identical
ASCONF chunk in a packet and send out a single reply, that's just handled fine.

^ permalink raw reply

* tcpdump's capture filter: "vlan" doesn't match
From: Lukas Tribus @ 2014-10-15 22:58 UTC (permalink / raw)
  To: netdev@vger.kernel.org
  Cc: John Fastabend, Michał Mirosław, Jiri Pirko,
	Ben Hutchings, Atzm Watanabe, Patrick McHardy, Jesse Gross

Hi,

since 2.6.39 (including -rc1), tcpdump "vlan" capture filters don't match
anymore. All 2.6.38 and older kernels are fine.

I reproduced this specifically on a r8169 NIC on 2.6.39-rc1, but I found
this problem initially on bnx2 and e1000e nics.

Howto reproduce: just tcpdump with a "not vlan", "vlan" or "vlan <vlanid>"
capture filter on a passive eth interface (dot1q/vlan/ip config not necessary).

Actual behavior is that a "vlan [vlanid]" capture filter doesn't match the
(tagged) packet, and a "not vlan" capture filter matches everything.

Disabling rx-vlan-offloading via
ethtool -K eth0 rxvlan off

doesn't change anything.

Here we are filtering for "not vlan" and we can see that the matched frame
is vlan tagged:

# tcpdump -Uenc1 not vlan
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
22:03:39.077584 70:ca:9b:01:23:34> 00:18:f8:01:23:34, \
*ethertype 802.1Q (0x8100), length 70: vlan 7, p 0*, ethertype IPv4, \
192.168.47.9.443> 192.168.32.30.39436: Flags [.], ack 255248912, \
[...]
1 packet captured
169 packets received by filter
0 packets dropped by kernel
59 packets dropped by interface
#

As suggested here [1], we can pipe everything through another tcpdump
instance:
tcpdump -Uw - | tcpdump -en -r - vlan <vlanid>

But that is not something that works for my specific use-case (dedicated
sniffer box, dedicated interface connected to a Cisco SPAN/mirror port,
un/single/double-tagged packets, remotely accessible via remote-pcap [2]).

The sniffer should also be able to:
- maintain the frame as-is, including dot1q, dot1p (preferably
  without artificial recreation of header fields/values and including CFI/DEI)
- "direct" capture filter based on vlan (not through multiple userspace
  instances)

Kernel <= 2.6.38 perfectly satisfies those requirements.

Isn't disabling rx-vlan-offloading supposed to remedy those problems?

Thanks,

Lukas

[1] https://bugzilla.redhat.com/show_bug.cgi?id=498981
[2] https://github.com/frgtn/rpcapd-linux

^ permalink raw reply

* Re: feature suggestion: implement SO_PEERCRED on local AF_INET/AF_INET6 sockets (allow uid-based identification on localhost)
From: Andy Lutomirski @ 2014-10-15 22:54 UTC (permalink / raw)
  To: David Madore; +Cc: Linux Kernel mailing-list, Linux network mailing-list
In-Reply-To: <20141015223033.GA11458@achernar.madore.org>

On Wed, Oct 15, 2014 at 3:30 PM, David Madore <david+ml@madore.org> wrote:
> On Wed, Oct 15, 2014 at 07:41:48AM -0700, Andy Lutomirski wrote:
>> On 10/15/2014 06:35 AM, David Madore wrote:
>> > Given an AF_UNIX socket, the getsockopt(, SOL_SOCKET, SO_PEERCRED,,)
>> > call allows one endpoint to authenticate the other endpoint's pid, uid
>> > and gid.
>> >
>> > The call is valid on AF_INET and AF_INET6 sockets but returns no data
>> > (pid=0, uid=-1, gid=-1).  Obviously it is meaningless to try to get
>> > such credentials from a INET/INET6 socket in general, but there is one
>> > case where it would make sense: namely, when the endpoint is local
>> > (i.e., when the socket is a connection to the same machine, e.g., when
>> > connecting to 127.0.0.0/8 or ::1/32).
>>
>> I will object to adding it as described, for the same reason that I
>> object to anything that extends the current model of socket-based
>> credential passing.  Ideally, credentials would *never* be implicitly
>> captured by socket syscalls.  We live in the real world, and SO_****CRED
>> exists, so I think the best we can do is to try to minimize its use.
>>
>> I can elaborate further, or you can IIRC search the archives for
>> SCM_IDENTITY, and you can also look at CVE-2013-1979 for a nasty example
>> of why this model is broken.
>
> From what I understand, what was broken is mainly that the credentials
> were evaluated when the write() system call took place rather than
> when socket() or bind(): this violates the Unix security model
> (privilege control occurs when the file descriptor is created, not
> when it is used).  On the contrary, it is conform to Unix security
> principles that credentials are checked implicitly when binding a
> socket (this happens when permissions are being checked on the path
> when binding or connecting on a Unix domain socket; and to allow
> binding to secure ports in the INET domain; and so on).  It seems to
> me that a suid program that is willing to create or bind a socket on
> behalf of its caller without knowing exactly what it will be
> connecting to, it should intrinsically be treated as a security
> vulnerability, even when it is not obviously exploitable.

socket has little precedent for checking credentials and none in
POSIX.  And you're talking about connect, not bind.

>
> Also, to go along the real world examples, identd exists and is used
> for identification on local networks (e.g. localhost), so the capture
> of credentials already takes place.  Unix programmers are aware of
> this, and know that a privileged program should not bind a socket if
> they don't want to leak privileges.  (Another example is the use of -m
> owner in iptables.)

Ugh.

Identd is completely insecure.  Quite a few years ago personally broke
Stanford's entire single sign-on mechanism by exploiting it, against a
hardened, kerberized version, so less.  And iptables -m owner is all
about what you can connect to, not what is assumed by the recipient.
(And it's probably insecure, too, in many cases.)

>
> And, of course, if Solaris already has this feature, there is some
> experience for it.  Has there been any documented vulnerability
> relating to the fact that Solaris allows getpeerucred() to
> authenticate locally connected AF_INET sockets?

Does it matter?  CVE-2013-1979 existed for many years before anyone noticed.

>
> Note that since the possibility of using SO_PEERCRED on AF_INET
> sockets does not hitherto exist on Linux, we can be sure that nobody
> uses it, so it's not like it might open vulnerabilities in existing
> code.  If you think it's insecure, it can be documented as such (by
> comparing it with identd): I still think it's better than having no
> control at all when binding to localhost, which is the present
> situation (causing, e.g., CVE-2014-2914).

This doesn't follow.  *Everybody* uses connect on AF_INET.

IMO anything that sends a caller's credentials needs to be explicit and opt-in.

>
> Because SO_PEERCRED currently returns {pid=0,uid=-1,gid=-1} on
> AF_INET, we might still return this value if there is any risk that
> the endpoint would be unwilling to share its credentials: for example,
> this value might be returned if the other endpoint is not ptraceable
> by the caller - this would still cover the essential use case, which
> is for unprivileged users to authenticate the connections from their
> own processes.  Would this limitation assuage your worries about the
> proposed feature?
>
> The thing is, I don't see any other way the ssh port forwarding mess
> can ever be improved.  Do you have another solution in mind that?

UNIX sockets.  Firewall rules.  An opt-in mechanism like SCM_IDENTITY
that has explicit support in OpenSSH and doesn't happen unless the
administrator actually requests it in sshd_config.

>
> Any attempt to have some kind of authentication of local sockets that
> required participation on the client (authenticatee)'s part is doomed:
> if modifying the protocol and/or client code is an option, we might as
> well use some form of crypto / TLS.  Or Unix-domain sockets.  But what
> are we supposed to do when modifying the client (to make it send
> credentials, use crypto or connect on AF_UNIX) is not an option?

Exactly.

I believe that there is no secure way to authenticate clients that
currently don't authenticate themselves without changing the clients.
That's the whole point: currently-secure are written under the
assumption that they are not exercising their credentials.  You can't
safely change that without making it opt-in.

--Andy

^ permalink raw reply

* RE: [PATCH v3 net] ixgbe: check adapter->vfinfo before dereference
From: Tantilov, Emil S @ 2014-10-15 22:50 UTC (permalink / raw)
  To: Thierry Herbelot, Kirsher, Jeffrey T, Brandeburg, Jesse,
	Allan, Bruce W, netdev@vger.kernel.org
In-Reply-To: <1413367080-31540-1-git-send-email-thierry.herbelot@6wind.com>

>-----Original Message-----
>From: Thierry Herbelot [mailto:thierry.herbelot@6wind.com]
>Sent: Wednesday, October 15, 2014 2:58 AM
>To: Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W;
>netdev@vger.kernel.org; Tantilov, Emil S
>Cc: Thierry Herbelot
>Subject: [PATCH v3 net] ixgbe: check adapter->vfinfo before dereference
>
>this protects against the following panic:
>(before a VF was actually created on p96p1 PF Ethernet port)
>
>ip link set p96p1 vf 0 spoofchk off
>BUG: unable to handle kernel NULL pointer dereference at 0000000000000052
>IP: [<ffffffffa044a1c1>]
>ixgbe_ndo_set_vf_spoofchk+0x51/0x150 [ixgbe]
>
>Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
>---
>
>v2:
>  compilation fixes
>
>v3:
>  remove checks in functions where vfinfo is known not to be NULL
>  return -EINVAL as error code
>
> drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c |   42
>++++++++++++++++++++++--
> 1 file changed, 40 insertions(+), 2 deletions(-)

Actually looking into this a bit more, the check for vfinfo is not sufficient
because it does not protect against specifying vf that is outside of sriov_num_vfs range.

All of the ndo functions have a check for it except for ixgbevf_ndo_set_spoofcheck().

The following patch should be all we need to protect against this panic:

This patch adds a check to return -EINVAL when setting spoofcheck on
VF that is not configured.

Reported-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 706fc69..97c85b8 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -1261,6 +1261,9 @@ int ixgbe_ndo_set_vf_spoofchk(struct net_device *netdev, int vf, bool setting)
 	struct ixgbe_hw *hw = &adapter->hw;
 	u32 regval;
 
+	if (vf >= adapter->num_vfs)
+		return -EINVAL;
+
 	adapter->vfinfo[vf].spoofchk_enabled = setting;
 
 	regval = IXGBE_READ_REG(hw, IXGBE_PFVFSPOOF(vf_target_reg));

^ permalink raw reply related

* Re: Regarding tx-nocache-copy in the Sheevaplug
From: Eric Dumazet @ 2014-10-15 22:45 UTC (permalink / raw)
  To: Benjamin Poirier
  Cc: Lluís Batlle i Rossell, linux-kernel, netdev,
	Carles Pagès, linux-arm-kernel
In-Reply-To: <20141015215701.GA4109@f1.synalogic.ca>

On Wed, 2014-10-15 at 14:57 -0700, Benjamin Poirier wrote:
> On 2014/10/13 12:52, Lluís Batlle i Rossell wrote:
> > Hello,
> > 
> > on the 7th of January 2014 ths patch was applied:
> > https://lkml.org/lkml/2014/1/7/307
> > 
> > [PATCH v2] net: Do not enable tx-nocache-copy by default
> >         
> > In the Sheevaplug (ARM Feroceon 88FR131 from Marvell) this made packets to be
> > sent corrupted. I think this machine has something special about the cache.
> > 
> > Enabling back this tx-nocache-copy (as it used to be before the patch) the
> > transfers work fine again. I think that most people, encountering this problem,
> > completely disable the tx offload instead of enabling back this setting.
> > 
> > Is this an ARM kernel problem regarding this platform?
> 
> This is odd, only x86 defines ARCH_HAS_NOCACHE_UACCESS. On arm,
> skb_do_copy_data_nocache() should end up using __copy_from_user()
> regardless of tx-nocache-copy.

 kmap_atomic()/kunmap_atomic() is missing, so we lack
__cpuc_flush_dcache_area() operations.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox