* [PATCH net-next v12 1/3] Documentation: add Device tree bindings for Hisilicon hip04 ethernet
From: Ding Tianhong @ 2015-01-13 9:11 UTC (permalink / raw)
To: arnd, robh+dt, davem, grant.likely, agraf
Cc: sergei.shtylyov, linux-arm-kernel, eric.dumazet, xuwei5,
zhangfei.gao, netdev, devicetree, linux
In-Reply-To: <1421140290-5492-1-git-send-email-dingtianhong@huawei.com>
From: Zhangfei Gao <zhangfei.gao@linaro.org>
This patch adds the Device Tree bindings for the Hisilicon hip04
Ethernet controller, including 100M / 1000M controller.
Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
.../bindings/net/hisilicon-hip04-net.txt | 88 ++++++++++++++++++++++
1 file changed, 88 insertions(+)
create mode 100644 Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
diff --git a/Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt b/Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
new file mode 100644
index 0000000..988fc69
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/hisilicon-hip04-net.txt
@@ -0,0 +1,88 @@
+Hisilicon hip04 Ethernet Controller
+
+* Ethernet controller node
+
+Required properties:
+- compatible: should be "hisilicon,hip04-mac".
+- reg: address and length of the register set for the device.
+- interrupts: interrupt for the device.
+- port-handle: <phandle port channel>
+ phandle, specifies a reference to the syscon ppe node
+ port, port number connected to the controller
+ channel, recv channel start from channel * number (RX_DESC_NUM)
+- phy-mode: see ethernet.txt [1].
+
+Optional properties:
+- phy-handle: see ethernet.txt [1].
+
+[1] Documentation/devicetree/bindings/net/ethernet.txt
+
+
+* Ethernet ppe node:
+Control rx & tx fifos of all ethernet controllers.
+Have 2048 recv channels shared by all ethernet controllers, only if no overlap.
+Each controller's recv channel start from channel * number (RX_DESC_NUM).
+
+Required properties:
+- compatible: "hisilicon,hip04-ppe", "syscon".
+- reg: address and length of the register set for the device.
+
+
+* MDIO bus node:
+
+Required properties:
+
+- compatible: should be "hisilicon,hip04-mdio".
+- Inherits from MDIO bus node binding [2]
+[2] Documentation/devicetree/bindings/net/phy.txt
+
+Example:
+ mdio {
+ compatible = "hisilicon,hip04-mdio";
+ reg = <0x28f1000 0x1000>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ phy0: ethernet-phy@0 {
+ compatible = "ethernet-phy-ieee802.3-c22";
+ reg = <0>;
+ marvell,reg-init = <18 0x14 0 0x8001>;
+ };
+
+ phy1: ethernet-phy@1 {
+ compatible = "ethernet-phy-ieee802.3-c22";
+ reg = <1>;
+ marvell,reg-init = <18 0x14 0 0x8001>;
+ };
+ };
+
+ ppe: ppe@28c0000 {
+ compatible = "hisilicon,hip04-ppe", "syscon";
+ reg = <0x28c0000 0x10000>;
+ };
+
+ fe: ethernet@28b0000 {
+ compatible = "hisilicon,hip04-mac";
+ reg = <0x28b0000 0x10000>;
+ interrupts = <0 413 4>;
+ phy-mode = "mii";
+ port-handle = <&ppe 31 0>;
+ };
+
+ ge0: ethernet@2800000 {
+ compatible = "hisilicon,hip04-mac";
+ reg = <0x2800000 0x10000>;
+ interrupts = <0 402 4>;
+ phy-mode = "sgmii";
+ port-handle = <&ppe 0 1>;
+ phy-handle = <&phy0>;
+ };
+
+ ge8: ethernet@2880000 {
+ compatible = "hisilicon,hip04-mac";
+ reg = <0x2880000 0x10000>;
+ interrupts = <0 410 4>;
+ phy-mode = "sgmii";
+ port-handle = <&ppe 8 2>;
+ phy-handle = <&phy1>;
+ };
--
1.8.0
^ permalink raw reply related
* Re: [PATCH net-next] rhashtable: unnecessary to use delayed work
From: Thomas Graf @ 2015-01-13 9:35 UTC (permalink / raw)
To: Ying Xue; +Cc: davem, netdev
In-Reply-To: <1421139645-1588-1-git-send-email-ying.xue@windriver.com>
On 01/13/15 at 05:00pm, Ying Xue wrote:
> When we put our declared work task in the global workqueue with
> schedule_delayed_work(), its delay parameter is always zero.
> Therefore, we should define a normal work in rhashtable structure
> instead of a delayed work.
>
> Signed-off-by: Ying Xue <ying.xue@windriver.com>
> Cc: Thomas Graf <tgraf@suug.ch>
> @@ -914,7 +914,7 @@ void rhashtable_destroy(struct rhashtable *ht)
>
> mutex_lock(&ht->mutex);
>
> - cancel_delayed_work(&ht->run_work);
> + cancel_work_sync(&ht->run_work);
> bucket_table_free(rht_dereference(ht->tbl, ht));
>
> mutex_unlock(&ht->mutex);
I like the patch!
I think it introduces a possible dead lock though (see below). OTOH, it
could actually explain the reason for the 0day lock debug splash that
was reported.
Dead lock: The worker could already have been kicked off but was
interrupted before it acquired ht->mutex. rhashtable_destroy() is
called and acquired ht->mutex. cancel_work_sync() waits for worker to
finish while holding ht->mutex. Worker can't finish because it needs to
acquire ht->mutex to do so.
For the very same reason the reported warning could have been triggered.
Instead of the dead lock, it would have called bucket_table_free()
with a deferred resizer still underway.
What about we do something like this?
void rhashtable_destroy(struct rhashtable *ht)
{
ht->being_destroyed = true;
cancel_work_sync(&ht->run_work);
mutex_lock(&ht->mutex);
bucket_table_free(rht_dereference(ht->tbl, ht));
mutex_unlock(&ht->mutex);
}
If you agree we can explain this shortly in the commit message and add:
Fixes: 97defe1 ("rhashtable: Per bucket locks & deferred expansion/shrinking")
^ permalink raw reply
* [PATCH net-next] cxgb4: Ripping out old hard-wired initialization code in driver
From: Hariprasad Shenai @ 2015-01-13 9:49 UTC (permalink / raw)
To: netdev; +Cc: davem, leedom, nirranjan, Hariprasad Shenai
Removing old hard-wired initialization code in the driver, which is no longer
used. Also deprecating few module parameters.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 480 +++--------------------
drivers/net/ethernet/chelsio/cxgb4/sge.c | 98 +----
2 files changed, 58 insertions(+), 520 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 23ae0b7..082a596 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -79,99 +79,6 @@
#define DRV_VERSION "2.0.0-ko"
#define DRV_DESC "Chelsio T4/T5 Network Driver"
-/*
- * Max interrupt hold-off timer value in us. Queues fall back to this value
- * under extreme memory pressure so it's largish to give the system time to
- * recover.
- */
-#define MAX_SGE_TIMERVAL 200U
-
-enum {
- /*
- * Physical Function provisioning constants.
- */
- PFRES_NVI = 4, /* # of Virtual Interfaces */
- PFRES_NETHCTRL = 128, /* # of EQs used for ETH or CTRL Qs */
- PFRES_NIQFLINT = 128, /* # of ingress Qs/w Free List(s)/intr
- */
- PFRES_NEQ = 256, /* # of egress queues */
- PFRES_NIQ = 0, /* # of ingress queues */
- PFRES_TC = 0, /* PCI-E traffic class */
- PFRES_NEXACTF = 128, /* # of exact MPS filters */
-
- PFRES_R_CAPS = FW_CMD_CAP_PF,
- PFRES_WX_CAPS = FW_CMD_CAP_PF,
-
-#ifdef CONFIG_PCI_IOV
- /*
- * Virtual Function provisioning constants. We need two extra Ingress
- * Queues with Interrupt capability to serve as the VF's Firmware
- * Event Queue and Forwarded Interrupt Queue (when using MSI mode) --
- * neither will have Free Lists associated with them). For each
- * Ethernet/Control Egress Queue and for each Free List, we need an
- * Egress Context.
- */
- VFRES_NPORTS = 1, /* # of "ports" per VF */
- VFRES_NQSETS = 2, /* # of "Queue Sets" per VF */
-
- VFRES_NVI = VFRES_NPORTS, /* # of Virtual Interfaces */
- VFRES_NETHCTRL = VFRES_NQSETS, /* # of EQs used for ETH or CTRL Qs */
- VFRES_NIQFLINT = VFRES_NQSETS+2,/* # of ingress Qs/w Free List(s)/intr */
- VFRES_NEQ = VFRES_NQSETS*2, /* # of egress queues */
- VFRES_NIQ = 0, /* # of non-fl/int ingress queues */
- VFRES_TC = 0, /* PCI-E traffic class */
- VFRES_NEXACTF = 16, /* # of exact MPS filters */
-
- VFRES_R_CAPS = FW_CMD_CAP_DMAQ|FW_CMD_CAP_VF|FW_CMD_CAP_PORT,
- VFRES_WX_CAPS = FW_CMD_CAP_DMAQ|FW_CMD_CAP_VF,
-#endif
-};
-
-/*
- * Provide a Port Access Rights Mask for the specified PF/VF. This is very
- * static and likely not to be useful in the long run. We really need to
- * implement some form of persistent configuration which the firmware
- * controls.
- */
-static unsigned int pfvfres_pmask(struct adapter *adapter,
- unsigned int pf, unsigned int vf)
-{
- unsigned int portn, portvec;
-
- /*
- * Give PF's access to all of the ports.
- */
- if (vf == 0)
- return FW_PFVF_CMD_PMASK_M;
-
- /*
- * For VFs, we'll assign them access to the ports based purely on the
- * PF. We assign active ports in order, wrapping around if there are
- * fewer active ports than PFs: e.g. active port[pf % nports].
- * Unfortunately the adapter's port_info structs haven't been
- * initialized yet so we have to compute this.
- */
- if (adapter->params.nports == 0)
- return 0;
-
- portn = pf % adapter->params.nports;
- portvec = adapter->params.portvec;
- for (;;) {
- /*
- * Isolate the lowest set bit in the port vector. If we're at
- * the port number that we want, return that as the pmask.
- * otherwise mask that bit out of the port vector and
- * decrement our port number ...
- */
- unsigned int pmask = portvec ^ (portvec & (portvec-1));
- if (portn == 0)
- return pmask;
- portn--;
- portvec &= ~pmask;
- }
- /*NOTREACHED*/
-}
-
enum {
MAX_TXQ_ENTRIES = 16384,
MAX_CTRL_TXQ_ENTRIES = 1024,
@@ -264,7 +171,8 @@ MODULE_PARM_DESC(force_init, "Forcibly become Master PF and initialize adapter")
static uint force_old_init;
module_param(force_old_init, uint, 0644);
-MODULE_PARM_DESC(force_old_init, "Force old initialization sequence");
+MODULE_PARM_DESC(force_old_init, "Force old initialization sequence, deprecated"
+ " parameter");
static int dflt_msg_enable = DFLT_MSG_ENABLE;
@@ -293,13 +201,14 @@ static unsigned int intr_holdoff[SGE_NTIMERS - 1] = { 5, 10, 20, 50, 100 };
module_param_array(intr_holdoff, uint, NULL, 0644);
MODULE_PARM_DESC(intr_holdoff, "values for queue interrupt hold-off timers "
- "0..4 in microseconds");
+ "0..4 in microseconds, deprecated parameter");
static unsigned int intr_cnt[SGE_NCOUNTERS - 1] = { 4, 8, 16 };
module_param_array(intr_cnt, uint, NULL, 0644);
MODULE_PARM_DESC(intr_cnt,
- "thresholds 1..3 for queue interrupt packet counters");
+ "thresholds 1..3 for queue interrupt packet counters, "
+ "deprecated parameter");
/*
* Normally we tell the chip to deliver Ingress Packets into our DMA buffers
@@ -319,7 +228,8 @@ static bool vf_acls;
#ifdef CONFIG_PCI_IOV
module_param(vf_acls, bool, 0644);
-MODULE_PARM_DESC(vf_acls, "if set enable virtualization L2 ACL enforcement");
+MODULE_PARM_DESC(vf_acls, "if set enable virtualization L2 ACL enforcement, "
+ "deprecated parameter");
/* Configure the number of PCI-E Virtual Function which are to be instantiated
* on SR-IOV Capable Physical Functions.
@@ -341,32 +251,11 @@ module_param(select_queue, int, 0644);
MODULE_PARM_DESC(select_queue,
"Select between kernel provided method of selecting or driver method of selecting TX queue. Default is kernel method.");
-/*
- * The filter TCAM has a fixed portion and a variable portion. The fixed
- * portion can match on source/destination IP IPv4/IPv6 addresses and TCP/UDP
- * ports. The variable portion is 36 bits which can include things like Exact
- * Match MAC Index (9 bits), Ether Type (16 bits), IP Protocol (8 bits),
- * [Inner] VLAN Tag (17 bits), etc. which, if all were somehow selected, would
- * far exceed the 36-bit budget for this "compressed" header portion of the
- * filter. Thus, we have a scarce resource which must be carefully managed.
- *
- * By default we set this up to mostly match the set of filter matching
- * capabilities of T3 but with accommodations for some of T4's more
- * interesting features:
- *
- * { IP Fragment (1), MPS Match Type (3), IP Protocol (8),
- * [Inner] VLAN (17), Port (3), FCoE (1) }
- */
-enum {
- TP_VLAN_PRI_MAP_DEFAULT = HW_TPL_FR_MT_PR_IV_P_FC,
- TP_VLAN_PRI_MAP_FIRST = FCOE_S,
- TP_VLAN_PRI_MAP_LAST = FRAGMENTATION_S,
-};
-
-static unsigned int tp_vlan_pri_map = TP_VLAN_PRI_MAP_DEFAULT;
+static unsigned int tp_vlan_pri_map = HW_TPL_FR_MT_PR_IV_P_FC;
module_param(tp_vlan_pri_map, uint, 0644);
-MODULE_PARM_DESC(tp_vlan_pri_map, "global compressed filter configuration");
+MODULE_PARM_DESC(tp_vlan_pri_map, "global compressed filter configuration, "
+ "deprecated parameter");
static struct dentry *cxgb4_debugfs_root;
@@ -5225,12 +5114,9 @@ static int adap_init0_config(struct adapter *adapter, int reset)
if (ret < 0)
goto bye;
- /*
- * Return successfully and note that we're operating with parameters
- * not supplied by the driver, rather than from hard-wired
- * initialization constants burried in the driver.
+ /* Emit Firmware Configuration File information and return
+ * successfully.
*/
- adapter->flags |= USING_SOFT_PARAMS;
dev_info(adapter->pdev_dev, "Successfully configured using Firmware "\
"Configuration File \"%s\", version %#x, computed checksum %#x\n",
config_name, finiver, cfcsum);
@@ -5248,248 +5134,6 @@ bye:
return ret;
}
-/*
- * Attempt to initialize the adapter via hard-coded, driver supplied
- * parameters ...
- */
-static int adap_init0_no_config(struct adapter *adapter, int reset)
-{
- struct sge *s = &adapter->sge;
- struct fw_caps_config_cmd caps_cmd;
- u32 v;
- int i, ret;
-
- /*
- * Reset device if necessary
- */
- if (reset) {
- ret = t4_fw_reset(adapter, adapter->mbox,
- PIORSTMODE_F | PIORST_F);
- if (ret < 0)
- goto bye;
- }
-
- /*
- * Get device capabilities and select which we'll be using.
- */
- memset(&caps_cmd, 0, sizeof(caps_cmd));
- caps_cmd.op_to_write = htonl(FW_CMD_OP_V(FW_CAPS_CONFIG_CMD) |
- FW_CMD_REQUEST_F | FW_CMD_READ_F);
- caps_cmd.cfvalid_to_len16 = htonl(FW_LEN16(caps_cmd));
- ret = t4_wr_mbox(adapter, adapter->mbox, &caps_cmd, sizeof(caps_cmd),
- &caps_cmd);
- if (ret < 0)
- goto bye;
-
- if (caps_cmd.niccaps & htons(FW_CAPS_CONFIG_NIC_VM)) {
- if (!vf_acls)
- caps_cmd.niccaps ^= htons(FW_CAPS_CONFIG_NIC_VM);
- else
- caps_cmd.niccaps = htons(FW_CAPS_CONFIG_NIC_VM);
- } else if (vf_acls) {
- dev_err(adapter->pdev_dev, "virtualization ACLs not supported");
- goto bye;
- }
- caps_cmd.op_to_write = htonl(FW_CMD_OP_V(FW_CAPS_CONFIG_CMD) |
- FW_CMD_REQUEST_F | FW_CMD_WRITE_F);
- ret = t4_wr_mbox(adapter, adapter->mbox, &caps_cmd, sizeof(caps_cmd),
- NULL);
- if (ret < 0)
- goto bye;
-
- /*
- * Tweak configuration based on system architecture, module
- * parameters, etc.
- */
- ret = adap_init0_tweaks(adapter);
- if (ret < 0)
- goto bye;
-
- /*
- * Select RSS Global Mode we want to use. We use "Basic Virtual"
- * mode which maps each Virtual Interface to its own section of
- * the RSS Table and we turn on all map and hash enables ...
- */
- adapter->flags |= RSS_TNLALLLOOKUP;
- ret = t4_config_glbl_rss(adapter, adapter->mbox,
- FW_RSS_GLB_CONFIG_CMD_MODE_BASICVIRTUAL,
- FW_RSS_GLB_CONFIG_CMD_TNLMAPEN_F |
- FW_RSS_GLB_CONFIG_CMD_HASHTOEPLITZ_F |
- ((adapter->flags & RSS_TNLALLLOOKUP) ?
- FW_RSS_GLB_CONFIG_CMD_TNLALLLKP_F : 0));
- if (ret < 0)
- goto bye;
-
- /*
- * Set up our own fundamental resource provisioning ...
- */
- ret = t4_cfg_pfvf(adapter, adapter->mbox, adapter->fn, 0,
- PFRES_NEQ, PFRES_NETHCTRL,
- PFRES_NIQFLINT, PFRES_NIQ,
- PFRES_TC, PFRES_NVI,
- FW_PFVF_CMD_CMASK_M,
- pfvfres_pmask(adapter, adapter->fn, 0),
- PFRES_NEXACTF,
- PFRES_R_CAPS, PFRES_WX_CAPS);
- if (ret < 0)
- goto bye;
-
- /*
- * Perform low level SGE initialization. We need to do this before we
- * send the firmware the INITIALIZE command because that will cause
- * any other PF Drivers which are waiting for the Master
- * Initialization to proceed forward.
- */
- for (i = 0; i < SGE_NTIMERS - 1; i++)
- s->timer_val[i] = min(intr_holdoff[i], MAX_SGE_TIMERVAL);
- s->timer_val[SGE_NTIMERS - 1] = MAX_SGE_TIMERVAL;
- s->counter_val[0] = 1;
- for (i = 1; i < SGE_NCOUNTERS; i++)
- s->counter_val[i] = min(intr_cnt[i - 1], THRESHOLD_0_M);
- t4_sge_init(adapter);
-
-#ifdef CONFIG_PCI_IOV
- /*
- * Provision resource limits for Virtual Functions. We currently
- * grant them all the same static resource limits except for the Port
- * Access Rights Mask which we're assigning based on the PF. All of
- * the static provisioning stuff for both the PF and VF really needs
- * to be managed in a persistent manner for each device which the
- * firmware controls.
- */
- {
- int pf, vf;
-
- for (pf = 0; pf < ARRAY_SIZE(num_vf); pf++) {
- if (num_vf[pf] <= 0)
- continue;
-
- /* VF numbering starts at 1! */
- for (vf = 1; vf <= num_vf[pf]; vf++) {
- ret = t4_cfg_pfvf(adapter, adapter->mbox,
- pf, vf,
- VFRES_NEQ, VFRES_NETHCTRL,
- VFRES_NIQFLINT, VFRES_NIQ,
- VFRES_TC, VFRES_NVI,
- FW_PFVF_CMD_CMASK_M,
- pfvfres_pmask(
- adapter, pf, vf),
- VFRES_NEXACTF,
- VFRES_R_CAPS, VFRES_WX_CAPS);
- if (ret < 0)
- dev_warn(adapter->pdev_dev,
- "failed to "\
- "provision pf/vf=%d/%d; "
- "err=%d\n", pf, vf, ret);
- }
- }
- }
-#endif
-
- /*
- * Set up the default filter mode. Later we'll want to implement this
- * via a firmware command, etc. ... This needs to be done before the
- * firmare initialization command ... If the selected set of fields
- * isn't equal to the default value, we'll need to make sure that the
- * field selections will fit in the 36-bit budget.
- */
- if (tp_vlan_pri_map != TP_VLAN_PRI_MAP_DEFAULT) {
- int j, bits = 0;
-
- for (j = TP_VLAN_PRI_MAP_FIRST; j <= TP_VLAN_PRI_MAP_LAST; j++)
- switch (tp_vlan_pri_map & (1 << j)) {
- case 0:
- /* compressed filter field not enabled */
- break;
- case FCOE_F:
- bits += 1;
- break;
- case PORT_F:
- bits += 3;
- break;
- case VNIC_F:
- bits += 17;
- break;
- case VLAN_F:
- bits += 17;
- break;
- case TOS_F:
- bits += 8;
- break;
- case PROTOCOL_F:
- bits += 8;
- break;
- case ETHERTYPE_F:
- bits += 16;
- break;
- case MACMATCH_F:
- bits += 9;
- break;
- case MPSHITTYPE_F:
- bits += 3;
- break;
- case FRAGMENTATION_F:
- bits += 1;
- break;
- }
-
- if (bits > 36) {
- dev_err(adapter->pdev_dev,
- "tp_vlan_pri_map=%#x needs %d bits > 36;"\
- " using %#x\n", tp_vlan_pri_map, bits,
- TP_VLAN_PRI_MAP_DEFAULT);
- tp_vlan_pri_map = TP_VLAN_PRI_MAP_DEFAULT;
- }
- }
- v = tp_vlan_pri_map;
- t4_write_indirect(adapter, TP_PIO_ADDR_A, TP_PIO_DATA_A,
- &v, 1, TP_VLAN_PRI_MAP_A);
-
- /*
- * We need Five Tuple Lookup mode to be set in TP_GLOBAL_CONFIG order
- * to support any of the compressed filter fields above. Newer
- * versions of the firmware do this automatically but it doesn't hurt
- * to set it here. Meanwhile, we do _not_ need to set Lookup Every
- * Packet in TP_INGRESS_CONFIG to support matching non-TCP packets
- * since the firmware automatically turns this on and off when we have
- * a non-zero number of filters active (since it does have a
- * performance impact).
- */
- if (tp_vlan_pri_map)
- t4_set_reg_field(adapter, TP_GLOBAL_CONFIG_A,
- FIVETUPLELOOKUP_V(FIVETUPLELOOKUP_M),
- FIVETUPLELOOKUP_V(FIVETUPLELOOKUP_M));
-
- /*
- * Tweak some settings.
- */
- t4_write_reg(adapter, TP_SHIFT_CNT_A, SYNSHIFTMAX_V(6) |
- RXTSHIFTMAXR1_V(4) | RXTSHIFTMAXR2_V(15) |
- PERSHIFTBACKOFFMAX_V(8) | PERSHIFTMAX_V(8) |
- KEEPALIVEMAXR1_V(4) | KEEPALIVEMAXR2_V(9));
-
- /*
- * Get basic stuff going by issuing the Firmware Initialize command.
- * Note that this _must_ be after all PFVF commands ...
- */
- ret = t4_fw_initialize(adapter, adapter->mbox);
- if (ret < 0)
- goto bye;
-
- /*
- * Return successfully!
- */
- dev_info(adapter->pdev_dev, "Successfully configured using built-in "\
- "driver parameters\n");
- return 0;
-
- /*
- * Something bad happened. Return the error ...
- */
-bye:
- return ret;
-}
-
static struct fw_info fw_info_array[] = {
{
.chip = CHELSIO_T4,
@@ -5662,88 +5306,58 @@ static int adap_init0(struct adapter *adap)
adap->params.nports = hweight32(port_vec);
adap->params.portvec = port_vec;
- /*
- * If the firmware is initialized already (and we're not forcing a
- * master initialization), note that we're living with existing
- * adapter parameters. Otherwise, it's time to try initializing the
- * adapter ...
+ /* If the firmware is initialized already, emit a simply note to that
+ * effect. Otherwise, it's time to try initializing the adapter.
*/
if (state == DEV_STATE_INIT) {
dev_info(adap->pdev_dev, "Coming up as %s: "\
"Adapter already initialized\n",
adap->flags & MASTER_PF ? "MASTER" : "SLAVE");
- adap->flags |= USING_SOFT_PARAMS;
} else {
dev_info(adap->pdev_dev, "Coming up as MASTER: "\
"Initializing adapter\n");
- /*
- * If the firmware doesn't support Configuration
- * Files warn user and exit,
+
+ /* Find out whether we're dealing with a version of the
+ * firmware which has configuration file support.
*/
- if (ret < 0)
- dev_warn(adap->pdev_dev, "Firmware doesn't support "
- "configuration file.\n");
- if (force_old_init)
- ret = adap_init0_no_config(adap, reset);
- else {
- /*
- * Find out whether we're dealing with a version of
- * the firmware which has configuration file support.
- */
- params[0] = (FW_PARAMS_MNEM_V(FW_PARAMS_MNEM_DEV) |
- FW_PARAMS_PARAM_X_V(
- FW_PARAMS_PARAM_DEV_CF));
- ret = t4_query_params(adap, adap->mbox, adap->fn, 0, 1,
- params, val);
-
- /*
- * If the firmware doesn't support Configuration
- * Files, use the old Driver-based, hard-wired
- * initialization. Otherwise, try using the
- * Configuration File support and fall back to the
- * Driver-based initialization if there's no
- * Configuration File found.
- */
- if (ret < 0)
- ret = adap_init0_no_config(adap, reset);
- else {
- /*
- * The firmware provides us with a memory
- * buffer where we can load a Configuration
- * File from the host if we want to override
- * the Configuration File in flash.
- */
+ params[0] = (FW_PARAMS_MNEM_V(FW_PARAMS_MNEM_DEV) |
+ FW_PARAMS_PARAM_X_V(FW_PARAMS_PARAM_DEV_CF));
+ ret = t4_query_params(adap, adap->mbox, adap->fn, 0, 1,
+ params, val);
- ret = adap_init0_config(adap, reset);
- if (ret == -ENOENT) {
- dev_info(adap->pdev_dev,
- "No Configuration File present "
- "on adapter. Using hard-wired "
- "configuration parameters.\n");
- ret = adap_init0_no_config(adap, reset);
- }
- }
+ /* If the firmware doesn't support Configuration Files,
+ * return an error.
+ */
+ if (ret < 0) {
+ dev_err(adap->pdev_dev, "firmware doesn't support "
+ "Firmware Configuration Files\n");
+ goto bye;
+ }
+
+ /* The firmware provides us with a memory buffer where we can
+ * load a Configuration File from the host if we want to
+ * override the Configuration File in flash.
+ */
+ ret = adap_init0_config(adap, reset);
+ if (ret == -ENOENT) {
+ dev_err(adap->pdev_dev, "no Configuration File "
+ "present on adapter.\n");
+ goto bye;
}
if (ret < 0) {
- dev_err(adap->pdev_dev,
- "could not initialize adapter, error %d\n",
- -ret);
+ dev_err(adap->pdev_dev, "could not initialize "
+ "adapter, error %d\n", -ret);
goto bye;
}
}
- /*
- * If we're living with non-hard-coded parameters (either from a
- * Firmware Configuration File or values programmed by a different PF
- * Driver), give the SGE code a chance to pull in anything that it
- * needs ... Note that this must be called after we retrieve our VPD
- * parameters in order to know how to convert core ticks to seconds.
+ /* Give the SGE code a chance to pull in anything that it needs ...
+ * Note that this must be called after we retrieve our VPD parameters
+ * in order to know how to convert core ticks to seconds, etc.
*/
- if (adap->flags & USING_SOFT_PARAMS) {
- ret = t4_sge_init(adap);
- if (ret < 0)
- goto bye;
- }
+ ret = t4_sge_init(adap);
+ if (ret < 0)
+ goto bye;
if (is_bypass_device(adap->pdev->device))
adap->params.bypass = 1;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index a79fa6a..ca42e2e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -2742,24 +2742,11 @@ void t4_sge_stop(struct adapter *adap)
}
/**
- * t4_sge_init - initialize SGE
+ * t4_sge_init_soft - grab core SGE values needed by SGE code
* @adap: the adapter
*
- * Performs SGE initialization needed every time after a chip reset.
- * We do not initialize any of the queues here, instead the driver
- * top-level must request them individually.
- *
- * Called in two different modes:
- *
- * 1. Perform actual hardware initialization and record hard-coded
- * parameters which were used. This gets used when we're the
- * Master PF and the Firmware Configuration File support didn't
- * work for some reason.
- *
- * 2. We're not the Master PF or initialization was performed with
- * a Firmware Configuration File. In this case we need to grab
- * any of the SGE operating parameters that we need to have in
- * order to do our job and make sure we can live with them ...
+ * We need to grab the SGE operating parameters that we need to have
+ * in order to do our job and make sure we can live with them.
*/
static int t4_sge_init_soft(struct adapter *adap)
@@ -2852,73 +2839,13 @@ static int t4_sge_init_soft(struct adapter *adap)
return 0;
}
-static int t4_sge_init_hard(struct adapter *adap)
-{
- struct sge *s = &adap->sge;
-
- /*
- * Set up our basic SGE mode to deliver CPL messages to our Ingress
- * Queue and Packet Date to the Free List.
- */
- t4_set_reg_field(adap, SGE_CONTROL_A, RXPKTCPLMODE_F, RXPKTCPLMODE_F);
-
- /*
- * Set up to drop DOORBELL writes when the DOORBELL FIFO overflows
- * and generate an interrupt when this occurs so we can recover.
- */
- if (is_t4(adap->params.chip)) {
- t4_set_reg_field(adap, SGE_DBFIFO_STATUS_A,
- HP_INT_THRESH_V(HP_INT_THRESH_M) |
- LP_INT_THRESH_V(LP_INT_THRESH_M),
- HP_INT_THRESH_V(dbfifo_int_thresh) |
- LP_INT_THRESH_V(dbfifo_int_thresh));
- } else {
- t4_set_reg_field(adap, SGE_DBFIFO_STATUS_A,
- LP_INT_THRESH_T5_V(LP_INT_THRESH_T5_M),
- LP_INT_THRESH_T5_V(dbfifo_int_thresh));
- t4_set_reg_field(adap, SGE_DBFIFO_STATUS2_A,
- HP_INT_THRESH_T5_V(HP_INT_THRESH_T5_M),
- HP_INT_THRESH_T5_V(dbfifo_int_thresh));
- }
- t4_set_reg_field(adap, SGE_DOORBELL_CONTROL_A, ENABLE_DROP_F,
- ENABLE_DROP_F);
-
- /*
- * SGE_FL_BUFFER_SIZE0 (RX_SMALL_PG_BUF) is set up by
- * t4_fixup_host_params().
- */
- s->fl_pg_order = FL_PG_ORDER;
- if (s->fl_pg_order)
- t4_write_reg(adap,
- SGE_FL_BUFFER_SIZE0_A+RX_LARGE_PG_BUF*sizeof(u32),
- PAGE_SIZE << FL_PG_ORDER);
- t4_write_reg(adap, SGE_FL_BUFFER_SIZE0_A+RX_SMALL_MTU_BUF*sizeof(u32),
- FL_MTU_SMALL_BUFSIZE(adap));
- t4_write_reg(adap, SGE_FL_BUFFER_SIZE0_A+RX_LARGE_MTU_BUF*sizeof(u32),
- FL_MTU_LARGE_BUFSIZE(adap));
-
- /*
- * Note that the SGE Ingress Packet Count Interrupt Threshold and
- * Timer Holdoff values must be supplied by our caller.
- */
- t4_write_reg(adap, SGE_INGRESS_RX_THRESHOLD_A,
- THRESHOLD_0_V(s->counter_val[0]) |
- THRESHOLD_1_V(s->counter_val[1]) |
- THRESHOLD_2_V(s->counter_val[2]) |
- THRESHOLD_3_V(s->counter_val[3]));
- t4_write_reg(adap, SGE_TIMER_VALUE_0_AND_1_A,
- TIMERVALUE0_V(us_to_core_ticks(adap, s->timer_val[0])) |
- TIMERVALUE1_V(us_to_core_ticks(adap, s->timer_val[1])));
- t4_write_reg(adap, SGE_TIMER_VALUE_2_AND_3_A,
- TIMERVALUE2_V(us_to_core_ticks(adap, s->timer_val[2])) |
- TIMERVALUE3_V(us_to_core_ticks(adap, s->timer_val[3])));
- t4_write_reg(adap, SGE_TIMER_VALUE_4_AND_5_A,
- TIMERVALUE4_V(us_to_core_ticks(adap, s->timer_val[4])) |
- TIMERVALUE5_V(us_to_core_ticks(adap, s->timer_val[5])));
-
- return 0;
-}
-
+/**
+ * t4_sge_init - initialize SGE
+ * @adap: the adapter
+ *
+ * Perform low-level SGE code initialization needed every time after a
+ * chip reset.
+ */
int t4_sge_init(struct adapter *adap)
{
struct sge *s = &adap->sge;
@@ -2959,10 +2886,7 @@ int t4_sge_init(struct adapter *adap)
s->fl_align = max(ingpadboundary, ingpackboundary);
}
- if (adap->flags & USING_SOFT_PARAMS)
- ret = t4_sge_init_soft(adap);
- else
- ret = t4_sge_init_hard(adap);
+ ret = t4_sge_init_soft(adap);
if (ret < 0)
return ret;
--
1.7.1
^ permalink raw reply related
* RE: [PATCH net-next] rhashtable: Lower/upper bucket may map to same lock while shrinking
From: David Laight @ 2015-01-13 9:49 UTC (permalink / raw)
To: 'Thomas Graf', davem@davemloft.net, Fengguang Wu
Cc: LKP, linux-kernel@vger.kernel.org,
netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
netdev@vger.kernel.org
In-Reply-To: <20150112235821.GB16617@casper.infradead.org>
From: Thomas Graf
> Each per bucket lock covers a configurable number of buckets. While
> shrinking, two buckets in the old table contain entries for a single
> bucket in the new table. We need to lock down both while linking.
> Check if they are protected by different locks to avoid a recursive
> lock.
Thought, could the shrunk table use the same locks as the lower half
of the old table?
I also wonder whether shrinking hash tables is ever actually worth
the effort. Most likely they'll need to grow again very quickly.
> spin_lock_bh(old_bucket_lock1);
> - spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED);
> - spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2);
> +
> + /* Depending on the lock per buckets mapping, the bucket in
> + * the lower and upper region may map to the same lock.
> + */
> + if (old_bucket_lock1 != old_bucket_lock2) {
> + spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED);
> + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2);
> + } else {
> + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED);
> + }
Acquiring 3 locks of much the same type looks like a locking hierarchy
violation just waiting to happen.
David
^ permalink raw reply
* Re: [PATCH net-next] rhashtable: unnecessary to use delayed work
From: Ying Xue @ 2015-01-13 9:48 UTC (permalink / raw)
To: Thomas Graf; +Cc: davem, netdev
In-Reply-To: <20150113093550.GG20387@casper.infradead.org>
On 01/13/2015 05:35 PM, Thomas Graf wrote:
> On 01/13/15 at 05:00pm, Ying Xue wrote:
>> When we put our declared work task in the global workqueue with
>> schedule_delayed_work(), its delay parameter is always zero.
>> Therefore, we should define a normal work in rhashtable structure
>> instead of a delayed work.
>>
>> Signed-off-by: Ying Xue <ying.xue@windriver.com>
>> Cc: Thomas Graf <tgraf@suug.ch>
>
>> @@ -914,7 +914,7 @@ void rhashtable_destroy(struct rhashtable *ht)
>>
>> mutex_lock(&ht->mutex);
>>
>> - cancel_delayed_work(&ht->run_work);
>> + cancel_work_sync(&ht->run_work);
>> bucket_table_free(rht_dereference(ht->tbl, ht));
>>
>> mutex_unlock(&ht->mutex);
>
> I like the patch!
>
> I think it introduces a possible dead lock though (see below). OTOH, it
> could actually explain the reason for the 0day lock debug splash that
> was reported.
>
> Dead lock: The worker could already have been kicked off but was
> interrupted before it acquired ht->mutex. rhashtable_destroy() is
> called and acquired ht->mutex. cancel_work_sync() waits for worker to
> finish while holding ht->mutex. Worker can't finish because it needs to
> acquire ht->mutex to do so.
>
> For the very same reason the reported warning could have been triggered.
> Instead of the dead lock, it would have called bucket_table_free()
> with a deferred resizer still underway.
>
> What about we do something like this?
>
> void rhashtable_destroy(struct rhashtable *ht)
> {
> ht->being_destroyed = true;
> cancel_work_sync(&ht->run_work);
>
> mutex_lock(&ht->mutex);
> bucket_table_free(rht_dereference(ht->tbl, ht));
> mutex_unlock(&ht->mutex);
> }
>
Damn! I knew your above described deadlock scenario. Thank you for the
nice catch!
> If you agree we can explain this shortly in the commit message and add:
> Fixes: 97defe1 ("rhashtable: Per bucket locks & deferred expansion/shrinking")
>
OK, I will deliver the next version.
By the way, I think we should check the following condition before call
cancel_work_sync(), otherwise, we may cancel an uninitialized work.
(ht->p.grow_decision || ht->p.shrink_decision)
What do you think?
Regards,
Ying
>
^ permalink raw reply
* Re: [PATCH net] ipv6: Prevent ipv6_find_hdr() from returning ENOENT for valid non-first fragments
From: Hannes Frederic Sowa @ 2015-01-13 10:11 UTC (permalink / raw)
To: Rahul Sharma; +Cc: Pablo Neira Ayuso, netdev, linux-kernel, netfilter-devel
In-Reply-To: <CAFB3abxcg4gdEh4CJHd_Vx8mZKFmO3kMG=pDjkQYS5awTzFbSQ@mail.gmail.com>
On Di, 2015-01-13 at 09:53 +0530, Rahul Sharma wrote:
> On Mon, Jan 12, 2015 at 5:21 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Mon, Jan 12, 2015 at 04:38:16PM +0530, Rahul Sharma wrote:
> >> Hi Pablo, Hannes
> >>
> >> On Fri, Jan 9, 2015 at 9:20 PM, Hannes Frederic Sowa
> >> <hannes@stressinduktion.org> wrote:
> >> > On Fr, 2015-01-09 at 12:45 +0100, Pablo Neira Ayuso wrote:
> >> >> Hi Hannes,
> >> >>
> >> >> On Fri, Jan 09, 2015 at 12:34:15PM +0100, Hannes Frederic Sowa wrote:
> >> >> > On Fri, Jan 9, 2015, at 08:18, Rahul Sharma wrote:
> >> >> > > Hi Pablo,
> >> >> > >
> >> >> > > On Fri, Jan 9, 2015 at 5:35 AM, Pablo Neira Ayuso <pablo@netfilter.org>
> >> >> > > wrote:
> >> >> > > > On Thu, Jan 08, 2015 at 11:39:16PM +0100, Hannes Frederic Sowa wrote:
> >> >> > > >> Hi Pablo,
> >> >> > > >>
> >> >> > > >> On Thu, Jan 8, 2015, at 21:53, Pablo Neira Ayuso wrote:
> >> >> > > >> > I'm afraid we cannot just get rid of that !ipv6_ext_hdr() check. The
> >> >> > > >> > ipv6_find_hdr() function is designed to return the transport protocol.
> >> >> > > >> > After the proposed change, it will return extension header numbers.
> >> >> > > >> > This will break existing ip6tables rulesets since the `-p' option
> >> >> > > >> > relies on this function to match the transport protocol.
> >> >> > > >> >
> >> >> > > >> > Note that the AH header is skipped (see code a bit below this
> >> >> > > >> > problematic fragmentation handling) so the follow up header after the
> >> >> > > >> > AH header is returned as the transport header.
> >> >> > > >> >
> >> >> > > >> > We can probably return the AH protocol number for non-1st fragments.
> >> >> > > >> > However, that would be something new to ip6tables since nobody has
> >> >> > > >> > ever seen packet matching `-p ah' rules. Thus, we restore control to
> >> >> > > >> > the user to allow this, but we would accept all kind of fragmented AH
> >> >> > > >> > traffic through the firewall since we cannot know what transport
> >> >> > > >> > protocol contains from non-1st fragments (unless I'm missing anything,
> >> >> > > >> > I need to have a closer look at this again tomorrow with fresher
> >> >> > > >> > mind).
> >> >> > > >>
> >> >> > > >> The code in question is guarded by (_frag_off != 0), so we are
> >> >> > > >> definitely processing a non-1st fragment currently. The -p match would
> >> >> > > >> happen at the time when the packet is reassembled and thus ipv6_find_hdr
> >> >> > > >> will find the real transport (final) header at this point (I hope I
> >> >> > > >> followed the code correctly here).
> >> >> > > >
> >> >> > > > Then, Rahul should get things working by modprobing nf_defrag_ipv6.
> >> >> > >
> >> >> > > I already had nf_defrag_ipv6 installed when the issue occured. But I
> >> >> > > see ip6table_raw_hook returning NF_DROP for the second fragment.
> >> >> >
> >> >> > That's what I expected. I think the change only affects hooks before
> >> >> > reassembly.
> >> >>
> >> >> reassembly happens at NF_IP6_PRI_CONNTRACK_DEFRAG (-400), so that
> >> >> happens before NF_IP6_PRI_RAW (-300) in IPv6 which is where the raw
> >> >> table is placed.
> >> >
> >> > I tried to reproduce it, but couldn't get non-1st fragments getting
> >> > dropped during traversal of the raw table. They get dropped earlier at
> >> > during reassembly or pass.
> >> >
> >> > I agree with Pablo, I also would like to see more data.
> >> >
> >> > Thanks,
> >> > Hannes
> >> >
> >> >
> >>
> >> I enabled pr_debug() and there was no error in nf_ct_frag6_gather().
> >> It seems to have defragmented the packet correctly. As expected,
> >> ipv6_defrag() returns NF_STOLEN for the first packet after queuing it.
> >> For the next fragment, ipv6_defrag() calls nf_ct_frag6_output() after
> >> after reassembling it.
> >
> > nf_ct_frag6_output() doesn't exist anymore. You're using an old
> > kernel, you should have started by telling so in your report.
> >
> > See 6aafeef ("netfilter: push reasm skb through instead of original
> > frag skbs").
>
> I apologize for not mentioning the kernel version in my first mail. I
> had suspected problem in ipv6_find_hdr, the code for which was same.
> Anyway, thanks for the help. I ll try to figure out how to make this
> work in my kernel.
If you have time could you quickly test a recent net-next kernel?
Thanks,
Hannes
^ permalink raw reply
* Re: why are IPv6 addresses removed on link down
From: Hannes Frederic Sowa @ 2015-01-13 10:35 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Ahern, netdev@vger.kernel.org
In-Reply-To: <20150112231021.316648e3@urahara>
On Mo, 2015-01-12 at 23:10 -0800, Stephen Hemminger wrote:
> On Mon, 12 Jan 2015 22:06:44 -0700
> David Ahern <dsahern@gmail.com> wrote:
>
> > We noticed that IPv6 addresses are removed on a link down. e.g.,
> > ip link set dev eth1
> >
> >
> > Looking at the code it appears to be this code path in addrconf.c:
> >
> > case NETDEV_DOWN:
> > case NETDEV_UNREGISTER:
> > /*
> > * Remove all addresses from this interface.
> > */
> > addrconf_ifdown(dev, event != NETDEV_DOWN);
> > break;
> >
> > IPv4 addresses are NOT removed on a link down. Is there a particular
> > reason IPv6 addresses are?
> >
> > Thanks,
> > David
>
> See RFC's which describes how IPv6 does Duplicate Address Detection.
> Address is not valid when link is down, since DAD is not possible.
It should be no problem if the kernel would reacquire them on ifup and
do proper DAD. We simply must not use them while the interface is dead
(also making sure they don't get used for loopback routing).
The problem the IPv6 addresses get removed is much more a historical
artifact nowadays, I think. It is part of user space API and scripts
deal with that already.
Bye,
Hannes
^ permalink raw reply
* Re: [PATCH v5] can: Convert to runtime_pm
From: Marc Kleine-Budde @ 2015-01-13 11:08 UTC (permalink / raw)
To: Sören Brinkmann, Kedareswara rao Appana
Cc: wg, michal.simek, grant.likely, robh+dt, linux-can, netdev,
linux-arm-kernel, linux-kernel, devicetree,
Kedareswara rao Appana
In-Reply-To: <3a3437c5c8ff48d9a45fee7e81fa8dca@BY2FFO11FD058.protection.gbl>
[-- Attachment #1: Type: text/plain, Size: 3138 bytes --]
On 01/12/2015 07:45 PM, Sören Brinkmann wrote:
> On Mon, 2015-01-12 at 08:34PM +0530, Kedareswara rao Appana wrote:
>> Instead of enabling/disabling clocks at several locations in the driver,
>> Use the runtime_pm framework. This consolidates the actions for runtime PM
>> In the appropriate callbacks and makes the driver more readable and mantainable.
>>
>> Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
>> Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
>> ---
>> Changes for v5:
>> - Updated with the review comments.
>> Updated the remove fuction to use runtime_pm.
>> Chnages for v4:
>> - Updated with the review comments.
>> Changes for v3:
>> - Converted the driver to use runtime_pm.
>> Changes for v2:
>> - Removed the struct platform_device* from suspend/resume
>> as suggest by Lothar.
>>
>> drivers/net/can/xilinx_can.c | 157 ++++++++++++++++++++++++++++-------------
>> 1 files changed, 107 insertions(+), 50 deletions(-)
> [..]
>> +static int __maybe_unused xcan_runtime_resume(struct device *dev)
>> {
>> - struct platform_device *pdev = dev_get_drvdata(dev);
>> - struct net_device *ndev = platform_get_drvdata(pdev);
>> + struct net_device *ndev = dev_get_drvdata(dev);
>> struct xcan_priv *priv = netdev_priv(ndev);
>> int ret;
>> + u32 isr, status;
>>
>> ret = clk_enable(priv->bus_clk);
>> if (ret) {
>> @@ -1014,15 +1030,28 @@ static int __maybe_unused xcan_resume(struct device *dev)
>> ret = clk_enable(priv->can_clk);
>> if (ret) {
>> dev_err(dev, "Cannot enable clock.\n");
>> - clk_disable_unprepare(priv->bus_clk);
>> + clk_disable(priv->bus_clk);
> [...]
>> @@ -1173,12 +1219,23 @@ static int xcan_remove(struct platform_device *pdev)
>> {
>> struct net_device *ndev = platform_get_drvdata(pdev);
>> struct xcan_priv *priv = netdev_priv(ndev);
>> + int ret;
>> +
>> + ret = pm_runtime_get_sync(&pdev->dev);
>> + if (ret < 0) {
>> + netdev_err(ndev, "%s: pm_runtime_get failed(%d)\n",
>> + __func__, ret);
>> + return ret;
>> + }
>>
>> if (set_reset_mode(ndev) < 0)
>> netdev_err(ndev, "mode resetting failed!\n");
>>
>> unregister_candev(ndev);
>> + pm_runtime_disable(&pdev->dev);
>> netif_napi_del(&priv->napi);
>> + clk_disable_unprepare(priv->bus_clk);
>> + clk_disable_unprepare(priv->can_clk);
>
> Shouldn't pretty much all these occurrences of clk_disable/enable
> disappear? This should all be handled by the runtime_pm framework now.
We have:
- clk_prepare_enable() in probe
- clk_disable_unprepare() in remove
- clk_enable() in runtime_resume
- clk_disable() in runtime_suspend
Which is, as far as I understand the right way to do it. Maybe
Kedareswara can post the clock debug output again with this patch
iteration. Have I missed something?
regards,
Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* [RFC 1/2] misc: uidstat: Add uid stat driver to collect network statistics.
From: Kiran Raparthy @ 2015-01-13 11:14 UTC (permalink / raw)
To: linux-kernel
Cc: Mike Chan, Arnd Bergmann, Greg Kroah-Hartman, David S. Miller,
Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
Patrick McHardy, netdev, Android Kernel Team, John Stultz,
Sumit Semwal, JP Abgrall, Arve Hj�nnev�g,
Kiran Raparthy
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 9395 bytes --]
From: Mike Chan <mike@android.com>
misc: uidstat: Add uid stat driver to collect network statistics.
To analyze application's network statistics, we need a mechanism to export
the UID based statistics to userspace so that userspace tools can use the
exported numbers and generate the report against the UID.
This patch allows the user to explore the UID based network statistics
exported to /proc/uid_stat.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
cc: Mike Chan <mike@android.com>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Android Kernel Team <kernel-team@android.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: JP Abgrall <jpa@google.com>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Mike Chan <mike@android.com>
[Kiran: Added context to commit message.
included build fixes from JP Abgrall and Arve.
Used single_release instead of seq_release.
internally continued to use uid_t but used covnersion
from kuid_t where ever necessary]
Signed-off-by: Kiran Raparthy <kiran.kumar@linaro.org>
---
drivers/misc/Kconfig | 7 +++
drivers/misc/Makefile | 1 +
drivers/misc/uid_stat.c | 157 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/uid_stat.h | 33 ++++++++++
net/ipv4/tcp.c | 10 +++
5 files changed, 208 insertions(+)
create mode 100644 drivers/misc/uid_stat.c
create mode 100644 include/linux/uid_stat.h
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 006242c..1c79b94 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -402,6 +402,13 @@ config TI_DAC7512
This driver can also be built as a module. If so, the module
will be called ti_dac7512.
+config UID_STAT
+ bool "UID based statistics tracking"
+ default n
+ help
+ This option allows to enable UID based network statistics tracking.
+ statistics are exported to /proc/uid_stat.
+
config VMWARE_BALLOON
tristate "VMware Balloon Driver"
depends on X86 && HYPERVISOR_GUEST
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 7d5c4cd..3754347 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -34,6 +34,7 @@ obj-$(CONFIG_ISL29020) += isl29020.o
obj-$(CONFIG_SENSORS_TSL2550) += tsl2550.o
obj-$(CONFIG_DS1682) += ds1682.o
obj-$(CONFIG_TI_DAC7512) += ti_dac7512.o
+obj-$(CONFIG_UID_STAT) += uid_stat.o
obj-$(CONFIG_C2PORT) += c2port/
obj-$(CONFIG_HMC6352) += hmc6352.o
obj-y += eeprom/
diff --git a/drivers/misc/uid_stat.c b/drivers/misc/uid_stat.c
new file mode 100644
index 0000000..aaaa406
--- /dev/null
+++ b/drivers/misc/uid_stat.c
@@ -0,0 +1,157 @@
+/* drivers/misc/uid_stat.c
+ *
+ * Copyright (C) 2008 - 2009 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/atomic.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/list.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/stat.h>
+#include <linux/uid_stat.h>
+
+static DEFINE_SPINLOCK(uid_lock);
+static LIST_HEAD(uid_list);
+static struct proc_dir_entry *parent;
+
+struct uid_stat {
+ struct list_head link;
+ uid_t uid;
+ atomic_t tcp_rcv;
+ atomic_t tcp_snd;
+};
+
+static struct uid_stat *find_uid_stat(uid_t uid)
+{
+ struct uid_stat *entry;
+
+ list_for_each_entry(entry, &uid_list, link) {
+ if (entry->uid == uid)
+ return entry;
+ }
+ return NULL;
+}
+
+static int uid_stat_atomic_int_show(struct seq_file *m, void *v)
+{
+ unsigned int bytes;
+ atomic_t *counter = m->private;
+
+ bytes = (unsigned int) (atomic_read(counter) + INT_MIN);
+ return seq_printf(m, "%u\n", bytes);
+}
+
+static int uid_stat_read_atomic_int_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, uid_stat_atomic_int_show, PDE_DATA(inode));
+}
+
+static const struct file_operations uid_stat_read_atomic_int_fops = {
+ .open = uid_stat_read_atomic_int_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+/* Create a new entry for tracking the specified uid. */
+static struct uid_stat *create_stat(uid_t uid)
+{
+ struct uid_stat *new_uid;
+ /* Create the uid stat struct and append it to the list. */
+ new_uid = kmalloc(sizeof(struct uid_stat), GFP_ATOMIC);
+ if (!new_uid)
+ return NULL;
+
+ new_uid->uid = uid;
+ /* Counters start at INT_MIN, so we can track 4GB of network traffic. */
+ atomic_set(&new_uid->tcp_rcv, INT_MIN);
+ atomic_set(&new_uid->tcp_snd, INT_MIN);
+
+ list_add_tail(&new_uid->link, &uid_list);
+ return new_uid;
+}
+
+static void create_stat_proc(struct uid_stat *new_uid)
+{
+ char uid_s[32];
+ struct proc_dir_entry *entry;
+
+ sprintf(uid_s, "%d", new_uid->uid);
+ entry = proc_mkdir(uid_s, parent);
+
+ /* Keep reference to uid_stat so we know what uid to read stats from. */
+ proc_create_data("tcp_snd", S_IRUGO, entry,
+ &uid_stat_read_atomic_int_fops, &new_uid->tcp_snd);
+
+ proc_create_data("tcp_rcv", S_IRUGO, entry,
+ &uid_stat_read_atomic_int_fops, &new_uid->tcp_rcv);
+}
+
+static struct uid_stat *find_or_create_uid_stat(kuid_t uid)
+{
+ struct uid_stat *entry;
+ unsigned long flags;
+ uid_t proc_uid;
+
+ proc_uid = from_kuid(&init_user_ns, uid);
+ spin_lock_irqsave(&uid_lock, flags);
+ entry = find_uid_stat(proc_uid);
+ if (entry) {
+ spin_unlock_irqrestore(&uid_lock, flags);
+ return entry;
+ }
+ entry = create_stat(proc_uid);
+ spin_unlock_irqrestore(&uid_lock, flags);
+ if (entry)
+ create_stat_proc(entry);
+ return entry;
+}
+
+int uid_stat_tcp_snd(kuid_t uid, int size)
+{
+ struct uid_stat *entry;
+
+ entry = find_or_create_uid_stat(uid);
+ if (!entry)
+ return -1;
+ atomic_add(size, &entry->tcp_snd);
+ return 0;
+}
+
+int uid_stat_tcp_rcv(kuid_t uid, int size)
+{
+ struct uid_stat *entry;
+
+ entry = find_or_create_uid_stat(uid);
+ if (!entry)
+ return -1;
+ atomic_add(size, &entry->tcp_rcv);
+ return 0;
+}
+
+static int __init uid_stat_init(void)
+{
+ parent = proc_mkdir("uid_stat", NULL);
+ if (!parent) {
+ pr_err("uid_stat: failed to create proc entry\n");
+ return -1;
+ }
+ return 0;
+}
+
+device_initcall(uid_stat_init);
diff --git a/include/linux/uid_stat.h b/include/linux/uid_stat.h
new file mode 100644
index 0000000..68823d3
--- /dev/null
+++ b/include/linux/uid_stat.h
@@ -0,0 +1,33 @@
+/* include/linux/uid_stat.h
+ *
+ * Copyright (C) 2008-2009 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef __uid_stat_h
+#define __uid_stat_h
+
+/* Contains definitions for resource tracking per uid. */
+
+#ifdef CONFIG_UID_STAT
+int uid_stat_tcp_snd(kuid_t uid, int size);
+int uid_stat_tcp_rcv(kuid_t uid, int size);
+#else
+static inline int uid_stat_tcp_snd(kuid_t uid, int size)
+{
+}
+static inline int uid_stat_tcp_rcv(kuid_t uid, int size)
+{
+}
+#endif
+
+#endif /* _LINUX_UID_STAT_H */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 3075723..00eb156 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -268,6 +268,7 @@
#include <linux/crypto.h>
#include <linux/time.h>
#include <linux/slab.h>
+#include <linux/uid_stat.h>
#include <net/icmp.h>
#include <net/inet_common.h>
@@ -1280,6 +1281,9 @@ out:
tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
out_nopush:
release_sock(sk);
+
+ if (copied + copied_syn)
+ uid_stat_tcp_snd(current_uid(), copied + copied_syn);
return copied + copied_syn;
do_fault:
@@ -1551,6 +1555,7 @@ int tcp_read_sock(struct sock *sk, read_descriptor_t *desc,
if (copied > 0) {
tcp_recv_skb(sk, seq, &offset);
tcp_cleanup_rbuf(sk, copied);
+ uid_stat_tcp_rcv(current_uid(), copied);
}
return copied;
}
@@ -1879,6 +1884,9 @@ skip_copy:
tcp_cleanup_rbuf(sk, copied);
release_sock(sk);
+
+ if (copied > 0)
+ uid_stat_tcp_rcv(current_uid(), copied);
return copied;
out:
@@ -1887,6 +1895,8 @@ out:
recv_urg:
err = tcp_recv_urg(sk, msg, len, flags);
+ if (err > 0)
+ uid_stat_tcp_rcv(current_uid(), err);
goto out;
recv_sndq:
--
1.8.2.1
^ permalink raw reply related
* [RFC 2/2] net: activity_stats: Add statistics for network transmission activity
From: Kiran Raparthy @ 2015-01-13 11:14 UTC (permalink / raw)
To: linux-kernel
Cc: Mike Chan, Arnd Bergmann, Greg Kroah-Hartman, David S. Miller,
netdev, Android Kernel Team, John Stultz, Sumit Semwal,
Arve Hj�nnev�g, Kiran Raparthy
In-Reply-To: <1421147642-28360-1-git-send-email-kiran.kumar@linaro.org>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=UTF-8, Size: 7811 bytes --]
From: Mike Chan <mike@android.com>
net: activity_stats: Add statistics for network transmission activity
When enabled, tracks the frequency of network transmissions
(inbound and outbound) and buckets them accordingly.
Buckets are determined by time between network activity.
Each bucket represents the number of network transmisions that were
N sec or longer apart. Where N is defined as 1 << bucket index.
This network pattern tracking is particularly useful for wireless
networks (ie: 3G) where batching network activity closely together
is more power efficient than far apart.
New file: /proc/net/stat/activity
output:
Min Bucket(sec) Count
1 7
2 0
4 1
8 0
16 0
32 2
64 1
128 0
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Android Kernel Team <kernel-team@android.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Mike Chan <mike@android.com>
[Kiran: Added context to commit message.
included build fix from Arve,continued to use uid_t
but used covnersion from kuid_t where ever necessary]
Signed-off-by: Kiran Raparthy <kiran.kumar@linaro.org>
---
drivers/misc/uid_stat.c | 3 ++
include/net/activity_stats.h | 25 ++++++++++
net/Kconfig | 8 +++
net/Makefile | 1 +
net/activity_stats.c | 116 +++++++++++++++++++++++++++++++++++++++++++
5 files changed, 153 insertions(+)
create mode 100644 include/net/activity_stats.h
create mode 100644 net/activity_stats.c
diff --git a/drivers/misc/uid_stat.c b/drivers/misc/uid_stat.c
index aaaa406..63065e2 100644
--- a/drivers/misc/uid_stat.c
+++ b/drivers/misc/uid_stat.c
@@ -24,6 +24,7 @@
#include <linux/spinlock.h>
#include <linux/stat.h>
#include <linux/uid_stat.h>
+#include <net/activity_stats.h>
static DEFINE_SPINLOCK(uid_lock);
static LIST_HEAD(uid_list);
@@ -126,6 +127,7 @@ int uid_stat_tcp_snd(kuid_t uid, int size)
{
struct uid_stat *entry;
+ activity_stats_update();
entry = find_or_create_uid_stat(uid);
if (!entry)
return -1;
@@ -137,6 +139,7 @@ int uid_stat_tcp_rcv(kuid_t uid, int size)
{
struct uid_stat *entry;
+ activity_stats_update();
entry = find_or_create_uid_stat(uid);
if (!entry)
return -1;
diff --git a/include/net/activity_stats.h b/include/net/activity_stats.h
new file mode 100644
index 0000000..10e4c15
--- /dev/null
+++ b/include/net/activity_stats.h
@@ -0,0 +1,25 @@
+/*
+ * Copyright (C) 2010 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * Author: Mike Chan (mike@android.com)
+ */
+
+#ifndef __activity_stats_h
+#define __activity_stats_h
+
+#ifdef CONFIG_NET_ACTIVITY_STATS
+void activity_stats_update(void);
+#else
+#define activity_stats_update(void) {}
+#endif
+
+#endif /* _NET_ACTIVITY_STATS_H */
diff --git a/net/Kconfig b/net/Kconfig
index ff9ffc1..3159361 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -83,6 +83,14 @@ source "net/netlabel/Kconfig"
endif # if INET
+config NET_ACTIVITY_STATS
+ bool "Network activity statistics tracking"
+ default y
+ help
+ Network activity statistics are useful for tracking wireless
+ modem activity on 2G, 3G, 4G wireless networks. Counts number of
+ transmissions and groups them in specified time buckets.
+
config NETWORK_SECMARK
bool "Security Marking"
help
diff --git a/net/Makefile b/net/Makefile
index 95fc694..d81a969 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -73,6 +73,7 @@ obj-$(CONFIG_OPENVSWITCH) += openvswitch/
obj-$(CONFIG_VSOCKETS) += vmw_vsock/
obj-$(CONFIG_NET_MPLS_GSO) += mpls/
obj-$(CONFIG_HSR) += hsr/
+obj-$(CONFIG_NET_ACTIVITY_STATS) += activity_stats.o
ifneq ($(CONFIG_NET_SWITCHDEV),)
obj-y += switchdev/
endif
diff --git a/net/activity_stats.c b/net/activity_stats.c
new file mode 100644
index 0000000..cf0a09d
--- /dev/null
+++ b/net/activity_stats.c
@@ -0,0 +1,116 @@
+/* net/activity_stats.c
+ *
+ * Copyright (C) 2010 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * Author: Mike Chan (mike@android.com)
+ */
+
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/suspend.h>
+#include <net/net_namespace.h>
+
+/* Track transmission rates in buckets (power of 2).
+ * 1,2,4,8...512 seconds.
+ *
+ * Buckets represent the count of network transmissions at least
+ * N seconds apart, where N is 1 << bucket index.
+ */
+#define BUCKET_MAX 10
+
+/* Track network activity frequency */
+static unsigned long activity_stats[BUCKET_MAX];
+static ktime_t last_transmit;
+static ktime_t suspend_time;
+static DEFINE_SPINLOCK(activity_lock);
+
+void activity_stats_update(void)
+{
+ int i;
+ unsigned long flags;
+ ktime_t now;
+ s64 delta;
+
+ spin_lock_irqsave(&activity_lock, flags);
+ now = ktime_get();
+ delta = ktime_to_ns(ktime_sub(now, last_transmit));
+
+ for (i = BUCKET_MAX - 1; i >= 0; i--) {
+ /* Check if the time delta between network activity is within
+ * the minimum bucket range.
+ */
+ if (delta < (1000000000ULL << i))
+ continue;
+
+ activity_stats[i]++;
+ last_transmit = now;
+ break;
+ }
+ spin_unlock_irqrestore(&activity_lock, flags);
+}
+
+static int activity_stats_show(struct seq_file *m, void *v)
+{
+ int i;
+ int ret;
+
+ seq_printf(m, "Min Bucket(sec) Count\n");
+
+ for (i = 0; i < BUCKET_MAX; i++) {
+ ret = seq_printf(m, "%15d %lu\n", 1 << i, activity_stats[i]);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static int activity_stats_notifier(struct notifier_block *nb,
+ unsigned long event, void *dummy)
+{
+ switch (event) {
+ case PM_SUSPEND_PREPARE:
+ suspend_time = ktime_get_real();
+ break;
+
+ case PM_POST_SUSPEND:
+ suspend_time = ktime_sub(ktime_get_real(), suspend_time);
+ last_transmit = ktime_sub(last_transmit, suspend_time);
+ }
+
+ return 0;
+}
+
+static int activity_stats_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, activity_stats_show, PDE_DATA(inode));
+}
+
+static const struct file_operations activity_stats_fops = {
+ .open = activity_stats_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = seq_release,
+};
+
+static struct notifier_block activity_stats_notifier_block = {
+ .notifier_call = activity_stats_notifier,
+};
+
+static int __init activity_stats_init(void)
+{
+ proc_create("activity", S_IRUGO,
+ init_net.proc_net_stat, &activity_stats_fops);
+ return register_pm_notifier(&activity_stats_notifier_block);
+}
+
+subsys_initcall(activity_stats_init);
--
1.8.2.1
^ permalink raw reply related
* Re: [PATCH net-next] rhashtable: Lower/upper bucket may map to same lock while shrinking
From: 'Thomas Graf' @ 2015-01-13 11:25 UTC (permalink / raw)
To: David Laight
Cc: davem@davemloft.net, Fengguang Wu, LKP,
linux-kernel@vger.kernel.org, netfilter-devel@vger.kernel.org,
coreteam@netfilter.org, netdev@vger.kernel.org
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6D1CAC6593@AcuExch.aculab.com>
On 01/13/15 at 09:49am, David Laight wrote:
> From: Thomas Graf
> > Each per bucket lock covers a configurable number of buckets. While
> > shrinking, two buckets in the old table contain entries for a single
> > bucket in the new table. We need to lock down both while linking.
> > Check if they are protected by different locks to avoid a recursive
> > lock.
>
> Thought, could the shrunk table use the same locks as the lower half
> of the old table?
No. A new bucket table and thus a new set of locks is allocated when the
table is shrunk or grown. We only have check for overlapping locks
when holding multiple locks for the same table at the same time.
> I also wonder whether shrinking hash tables is ever actually worth
> the effort. Most likely they'll need to grow again very quickly.
Specifying a .shrink_decision function is optional so every rhashtable
user can decide whether it wants shrinking or not. Need for it was
expressed in the past threads.
Also, the case of multiple buckets mapping to the same lock is also
present in the expanding logic so removing the shrinking logic would
not remove the need for these types of checks.
> > spin_lock_bh(old_bucket_lock1);
> > - spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED);
> > - spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2);
> > +
> > + /* Depending on the lock per buckets mapping, the bucket in
> > + * the lower and upper region may map to the same lock.
> > + */
> > + if (old_bucket_lock1 != old_bucket_lock2) {
> > + spin_lock_bh_nested(old_bucket_lock2, RHT_LOCK_NESTED);
> > + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED2);
> > + } else {
> > + spin_lock_bh_nested(new_bucket_lock, RHT_LOCK_NESTED);
> > + }
>
> Acquiring 3 locks of much the same type looks like a locking hierarchy
> violation just waiting to happen.
I'm not claiming it's extremely pretty, lockless lookup with deferred
resizing doesn't come for free ;-) If you have a suggestion on how to
implement this differently I'm all ears. That said, it's well isolated
and the user of rhashtable does not have to deal with it. All code paths
which take multiple locks are mutually exclusive to each other (ht->mutex).
^ permalink raw reply
* Re: [PATCH net-next] rhashtable: unnecessary to use delayed work
From: Thomas Graf @ 2015-01-13 11:26 UTC (permalink / raw)
To: Ying Xue; +Cc: davem, netdev
In-Reply-To: <54B4EA06.8010507@windriver.com>
On 01/13/15 at 05:48pm, Ying Xue wrote:
> On 01/13/2015 05:35 PM, Thomas Graf wrote:
> > If you agree we can explain this shortly in the commit message and add:
> > Fixes: 97defe1 ("rhashtable: Per bucket locks & deferred expansion/shrinking")
>
> OK, I will deliver the next version.
>
> By the way, I think we should check the following condition before call
> cancel_work_sync(), otherwise, we may cancel an uninitialized work.
>
> (ht->p.grow_decision || ht->p.shrink_decision)
>
> What do you think?
+1
^ permalink raw reply
* Re: [PATCH 2/6] vxlan: Group Policy extension
From: Thomas Graf @ 2015-01-13 11:32 UTC (permalink / raw)
To: Tom Herbert
Cc: David Miller, Jesse Gross, Stephen Hemminger, Pravin B Shelar,
Alexei Starovoitov, Linux Netdev List, dev@openvswitch.org
In-Reply-To: <CA+mtBx8-rMYSkj2KKXBq60KeF=V=iCG2K0HgVxO5JMVK1yhwCg@mail.gmail.com>
On 01/12/15 at 06:28pm, Tom Herbert wrote:
> On Mon, Jan 12, 2015 at 5:03 PM, Thomas Graf <tgraf@suug.ch> wrote:
> >>
> >> Creating a level of indirection for extensions seems overly
> >> complicated to me. Why not just define IFLA_VXLAN_GBP as just another
> >> enum above?
> >
> > I think it's cleaner to group them in a nested attribute.
> > It clearly separates the optional extensions from the base
> > attributes. RCO, GPE, GBP can all live in there.
>
> This is inconsistent with similar things in GRE and GUE. For instance,
> GRE keyid is set as its own attribute. It just seems like this adding
> more code to the driver than is necessary for the functionality
> needed.
The major difference here is that we have to consider backwards
compatibility specifically for VXLAN. Your initial feedback on GPE
actually led me to how I implemented GBP.
I think the axioms we want to establish are as follows:
1. Extensions need to be explicitly enabled by the user. A previously
dropped frame should only be processed if the user explitly asks
for it.
2. As a consequence: only share a VLXAN UDP port if the enabled
extensions match (vxlan_sock_add), e.g. user A might want RCO
but user B might be unaware. They cannot share the same UDP port.
The 2nd lead me to introduce the 'exts' member to vxlan_sock so we can
compare it in vxlan_find_sock() and only share a UDP port if the
enabled extensions match.
Your patch currently implements (1) but not (2).
^ permalink raw reply
* [net-next 00/15][pull request] Intel Wired LAN Driver Updates 2015-01-13
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene
This series contains updates to i40e and i40evf.
Mitch provides a fix for i40e to move the call to pci_disable_sriov() so
that it is called earlier to ensure that the PF driver won't free VF
resources before the VF remove routine can complete. Also cleans up
redundant and duplicate code in the i40evf. Refactors the i40evf
shutdown code and let the watchdog take care of shutting things down.
Fix a possible memory leak, if we are using VLANs and the communication
with the PF fail during shutdown. On some versions of the firmware, the
VF admin send queue may become stalled. In this case, the easiest
solution is to place another descriptor on the queue and the firmware
will then process both requests.
Greg adds a warning when the NPAR enabled partitions detected a link speed
less than 10 Gpbs.
Vasu removes redundant VN2VN MAC address which were already added by
the FCoE stack.
Shannon adds code to find how many partitions there are per port and
what is the current partition_id when in NPAR mode. In multifunction
mode, make sure we only allow SR/IOV on the master PF of a port and
only allow partition 1 to set WoL, speed and flow control.
Kamil adds code to read the PBA block from shadow RAM and returns
the part number in a string format.
Catherine provides a fix to check if link state and link speed has
changed before exiting link event
The following are changes since commit d2c60b1350c9a3eb7ed407c18f50306762365646:
Merge branch 'tuntap_queues'
and are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master
Catherine Sullivan (1):
i40e: Don't exit link event early if link speed has changed
Greg Rose (1):
i40e: Add warning for NPAR partitions with link speed less than 10Gbps
Kamil Krawczyk (1):
i40e: Adding function for reading PBA String
Mitch A Williams (8):
i40e: disable IOV before freeing resources
i40evf: remove redundant code
i40evf: Remove some scary log messages
i40evf: refactor shutdown code
i40evf: remove leftover VLAN filters
i40evf: don't fire traffic IRQs when the interface is down
i40evf: enable interrupt 0 appropriately
i40evf: kick a stalled admin queue
Shannon Nelson (3):
i40e/i40evf: find partition_id in npar mode
i40e: limit WoL and link settings to partition 1
i40e: limit sriov to partition 1 of NPAR configurations
Vasu Dev (1):
i40e: remove VN2VN related mac filters
drivers/net/ethernet/intel/i40e/i40e_common.c | 125 +++++++++++++++++++++
drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 43 ++++++-
drivers/net/ethernet/intel/i40e/i40e_fcoe.c | 2 -
drivers/net/ethernet/intel/i40e/i40e_main.c | 15 ++-
drivers/net/ethernet/intel/i40e/i40e_prototype.h | 5 +
drivers/net/ethernet/intel/i40e/i40e_type.h | 9 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 11 +-
drivers/net/ethernet/intel/i40evf/i40e_type.h | 7 +-
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 94 ++++++++--------
.../net/ethernet/intel/i40evf/i40evf_virtchnl.c | 6 +-
10 files changed, 256 insertions(+), 61 deletions(-)
--
1.9.3
^ permalink raw reply
* [net-next 02/15] i40evf: remove redundant code
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Mitch A Williams <mitch.a.williams@intel.com>
These functions are redundant and duplicate functionality found in
i40evf_free_all_[tx|rx]_resources.
Change-ID: Ia199908926d7a1a4b8247f75f89b5da24c9b149c
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 27 -------------------------
1 file changed, 27 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index cabaf59..ee0db59 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -947,30 +947,6 @@ static int i40evf_up_complete(struct i40evf_adapter *adapter)
}
/**
- * i40evf_clean_all_rx_rings - Free Rx Buffers for all queues
- * @adapter: board private structure
- **/
-static void i40evf_clean_all_rx_rings(struct i40evf_adapter *adapter)
-{
- int i;
-
- for (i = 0; i < adapter->num_active_queues; i++)
- i40evf_clean_rx_ring(adapter->rx_rings[i]);
-}
-
-/**
- * i40evf_clean_all_tx_rings - Free Tx Buffers for all queues
- * @adapter: board private structure
- **/
-static void i40evf_clean_all_tx_rings(struct i40evf_adapter *adapter)
-{
- int i;
-
- for (i = 0; i < adapter->num_active_queues; i++)
- i40evf_clean_tx_ring(adapter->tx_rings[i]);
-}
-
-/**
* i40e_down - Shutdown the connection processing
* @adapter: board private structure
**/
@@ -1008,9 +984,6 @@ void i40evf_down(struct i40evf_adapter *adapter)
i40evf_napi_disable_all(adapter);
netif_carrier_off(netdev);
-
- i40evf_clean_all_tx_rings(adapter);
- i40evf_clean_all_rx_rings(adapter);
}
/**
--
1.9.3
^ permalink raw reply related
* [net-next 03/15] i40evf: Remove some scary log messages
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Mitch A Williams <mitch.a.williams@intel.com>
These messages may be triggered during normal init of the driver if the
PF or FW take a long time to respond. There's nothing really wrong, so
don't freak people out logging messages.
If the communication channel really is dead, then we'll retry a few
times and give up. This will log a different more scary message that
should cause consternation. This allows the user to more easily detect a
genuine failure.
Change-ID: I6e2b758d4234a3a09c1015c82c8f2442a697cbdb
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 4 ----
drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c | 3 ---
2 files changed, 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index ee0db59..f8f1d26 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -2026,10 +2026,7 @@ static void i40evf_init_task(struct work_struct *work)
/* aq msg sent, awaiting reply */
err = i40evf_verify_api_ver(adapter);
if (err) {
- dev_info(&pdev->dev, "Unable to verify API version (%d), retrying\n",
- err);
if (err == I40E_ERR_ADMIN_QUEUE_NO_WORK) {
- dev_info(&pdev->dev, "Resending request\n");
err = i40evf_send_api_ver(adapter);
}
goto err;
@@ -2054,7 +2051,6 @@ static void i40evf_init_task(struct work_struct *work)
}
err = i40evf_get_vf_config(adapter);
if (err == I40E_ERR_ADMIN_QUEUE_NO_WORK) {
- dev_info(&pdev->dev, "Resending VF config request\n");
err = i40evf_send_vf_config_msg(adapter);
goto err;
}
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index 5fde5a7..3aeb633 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -715,9 +715,6 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
}
return;
}
- if (v_opcode != adapter->current_op)
- dev_info(&adapter->pdev->dev, "Pending op is %d, received %d\n",
- adapter->current_op, v_opcode);
if (v_retval) {
dev_err(&adapter->pdev->dev, "%s: PF returned error %d to our request %d\n",
__func__, v_retval, v_opcode);
--
1.9.3
^ permalink raw reply related
* [net-next 01/15] i40e: disable IOV before freeing resources
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Mitch A Williams <mitch.a.williams@intel.com>
If VF drivers are loaded in the host OS, the call to pci_disable_sriov()
will cause these drivers' remove routines to be called. If the PF driver
has already freed VF resources before this happens, then the VF remove
routine can't properly communicate with the PF driver causing all sorts
of mayhem and error messages and hurt feelings.
To fix this, we move the call to pci_disable_sriov() up to the top of
the function and let it complete before freeing any VF resources.
Change-ID: I397c3997a00f6408e32b7735273911e499600236
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 5bae895..044019b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -791,10 +791,18 @@ void i40e_free_vfs(struct i40e_pf *pf)
if (!pf->vf)
return;
+ /* Disable IOV before freeing resources. This lets any VF drivers
+ * running in the host get themselves cleaned up before we yank
+ * the carpet out from underneath their feet.
+ */
+ if (!pci_vfs_assigned(pf->pdev))
+ pci_disable_sriov(pf->pdev);
+
+ msleep(20); /* let any messages in transit get finished up */
+
/* Disable interrupt 0 so we don't try to handle the VFLR. */
i40e_irq_dynamic_disable_icr0(pf);
- mdelay(10); /* let any messages in transit get finished up */
/* free up vf resources */
tmp = pf->num_alloc_vfs;
pf->num_alloc_vfs = 0;
@@ -813,7 +821,6 @@ void i40e_free_vfs(struct i40e_pf *pf)
* before this function ever gets called.
*/
if (!pci_vfs_assigned(pf->pdev)) {
- pci_disable_sriov(pf->pdev);
/* Acknowledge VFLR for all VFS. Without this, VFs will fail to
* work correctly when SR-IOV gets re-enabled.
*/
--
1.9.3
^ permalink raw reply related
* [net-next 05/15] i40evf: remove leftover VLAN filters
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Mitch A Williams <mitch.a.williams@intel.com>
If we're using VLANs and communications with the PF fail during
shutdown, we will leak memory because not all of the VLAN filters will
be removed. To eliminate this possibility, go through the list again
right before the module is removed and delete any leftover entries.
Change-ID: Id3b5315c47ca0a61ae123a96ff345d010bc41aed
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 5a2ed90..a21fb88 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -2463,6 +2463,10 @@ static void i40evf_remove(struct pci_dev *pdev)
list_del(&f->list);
kfree(f);
}
+ list_for_each_entry_safe(f, ftmp, &adapter->vlan_filter_list, list) {
+ list_del(&f->list);
+ kfree(f);
+ }
free_netdev(netdev);
--
1.9.3
^ permalink raw reply related
* [net-next 04/15] i40evf: refactor shutdown code
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Mitch A Williams <mitch.a.williams@intel.com>
If the VF driver is running in the host, the shutdown code is completely
broken. We cannot wait in our down routine for the PF to respond to our
requests, as its admin queue task will never run while we hold the lock.
Instead, we schedule operations, then let the watchdog take care of
shutting things down. If the driver is being removed, then wait in the
remove routine until the watchdog is done before continuing.
Change-ID: I93a58d17389e8d6b58f21e430b56ed7b4590b2c5
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 29 ++++++++++++++++++++-----
1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index f8f1d26..5a2ed90 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -958,6 +958,12 @@ void i40evf_down(struct i40evf_adapter *adapter)
if (adapter->state == __I40EVF_DOWN)
return;
+ while (test_and_set_bit(__I40EVF_IN_CRITICAL_TASK,
+ &adapter->crit_section))
+ usleep_range(500, 1000);
+
+ i40evf_irq_disable(adapter);
+
/* remove all MAC filters */
list_for_each_entry(f, &adapter->mac_filter_list, list) {
f->remove = true;
@@ -968,22 +974,27 @@ void i40evf_down(struct i40evf_adapter *adapter)
}
if (!(adapter->flags & I40EVF_FLAG_PF_COMMS_FAILED) &&
adapter->state != __I40EVF_RESETTING) {
- adapter->aq_required |= I40EVF_FLAG_AQ_DEL_MAC_FILTER;
+ /* cancel any current operation */
+ adapter->current_op = I40E_VIRTCHNL_OP_UNKNOWN;
+ adapter->aq_pending = 0;
+ /* Schedule operations to close down the HW. Don't wait
+ * here for this to complete. The watchdog is still running
+ * and it will take care of this.
+ */
+ adapter->aq_required = I40EVF_FLAG_AQ_DEL_MAC_FILTER;
adapter->aq_required |= I40EVF_FLAG_AQ_DEL_VLAN_FILTER;
- /* disable receives */
adapter->aq_required |= I40EVF_FLAG_AQ_DISABLE_QUEUES;
- mod_timer_pending(&adapter->watchdog_timer, jiffies + 1);
- msleep(20);
}
netif_tx_disable(netdev);
netif_tx_stop_all_queues(netdev);
- i40evf_irq_disable(adapter);
-
i40evf_napi_disable_all(adapter);
+ msleep(20);
+
netif_carrier_off(netdev);
+ clear_bit(__I40EVF_IN_CRITICAL_TASK, &adapter->crit_section);
}
/**
@@ -2409,6 +2420,7 @@ static void i40evf_remove(struct pci_dev *pdev)
struct i40evf_adapter *adapter = netdev_priv(netdev);
struct i40evf_mac_filter *f, *ftmp;
struct i40e_hw *hw = &adapter->hw;
+ int count = 50;
cancel_delayed_work_sync(&adapter->init_task);
cancel_work_sync(&adapter->reset_task);
@@ -2417,6 +2429,11 @@ static void i40evf_remove(struct pci_dev *pdev)
unregister_netdev(netdev);
adapter->netdev_registered = false;
}
+ while (count-- && adapter->aq_required)
+ msleep(50);
+
+ if (count < 0)
+ dev_err(&pdev->dev, "Timed out waiting for PF driver.\n");
adapter->state = __I40EVF_REMOVE;
if (adapter->msix_entries) {
--
1.9.3
^ permalink raw reply related
* [net-next 07/15] i40evf: enable interrupt 0 appropriately
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Mitch A Williams <mitch.a.williams@intel.com>
Don't enable vector 0 in the ISR, just schedule the adminq task and let
it enable the vector. This prevents the task from being called
reentrantly. Make sure that the vector is enabled on all exit paths of
the adminq task, including error exits.
Change-ID: I53f3d14f91ed7a9e90291ea41c681122a5eca5b5
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index d55c4a9..6d5a0e6 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -313,10 +313,6 @@ static irqreturn_t i40evf_msix_aq(int irq, void *data)
val = val | I40E_PFINT_DYN_CTL0_CLEARPBA_MASK;
wr32(hw, I40E_VFINT_DYN_CTL01, val);
- /* re-enable interrupt causes */
- wr32(hw, I40E_VFINT_ICR0_ENA1, ena_mask);
- wr32(hw, I40E_VFINT_DYN_CTL01, I40E_VFINT_DYN_CTL01_INTENA_MASK);
-
/* schedule work on the private workqueue */
schedule_work(&adapter->adminq_task);
@@ -1620,12 +1616,12 @@ static void i40evf_adminq_task(struct work_struct *work)
u16 pending;
if (adapter->flags & I40EVF_FLAG_PF_COMMS_FAILED)
- return;
+ goto out;
event.buf_len = I40EVF_MAX_AQ_BUF_SIZE;
event.msg_buf = kzalloc(event.buf_len, GFP_KERNEL);
if (!event.msg_buf)
- return;
+ goto out;
v_msg = (struct i40e_virtchnl_msg *)&event.desc;
do {
@@ -1675,10 +1671,10 @@ static void i40evf_adminq_task(struct work_struct *work)
if (oldval != val)
wr32(hw, hw->aq.asq.len, val);
+ kfree(event.msg_buf);
+out:
/* re-enable Admin queue interrupt cause */
i40evf_misc_irq_enable(adapter);
-
- kfree(event.msg_buf);
}
/**
--
1.9.3
^ permalink raw reply related
* [net-next 06/15] i40evf: don't fire traffic IRQs when the interface is down
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Mitch A Williams <mitch.a.williams@intel.com>
There is always a possibility that MSI-X interrupts can get lost. To
keep this problem from stalling the driver, we fire all of our MSI-X
vectors during the watchdog routine. However, we should not fire the
traffic vectors when the interface is closed. In this case, just fire
vector 0, which is used for admin queue events.
As a result, we do not enable the interrupt cause for vector 0. This
can cause the admin queue handler to be called reentrantly, which
causes a scary "critical section violation" message to be logged,
even though no real damage is done.
Change-ID: Ic43a5184708ab2cb9a23fca7dedd808a46717795
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index a21fb88..d55c4a9 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1385,11 +1385,14 @@ static void i40evf_watchdog_task(struct work_struct *work)
if (adapter->state == __I40EVF_RUNNING)
i40evf_request_stats(adapter);
-
- i40evf_irq_enable(adapter, true);
- i40evf_fire_sw_int(adapter, 0xFF);
-
watchdog_done:
+ if (adapter->state == __I40EVF_RUNNING) {
+ i40evf_irq_enable_queues(adapter, ~0);
+ i40evf_fire_sw_int(adapter, 0xFF);
+ } else {
+ i40evf_fire_sw_int(adapter, 0x1);
+ }
+
clear_bit(__I40EVF_IN_CRITICAL_TASK, &adapter->crit_section);
restart_watchdog:
if (adapter->state == __I40EVF_REMOVE)
--
1.9.3
^ permalink raw reply related
* [net-next 09/15] i40e: Add warning for NPAR partitions with link speed less than 10Gbps
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Greg Rose, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Greg Rose <gregory.v.rose@intel.com>
NPAR enabled partitions should warn the user when detected link speed is
less than 10Gpbs.
Change-ID: I7728bb8ce279bf0f4f755d78d7071074a4eb5f69
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index a5f2660..7c14973 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -4557,6 +4557,15 @@ static void i40e_print_link_message(struct i40e_vsi *vsi, bool isup)
return;
}
+ /* Warn user if link speed on NPAR enabled partition is not at
+ * least 10GB
+ */
+ if (vsi->back->hw.func_caps.npar_enable &&
+ (vsi->back->hw.phy.link_info.link_speed == I40E_LINK_SPEED_1GB ||
+ vsi->back->hw.phy.link_info.link_speed == I40E_LINK_SPEED_100MB))
+ netdev_warn(vsi->netdev,
+ "The partition detected link speed that is less than 10Gbps\n");
+
switch (vsi->back->hw.phy.link_info.link_speed) {
case I40E_LINK_SPEED_40GB:
strlcpy(speed, "40 Gbps", SPEED_SIZE);
--
1.9.3
^ permalink raw reply related
* [net-next 08/15] i40evf: kick a stalled admin queue
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Mitch A Williams, netdev, nhorman, sassmann, jogreene,
Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Mitch A Williams <mitch.a.williams@intel.com>
On some versions of the firmware, the VF admin send queue may become
stalled. In this case, the easiest solution is to just place another
descriptor on the queue; the firmware will then process both requests.
The early init code already accounts for this, but the runtime code does
not. In the watchdog task, check for the stall condition, and if it's
found, send our API version to the PF. When the PF replies, just ignore
the reply.
Change-ID: I380d78185a4f284d649c44d263e648afc9b4d50c
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40evf/i40evf_main.c | 7 ++++++-
drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c | 3 +++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 6d5a0e6..c663182 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1336,8 +1336,13 @@ static void i40evf_watchdog_task(struct work_struct *work)
/* Process admin queue tasks. After init, everything gets done
* here so we don't race on the admin queue.
*/
- if (adapter->aq_pending)
+ if (adapter->aq_pending) {
+ if (!i40evf_asq_done(hw)) {
+ dev_dbg(&adapter->pdev->dev, "Admin queue timeout\n");
+ i40evf_send_api_ver(adapter);
+ }
goto watchdog_done;
+ }
if (adapter->aq_required & I40EVF_FLAG_AQ_MAP_VECTORS) {
i40evf_map_queues(adapter);
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index 3aeb633..3f0c85e 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -720,6 +720,9 @@ void i40evf_virtchnl_completion(struct i40evf_adapter *adapter,
__func__, v_retval, v_opcode);
}
switch (v_opcode) {
+ case I40E_VIRTCHNL_OP_VERSION:
+ /* no action, but also not an error */
+ break;
case I40E_VIRTCHNL_OP_GET_STATS: {
struct i40e_eth_stats *stats =
(struct i40e_eth_stats *)msg;
--
1.9.3
^ permalink raw reply related
* [net-next 11/15] i40e/i40evf: find partition_id in npar mode
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Shannon Nelson, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Shannon Nelson <shannon.nelson@intel.com>
When in NPAR mode the driver instance might be controlling the base
partition or one of the other "fake" PFs. There are some things that
can only be done by the base partition, aka partition_id 1. This code
does a bit of work to find how many partitions are there per port and
what is the current partition_id.
Change-ID: Iba427f020a1983d02147d86f121b3627e20ee21d
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_common.c | 66 ++++++++++++++++++++++++
drivers/net/ethernet/intel/i40e/i40e_prototype.h | 3 ++
drivers/net/ethernet/intel/i40e/i40e_type.h | 7 ++-
drivers/net/ethernet/intel/i40evf/i40e_type.h | 7 ++-
4 files changed, 81 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index 3d741ee..b16fc03 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -2035,6 +2035,43 @@ i40e_status i40e_aq_send_msg_to_vf(struct i40e_hw *hw, u16 vfid,
}
/**
+ * i40e_aq_debug_read_register
+ * @hw: pointer to the hw struct
+ * @reg_addr: register address
+ * @reg_val: register value
+ * @cmd_details: pointer to command details structure or NULL
+ *
+ * Read the register using the admin queue commands
+ **/
+i40e_status i40e_aq_debug_read_register(struct i40e_hw *hw,
+ u32 reg_addr, u64 *reg_val,
+ struct i40e_asq_cmd_details *cmd_details)
+{
+ struct i40e_aq_desc desc;
+ struct i40e_aqc_debug_reg_read_write *cmd_resp =
+ (struct i40e_aqc_debug_reg_read_write *)&desc.params.raw;
+ i40e_status status;
+
+ if (reg_val == NULL)
+ return I40E_ERR_PARAM;
+
+ i40e_fill_default_direct_cmd_desc(&desc,
+ i40e_aqc_opc_debug_read_reg);
+
+ cmd_resp->address = cpu_to_le32(reg_addr);
+
+ status = i40e_asq_send_command(hw, &desc, NULL, 0, cmd_details);
+
+ if (!status) {
+ *reg_val = ((u64)cmd_resp->value_high << 32) |
+ (u64)cmd_resp->value_low;
+ *reg_val = le64_to_cpu(*reg_val);
+ }
+
+ return status;
+}
+
+/**
* i40e_aq_debug_write_register
* @hw: pointer to the hw struct
* @reg_addr: register address
@@ -2292,6 +2329,7 @@ static void i40e_parse_discover_capabilities(struct i40e_hw *hw, void *buff,
enum i40e_admin_queue_opc list_type_opc)
{
struct i40e_aqc_list_capabilities_element_resp *cap;
+ u32 valid_functions, num_functions;
u32 number, logical_id, phys_id;
struct i40e_hw_capabilities *p;
u32 i = 0;
@@ -2427,6 +2465,34 @@ static void i40e_parse_discover_capabilities(struct i40e_hw *hw, void *buff,
if (p->npar_enable || p->mfp_mode_1)
p->fcoe = false;
+ /* count the enabled ports (aka the "not disabled" ports) */
+ hw->num_ports = 0;
+ for (i = 0; i < 4; i++) {
+ u32 port_cfg_reg = I40E_PRTGEN_CNF + (4 * i);
+ u64 port_cfg = 0;
+
+ /* use AQ read to get the physical register offset instead
+ * of the port relative offset
+ */
+ i40e_aq_debug_read_register(hw, port_cfg_reg, &port_cfg, NULL);
+ if (!(port_cfg & I40E_PRTGEN_CNF_PORT_DIS_MASK))
+ hw->num_ports++;
+ }
+
+ valid_functions = p->valid_functions;
+ num_functions = 0;
+ while (valid_functions) {
+ if (valid_functions & 1)
+ num_functions++;
+ valid_functions >>= 1;
+ }
+
+ /* partition id is 1-based, and functions are evenly spread
+ * across the ports as partitions
+ */
+ hw->partition_id = (hw->pf_id / hw->num_ports) + 1;
+ hw->num_partitions = num_functions / hw->num_ports;
+
/* additional HW specific goodies that might
* someday be HW version specific
*/
diff --git a/drivers/net/ethernet/intel/i40e/i40e_prototype.h b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
index 2fb4306..d1c7d63 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_prototype.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
@@ -71,6 +71,9 @@ i40e_status i40e_aq_get_firmware_version(struct i40e_hw *hw,
i40e_status i40e_aq_debug_write_register(struct i40e_hw *hw,
u32 reg_addr, u64 reg_val,
struct i40e_asq_cmd_details *cmd_details);
+i40e_status i40e_aq_debug_read_register(struct i40e_hw *hw,
+ u32 reg_addr, u64 *reg_val,
+ struct i40e_asq_cmd_details *cmd_details);
i40e_status i40e_aq_set_phy_debug(struct i40e_hw *hw, u8 cmd_flags,
struct i40e_asq_cmd_details *cmd_details);
i40e_status i40e_aq_set_default_vsi(struct i40e_hw *hw, u16 vsi_id,
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
index c1f2eb9..611de3e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -431,7 +431,7 @@ struct i40e_hw {
u8 __iomem *hw_addr;
void *back;
- /* function pointer structs */
+ /* subsystem structs */
struct i40e_phy_info phy;
struct i40e_mac_info mac;
struct i40e_bus_info bus;
@@ -458,6 +458,11 @@ struct i40e_hw {
u8 pf_id;
u16 main_vsi_seid;
+ /* for multi-function MACs */
+ u16 partition_id;
+ u16 num_partitions;
+ u16 num_ports;
+
/* Closest numa node to the device */
u16 numa_node;
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_type.h b/drivers/net/ethernet/intel/i40evf/i40e_type.h
index 68aec11..d1c2b5a 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_type.h
@@ -425,7 +425,7 @@ struct i40e_hw {
u8 __iomem *hw_addr;
void *back;
- /* function pointer structs */
+ /* subsystem structs */
struct i40e_phy_info phy;
struct i40e_mac_info mac;
struct i40e_bus_info bus;
@@ -452,6 +452,11 @@ struct i40e_hw {
u8 pf_id;
u16 main_vsi_seid;
+ /* for multi-function MACs */
+ u16 partition_id;
+ u16 num_partitions;
+ u16 num_ports;
+
/* Closest numa node to the device */
u16 numa_node;
--
1.9.3
^ permalink raw reply related
* [net-next 12/15] i40e: Adding function for reading PBA String
From: Jeff Kirsher @ 2015-01-13 11:33 UTC (permalink / raw)
To: davem; +Cc: Kamil Krawczyk, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1421148811-9763-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Kamil Krawczyk <kamil.krawczyk@intel.com>
Function will read PBA Block from Shadow RAM and return it in a string format.
Change-ID: I4ee7059f6e21bd0eba38687da15e772e0b4ab36e
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
drivers/net/ethernet/intel/i40e/i40e_common.c | 59 ++++++++++++++++++++++++
drivers/net/ethernet/intel/i40e/i40e_prototype.h | 2 +
drivers/net/ethernet/intel/i40e/i40e_type.h | 2 +
3 files changed, 63 insertions(+)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
index b16fc03..4f4d9d1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -742,6 +742,65 @@ i40e_status i40e_get_san_mac_addr(struct i40e_hw *hw, u8 *mac_addr)
#endif
/**
+ * i40e_read_pba_string - Reads part number string from EEPROM
+ * @hw: pointer to hardware structure
+ * @pba_num: stores the part number string from the EEPROM
+ * @pba_num_size: part number string buffer length
+ *
+ * Reads the part number string from the EEPROM.
+ **/
+i40e_status i40e_read_pba_string(struct i40e_hw *hw, u8 *pba_num,
+ u32 pba_num_size)
+{
+ i40e_status status = 0;
+ u16 pba_word = 0;
+ u16 pba_size = 0;
+ u16 pba_ptr = 0;
+ u16 i = 0;
+
+ status = i40e_read_nvm_word(hw, I40E_SR_PBA_FLAGS, &pba_word);
+ if (status || (pba_word != 0xFAFA)) {
+ hw_dbg(hw, "Failed to read PBA flags or flag is invalid.\n");
+ return status;
+ }
+
+ status = i40e_read_nvm_word(hw, I40E_SR_PBA_BLOCK_PTR, &pba_ptr);
+ if (status) {
+ hw_dbg(hw, "Failed to read PBA Block pointer.\n");
+ return status;
+ }
+
+ status = i40e_read_nvm_word(hw, pba_ptr, &pba_size);
+ if (status) {
+ hw_dbg(hw, "Failed to read PBA Block size.\n");
+ return status;
+ }
+
+ /* Subtract one to get PBA word count (PBA Size word is included in
+ * total size)
+ */
+ pba_size--;
+ if (pba_num_size < (((u32)pba_size * 2) + 1)) {
+ hw_dbg(hw, "Buffer to small for PBA data.\n");
+ return I40E_ERR_PARAM;
+ }
+
+ for (i = 0; i < pba_size; i++) {
+ status = i40e_read_nvm_word(hw, (pba_ptr + 1) + i, &pba_word);
+ if (status) {
+ hw_dbg(hw, "Failed to read PBA Block word %d.\n", i);
+ return status;
+ }
+
+ pba_num[(i * 2)] = (pba_word >> 8) & 0xFF;
+ pba_num[(i * 2) + 1] = pba_word & 0xFF;
+ }
+ pba_num[(pba_size * 2)] = '\0';
+
+ return status;
+}
+
+/**
* i40e_get_media_type - Gets media type
* @hw: pointer to the hardware structure
**/
diff --git a/drivers/net/ethernet/intel/i40e/i40e_prototype.h b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
index d1c7d63..68e852a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_prototype.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
@@ -248,6 +248,8 @@ void i40e_clear_pxe_mode(struct i40e_hw *hw);
bool i40e_get_link_status(struct i40e_hw *hw);
i40e_status i40e_get_mac_addr(struct i40e_hw *hw, u8 *mac_addr);
i40e_status i40e_get_port_mac_addr(struct i40e_hw *hw, u8 *mac_addr);
+i40e_status i40e_read_pba_string(struct i40e_hw *hw, u8 *pba_num,
+ u32 pba_num_size);
i40e_status i40e_validate_mac_addr(u8 *mac_addr);
void i40e_pre_tx_queue_cfg(struct i40e_hw *hw, u32 queue, bool enable);
#ifdef I40E_FCOE
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
index 611de3e..ff121fe 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -1140,6 +1140,8 @@ struct i40e_hw_port_stats {
/* Checksum and Shadow RAM pointers */
#define I40E_SR_NVM_CONTROL_WORD 0x00
#define I40E_SR_EMP_MODULE_PTR 0x0F
+#define I40E_SR_PBA_FLAGS 0x15
+#define I40E_SR_PBA_BLOCK_PTR 0x16
#define I40E_SR_NVM_IMAGE_VERSION 0x18
#define I40E_SR_NVM_WAKE_ON_LAN 0x19
#define I40E_SR_ALTERNATE_SAN_MAC_ADDRESS_PTR 0x27
--
1.9.3
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox