* [iproute PATCH] ip-link: Support printing VF trust setting
From: Phil Sutter @ 2016-03-31 12:43 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
This adds a new item to VF lines of a PF, stating whether the VF is
trusted or not.
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
ip/ipaddress.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index 3998d8cec4ab2..2f1d55c115dde 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -380,6 +380,13 @@ static void print_vfinfo(FILE *fp, struct rtattr *vfinfo)
else
fprintf(fp, ", link-state disable");
}
+ if (vf[IFLA_VF_TRUST]) {
+ struct ifla_vf_trust *vf_trust = RTA_DATA(vf[IFLA_VF_TRUST]);
+
+ if (vf_trust->setting != -1)
+ fprintf(fp, ", trust %s",
+ vf_trust->setting ? "on" : "off");
+ }
if (vf[IFLA_VF_STATS] && show_stats)
print_vf_stats64(fp, vf[IFLA_VF_STATS]);
}
--
2.7.2
^ permalink raw reply related
* [PATCH v2 net-next] net: hns: add support of pause frame ctrl for HNS V2
From: Yisen Zhuang @ 2016-03-31 13:00 UTC (permalink / raw)
To: davem, salil.mehta, liguozhu, huangdaode, arnd, andriy.shevchenko,
andrew, geliangtang, ivecera, lisheng011, fengguang.wu
Cc: charles.chenxin, haifeng.wei, netdev, linux-kernel,
linux-arm-kernel, linuxarm
From: Lisheng <lisheng011@huawei.com>
The patch adds support of pause ctrl for HNS V2, and this feature is lost
by HNS V1:
1) service ports can disable rx pause frame,
2) debug ports can open tx/rx pause frame.
And this patch updates the REGs about the pause ctrl when updated
status function called by upper layer routine.
Signed-off-by: Lisheng <lisheng011@huawei.com>
Signed-off-by: Yisen Zhuang <Yisen.Zhuang@huawei.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
change log:
PATCH V2:
- delete the useless code found by Andy Shevchenko
PATCH V1:
- initial submit
V1 Link: https://lkml.org/lkml/2016/3/29/77
---
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c | 20 +++++-
drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c | 30 ++-------
drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c | 75 +++++++++++++++++++---
drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h | 5 ++
drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c | 6 +-
drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h | 6 ++
6 files changed, 104 insertions(+), 38 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
index a1cb461..1591422 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c
@@ -399,11 +399,16 @@ static void hns_ae_get_ring_bdnum_limit(struct hnae_queue *queue,
static void hns_ae_get_pauseparam(struct hnae_handle *handle,
u32 *auto_neg, u32 *rx_en, u32 *tx_en)
{
- assert(handle);
+ struct hns_mac_cb *mac_cb = hns_get_mac_cb(handle);
+ struct dsaf_device *dsaf_dev = mac_cb->dsaf_dev;
- hns_mac_get_autoneg(hns_get_mac_cb(handle), auto_neg);
+ hns_mac_get_autoneg(mac_cb, auto_neg);
- hns_mac_get_pauseparam(hns_get_mac_cb(handle), rx_en, tx_en);
+ hns_mac_get_pauseparam(mac_cb, rx_en, tx_en);
+
+ /* Service port's pause feature is provided by DSAF, not mac */
+ if (handle->port_type == HNAE_PORT_SERVICE)
+ hns_dsaf_get_rx_mac_pause_en(dsaf_dev, mac_cb->mac_id, rx_en);
}
static int hns_ae_set_autoneg(struct hnae_handle *handle, u8 enable)
@@ -436,12 +441,21 @@ static int hns_ae_set_pauseparam(struct hnae_handle *handle,
u32 autoneg, u32 rx_en, u32 tx_en)
{
struct hns_mac_cb *mac_cb = hns_get_mac_cb(handle);
+ struct dsaf_device *dsaf_dev = mac_cb->dsaf_dev;
int ret;
ret = hns_mac_set_autoneg(mac_cb, autoneg);
if (ret)
return ret;
+ /* Service port's pause feature is provided by DSAF, not mac */
+ if (handle->port_type == HNAE_PORT_SERVICE) {
+ ret = hns_dsaf_set_rx_mac_pause_en(dsaf_dev,
+ mac_cb->mac_id, rx_en);
+ if (ret)
+ return ret;
+ rx_en = 0;
+ }
return hns_mac_set_pauseparam(mac_cb, rx_en, tx_en);
}
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
index a38084a..10c367d 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_mac.c
@@ -439,9 +439,8 @@ int hns_mac_vm_config_bc_en(struct hns_mac_cb *mac_cb, u32 vmid, bool enable)
void hns_mac_reset(struct hns_mac_cb *mac_cb)
{
- struct mac_driver *drv;
-
- drv = hns_mac_get_drv(mac_cb);
+ struct mac_driver *drv = hns_mac_get_drv(mac_cb);
+ bool is_ver1 = AE_IS_VER1(mac_cb->dsaf_dev->dsaf_ver);
drv->mac_init(drv);
@@ -456,7 +455,7 @@ void hns_mac_reset(struct hns_mac_cb *mac_cb)
if (drv->mac_pausefrm_cfg) {
if (mac_cb->mac_type == HNAE_PORT_DEBUG)
- drv->mac_pausefrm_cfg(drv, 0, 0);
+ drv->mac_pausefrm_cfg(drv, !is_ver1, !is_ver1);
else /* mac rx must disable, dsaf pfc close instead of it*/
drv->mac_pausefrm_cfg(drv, 0, 1);
}
@@ -561,14 +560,6 @@ void hns_mac_get_pauseparam(struct hns_mac_cb *mac_cb, u32 *rx_en, u32 *tx_en)
*rx_en = 0;
*tx_en = 0;
}
-
- /* Due to the chip defect, the service mac's rx pause CAN'T be enabled.
- * We set the rx pause frm always be true (1), because DSAF deals with
- * the rx pause frm instead of service mac. After all, we still support
- * rx pause frm.
- */
- if (mac_cb->mac_type == HNAE_PORT_SERVICE)
- *rx_en = 1;
}
/**
@@ -602,20 +593,13 @@ int hns_mac_set_autoneg(struct hns_mac_cb *mac_cb, u8 enable)
int hns_mac_set_pauseparam(struct hns_mac_cb *mac_cb, u32 rx_en, u32 tx_en)
{
struct mac_driver *mac_ctrl_drv = hns_mac_get_drv(mac_cb);
+ bool is_ver1 = AE_IS_VER1(mac_cb->dsaf_dev->dsaf_ver);
- if (mac_cb->mac_type == HNAE_PORT_SERVICE) {
- if (!rx_en) {
- dev_err(mac_cb->dev, "disable rx_pause is not allowed!");
+ if (mac_cb->mac_type == HNAE_PORT_DEBUG) {
+ if (is_ver1 && (tx_en || rx_en)) {
+ dev_err(mac_cb->dev, "macv1 cann't enable tx/rx_pause!");
return -EINVAL;
}
- } else if (mac_cb->mac_type == HNAE_PORT_DEBUG) {
- if (tx_en || rx_en) {
- dev_err(mac_cb->dev, "enable tx_pause or enable rx_pause are not allowed!");
- return -EINVAL;
- }
- } else {
- dev_err(mac_cb->dev, "Unsupport this operation!");
- return -EINVAL;
}
if (mac_ctrl_drv->mac_pausefrm_cfg)
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
index 5c1ac9b..5b05d31 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c
@@ -1022,12 +1022,52 @@ static void hns_dsaf_tbl_tcam_init(struct dsaf_device *dsaf_dev)
* @mac_cb: mac contrl block
*/
static void hns_dsaf_pfc_en_cfg(struct dsaf_device *dsaf_dev,
- int mac_id, int en)
+ int mac_id, int tc_en)
{
- if (!en)
- dsaf_write_dev(dsaf_dev, DSAF_PFC_EN_0_REG + mac_id * 4, 0);
+ dsaf_write_dev(dsaf_dev, DSAF_PFC_EN_0_REG + mac_id * 4, tc_en);
+}
+
+static void hns_dsaf_set_pfc_pause(struct dsaf_device *dsaf_dev,
+ int mac_id, int tx_en, int rx_en)
+{
+ if (AE_IS_VER1(dsaf_dev->dsaf_ver)) {
+ if (!tx_en || !rx_en)
+ dev_err(dsaf_dev->dev, "dsaf v1 can not close pfc!\n");
+
+ return;
+ }
+
+ dsaf_set_dev_bit(dsaf_dev, DSAF_PAUSE_CFG_REG + mac_id * 4,
+ DSAF_PFC_PAUSE_RX_EN_B, !!rx_en);
+ dsaf_set_dev_bit(dsaf_dev, DSAF_PAUSE_CFG_REG + mac_id * 4,
+ DSAF_PFC_PAUSE_TX_EN_B, !!tx_en);
+}
+
+int hns_dsaf_set_rx_mac_pause_en(struct dsaf_device *dsaf_dev, int mac_id,
+ u32 en)
+{
+ if (AE_IS_VER1(dsaf_dev->dsaf_ver)) {
+ if (!en)
+ dev_err(dsaf_dev->dev, "dsafv1 can't close rx_pause!\n");
+
+ return -EINVAL;
+ }
+
+ dsaf_set_dev_bit(dsaf_dev, DSAF_PAUSE_CFG_REG + mac_id * 4,
+ DSAF_MAC_PAUSE_RX_EN_B, !!en);
+
+ return 0;
+}
+
+void hns_dsaf_get_rx_mac_pause_en(struct dsaf_device *dsaf_dev, int mac_id,
+ u32 *en)
+{
+ if (AE_IS_VER1(dsaf_dev->dsaf_ver))
+ *en = 1;
else
- dsaf_write_dev(dsaf_dev, DSAF_PFC_EN_0_REG + mac_id * 4, 0xff);
+ *en = dsaf_get_dev_bit(dsaf_dev,
+ DSAF_PAUSE_CFG_REG + mac_id * 4,
+ DSAF_MAC_PAUSE_RX_EN_B);
}
/**
@@ -1039,6 +1079,7 @@ static void hns_dsaf_comm_init(struct dsaf_device *dsaf_dev)
{
u32 i;
u32 o_dsaf_cfg;
+ bool is_ver1 = AE_IS_VER1(dsaf_dev->dsaf_ver);
o_dsaf_cfg = dsaf_read_dev(dsaf_dev, DSAF_CFG_0_REG);
dsaf_set_bit(o_dsaf_cfg, DSAF_CFG_EN_S, dsaf_dev->dsaf_en);
@@ -1064,8 +1105,10 @@ static void hns_dsaf_comm_init(struct dsaf_device *dsaf_dev)
hns_dsaf_sw_port_type_cfg(dsaf_dev, DSAF_SW_PORT_TYPE_NON_VLAN);
/*set dsaf pfc to 0 for parseing rx pause*/
- for (i = 0; i < DSAF_COMM_CHN; i++)
+ for (i = 0; i < DSAF_COMM_CHN; i++) {
hns_dsaf_pfc_en_cfg(dsaf_dev, i, 0);
+ hns_dsaf_set_pfc_pause(dsaf_dev, i, is_ver1, is_ver1);
+ }
/*msk and clr exception irqs */
for (i = 0; i < DSAF_COMM_CHN; i++) {
@@ -2013,6 +2056,8 @@ void hns_dsaf_update_stats(struct dsaf_device *dsaf_dev, u32 node_num)
{
struct dsaf_hw_stats *hw_stats
= &dsaf_dev->hw_stats[node_num];
+ bool is_ver1 = AE_IS_VER1(dsaf_dev->dsaf_ver);
+ u32 reg_tmp;
hw_stats->pad_drop += dsaf_read_dev(dsaf_dev,
DSAF_INODE_PAD_DISCARD_NUM_0_REG + 0x80 * (u64)node_num);
@@ -2022,8 +2067,12 @@ void hns_dsaf_update_stats(struct dsaf_device *dsaf_dev, u32 node_num)
DSAF_INODE_FINAL_IN_PKT_NUM_0_REG + 0x80 * (u64)node_num);
hw_stats->rx_pkt_id += dsaf_read_dev(dsaf_dev,
DSAF_INODE_SBM_PID_NUM_0_REG + 0x80 * (u64)node_num);
- hw_stats->rx_pause_frame += dsaf_read_dev(dsaf_dev,
- DSAF_INODE_FINAL_IN_PAUSE_NUM_0_REG + 0x80 * (u64)node_num);
+
+ reg_tmp = is_ver1 ? DSAF_INODE_FINAL_IN_PAUSE_NUM_0_REG :
+ DSAFV2_INODE_FINAL_IN_PAUSE_NUM_0_REG;
+ hw_stats->rx_pause_frame +=
+ dsaf_read_dev(dsaf_dev, reg_tmp + 0x80 * (u64)node_num);
+
hw_stats->release_buf_num += dsaf_read_dev(dsaf_dev,
DSAF_INODE_SBM_RELS_NUM_0_REG + 0x80 * (u64)node_num);
hw_stats->sbm_drop += dsaf_read_dev(dsaf_dev,
@@ -2056,6 +2105,8 @@ void hns_dsaf_get_regs(struct dsaf_device *ddev, u32 port, void *data)
u32 i = 0;
u32 j;
u32 *p = data;
+ u32 reg_tmp;
+ bool is_ver1 = AE_IS_VER1(ddev->dsaf_ver);
/* dsaf common registers */
p[0] = dsaf_read_dev(ddev, DSAF_SRAM_INIT_OVER_0_REG);
@@ -2120,8 +2171,9 @@ void hns_dsaf_get_regs(struct dsaf_device *ddev, u32 port, void *data)
DSAF_INODE_FINAL_IN_PKT_NUM_0_REG + j * 0x80);
p[190 + i] = dsaf_read_dev(ddev,
DSAF_INODE_SBM_PID_NUM_0_REG + j * 0x80);
- p[193 + i] = dsaf_read_dev(ddev,
- DSAF_INODE_FINAL_IN_PAUSE_NUM_0_REG + j * 0x80);
+ reg_tmp = is_ver1 ? DSAF_INODE_FINAL_IN_PAUSE_NUM_0_REG :
+ DSAFV2_INODE_FINAL_IN_PAUSE_NUM_0_REG;
+ p[193 + i] = dsaf_read_dev(ddev, reg_tmp + j * 0x80);
p[196 + i] = dsaf_read_dev(ddev,
DSAF_INODE_SBM_RELS_NUM_0_REG + j * 0x80);
p[199 + i] = dsaf_read_dev(ddev,
@@ -2368,8 +2420,11 @@ void hns_dsaf_get_regs(struct dsaf_device *ddev, u32 port, void *data)
p[496] = dsaf_read_dev(ddev, DSAF_NETPORT_CTRL_SIG_0_REG + port * 0x4);
p[497] = dsaf_read_dev(ddev, DSAF_XGE_CTRL_SIG_CFG_0_REG + port * 0x4);
+ if (!is_ver1)
+ p[498] = dsaf_read_dev(ddev, DSAF_PAUSE_CFG_REG + port * 0x4);
+
/* mark end of dsaf regs */
- for (i = 498; i < 504; i++)
+ for (i = 499; i < 504; i++)
p[i] = 0xdddddddd;
}
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h
index 5fea226..e8eedc5 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.h
@@ -417,6 +417,11 @@ void hns_dsaf_get_strings(int stringset, u8 *data, int port);
void hns_dsaf_get_regs(struct dsaf_device *ddev, u32 port, void *data);
int hns_dsaf_get_regs_count(void);
void hns_dsaf_set_promisc_mode(struct dsaf_device *dsaf_dev, u32 en);
+
+void hns_dsaf_get_rx_mac_pause_en(struct dsaf_device *dsaf_dev, int mac_id,
+ u32 *en);
+int hns_dsaf_set_rx_mac_pause_en(struct dsaf_device *dsaf_dev, int mac_id,
+ u32 en);
void hns_dsaf_set_inner_lb(struct dsaf_device *dsaf_dev, u32 mac_id, u32 en);
#endif /* __HNS_DSAF_MAIN_H__ */
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
index 5b7ae5f..ab27b3b 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_ppe.c
@@ -332,10 +332,12 @@ static void hns_ppe_init_hw(struct hns_ppe_cb *ppe_cb)
/* clr and msk except irq*/
hns_ppe_exc_irq_en(ppe_cb, 0);
- if (ppe_common_cb->ppe_mode == PPE_COMMON_MODE_DEBUG)
+ if (ppe_common_cb->ppe_mode == PPE_COMMON_MODE_DEBUG) {
hns_ppe_set_port_mode(ppe_cb, PPE_MODE_GE);
- else
+ dsaf_write_dev(ppe_cb, PPE_CFG_PAUSE_IDLE_CNT_REG, 0);
+ } else {
hns_ppe_set_port_mode(ppe_cb, PPE_MODE_XGE);
+ }
hns_ppe_checksum_hw(ppe_cb, 0xffffffff);
hns_ppe_cnt_clr_ce(ppe_cb);
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h
index 018fa7d..e021890 100644
--- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h
+++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_reg.h
@@ -135,6 +135,7 @@
#define DSAF_PPE_INT_STS_0_REG 0x1E0
#define DSAF_ROCEE_INT_STS_0_REG 0x200
#define DSAFV2_SERDES_LBK_0_REG 0x220
+#define DSAF_PAUSE_CFG_REG 0x240
#define DSAF_PPE_QID_CFG_0_REG 0x300
#define DSAF_SW_PORT_TYPE_0_REG 0x320
#define DSAF_STP_PORT_TYPE_0_REG 0x340
@@ -153,6 +154,7 @@
#define DSAF_INODE_FINAL_IN_PKT_NUM_0_REG 0x1030
#define DSAF_INODE_SBM_PID_NUM_0_REG 0x1038
#define DSAF_INODE_FINAL_IN_PAUSE_NUM_0_REG 0x103C
+#define DSAFV2_INODE_FINAL_IN_PAUSE_NUM_0_REG 0x1024
#define DSAF_INODE_SBM_RELS_NUM_0_REG 0x104C
#define DSAF_INODE_SBM_DROP_NUM_0_REG 0x1050
#define DSAF_INODE_CRC_FALSE_NUM_0_REG 0x1054
@@ -709,6 +711,10 @@
#define DSAF_PFC_UNINT_CNT_M ((1ULL << 9) - 1)
#define DSAF_PFC_UNINT_CNT_S 0
+#define DSAF_MAC_PAUSE_RX_EN_B 2
+#define DSAF_PFC_PAUSE_RX_EN_B 1
+#define DSAF_PFC_PAUSE_TX_EN_B 0
+
#define DSAF_PPE_QID_CFG_M 0xFF
#define DSAF_PPE_QID_CFG_S 0
--
1.9.1
^ permalink raw reply related
* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
From: Daniel Borkmann @ 2016-03-31 12:16 UTC (permalink / raw)
To: Eric Dumazet
Cc: Alexei Starovoitov, Michal Kubecek, davem, sasha.levin, jslaby,
mst, netdev
In-Reply-To: <1459425558.6473.229.camel@edumazet-glaptop3.roam.corp.google.com>
On 03/31/2016 01:59 PM, Eric Dumazet wrote:
> On Thu, 2016-03-31 at 13:35 +0200, Daniel Borkmann wrote:
>
>> +static inline bool sock_owned_externally(const struct sock *sk)
>> +{
>> + return sk->sk_flags & (1UL << SOCK_EXTERNAL_OWNER);
>> +}
>> +
>
> Have you reinvented sock_flag(sl, SOCK_EXTERNAL_OWNER) ? ;)
>
> Anyway, using a flag for this purpose sounds overkill to me.
Right.
> Setting it is a way to 'fool' lockdep anyway...
Yep, correct, we'd be fooling the tun case, so this diff doesn't
really make it any better there.
Thanks,
Daniel
^ permalink raw reply
* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
From: Hannes Frederic Sowa @ 2016-03-31 12:12 UTC (permalink / raw)
To: Alexei Starovoitov, Michal Kubecek
Cc: Daniel Borkmann, davem, sasha.levin, jslaby, eric.dumazet, mst,
netdev
In-Reply-To: <20160331054301.GA57227@ast-mbp.thefacebook.com>
On 31.03.2016 07:43, Alexei Starovoitov wrote:
> On Thu, Mar 31, 2016 at 07:22:32AM +0200, Michal Kubecek wrote:
>> On Wed, Mar 30, 2016 at 10:08:10PM -0700, Alexei Starovoitov wrote:
>>> On Thu, Mar 31, 2016 at 07:01:15AM +0200, Michal Kubecek wrote:
>>>> On Wed, Mar 30, 2016 at 06:18:42PM -0700, Alexei Starovoitov wrote:
>>>>>
>>>>> kinda heavy patch to shut up lockdep.
>>>>> Can we do
>>>>> old_fp = rcu_dereference_protected(sk->sk_filter,
>>>>> sock_owned_by_user(sk) || lockdep_rtnl_is_held());
>>>>> and it always be correct?
>>>>> I think right now tun is the only such user, but if it's correct
>>>>> for tun, it's correct for future users too. If not correct then
>>>>> not correct for tun either.
>>>>> Or I'm missing something?
>>>>
>>>> Already discussed here:
>>>>
>>>> http://thread.gmane.org/gmane.linux.kernel/2158069/focus=405853
>>>
>>> I saw that. My point above was challenging 'less accurate' part.
>>>
>> Daniel's point was that lockdep_rtnl_is_held() does not mean "we hold
>> RTNL" but "someone holds RTNL" so that some other task holding RTNL at
>> the moment could make the check happy even when called by someone
>> supposed to own the socket.
>
> Of course... and that is the case for all rtnl_dereference() calls...
> yet we're not paranoid about it.
lockdep_rtnl_is_held actually checks *current if the currently running
code actually has the lock, no?
Bye,
Hannes
^ permalink raw reply
* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
From: Eric Dumazet @ 2016-03-31 11:59 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Alexei Starovoitov, Michal Kubecek, davem, sasha.levin, jslaby,
mst, netdev
In-Reply-To: <56FD0B79.5020007@iogearbox.net>
On Thu, 2016-03-31 at 13:35 +0200, Daniel Borkmann wrote:
> +static inline bool sock_owned_externally(const struct sock *sk)
> +{
> + return sk->sk_flags & (1UL << SOCK_EXTERNAL_OWNER);
> +}
> +
Have you reinvented sock_flag(sl, SOCK_EXTERNAL_OWNER) ? ;)
Anyway, using a flag for this purpose sounds overkill to me.
Setting it is a way to 'fool' lockdep anyway...
^ permalink raw reply
* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
From: Daniel Borkmann @ 2016-03-31 11:35 UTC (permalink / raw)
To: Alexei Starovoitov, Michal Kubecek
Cc: davem, sasha.levin, jslaby, eric.dumazet, mst, netdev
In-Reply-To: <20160331054301.GA57227@ast-mbp.thefacebook.com>
On 03/31/2016 07:43 AM, Alexei Starovoitov wrote:
> On Thu, Mar 31, 2016 at 07:22:32AM +0200, Michal Kubecek wrote:
>> On Wed, Mar 30, 2016 at 10:08:10PM -0700, Alexei Starovoitov wrote:
>>> On Thu, Mar 31, 2016 at 07:01:15AM +0200, Michal Kubecek wrote:
>>>> On Wed, Mar 30, 2016 at 06:18:42PM -0700, Alexei Starovoitov wrote:
>>>>>
>>>>> kinda heavy patch to shut up lockdep.
>>>>> Can we do
>>>>> old_fp = rcu_dereference_protected(sk->sk_filter,
>>>>> sock_owned_by_user(sk) || lockdep_rtnl_is_held());
>>>>> and it always be correct?
>>>>> I think right now tun is the only such user, but if it's correct
>>>>> for tun, it's correct for future users too. If not correct then
>>>>> not correct for tun either.
>>>>> Or I'm missing something?
>>>>
>>>> Already discussed here:
>>>>
>>>> http://thread.gmane.org/gmane.linux.kernel/2158069/focus=405853
>>>
>>> I saw that. My point above was challenging 'less accurate' part.
>>>
>> Daniel's point was that lockdep_rtnl_is_held() does not mean "we hold
>> RTNL" but "someone holds RTNL" so that some other task holding RTNL at
>> the moment could make the check happy even when called by someone
>> supposed to own the socket.
>
> Of course... and that is the case for all rtnl_dereference() calls...
> yet we're not paranoid about it.
Sure, but the rtnl case is a bit different, no? In the sense that there's
only one global mutex. So, imho, I don't think it's appropriate to relax
the current rcu_dereference_protected() check for the socket case _just_
in order to silence the tun case warning, if we _can_ actually do better
than this w/o much effort.
I thought about some alternatives if we really don't want to change the
API code like this: We could change the rcu_dereference_protected() just
into a rcu_dereference(), but with the trade-off of not having lockdep
which is probably not really what we want. We could hack the tun case to
create some 'fake' ownership by setting sk->sk_lock.owned, but this seems
very hacky imho, and messing around with sk_lock details that shouldn't
be messed with. Or, as in the other thread mentioned, we could add a flag
like below to mark the socket that it doesn't need to be locked in the
expected way. That diff works as well, is smaller, and the flag could
perhaps be reused in other cases, too. Downside is that we burn a socket
flag, but as it's not uapi, it's not set in stone and can still be changed
should we get into a shortage of bits in future. Have no strong opinion
whether this seems better or not.
Thanks,
Daniel
drivers/net/tun.c | 1 +
include/net/sock.h | 8 ++++++++
net/core/filter.c | 7 ++++---
3 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index afdf950..8dc7d3e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2252,6 +2252,7 @@ static int tun_chr_open(struct inode *inode, struct file * file)
INIT_LIST_HEAD(&tfile->next);
sock_set_flag(&tfile->sk, SOCK_ZEROCOPY);
+ sock_set_flag(&tfile->sk, SOCK_EXTERNAL_OWNER);
return 0;
}
diff --git a/include/net/sock.h b/include/net/sock.h
index 255d3e0..8d90673 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -720,6 +720,9 @@ enum sock_flags {
*/
SOCK_FILTER_LOCKED, /* Filter cannot be changed anymore */
SOCK_SELECT_ERR_QUEUE, /* Wake select on error queue */
+ SOCK_EXTERNAL_OWNER, /* External locking (e.g. RTNL) is used instead
+ * of sk_lock for control path.
+ */
};
#define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
@@ -1330,6 +1333,11 @@ static inline void sock_release_ownership(struct sock *sk)
sk->sk_lock.owned = 0;
}
+static inline bool sock_owned_externally(const struct sock *sk)
+{
+ return sk->sk_flags & (1UL << SOCK_EXTERNAL_OWNER);
+}
+
/*
* Macro so as to not evaluate some arguments when
* lockdep is not enabled.
diff --git a/net/core/filter.c b/net/core/filter.c
index 4b81b71..828274e 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1166,9 +1166,9 @@ static int __sk_attach_prog(struct bpf_prog *prog, struct sock *sk)
}
old_fp = rcu_dereference_protected(sk->sk_filter,
- sock_owned_by_user(sk));
+ sock_owned_by_user(sk) ||
+ sock_owned_externally(sk));
rcu_assign_pointer(sk->sk_filter, fp);
-
if (old_fp)
sk_filter_uncharge(sk, old_fp);
@@ -2259,7 +2259,8 @@ int sk_detach_filter(struct sock *sk)
return -EPERM;
filter = rcu_dereference_protected(sk->sk_filter,
- sock_owned_by_user(sk));
+ sock_owned_by_user(sk) ||
+ sock_owned_externally(sk));
if (filter) {
RCU_INIT_POINTER(sk->sk_filter, NULL);
sk_filter_uncharge(sk, filter);
--
1.9.3
^ permalink raw reply related
* RE: [PATCH] sctp: avoid refreshing heartbeat timer too often
From: David Laight @ 2016-03-31 11:16 UTC (permalink / raw)
To: 'Marcelo Ricardo Leitner', netdev@vger.kernel.org
Cc: Neil Horman, Vlad Yasevich, linux-sctp@vger.kernel.org
In-Reply-To: <56FBC2DE.3000207@gmail.com>
From: Marcelo Ricardo Leitner
> Sent: 30 March 2016 13:13
> Em 30-03-2016 06:37, David Laight escreveu:
> > From: Marcelo Ricardo Leitner
> >> Sent: 29 March 2016 14:42
> >>
> >> Currently on high rate SCTP streams the heartbeat timer refresh can
> >> consume quite a lot of resources as timer updates are costly and it
> >> contains a random factor, which a) is also costly and b) invalidates
> >> mod_timer() optimization for not editing a timer to the same value.
> >> It may even cause the timer to be slightly advanced, for no good reason.
> >
> > Interesting thoughts:
> > 1) Is it necessary to use a different 'random factor' until the timer actually
> > expires?
>
> I don't understand you fully here, but we have to have a random factor
> on timer expire. As noted by Daniel Borkmann on his commit 8f61059a96c2
> ("net: sctp: improve timer slack calculation for transport HBs"):
When a HEARTBEAT chunk is sent determine the new interval, use that
interval until the timer actually expires when a new interval is
calculated. So the random number is only generated once per heartbeat.
> RFC4960, section 8.3 says:
>
> On an idle destination address that is allowed to heartbeat,
> it is recommended that a HEARTBEAT chunk is sent once per RTO
> of that destination address plus the protocol parameter
> 'HB.interval', with jittering of +/- 50% of the RTO value,
> and exponential backoff of the RTO if the previous HEARTBEAT
> is unanswered.
>
> Previous to his commit, it was using a random factor based on jiffies.
>
> This patch then assumes that random_A+2 is just as random as random_B as
> long as it is within the allowed range, avoiding the unnecessary updates.
>
> > 2) It might be better to allow the heartbeat timer to expire, on expiry work
> > out the new interval based on when the last 'refresh' was done.
>
> Cool, I thought about this too. It would introduce some extra complexity
> that is not really worth I think, specially because now we may be doing
> more timer updates even with this patch but it's not triggering any wake
> ups and we would need at least 2 wake ups then: one for the first
> timeout event, and then re-schedule the timer for the next updated one,
> and maybe again, and again.. less timer updates but more wake ups, one
> at every heartbeat interval even on a busy transport. Seems it's cheaper
> to just update the timer then.
One wakeup per heartbeat interval on a busy connection is probably noise.
Probably much less than the 1000s of timer updates that would otherwise happen.
A further optimisation would be to restart the timer if more than (say) 80%
of the way through the timeout period.
Similarly the HEARTBEAT could be sent if the 2nd wakeup would be almost immediate.
David
^ permalink raw reply
* Re: [PATCH] net: fec: stop the "rcv is not +last, " error messages
From: Fabio Estevam @ 2016-03-31 10:58 UTC (permalink / raw)
To: Greg Ungerer; +Cc: Troy Kisky, netdev@vger.kernel.org
In-Reply-To: <56FC7A96.9070002@uclinux.org>
Hi Greg,
On Wed, Mar 30, 2016 at 10:17 PM, Greg Ungerer <gerg@uclinux.org> wrote:
> Yes, that fixes it. Will you carry this change?
Thanks for confirming.
Yes, I will submit it later today;
^ permalink raw reply
* Re: [PATCH] net: fec: stop the "rcv is not +last, " error messages
From: Fabio Estevam @ 2016-03-31 10:56 UTC (permalink / raw)
To: Fugang Duan; +Cc: Greg Ungerer, Troy Kisky, netdev@vger.kernel.org
In-Reply-To: <VI1PR0401MB1855458E0B31502D1C112AF6FF990@VI1PR0401MB1855.eurprd04.prod.outlook.com>
Hi Andy,
On Wed, Mar 30, 2016 at 10:41 PM, Fugang Duan <fugang.duan@nxp.com> wrote:
>
> Fabio, we cannot do it like this that may cause confused for the quirk flag "FEC_QUIRK_HAS_RACC".
We can treat FEC_QUIRK_HAS_RACC flag as "this is a non-Coldfire SoC".
>
>
> Hi, Greg,
>
> The header file fec.h define the FEC_FTRL as below, if ColdFire SoC has no this register, we may remove the define in here and define the register according to SOC type. For example, it is ColdFire Soc, define it as 0xFFF. Is it feasible ?
>
This is even worse IMHO. We should not write to a 'fake' register
offset of 0xFFF.
^ permalink raw reply
* Re: [PATCH net-next 1/6] net: skbuff: don't use union for napi_id and sender_cpu
From: Eric Dumazet @ 2016-03-31 10:32 UTC (permalink / raw)
To: Jason Wang; +Cc: davem, mst, netdev, linux-kernel
In-Reply-To: <1459403439-6011-2-git-send-email-jasowang@redhat.com>
On Thu, 2016-03-31 at 13:50 +0800, Jason Wang wrote:
> We use a union for napi_id and send_cpu, this is ok for most of the
> cases except when we want to support busy polling for tun which needs
> napi_id to be stored and passed to socket during tun_net_xmit(). In
> this case, napi_id was overridden with sender_cpu before tun_net_xmit()
> was called if XPS was enabled. Fixing by not using union for napi_id
> and sender_cpu.
>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> include/linux/skbuff.h | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 15d0df9..8aee891 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -743,11 +743,11 @@ struct sk_buff {
> __u32 hash;
> __be16 vlan_proto;
> __u16 vlan_tci;
> -#if defined(CONFIG_NET_RX_BUSY_POLL) || defined(CONFIG_XPS)
> - union {
> - unsigned int napi_id;
> - unsigned int sender_cpu;
> - };
> +#if defined(CONFIG_NET_RX_BUSY_POLL)
> + unsigned int napi_id;
> +#endif
> +#if defined(CONFIG_XPS)
> + unsigned int sender_cpu;
> #endif
> union {
> #ifdef CONFIG_NETWORK_SECMARK
Hmmm...
This is a serious problem.
Making skb bigger (8 bytes because of alignment) was not considered
valid for sender_cpu introduction. We worked quite hard to avoid this,
if you take a look at git history :(
Can you describe more precisely the problem and code path ?
^ permalink raw reply
* [PATCH] netlink: use nla_get_in_addr and nla_put_in_addr for ipv4 address
From: Haishuang Yan @ 2016-03-31 10:21 UTC (permalink / raw)
To: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy
Cc: netdev, linux-kernel, Haishuang Yan
Since nla_get_in_addr and nla_put_in_addr were implemented,
so use them appropriately.
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
---
net/ipv4/ip_tunnel_core.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 02dd990..47ea85d 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -247,10 +247,10 @@ static int ip_tun_build_state(struct net_device *dev, struct nlattr *attr,
tun_info->key.tun_id = nla_get_be64(tb[LWTUNNEL_IP_ID]);
if (tb[LWTUNNEL_IP_DST])
- tun_info->key.u.ipv4.dst = nla_get_be32(tb[LWTUNNEL_IP_DST]);
+ tun_info->key.u.ipv4.dst = nla_get_in_addr(tb[LWTUNNEL_IP_DST]);
if (tb[LWTUNNEL_IP_SRC])
- tun_info->key.u.ipv4.src = nla_get_be32(tb[LWTUNNEL_IP_SRC]);
+ tun_info->key.u.ipv4.src = nla_get_in_addr(tb[LWTUNNEL_IP_SRC]);
if (tb[LWTUNNEL_IP_TTL])
tun_info->key.ttl = nla_get_u8(tb[LWTUNNEL_IP_TTL]);
@@ -275,8 +275,8 @@ static int ip_tun_fill_encap_info(struct sk_buff *skb,
struct ip_tunnel_info *tun_info = lwt_tun_info(lwtstate);
if (nla_put_be64(skb, LWTUNNEL_IP_ID, tun_info->key.tun_id) ||
- nla_put_be32(skb, LWTUNNEL_IP_DST, tun_info->key.u.ipv4.dst) ||
- nla_put_be32(skb, LWTUNNEL_IP_SRC, tun_info->key.u.ipv4.src) ||
+ nla_put_in_addr(skb, LWTUNNEL_IP_DST, tun_info->key.u.ipv4.dst) ||
+ nla_put_in_addr(skb, LWTUNNEL_IP_SRC, tun_info->key.u.ipv4.src) ||
nla_put_u8(skb, LWTUNNEL_IP_TOS, tun_info->key.tos) ||
nla_put_u8(skb, LWTUNNEL_IP_TTL, tun_info->key.ttl) ||
nla_put_be16(skb, LWTUNNEL_IP_FLAGS, tun_info->key.tun_flags))
--
1.8.3.1
^ permalink raw reply related
* Re: am335x: no multicast reception over VLAN
From: Yegor Yefremov @ 2016-03-31 10:16 UTC (permalink / raw)
To: Mugunthan V N
Cc: Peter Korsgaard, Grygorii Strashko, netdev,
linux-omap@vger.kernel.org, drivshin, ml
In-Reply-To: <56FCF5BB.2030000@ti.com>
On Thu, Mar 31, 2016 at 12:02 PM, Mugunthan V N <mugunthanvnm@ti.com> wrote:
> On Thursday 31 March 2016 01:22 PM, Yegor Yefremov wrote:
>> On Thu, Mar 31, 2016 at 8:37 AM, Mugunthan V N <mugunthanvnm@ti.com> wrote:
>>> On Thursday 31 March 2016 01:17 AM, Peter Korsgaard wrote:
>>>>>>>>> "Mugunthan" == Mugunthan V N <mugunthanvnm@ti.com> writes:
>>>>
>>>> Hi,
>>>>
>>>> > You had received these packets as tcpdump will enable promiscuous mode
>>>> > so that you receive all the packets from the wire.
>>>>
>>>> FYI, you can use the -p option to tcpdump to not put the interface into
>>>> promiscuous mode.
>>>>
>>>
>>> Thanks for the information Peter Korsgaard.
>>>
>>> Yegor, can you provide tcpdump using -p as well in Grygorii commands.
>>
>> Before VLAN configuration:
>>
>> # switch-config -d
>> cpsw hw version 1.12 (0)
>> 0 : type: vlan , vid = 1, untag_force = 0x3, reg_mcast = 0x3,
>> unreg_mcast = 0x0, member_list = 0x3
>> 1 : type: mcast, vid = 1, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
>> no super, port_mask = 0x3
>> 2 : type: ucast, vid = 1, addr = 74:6a:8f:00:16:12, ucast_type =
>> persistant, port_num = 0x0, Secure
>> 3 : type: vlan , vid = 0, untag_force = 0x7, reg_mcast = 0x0,
>> unreg_mcast = 0x0, member_list = 0x7
>> 4 : type: mcast, vid = 1, addr = 01:00:5e:00:00:01, mcast_state = f,
>> no super, port_mask = 0x3
>> 5 : type: vlan , vid = 2, untag_force = 0x5, reg_mcast = 0x5,
>> unreg_mcast = 0x0, member_list = 0x5
>> 6 : type: mcast, vid = 2, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
>> no super, port_mask = 0x5
>> 7 : type: ucast, vid = 2, addr = 74:6a:8f:00:16:13, ucast_type =
>> persistant, port_num = 0x0, Secure
>> 8 : type: mcast, vid = 2, addr = 01:00:5e:00:00:01, mcast_state = f,
>> no super, port_mask = 0x5
>>
>> After VLAN configuration:
>>
>> # switch-config -d
>> cpsw hw version 1.12 (0)
>> 0 : type: vlan , vid = 1, untag_force = 0x3, reg_mcast = 0x3,
>> unreg_mcast = 0x0, member_list = 0x3
>> 1 : type: mcast, vid = 1, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
>> no super, port_mask = 0x3
>> 2 : type: ucast, vid = 1, addr = 74:6a:8f:00:16:12, ucast_type =
>> persistant, port_num = 0x0, Secure
>> 3 : type: vlan , vid = 0, untag_force = 0x7, reg_mcast = 0x0,
>> unreg_mcast = 0x0, member_list = 0x7
>> 4 : type: mcast, vid = 1, addr = 01:00:5e:00:00:01, mcast_state = f,
>> no super, port_mask = 0x3
>> 5 : type: vlan , vid = 2, untag_force = 0x5, reg_mcast = 0x5,
>> unreg_mcast = 0x0, member_list = 0x5
>> 6 : type: mcast, vid = 2, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
>> no super, port_mask = 0x5
>> 7 : type: ucast, vid = 2, addr = 74:6a:8f:00:16:13, ucast_type =
>> persistant, port_num = 0x0, Secure
>> 8 : type: mcast, vid = 2, addr = 01:00:5e:00:00:01, mcast_state = f,
>> no super, port_mask = 0x5
>> 9 : type: vlan , vid = 100, untag_force = 0x0, reg_mcast = 0x5,
>> unreg_mcast = 0x0, member_list = 0x5
>> 10 : type: ucast, vid = 100, addr = 74:6a:8f:00:16:13, ucast_type =
>> persistant, port_num = 0x0
>> 11 : type: mcast, vid = 100, addr = ff:ff:ff:ff:ff:ff, mcast_state =
>> f, no super, port_mask = 0x5
>> 12 : type: mcast, vid = 2, addr = 01:80:c2:00:00:21, mcast_state = f,
>> no super, port_mask = 0x5
>>
>> During mulitcast receive:
>>
>> # switch-config -d
>> cpsw hw version 1.12 (0)
>> 0 : type: vlan , vid = 1, untag_force = 0x3, reg_mcast = 0x3,
>> unreg_mcast = 0x0, member_list = 0x3
>> 1 : type: mcast, vid = 1, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
>> no super, port_mask = 0x3
>> 2 : type: ucast, vid = 1, addr = 74:6a:8f:00:16:12, ucast_type =
>> persistant, port_num = 0x0, Secure
>> 3 : type: vlan , vid = 0, untag_force = 0x7, reg_mcast = 0x0,
>> unreg_mcast = 0x0, member_list = 0x7
>> 4 : type: mcast, vid = 1, addr = 01:00:5e:00:00:01, mcast_state = f,
>> no super, port_mask = 0x3
>> 5 : type: vlan , vid = 2, untag_force = 0x5, reg_mcast = 0x5,
>> unreg_mcast = 0x0, member_list = 0x5
>> 6 : type: mcast, vid = 2, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
>> no super, port_mask = 0x5
>> 7 : type: ucast, vid = 2, addr = 74:6a:8f:00:16:13, ucast_type =
>> persistant, port_num = 0x0, Secure
>> 8 : type: mcast, vid = 2, addr = 01:00:5e:00:00:01, mcast_state = f,
>> no super, port_mask = 0x5
>> 9 : type: vlan , vid = 100, untag_force = 0x0, reg_mcast = 0x5,
>> unreg_mcast = 0x0, member_list = 0x5
>> 10 : type: ucast, vid = 100, addr = 74:6a:8f:00:16:13, ucast_type =
>> persistant, port_num = 0x0
>> 11 : type: mcast, vid = 100, addr = ff:ff:ff:ff:ff:ff, mcast_state =
>> f, no super, port_mask = 0x5
>> 12 : type: mcast, vid = 2, addr = 01:80:c2:00:00:21, mcast_state = f,
>> no super, port_mask = 0x5
>> 13 : type: mcast, vid = 2, addr = 01:00:5e:03:1d:47, mcast_state = f,
>> no super, port_mask = 0x5
>> 14 : type: ucast, vid = 100, addr = 66:22:04:bc:90:26, ucast_type =
>> untouched , port_num = 0x2
>
> I could see multicast address 01:00:5e:03:1d:47 is added to ALE table at
> index 13, but it is added for VLAN id 2 ie eth1, did you ran the test
> for eth1 or eth1.100?
My routing table looks as follows:
# route add -net 224.0.0.0 netmask 224.0.0.0 eth1.100
# ip route show
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.233
192.168.100.0/24 dev eth1.100 proto kernel scope link src 192.168.100.2
192.168.254.0/24 dev eth0 proto kernel scope link src 192.168.254.254
224.0.0.0/3 dev eth1.100 scope link
so multicast packets should go via eth1.100
Yegor
^ permalink raw reply
* Re: am335x: no multicast reception over VLAN
From: Mugunthan V N @ 2016-03-31 10:02 UTC (permalink / raw)
To: Yegor Yefremov
Cc: Peter Korsgaard, Grygorii Strashko, netdev,
linux-omap@vger.kernel.org, drivshin, ml
In-Reply-To: <CAGm1_kutVn+jNf58QwKyO8j5NmRhaOCUsXmpTiLbX75AczCm6w@mail.gmail.com>
On Thursday 31 March 2016 01:22 PM, Yegor Yefremov wrote:
> On Thu, Mar 31, 2016 at 8:37 AM, Mugunthan V N <mugunthanvnm@ti.com> wrote:
>> On Thursday 31 March 2016 01:17 AM, Peter Korsgaard wrote:
>>>>>>>> "Mugunthan" == Mugunthan V N <mugunthanvnm@ti.com> writes:
>>>
>>> Hi,
>>>
>>> > You had received these packets as tcpdump will enable promiscuous mode
>>> > so that you receive all the packets from the wire.
>>>
>>> FYI, you can use the -p option to tcpdump to not put the interface into
>>> promiscuous mode.
>>>
>>
>> Thanks for the information Peter Korsgaard.
>>
>> Yegor, can you provide tcpdump using -p as well in Grygorii commands.
>
> Before VLAN configuration:
>
> # switch-config -d
> cpsw hw version 1.12 (0)
> 0 : type: vlan , vid = 1, untag_force = 0x3, reg_mcast = 0x3,
> unreg_mcast = 0x0, member_list = 0x3
> 1 : type: mcast, vid = 1, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
> no super, port_mask = 0x3
> 2 : type: ucast, vid = 1, addr = 74:6a:8f:00:16:12, ucast_type =
> persistant, port_num = 0x0, Secure
> 3 : type: vlan , vid = 0, untag_force = 0x7, reg_mcast = 0x0,
> unreg_mcast = 0x0, member_list = 0x7
> 4 : type: mcast, vid = 1, addr = 01:00:5e:00:00:01, mcast_state = f,
> no super, port_mask = 0x3
> 5 : type: vlan , vid = 2, untag_force = 0x5, reg_mcast = 0x5,
> unreg_mcast = 0x0, member_list = 0x5
> 6 : type: mcast, vid = 2, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
> no super, port_mask = 0x5
> 7 : type: ucast, vid = 2, addr = 74:6a:8f:00:16:13, ucast_type =
> persistant, port_num = 0x0, Secure
> 8 : type: mcast, vid = 2, addr = 01:00:5e:00:00:01, mcast_state = f,
> no super, port_mask = 0x5
>
> After VLAN configuration:
>
> # switch-config -d
> cpsw hw version 1.12 (0)
> 0 : type: vlan , vid = 1, untag_force = 0x3, reg_mcast = 0x3,
> unreg_mcast = 0x0, member_list = 0x3
> 1 : type: mcast, vid = 1, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
> no super, port_mask = 0x3
> 2 : type: ucast, vid = 1, addr = 74:6a:8f:00:16:12, ucast_type =
> persistant, port_num = 0x0, Secure
> 3 : type: vlan , vid = 0, untag_force = 0x7, reg_mcast = 0x0,
> unreg_mcast = 0x0, member_list = 0x7
> 4 : type: mcast, vid = 1, addr = 01:00:5e:00:00:01, mcast_state = f,
> no super, port_mask = 0x3
> 5 : type: vlan , vid = 2, untag_force = 0x5, reg_mcast = 0x5,
> unreg_mcast = 0x0, member_list = 0x5
> 6 : type: mcast, vid = 2, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
> no super, port_mask = 0x5
> 7 : type: ucast, vid = 2, addr = 74:6a:8f:00:16:13, ucast_type =
> persistant, port_num = 0x0, Secure
> 8 : type: mcast, vid = 2, addr = 01:00:5e:00:00:01, mcast_state = f,
> no super, port_mask = 0x5
> 9 : type: vlan , vid = 100, untag_force = 0x0, reg_mcast = 0x5,
> unreg_mcast = 0x0, member_list = 0x5
> 10 : type: ucast, vid = 100, addr = 74:6a:8f:00:16:13, ucast_type =
> persistant, port_num = 0x0
> 11 : type: mcast, vid = 100, addr = ff:ff:ff:ff:ff:ff, mcast_state =
> f, no super, port_mask = 0x5
> 12 : type: mcast, vid = 2, addr = 01:80:c2:00:00:21, mcast_state = f,
> no super, port_mask = 0x5
>
> During mulitcast receive:
>
> # switch-config -d
> cpsw hw version 1.12 (0)
> 0 : type: vlan , vid = 1, untag_force = 0x3, reg_mcast = 0x3,
> unreg_mcast = 0x0, member_list = 0x3
> 1 : type: mcast, vid = 1, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
> no super, port_mask = 0x3
> 2 : type: ucast, vid = 1, addr = 74:6a:8f:00:16:12, ucast_type =
> persistant, port_num = 0x0, Secure
> 3 : type: vlan , vid = 0, untag_force = 0x7, reg_mcast = 0x0,
> unreg_mcast = 0x0, member_list = 0x7
> 4 : type: mcast, vid = 1, addr = 01:00:5e:00:00:01, mcast_state = f,
> no super, port_mask = 0x3
> 5 : type: vlan , vid = 2, untag_force = 0x5, reg_mcast = 0x5,
> unreg_mcast = 0x0, member_list = 0x5
> 6 : type: mcast, vid = 2, addr = ff:ff:ff:ff:ff:ff, mcast_state = f,
> no super, port_mask = 0x5
> 7 : type: ucast, vid = 2, addr = 74:6a:8f:00:16:13, ucast_type =
> persistant, port_num = 0x0, Secure
> 8 : type: mcast, vid = 2, addr = 01:00:5e:00:00:01, mcast_state = f,
> no super, port_mask = 0x5
> 9 : type: vlan , vid = 100, untag_force = 0x0, reg_mcast = 0x5,
> unreg_mcast = 0x0, member_list = 0x5
> 10 : type: ucast, vid = 100, addr = 74:6a:8f:00:16:13, ucast_type =
> persistant, port_num = 0x0
> 11 : type: mcast, vid = 100, addr = ff:ff:ff:ff:ff:ff, mcast_state =
> f, no super, port_mask = 0x5
> 12 : type: mcast, vid = 2, addr = 01:80:c2:00:00:21, mcast_state = f,
> no super, port_mask = 0x5
> 13 : type: mcast, vid = 2, addr = 01:00:5e:03:1d:47, mcast_state = f,
> no super, port_mask = 0x5
> 14 : type: ucast, vid = 100, addr = 66:22:04:bc:90:26, ucast_type =
> untouched , port_num = 0x2
I could see multicast address 01:00:5e:03:1d:47 is added to ALE table at
index 13, but it is added for VLAN id 2 ie eth1, did you ran the test
for eth1 or eth1.100?
Regards
Mugunthan V N
^ permalink raw reply
* IPv6: routing/forwarding/masking link-local addresses
From: Michal Kazior @ 2016-03-31 9:44 UTC (permalink / raw)
To: Network Development, linux-wireless
Hi,
The most commonly used framing/addressing in 802.11 is 3addr. This is
how most Access Points work and Clients follow suit.
However it is impossible to put a 3addr Client interface into a bridge
because it's not possible to properly distinguish RA/TA/DA/SA
addresses.
There is 4addr framing in 802.11 (often referred to as WDS or
Repeater) but it's a loose standard and various vendors implement it
differently. For this kind of framing to work both AP and Client must
support the same flavor of the framing. Many APs don't even allow the
user to enable WDS. This makes it troublesome to extend wireless
networks without replacing AP equipment (upgrading AP software isn't
always an option).
There is already a userspace tool which solves this problem for IPv4
called relayd [1]. It forwards ARP/DHCP packets (and replaces MAC
addresses accordingly) between non-bridged interfaces and adds
additional route entries to the firewall to, effectively, act as a
bridge transparently. via routing. Compared to ARP Proxy it doesn't
require IP address configuration on local interfaces.
I'm trying to add IPv6 support for relayd. The problem is I can't get
link-local addresses to route properly.
I can easily get things like 2000::/32 to route perfectly fine but
fe80:: just won't work. Upon inspection I've noticed:
; ip link add veth0 type veth peer name veth1
; ip link set veth0 up
; ip link set veth1 up
; ip route add fe80::1 dev veth0
; ip route add 2000::1 dev veth0
; ip route get iif veth1 fe80::1
fe80::1 from :: dev veth1 metric 0
cache iif veth1
; ip route get iif veth1 2000::1
2000::1 from :: dev veth0 metric 0
cache iif veth1
; ip link del veth0
If I remove default fe80::/64 routes I get:
; ip link del veth0
; ip link add veth0 type veth peer name veth1
; ip link set veth0 up
; ip link set veth1 up
; ip route del fe80::/64 dev veth0
; ip route del fe80::/64 dev veth1
; ip route add fe80::1 dev veth0
; ip route add 2000::1 dev veth0
; ip route get iif veth1 fe80::1
unreachable fe80::1 from :: dev lo table unspec proto kernel metric
4294967295 error -101 iif veth1
; ip route get iif veth1 2000::1
2000::1 from :: dev veth0 metric 0
cache iif veth1
The fe80:: route gets ignored completely in both cases.
My question is: can I make it work (a kernel knob I'm not aware of) or
does it require patching the kernel? Thoughts/ideas/hints?
Michał
[1]: http://git.openwrt.org/?p=project/relayd.git;a=summary
^ permalink raw reply
* Re: [PATCH net] tun, bpf: fix suspicious RCU usage in tun_{attach,detach}_filter
From: Jiri Slaby @ 2016-03-31 9:15 UTC (permalink / raw)
To: Daniel Borkmann, davem
Cc: alexei.starovoitov, mkubecek, sasha.levin, eric.dumazet, mst,
netdev
In-Reply-To: <755ee9ec1f6d2229be41806964b372548e4b7586.1459382574.git.daniel@iogearbox.net>
On 03/31/2016, 02:13 AM, Daniel Borkmann wrote:
> Sasha Levin reported a suspicious rcu_dereference_protected() warning
> found while fuzzing with trinity that is similar to this one:
>
> [ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
> [ 52.765688] other info that might help us debug this:
> [ 52.765695] rcu_scheduler_active = 1, debug_locks = 1
> [ 52.765701] 1 lock held by a.out/1525:
> [ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20
> [ 52.765721] stack backtrace:
> [ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
> [...]
> [ 52.765768] Call Trace:
> [ 52.765775] [<ffffffff813e488d>] dump_stack+0x85/0xc8
> [ 52.765784] [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110
> [ 52.765792] [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90
> [ 52.765801] [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun]
> [ 52.765810] [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun]
> [ 52.765818] [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210
> [ 52.765827] [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun]
> [ 52.765834] [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690
> [ 52.765843] [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60
> [ 52.765850] [<ffffffff81261519>] SyS_ioctl+0x79/0x90
> [ 52.765858] [<ffffffff81003ba2>] do_syscall_64+0x62/0x140
> [ 52.765866] [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25
>
> Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
> from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
> DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.
>
> Since the fix in f91ff5b9ff52 ("net: sk_{detach|attach}_filter() rcu
> fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
> filter with rcu_dereference_protected(), checking whether socket lock
> is held in control path.
>
> Since its introduction in 994051625981 ("tun: socket filter support"),
> tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
> sock_owned_by_user(sk) doesn't apply in this specific case and therefore
> triggers the false positive.
>
> Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
> that is used by tap filters and pass in lockdep_rtnl_is_held() for the
> rcu_dereference_protected() checks instead.
It seems to be gone with this patch here.
thanks,
--
js
suse labs
^ permalink raw reply
* [PATCH v2] net: mvpp2: replace MVPP2_CPU_D_CACHE_LINE_SIZE with cache_line_size
From: Jisheng Zhang @ 2016-03-31 9:03 UTC (permalink / raw)
To: davem, mw; +Cc: netdev, linux-kernel, linux-arm-kernel, Jisheng Zhang
The mvpp2 ip maybe used in SoCs which may have have different cacheline
size. Replace the MVPP2_CPU_D_CACHE_LINE_SIZE with cache_line_size.
And since dma_alloc_coherent() is always cacheline size aligned, so
remove the align checks.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
---
Since v1:
- use cache_line_size() suggested by Marcin
drivers/net/ethernet/marvell/mvpp2.c | 14 +-------------
1 file changed, 1 insertion(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index e9aa8d9..868a957 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -321,7 +321,6 @@
/* Lbtd 802.3 type */
#define MVPP2_IP_LBDT_TYPE 0xfffa
-#define MVPP2_CPU_D_CACHE_LINE_SIZE 32
#define MVPP2_TX_CSUM_MAX_SIZE 9800
/* Timeout constants */
@@ -377,7 +376,7 @@
#define MVPP2_RX_PKT_SIZE(mtu) \
ALIGN((mtu) + MVPP2_MH_SIZE + MVPP2_VLAN_TAG_LEN + \
- ETH_HLEN + ETH_FCS_LEN, MVPP2_CPU_D_CACHE_LINE_SIZE)
+ ETH_HLEN + ETH_FCS_LEN, cache_line_size())
#define MVPP2_RX_BUF_SIZE(pkt_size) ((pkt_size) + NET_SKB_PAD)
#define MVPP2_RX_TOTAL_SIZE(buf_size) ((buf_size) + MVPP2_SKB_SHINFO_SIZE)
@@ -4493,10 +4492,6 @@ static int mvpp2_aggr_txq_init(struct platform_device *pdev,
if (!aggr_txq->descs)
return -ENOMEM;
- /* Make sure descriptor address is cache line size aligned */
- BUG_ON(aggr_txq->descs !=
- PTR_ALIGN(aggr_txq->descs, MVPP2_CPU_D_CACHE_LINE_SIZE));
-
aggr_txq->last_desc = aggr_txq->size - 1;
/* Aggr TXQ no reset WA */
@@ -4526,9 +4521,6 @@ static int mvpp2_rxq_init(struct mvpp2_port *port,
if (!rxq->descs)
return -ENOMEM;
- BUG_ON(rxq->descs !=
- PTR_ALIGN(rxq->descs, MVPP2_CPU_D_CACHE_LINE_SIZE));
-
rxq->last_desc = rxq->size - 1;
/* Zero occupied and non-occupied counters - direct access */
@@ -4616,10 +4608,6 @@ static int mvpp2_txq_init(struct mvpp2_port *port,
if (!txq->descs)
return -ENOMEM;
- /* Make sure descriptor address is cache line size aligned */
- BUG_ON(txq->descs !=
- PTR_ALIGN(txq->descs, MVPP2_CPU_D_CACHE_LINE_SIZE));
-
txq->last_desc = txq->size - 1;
/* Set Tx descriptors queue starting address - indirect access */
--
2.8.0.rc3
^ permalink raw reply related
* [PATCH] net: mvpp2: fix maybe-uninitialized warning
From: Jisheng Zhang @ 2016-03-31 9:01 UTC (permalink / raw)
To: davem, mw; +Cc: netdev, linux-kernel, linux-arm-kernel, Jisheng Zhang
This is to fix the following maybe-uninitialized warning:
drivers/net/ethernet/marvell/mvpp2.c:6007:18: warning: 'err' may be
used uninitialized in this function [-Wmaybe-uninitialized]
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
---
drivers/net/ethernet/marvell/mvpp2.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index c797971a..e9aa8d9 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -6059,8 +6059,10 @@ static int mvpp2_port_init(struct mvpp2_port *port)
/* Map physical Rx queue to port's logical Rx queue */
rxq = devm_kzalloc(dev, sizeof(*rxq), GFP_KERNEL);
- if (!rxq)
+ if (!rxq) {
+ err = -ENOMEM;
goto err_free_percpu;
+ }
/* Map this Rx queue to a physical queue */
rxq->id = port->first_rxq + queue;
rxq->port = port->id;
--
2.8.0.rc3
^ permalink raw reply related
* Re: [PATCH 0/5] wireless: ti: Convert specialized logging macros to kernel style
From: Eliad Peller @ 2016-03-31 8:59 UTC (permalink / raw)
To: Joe Perches
Cc: Kalle Valo, LKML, linux-wireless@vger.kernel.org,
open list:NETWORKING DRIVERS, Guy Mishol, Uri Mashiach,
Johannes Berg
In-Reply-To: <1459411668.1744.6.camel@perches.com>
On Thu, Mar 31, 2016 at 11:07 AM, Joe Perches <joe@perches.com> wrote:
> On Thu, 2016-03-31 at 10:39 +0300, Kalle Valo wrote:
>> Joe Perches <joe@perches.com> writes:
>> > On Wed, 2016-03-30 at 14:51 +0300, Kalle Valo wrote:
>> > > Joe Perches <joe@perches.com> writes:
>> > > >
>> > > > Using the normal kernel logging mechanisms makes this code
>> > > > a bit more like other wireless drivers.
>> > > Personally I don't see the point but I don't have any strong opinions. A
>> > > bigger problem is that TI drivers are not really in active development
>> > > and that's I'm not thrilled to take big patches like this for dormant
>> > > drivers.
>> > Not very dormant.
>> >
>> > 35 patches in the last year, most of them adding functionality.
>> Oh, I didn't realise it had that many patches. But the driver is
>> orphaned and doesn't have a maintainer so could I then have an ack from
>> one of the active contributors that this ok?
>
> Fine by me.
>
> $ ./scripts/get_maintainer.pl -f --git drivers/net/wireless/ti/
>
> Kalle Valo <kvalo@codeaurora.org> (maintainer:NETWORKING DRIVERS (WIRELESS),commit_signer:27/35=77%)
> Eliad Peller <eliad@wizery.com> (commit_signer:9/35=26%,authored:7/35=20%)
> Guy Mishol <guym@ti.com> (commit_signer:6/35=17%,authored:5/35=14%)
> Johannes Berg <johannes.berg@intel.com> (commit_signer:6/35=17%,authored:3/35=9%)
> Uri Mashiach <uri.mashiach@compulab.co.il> (commit_signer:4/35=11%,authored:4/35=11%)
>
> For those people now added to the cc list,
> here's the original patch thread:
>
> https://lkml.org/lkml/2016/3/7/1099
I don't have a strong opinion here either.
(I do like the trailing newline being added automatically, but that's
hardly an issue...)
Eliad.
^ permalink raw reply
* [PATCH v2] net: mvneta: replace MVNETA_CPU_D_CACHE_LINE_SIZE with cache_line_size
From: Jisheng Zhang @ 2016-03-31 8:54 UTC (permalink / raw)
To: thomas.petazzoni, mw, gregory.clement, davem
Cc: netdev, linux-kernel, linux-arm-kernel, Jisheng Zhang
The mvneta is also used in some Marvell berlin family SoCs which may
have different cacheline size. Replace the MVNETA_CPU_D_CACHE_LINE_SIZE
usage with cache_line_size().
And since dma_alloc_coherent() is always cacheline size aligned, so
remove the align checks.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
---
Since v1:
- use cache_line_size() suggested by Marcin
drivers/net/ethernet/marvell/mvneta.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 577f7ca..b1db000 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -260,7 +260,6 @@
#define MVNETA_VLAN_TAG_LEN 4
-#define MVNETA_CPU_D_CACHE_LINE_SIZE 32
#define MVNETA_TX_CSUM_DEF_SIZE 1600
#define MVNETA_TX_CSUM_MAX_SIZE 9800
#define MVNETA_ACC_MODE_EXT1 1
@@ -300,7 +299,7 @@
#define MVNETA_RX_PKT_SIZE(mtu) \
ALIGN((mtu) + MVNETA_MH_SIZE + MVNETA_VLAN_TAG_LEN + \
ETH_HLEN + ETH_FCS_LEN, \
- MVNETA_CPU_D_CACHE_LINE_SIZE)
+ cache_line_size())
#define IS_TSO_HEADER(txq, addr) \
((addr >= txq->tso_hdrs_phys) && \
@@ -2764,9 +2763,6 @@ static int mvneta_rxq_init(struct mvneta_port *pp,
if (rxq->descs == NULL)
return -ENOMEM;
- BUG_ON(rxq->descs !=
- PTR_ALIGN(rxq->descs, MVNETA_CPU_D_CACHE_LINE_SIZE));
-
rxq->last_desc = rxq->size - 1;
/* Set Rx descriptors queue starting address */
@@ -2837,10 +2833,6 @@ static int mvneta_txq_init(struct mvneta_port *pp,
if (txq->descs == NULL)
return -ENOMEM;
- /* Make sure descriptor address is cache line size aligned */
- BUG_ON(txq->descs !=
- PTR_ALIGN(txq->descs, MVNETA_CPU_D_CACHE_LINE_SIZE));
-
txq->last_desc = txq->size - 1;
/* Set maximum bandwidth for enabled TXQs */
--
2.8.0.rc3
^ permalink raw reply related
* [PATCH] rds: rds-stress show all zeros after few minutes
From: shamir rabinovitch @ 2016-03-31 0:50 UTC (permalink / raw)
To: rds-devel, netdev; +Cc: davem, shamir.rabinovitch
Issue can be seen on platforms that use 8K and above page size
while rds fragment size is 4K. On those platforms single page is
shared between 2 or more rds fragments. Each fragment has it's own
offeset and rds cong map code need to take this offset to account.
Not taking this offset to account lead to reading the data fragment
as congestion map fragment and hang of the rds transmit due to far
cong map corruption.
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Tested-by: Anand Bibhuti <anand.bibhuti@oracle.com>
Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
---
net/rds/ib_recv.c | 2 +-
net/rds/iw_recv.c | 2 +-
net/rds/page.c | 5 +++--
3 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index 977fb86..abc8cc8 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -796,7 +796,7 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
addr = kmap_atomic(sg_page(&frag->f_sg));
- src = addr + frag_off;
+ src = addr + frag->f_sg.offset + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
diff --git a/net/rds/iw_recv.c b/net/rds/iw_recv.c
index a66d179..62a1738 100644
--- a/net/rds/iw_recv.c
+++ b/net/rds/iw_recv.c
@@ -585,7 +585,7 @@ static void rds_iw_cong_recv(struct rds_connection *conn,
addr = kmap_atomic(frag->f_page);
- src = addr + frag_off;
+ src = addr + frag->f_offset + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
diff --git a/net/rds/page.c b/net/rds/page.c
index 5a14e6d..715cbaa 100644
--- a/net/rds/page.c
+++ b/net/rds/page.c
@@ -135,8 +135,9 @@ int rds_page_remainder_alloc(struct scatterlist *scat, unsigned long bytes,
if (rem->r_offset != 0)
rds_stats_inc(s_page_remainder_hit);
- rem->r_offset += bytes;
- if (rem->r_offset == PAGE_SIZE) {
+ /* some hw (e.g. sparc) require aligned memory */
+ rem->r_offset += ALIGN(bytes, 8);
+ if (rem->r_offset >= PAGE_SIZE) {
__free_page(rem->r_page);
rem->r_page = NULL;
}
--
1.7.1
^ permalink raw reply related
* Re: [PATCH] net: mvneta: explicitly disable BM on 64bit platform
From: Jisheng Zhang @ 2016-03-31 8:13 UTC (permalink / raw)
To: Marcin Wojtas
Cc: Gregory CLEMENT, David S. Miller, Thomas Petazzoni, netdev,
linux-kernel, linux-arm-kernel@lists.infradead.org
In-Reply-To: <CAPv3WKcR+JgoguUUEW0_uPoJL_CtxgKz-ZG_rZhGAAB5utzqMw@mail.gmail.com>
Hi Marcin,
On Thu, 31 Mar 2016 08:49:19 +0200 Marcin Wojtas wrote:
> Hi Jisheng,
>
> 2016-03-31 7:53 GMT+02:00 Jisheng Zhang <jszhang@marvell.com>:
> > Hi Gregory,
> >
> > On Wed, 30 Mar 2016 17:11:41 +0200 Gregory CLEMENT wrote:
> >
> >> Hi Jisheng,
> >>
> >> On mer., mars 30 2016, Jisheng Zhang <jszhang@marvell.com> wrote:
> >>
> >> > The mvneta BM can't work on 64bit platform, as the BM hardware expects
> >> > buf virtual address to be placed in the first four bytes of mapped
> >> > buffer, but obviously the virtual address on 64bit platform can't be
> >> > stored in 4 bytes. So we have to explicitly disable BM on 64bit
> >> > platform.
> >>
> >> Actually mvneta is used on Armada 3700 which is a 64bits platform.
> >> Is it true that the driver needs some change to use BM in 64 bits, but
> >> we don't have to disable it.
> >>
> >> Here is the 64 bits part of the patch we have currently on the hardware
> >> prototype. We have more things which are really related to the way the
> >> mvneta is connected to the Armada 3700 SoC. This code was not ready for
> >
> > Thanks for the sharing.
> >
> > I think we could commit easy parts firstly, for example: the cacheline size
> > hardcoding, either piece of your diff or my version:
> >
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2016-March/418513.html
>
> Since the commit:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/arch/arm64/include/asm/cache.h?id=97303480753e48fb313dc0e15daaf11b0451cdb8
> detached L1_CACHE_BYTES from real cache size, I suggest, the macro should be:
> #define MVNETA_CPU_D_CACHE_LINE_SIZE cache_line_size()
Thanks for the hint. I'll send out updated version to address the cacheline size
issue.
>
> Regarding check after dma_alloc_coherent, I agree it's not necessary.
>
> >
> >> mainline but I prefer share it now instead of having the HWBM blindly
> >
> > I have looked through the diff, it is for the driver itself on 64bit platforms,
> > and it doesn't touch BM. The BM itself need to be disabled for 64bit, I'm not
> > sure the BM could work on 64bit even with your diff. Per my understanding, the BM
> > can't work on 64 bit, let's have a look at some piece of the mvneta_bm_construct()
> >
> > *(u32 *)buf = (u32)buf;
>
> Indeed this particular part is different and unclear, I tried
> different options - with no success. I'm checking with design team
> now. Anyway, I managed to enable operation for HWBM on A3700 with one
> work-around in mvneta_hwbm_rx():
> data = phys_to_virt(rx_desc->buf_phys_addr);
oh yes! This seems a good idea. And If we replace all
data = (void *)rx_desc->buf_cookie
with
data = phys_to_virt(rx_desc->buf_phys_addr);
we also resolve the buf_cookie issue on 64bit platforms! no need to introduce
data_high or use existing reserved member to store virtual address' higher 32bits
>
> Of course mvneta_bm, due to some silicone differences needed also a rework.
>
> Actually I'd wait with updating 64-bit parts of mvneta, until real
> support for such machine's controller is introduced. Basing on my
> experience with enabling neta on A3700, it turns out to be more
> changes.
I agree with you. And I need one more rework: berlin SoCs don't have mbus
concept at all ;)
Thanks for your hints,
Jisheng
^ permalink raw reply
* Re: [PATCH 0/5] wireless: ti: Convert specialized logging macros to kernel style
From: Joe Perches @ 2016-03-31 8:07 UTC (permalink / raw)
To: Kalle Valo
Cc: linux-kernel, linux-wireless, netdev, Eliad Peller, Guy Mishol,
Uri Mashiach, Johannes Berg
In-Reply-To: <87y48z3xkz.fsf@purkki.adurom.net>
On Thu, 2016-03-31 at 10:39 +0300, Kalle Valo wrote:
> Joe Perches <joe@perches.com> writes:
> > On Wed, 2016-03-30 at 14:51 +0300, Kalle Valo wrote:
> > > Joe Perches <joe@perches.com> writes:
> > > >
> > > > Using the normal kernel logging mechanisms makes this code
> > > > a bit more like other wireless drivers.
> > > Personally I don't see the point but I don't have any strong opinions. A
> > > bigger problem is that TI drivers are not really in active development
> > > and that's I'm not thrilled to take big patches like this for dormant
> > > drivers.
> > Not very dormant.
> >
> > 35 patches in the last year, most of them adding functionality.
> Oh, I didn't realise it had that many patches. But the driver is
> orphaned and doesn't have a maintainer so could I then have an ack from
> one of the active contributors that this ok?
Fine by me.
$ ./scripts/get_maintainer.pl -f --git drivers/net/wireless/ti/
Kalle Valo <kvalo@codeaurora.org> (maintainer:NETWORKING DRIVERS (WIRELESS),commit_signer:27/35=77%)
Eliad Peller <eliad@wizery.com> (commit_signer:9/35=26%,authored:7/35=20%)
Guy Mishol <guym@ti.com> (commit_signer:6/35=17%,authored:5/35=14%)
Johannes Berg <johannes.berg@intel.com> (commit_signer:6/35=17%,authored:3/35=9%)
Uri Mashiach <uri.mashiach@compulab.co.il> (commit_signer:4/35=11%,authored:4/35=11%)
For those people now added to the cc list,
here's the original patch thread:
https://lkml.org/lkml/2016/3/7/1099
^ permalink raw reply
* RE: [PATCH iproute2 v1 1/1] lib/utils: fix get_addr() and get_prefix() error messages
From: Varlese, Marco @ 2016-03-31 8:04 UTC (permalink / raw)
To: Stephen Hemminger
Cc: netdev@vger.kernel.org, davem@davemloft.net, Jiri Pirko,
John Fastabend, jhs@mojatatu.com, Szczerbik, PrzemyslawX
In-Reply-To: <20160330165420.36f98115@xeon-e3>
[-- Attachment #1: Type: text/plain, Size: 1571 bytes --]
> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, March 31, 2016 12:54 AM
> To: Varlese, Marco <marco.varlese@intel.com>
> Cc: netdev@vger.kernel.org; davem@davemloft.net; Jiri Pirko
<jiri@resnulli.us>;
> John Fastabend <john.fastabend@gmail.com>; jhs@mojatatu.com; Szczerbik,
> PrzemyslawX <przemyslawx.szczerbik@intel.com>
> Subject: Re: [PATCH iproute2 v1 1/1] lib/utils: fix get_addr() and
get_prefix()
> error messages
>
> On Tue, 22 Mar 2016 13:02:02 +0000
> "Varlese, Marco" <marco.varlese@intel.com> wrote:
>
> > An attempt to add invalid address to interface would print "???" string
> > instead of the address family name.
> >
> > For example:
> > $ ip address add 256.10.166.1/24 dev ens8
> > Error: ??? prefix is expected rather than "256.10.166.1/24".
> >
> > $ ip neighbor add proxy 2001:db8::g dev ens8
> > Error: ??? address is expected rather than "2001:db8::g".
> >
> > With this patch the output will look like:
> > $ ip address add 256.10.166.1/24 dev ens8
> > Error: inet prefix is expected rather than "256.10.166.1/24".
> >
> > $ ip neighbor add proxy 2001:db8::g dev ens8
> > Error: inet6 address is expected rather than "2001:db8::g".
> >
> > Signed-off-by: Przemyslaw Szczerbik <przemyslawx.szczerbik@intel.com>
> > Signed-off-by: Marco Varlese <marco.varlese@intel.com>
> > ---
> > lib/utils.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
>
> If you look at git, I already applied this by manual fix.
Thanks, I didn't see your other email. Thank you for your help!
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 4414 bytes --]
^ permalink raw reply
* Fwd: Urgent- Quotations Needed
From: Nethaji road @ 2016-03-31 7:59 UTC (permalink / raw)
In-Reply-To: <243313206.418853.1459409843137.JavaMail.root@muthoottumini.com>
Dear Sir,
FYI kindly send us proforma ASAP for the attached purchase order.
Thanks
Carmen Rodriguez
^ permalink raw reply
* Re: [PATCH v3 0/8] arm64: rockchip: Initial GeekBox enablement
From: Giuseppe CAVALLARO @ 2016-03-31 7:53 UTC (permalink / raw)
To: Dinh Nguyen
Cc: devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Heiko Stübner, netdev-u79uwXL29TY76Z2rM5mHXA,
Gabriel Fernandez, LKML, Frank Schäfer,
open list:ARM/Rockchip SoC..., LAKML, Fabrice GASNIER,
Andreas Färber, Tomeu Vizoso, Alexandre TORGUE
In-Reply-To: <CADhT+wdXXp322vmgFmWTUiiRZTsCWeJSSdU=BMEEbSyb_bnrUw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 3/30/2016 6:44 PM, Dinh Nguyen wrote:
> On Tue, Mar 15, 2016 at 7:36 AM, Giuseppe CAVALLARO
> <peppe.cavallaro-qxv4g6HH51o@public.gmane.org> wrote:
>> Hello Tomeu
>>
>> On 3/15/2016 8:23 AM, Tomeu Vizoso wrote:
>>>
>>> Thanks.
>>>
>>> Btw, I have rebased on top of 4.5 this morning and I have noticed that
>>> 88f8b1bb41c6 ("stmmac: Fix 'eth0: No PHY found' regression") got in
>>> there, so I guess we have now a bunch of boards with broken network on
>>> that release:(
>>
>>
>>
>> This is the status on my side: I am testing on an HW that has the
>> Enhanced descriptors and all works fine.
>>
>> On this HW, if I force the driver to use the normal descriptor
>> layout, I meet problems but using both net.git and net-next.
>> So I suspect I cannot ply with this HW forcing the normal descriptors.
>> But! That is helping me to check if, on net-next, the stmmac is
>> actually programming fine the normal desc case.
>> I have just found another fix so I kindly ask you to apply the temp
>> patch attached and let me know.
>> In details, I have noticed that the OWN bit was not set in the right
>> TDES0.
>>
>> I also ask you to give me a log of the kernel where the stmmac was
>> running fine. I would like to see which configuration it is selected
>> at runtime by the driver on your box.
>> From your previous logs (where the stmmac failed), it seems that
>> the problem is on normal desc but, to be honest, this is the first
>> case I see a 3.50a with HW capability register and w/o Enhanced
>> descriptors.
>>
>
> Are you still working on a fix for:
>
> [ 1.196110] libphy: PHY stmmac-0:ffffffff not found
> [ 1.200972] eth0: Could not attach to PHY
> [ 1.204991] stmmac_open: Cannot attach to PHY (error: -19)
>
> I see the error still there as of linux-next 20160330.
this could be because the fixes have been not applied on net-next
I will check and resend all asap
peppe
>
> Dinh
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox