* Re: [README] bug fixes only...
From: Cong Wang @ 2014-01-24 1:16 UTC (permalink / raw)
To: David Miller
Cc: netdev, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netfilter-devel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20140123.162408.681541949851852312.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Hi, David
On Thu, Jan 23, 2014 at 4:24 PM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>
> I want to see zero feature and/or cleanup patches submitted at this
> time.
>
Are the patches submitted before your email still allowed to update?
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* RE: Freescale FEC packet loss
From: fugang.duan @ 2014-01-24 1:28 UTC (permalink / raw)
To: Marek Vasut
Cc: netdev@vger.kernel.org, Frank.Li@freescale.com,
Fabio.Estevam@freescale.com, Hector Palacios,
linux-arm-kernel@lists.infradead.org, Detlev Zundel, Eric Nelson
In-Reply-To: <201401231520.56990.marex@denx.de>
From: Marek Vasut <marex@denx.de>
Data: Thursday, January 23, 2014 10:21 PM
>To: Duan Fugang-B38611
>Cc: netdev@vger.kernel.org; Li Frank-B20596; Estevam Fabio-R49496; Hector
>Palacios; linux-arm-kernel@lists.infradead.org; Detlev Zundel; Eric Nelson
>Subject: Re: Freescale FEC packet loss
>
>On Thursday, January 23, 2014 at 02:49:48 AM, fugang.duan@freescale.com wrote:
>[...]
>
>> >[ 3] 71.0-72.0 sec 23.4 MBytes 196 Mbits/sec
>> >[ 3] 72.0-73.0 sec 12.2 MBytes 103 Mbits/sec
>> >[ 3] 73.0-74.0 sec 0.00 Bytes 0.00 bits/sec [ 3] 74.0-75.0 sec
>> >0.00 Bytes 0.00 bits/sec [ 3] 75.0-76.0 sec 10.9 MBytes 91.2 Mbits/sec
>> >[ 3] 76.0-77.0 sec 22.4 MBytes 188 Mbits/sec
>> >[ 3] 77.0-78.0 sec 23.0 MBytes 193 Mbits/sec
>>
>> I will debug the issue when I am free, and then report the result to you.
>> Thanks for your reporting the issue.
>
>Hi Andy,
>
>Thanks for looking into this. Is there any way I can help you with figuring out
>the issue ? Do you need any more feedback or anything please ?
>
No, the information is enough. Thanks for your testing.
Best Regards,
Andy
^ permalink raw reply
* Re: [README] bug fixes only...
From: David Miller @ 2014-01-24 1:56 UTC (permalink / raw)
To: cwang; +Cc: netdev, linux-wireless, netfilter-devel
In-Reply-To: <CAHA+R7MtKK-YS8O808a_d-Qm7od+beEvsRjMpWpWZozV1sS1QA@mail.gmail.com>
From: Cong Wang <cwang@twopensource.com>
Date: Thu, 23 Jan 2014 17:16:58 -0800
> Are the patches submitted before your email still allowed to update?
If the feature/cleanup/optimization patch wasn't ready by now, it will
have to wait until net-next opens back up again.
^ permalink raw reply
* critic on documentation of the network stack
From: Hannes Frederic Sowa @ 2014-01-24 3:23 UTC (permalink / raw)
To: netdev
Hello!
After net-next is closed I wanted to put the following link here:
<http://linux.slashdot.org/comments.pl?sid=4356053&cid=45184693>
I don't want to start a flamefest or come too close to someone but I
fear some of the critic is reasonable. Maybe we can do better (I have
to admit, I also hate writing documentation, e.g. have not yet send the
IP_PMTUDISC_INTERFACE man-page patches).
I try to start with some constructive discussion:
There are some great features in the network stack that some people miss
because of lack documentation. One possible solution is documentation
directly in the kernel, but mostly this is just written as a reference
and the real wonderful stuff is only achieved by putting lots of those
features correclty together.
Maybe this is the second or third time this was proposed but I'll try
again: Would it make sense to just start slow and setup a wiki where we
just throw in the various snippets we use for testing while developing
patches, maybe with a bit of background information? This may well attract
interested people outside of netdev@ which could start helping cleaning
up the wiki or add more useful documentation on their own. We could
check from time to time what could be fed back into Documentation/? The
reason why I would definitely help to improve the wiki is because I
am sure I can learn from other setups and testing methodologies, too,
and definitely still have not yet seen everything what is possible with
the linux network stack.
Thanks && Hopes for a constructive dicussion,
Hannes
^ permalink raw reply
* [Xen-devel] [PATCH net-next v4] xen-netfront: clean up code in xennet_release_rx_bufs
From: Annie Li @ 2014-01-24 3:28 UTC (permalink / raw)
To: xen-devel, netdev
Cc: konrad.wilk, ian.campbell, wei.liu2, david.vrabel, annie.li
From: Annie Li <annie.li@oracle.com>
This patch removes grant transfer releasing code from netfront, and uses
gnttab_end_foreign_access to end grant access since
gnttab_end_foreign_access_ref may fail when the grant entry is
currently used for reading or writing.
* clean up grant transfer code kept from old netfront(2.6.18) which grants
pages for access/map and transfer. But grant transfer is deprecated in current
netfront, so remove corresponding release code for transfer.
* release grant access (through gnttab_end_foreign_access) and skb for tx/rx path,
use get_page to ensure page is released when grant access is completed successfully.
Xen-blkfront/xen-tpmfront/xen-pcifront also have similar issue, but patches
for them will be created separately.
V4: Revert put_page in gnttab_end_foreign_access, and keep netfront change in
single patch
V3: Changes as suggestion from David Vrabel, ensure pages are not freed untill
grant acess is ended.
V2: Improve patch comments.
Signed-off-by: Annie Li <annie.li@oracle.com>
---
drivers/net/xen-netfront.c | 93 ++++++++++++++-----------------------------
1 files changed, 30 insertions(+), 63 deletions(-)
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index d7bee8a..a22adaa 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -117,6 +117,7 @@ struct netfront_info {
} tx_skbs[NET_TX_RING_SIZE];
grant_ref_t gref_tx_head;
grant_ref_t grant_tx_ref[NET_TX_RING_SIZE];
+ struct page *grant_tx_page[NET_TX_RING_SIZE];
unsigned tx_skb_freelist;
spinlock_t rx_lock ____cacheline_aligned_in_smp;
@@ -396,6 +397,7 @@ static void xennet_tx_buf_gc(struct net_device *dev)
gnttab_release_grant_reference(
&np->gref_tx_head, np->grant_tx_ref[id]);
np->grant_tx_ref[id] = GRANT_INVALID_REF;
+ np->grant_tx_page[id] = NULL;
add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, id);
dev_kfree_skb_irq(skb);
}
@@ -452,6 +454,7 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
mfn, GNTMAP_readonly);
+ np->grant_tx_page[id] = virt_to_page(data);
tx->gref = np->grant_tx_ref[id] = ref;
tx->offset = offset;
tx->size = len;
@@ -497,6 +500,7 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
np->xbdev->otherend_id,
mfn, GNTMAP_readonly);
+ np->grant_tx_page[id] = page;
tx->gref = np->grant_tx_ref[id] = ref;
tx->offset = offset;
tx->size = bytes;
@@ -596,6 +600,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
mfn = virt_to_mfn(data);
gnttab_grant_foreign_access_ref(
ref, np->xbdev->otherend_id, mfn, GNTMAP_readonly);
+ np->grant_tx_page[id] = virt_to_page(data);
tx->gref = np->grant_tx_ref[id] = ref;
tx->offset = offset;
tx->size = len;
@@ -1085,10 +1090,11 @@ static void xennet_release_tx_bufs(struct netfront_info *np)
continue;
skb = np->tx_skbs[i].skb;
- gnttab_end_foreign_access_ref(np->grant_tx_ref[i],
- GNTMAP_readonly);
- gnttab_release_grant_reference(&np->gref_tx_head,
- np->grant_tx_ref[i]);
+ get_page(np->grant_tx_page[i]);
+ gnttab_end_foreign_access(np->grant_tx_ref[i],
+ GNTMAP_readonly,
+ (unsigned long)page_address(np->grant_tx_page[i]));
+ np->grant_tx_page[i] = NULL;
np->grant_tx_ref[i] = GRANT_INVALID_REF;
add_id_to_freelist(&np->tx_skb_freelist, np->tx_skbs, i);
dev_kfree_skb_irq(skb);
@@ -1097,78 +1103,35 @@ static void xennet_release_tx_bufs(struct netfront_info *np)
static void xennet_release_rx_bufs(struct netfront_info *np)
{
- struct mmu_update *mmu = np->rx_mmu;
- struct multicall_entry *mcl = np->rx_mcl;
- struct sk_buff_head free_list;
- struct sk_buff *skb;
- unsigned long mfn;
- int xfer = 0, noxfer = 0, unused = 0;
int id, ref;
- dev_warn(&np->netdev->dev, "%s: fix me for copying receiver.\n",
- __func__);
- return;
-
- skb_queue_head_init(&free_list);
-
spin_lock_bh(&np->rx_lock);
for (id = 0; id < NET_RX_RING_SIZE; id++) {
- ref = np->grant_rx_ref[id];
- if (ref == GRANT_INVALID_REF) {
- unused++;
- continue;
- }
+ struct sk_buff *skb;
+ struct page *page;
skb = np->rx_skbs[id];
- mfn = gnttab_end_foreign_transfer_ref(ref);
- gnttab_release_grant_reference(&np->gref_rx_head, ref);
- np->grant_rx_ref[id] = GRANT_INVALID_REF;
-
- if (0 == mfn) {
- skb_shinfo(skb)->nr_frags = 0;
- dev_kfree_skb(skb);
- noxfer++;
+ if (!skb)
continue;
- }
-
- if (!xen_feature(XENFEAT_auto_translated_physmap)) {
- /* Remap the page. */
- const struct page *page =
- skb_frag_page(&skb_shinfo(skb)->frags[0]);
- unsigned long pfn = page_to_pfn(page);
- void *vaddr = page_address(page);
- MULTI_update_va_mapping(mcl, (unsigned long)vaddr,
- mfn_pte(mfn, PAGE_KERNEL),
- 0);
- mcl++;
- mmu->ptr = ((u64)mfn << PAGE_SHIFT)
- | MMU_MACHPHYS_UPDATE;
- mmu->val = pfn;
- mmu++;
+ ref = np->grant_rx_ref[id];
+ if (ref == GRANT_INVALID_REF)
+ continue;
- set_phys_to_machine(pfn, mfn);
- }
- __skb_queue_tail(&free_list, skb);
- xfer++;
- }
+ page = skb_frag_page(&skb_shinfo(skb)->frags[0]);
- dev_info(&np->netdev->dev, "%s: %d xfer, %d noxfer, %d unused\n",
- __func__, xfer, noxfer, unused);
+ /* gnttab_end_foreign_access() needs a page ref until
+ * foreign access is ended (which may be deferred).
+ */
+ get_page(page);
+ gnttab_end_foreign_access(ref, 0,
+ (unsigned long)page_address(page));
+ np->grant_rx_ref[id] = GRANT_INVALID_REF;
- if (xfer) {
- if (!xen_feature(XENFEAT_auto_translated_physmap)) {
- /* Do all the remapping work and M2P updates. */
- MULTI_mmu_update(mcl, np->rx_mmu, mmu - np->rx_mmu,
- NULL, DOMID_SELF);
- mcl++;
- HYPERVISOR_multicall(np->rx_mcl, mcl - np->rx_mcl);
- }
+ kfree_skb(skb);
}
- __skb_queue_purge(&free_list);
-
spin_unlock_bh(&np->rx_lock);
}
@@ -1339,6 +1302,7 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
for (i = 0; i < NET_RX_RING_SIZE; i++) {
np->rx_skbs[i] = NULL;
np->grant_rx_ref[i] = GRANT_INVALID_REF;
+ np->grant_tx_page[i] = NULL;
}
/* A grant for every tx ring slot */
@@ -1439,8 +1403,11 @@ static int netfront_probe(struct xenbus_device *dev,
static void xennet_end_access(int ref, void *page)
{
/* This frees the page as a side-effect */
- if (ref != GRANT_INVALID_REF)
+ if (ref != GRANT_INVALID_REF) {
+ get_page(virt_to_page(page));
gnttab_end_foreign_access(ref, 0, (unsigned long)page);
+ free_page((unsigned long)page);
+ }
}
static void xennet_disconnect_backend(struct netfront_info *info)
--
1.7.3.4
^ permalink raw reply related
* BUSINESS PROPOSAL OF $44.5MILLI?ON.USD.
From: BusinessProposal21 @ 2014-01-24 3:44 UTC (permalink / raw)
To: Recipients
Hello, It is understandable that you might be a little bit apprehensive
because you do not know me but I have a lucrative business proposal of mutual
interest to share with you.
I got your reference in my search for someone who suits my proposed business relationship.
I am Mr.ching Kwok bank officer with the International bank of Taipei.
I will need you to assist me in executing a business project from Taiwan to your country.
It involves the transfer of money.
Everything concerning this transaction shall be legally done without hitch.
Once the funds have been successfully transferred into your account,
we shall share in the ratio to be agreed by both of us.Contact me for further details.
I shall furnish you with more information about this operation immediately
I receive a positive response from you.
You Contact me for more details
Email: mrchingkwok091@gmail.com
Kind Regards,
Mr.ching Kwok
^ permalink raw reply
* [PATCH net-next v3 1/2] bonding: fail_over_mac should only affect AB mode at enslave and removal processing
From: Ding Tianhong @ 2014-01-24 4:27 UTC (permalink / raw)
To: Jay Vosburgh, Veaceslav Falico, David S. Miller, Netdev,
Andy Gospodarek
According to bonding.txt, the fail_over_ma should only affect active-backup mode,
but I fould that the fail_over_mac could be set to active or follow in every
mode, this will cause new slave could not be set to bond's MAC address at
enslave processing and restore its own MAC address at removal processing.
The correct way to fix the problem is that we should not add restrictions when
setting options, just need to modify the bond enslave and removal processing
to check the mode in addition to fail_over_mac when setting a slave's MAC during
enslavement. The change active slave processing already only calls the fail_over_mac
function when in active-backup mode.
Thanks for Jay's suggestion.
The patch also modify the pr_warning() to pr_warn().
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
drivers/net/bonding/bond_main.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2ca949f..4409aa8 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1270,9 +1270,13 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
if (slave_ops->ndo_set_mac_address == NULL) {
if (!bond_has_slaves(bond)) {
- pr_warning("%s: Warning: The first slave device specified does not support setting the MAC address. Setting fail_over_mac to active.",
- bond_dev->name);
- bond->params.fail_over_mac = BOND_FOM_ACTIVE;
+ pr_warn("%s: Warning: The first slave device specified does not support setting the MAC address.\n",
+ bond_dev->name);
+ if (bond->params.mode == BOND_MODE_ACTIVEBACKUP) {
+ bond->params.fail_over_mac = BOND_FOM_ACTIVE;
+ pr_warn("%s: Setting fail_over_mac to active for active-backup mode.\n",
+ bond_dev->name);
+ }
} else if (bond->params.fail_over_mac != BOND_FOM_ACTIVE) {
pr_err("%s: Error: The slave device specified does not support setting the MAC address, but fail_over_mac is not set to active.\n",
bond_dev->name);
@@ -1315,7 +1319,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
*/
memcpy(new_slave->perm_hwaddr, slave_dev->dev_addr, ETH_ALEN);
- if (!bond->params.fail_over_mac) {
+ if (!bond->params.fail_over_mac ||
+ bond->params.mode != BOND_MODE_ACTIVEBACKUP) {
/*
* Set slave to master's mac address. The application already
* set the master's mac address to that of the first slave
@@ -1579,7 +1584,8 @@ err_close:
dev_close(slave_dev);
err_restore_mac:
- if (!bond->params.fail_over_mac) {
+ if (!bond->params.fail_over_mac ||
+ bond->params.mode != BOND_MODE_ACTIVEBACKUP) {
/* XXX TODO - fom follow mode needs to change master's
* MAC if this slave's MAC is in use by the bond, or at
* least print a warning.
@@ -1672,7 +1678,8 @@ static int __bond_release_one(struct net_device *bond_dev,
bond->current_arp_slave = NULL;
- if (!all && !bond->params.fail_over_mac) {
+ if (!all && (!bond->params.fail_over_mac ||
+ bond->params.mode != BOND_MODE_ACTIVEBACKUP)) {
if (ether_addr_equal_64bits(bond_dev->dev_addr, slave->perm_hwaddr) &&
bond_has_slaves(bond))
pr_warn("%s: Warning: the permanent HWaddr of %s - %pM - is still in use by %s. Set the HWaddr of %s to a different address to avoid conflicts.\n",
@@ -1769,7 +1776,8 @@ static int __bond_release_one(struct net_device *bond_dev,
/* close slave before restoring its mac address */
dev_close(slave_dev);
- if (bond->params.fail_over_mac != BOND_FOM_ACTIVE) {
+ if (bond->params.fail_over_mac != BOND_FOM_ACTIVE ||
+ bond->params.mode != BOND_MODE_ACTIVEBACKUP) {
/* restore original ("permanent") mac address */
memcpy(addr.sa_data, slave->perm_hwaddr, ETH_ALEN);
addr.sa_family = slave_dev->type;
--
1.8.0
^ permalink raw reply related
* [PATCH net-next v3 0/2] bonding: Fix some issues for fail_over_mac
From: Ding Tianhong @ 2014-01-24 4:27 UTC (permalink / raw)
To: Jay Vosburgh, Veaceslav Falico, David S. Miller, Netdev,
Andy Gospodarek
The parameter fail_over_mac only affect active-backup mode, if it was
set to active or follow and works with other modes, just like RR or XOR
mode, the bonding could not set all slaves to the master's address, it
will cause the slave could not work well with master.
So set the fail_over_mac to none if the mode is not active-backup and
slight optimization for bond_set_mac_address().
v1->v2: According Jay's suggestion, that we should permit setting an option
at any time, but only have it take effect in active-backup mode, so
I add mode checking together with fail_over_mac during enslavement and
rebuild the patches.
v2->v3: The correct way to fix the problem is that we should not add restrictions when
setting options, just need to modify the bond enslave and removal processing
to check the mode in addition to fail_over_mac when setting a slave's MAC during
enslavement. The change active slave processing already only calls the fail_over_mac
function when in active-backup mode.
Remove the cleanup patch because the net-next is frozen now.
Regards
Ding
Ding Tianhong (2):
bonding: bonding: fail_over_mac should only affect AB mode at
enslave and removal processing
bonding: fail_over_mac should only affect AB mode in
bond_set_mac_address()
drivers/net/bonding/bond_main.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)
--
1.8.0
^ permalink raw reply
* [PATCH net-next v3 2/2] bonding: fail_over_mac should only affect AB mode in bond_set_mac_address()
From: Ding Tianhong @ 2014-01-24 4:27 UTC (permalink / raw)
To: Jay Vosburgh, Veaceslav Falico, David S. Miller, Netdev,
Andy Gospodarek
The fail_over_mac could be set to active or follow in any time for every mode,
so if the fail_over_mac is not none when the mode is not in active-backup,
the bond_set_moc_address() could not change the master and slave's MAC address.
In bond_set_mac_address(), the fail_over_mac should only affect AB mode, so modify
to check the mode in addition to fail_over_mac when setting bond's MAC address.
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
drivers/net/bonding/bond_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 4409aa8..4a79585 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3423,7 +3423,8 @@ static int bond_set_mac_address(struct net_device *bond_dev, void *addr)
/* If fail_over_mac is enabled, do nothing and return success.
* Returning an error causes ifenslave to fail.
*/
- if (bond->params.fail_over_mac)
+ if (bond->params.fail_over_mac &&
+ bond->params.mode == BOND_MODE_ACTIVEBACKUP)
return 0;
if (!is_valid_ether_addr(sa->sa_data))
--
1.8.0
^ permalink raw reply related
* Re: [PATCH 2/2] netxen: Add sanity checks for Rx buffers returning from hardware
From: David Gibson @ 2014-01-24 5:21 UTC (permalink / raw)
To: David Miller
Cc: manish.chopra, sony.chacko, rajesh.borundia, netdev, snagarka,
tcamuso, vdasgupt
In-Reply-To: <20131219.150527.1753716895684393783.davem@davemloft.net>
[-- Attachment #1: Type: text/plain, Size: 1315 bytes --]
On Thu, Dec 19, 2013 at 03:05:27PM -0500, David Miller wrote:
> From: David Gibson <david@gibson.dropbear.id.au>
> Date: Tue, 17 Dec 2013 16:22:33 +1100
>
> > +static int netxen_check_rx_buf(int index, struct netxen_rx_buffer *rxbuf)
> > +{
> > + int ret = 0;
> > +
> > + if ((rxbuf->list.next != LIST_POISON1)
> > + || (rxbuf->list.prev != LIST_POISON2)) {
> > + printk(KERN_ERR "netxen: Rx buffer is already on free list "
> > + "(next=%p, prev=%p)\n", rxbuf->list.next,
> > + rxbuf->list.prev);
> > + ret = -1;
> > + }
>
> First, you should be using netdev_err() or similar.
Well, I was just matching the form used in the rest of the driver.
> Second, doing this unconditionally for every receive packet is
> not really reasonable.
Yeah, trouble is I don't know what the alternative is. Whatever is
causing this problem, it's triggered very rarely - I'm only seeing a
pattern by looking across everything reported by Red Hat customers.
Just adding diagnostics for one customer is unlikely to be useful,
because the chances are they'll never hit the bug again.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Re: [PATCH v6] xen/grant-table: Avoid m2p_override during mapping
From: Matt Wilson @ 2014-01-24 5:48 UTC (permalink / raw)
To: Zoltan Kiss
Cc: ian.campbell, wei.liu2, xen-devel, netdev, linux-kernel,
jonathan.davies, Anthony Liguori, Matt Wilson
In-Reply-To: <1390512224-27296-1-git-send-email-zoltan.kiss@citrix.com>
On Thu, Jan 23, 2014 at 09:23:44PM +0000, Zoltan Kiss wrote:
> The grant mapping API does m2p_override unnecessarily: only gntdev needs it,
> for blkback and future netback patches it just cause a lock contention, as
> those pages never go to userspace. Therefore this series does the following:
> - the original functions were renamed to __gnttab_[un]map_refs, with a new
> parameter m2p_override
> - based on m2p_override either they follow the original behaviour, or just set
> the private flag and call set_phys_to_machine
> - gnttab_[un]map_refs are now a wrapper to call __gnttab_[un]map_refs with
> m2p_override false
> - a new function gnttab_[un]map_refs_userspace provides the old behaviour
>
> It also removes a stray space from page.h and change ret to 0 if
> XENFEAT_auto_translated_physmap, as that is the only possible return value
> there.
>
> v2:
> - move the storing of the old mfn in page->index to gnttab_map_refs
> - move the function header update to a separate patch
>
> v3:
> - a new approach to retain old behaviour where it needed
> - squash the patches into one
>
> v4:
> - move out the common bits from m2p* functions, and pass pfn/mfn as parameter
> - clear page->private before doing anything with the page, so m2p_find_override
> won't race with this
>
> v5:
> - change return value handling in __gnttab_[un]map_refs
> - remove a stray space in page.h
> - add detail why ret = 0 now at some places
>
> v6:
> - don't pass pfn to m2p* functions, just get it locally
>
> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
> Suggested-by: David Vrabel <david.vrabel@citrix.com>
Apologies for coming in late on this thread. I'm quite behind on
xen-devel mail that isn't CC: to me.
It seems to have been forgotten that Anthony and I proposed a similar
change last November.
https://lkml.kernel.org/r/1384307336-5328-1-git-send-email-anthony@codemonkey.ws
Or am I misunderstanding the change?
--msw
^ permalink raw reply
* Re: [0/2] netxen: bug fix and diagnostics for possible (hardware?) bug
From: David Gibson @ 2014-01-24 6:44 UTC (permalink / raw)
To: Manish Chopra
Cc: Sony Chacko, Rajesh Borundia, netdev, snagarka@redhat.com,
tcamuso@redhat.com, vdasgupt@redhat.com
In-Reply-To: <31AFFC7280259C4184970ABA9AFE8B938CF868E9@avmb3.qlogic.org>
[-- Attachment #1: Type: text/plain, Size: 6205 bytes --]
On Thu, Dec 19, 2013 at 09:11:33AM +0000, Manish Chopra wrote:
> >> >From: David Gibson [mailto:david@gibson.dropbear.id.au]
> >> >Sent: Tuesday, December 17, 2013 10:53 AM
> >> >To: Manish Chopra; Sony Chacko; Rajesh Borundia
> >> >Cc: netdev; snagarka@redhat.com; tcamuso@redhat.com;
> >> >vdasgupt@redhat.com
> >> >Subject: [0/2] netxen: bug fix and diagnostics for possible
> >> >(hardware?) bug
> >> >
> >> >At Red Hat, we've hit a couple of customer cases with crashes in the
> >> >netxen driver due to list corruption. This seems to be very rarely
> >> >triggered, and unfortunately the dumps we have don't have enough
> >> >information to be certain of the cause, although we have a possible theory.
> >> >
> >> >I'm suggesting, therefore a patch to add some sanity checking which
> >> >should help to at least localize and mitigate the problem when someone hits it
> >in future.
> >> >Please let me know if there's a better approach to doing this.
> >> >
> >> >That's 2/2. 1/2 is a fix for a clear bug I spotted along the way,
> >> >but not one that could cause the symptoms we've seen.
> >>
> >> David,
> >>
> >> Having these checks in data path(Rx path) may have some performance
> >> impact. It's better to root cause it instead of putting some sanity
> >> checks.
> >
> >Obviously, but this was the best way I could think of to try narrowing down the
> >root cause (at least trying to eliminate driver vs. firmware bug).
>
> David, Instead of making permanent changes in driver, can you please
> run your modified driver in selective customer environment where
> this issues is seen?
Yeah, the problem with that is that the problem has never triggered
twice for a single customer. Well, technically there is one customer
that's hit it twice, but I'm pretty sure it's on entirely unrelated
systems in different sections of a large customer. The only reason I
can see enough cases to suspect a pattern to these problems is from
looking across Red Hat's whole case history.
> Which may give some data point that what's the issue exactly and then we go by that.
>
> >
> >> We will get back to you on this.
> >
> >If you have a better idea for locating the root cause, please let me know. I have
> >access to a vmcore which I can poke around in.
>
> We will also try to reproduce the problem in our environment and debug this.
> Can you please give some details?
Apologies for the long delay, I'd been hoping for some more
confirmation of things, but it hasn't happened. I'll give you what I
can.
> 1) what's the driver and firmware version used?
I'm not sure what the most useful way ot giving a driver version.
I've given kernel version below, but it's an RH kernel, so I'm not
sure how much has been backported.
As to firmware, the driver reports:
netxen_nic 0000:04:00.0: Gen2 strapping detected
netxen_nic 0000:04:00.0: using 64-bit dma mask
netxen_nic: NX3031 Gigabit Ethernet Board S/N
<FF><FF><FF><FF><FF><FF><FF><FF><FF>
<FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF>
<FF><FF><FF>NX3031 Gigabit Ethernet Chip rev 0x42
netxen_nic 0000:04:00.0: firmware v4.0.585 [legacy]
> 2) which operating system and kernel version?
RHEL5,
Linux hostname 2.6.18-308.el5 #1 SMP Fri Jan 27 17:17:51 EST 2012 x86_64 x86_64 x86_64 GNU/Linux
> 3) please send the vmcore also with backtrace if available which can
> give some idea what can trigger this issue.
I can't send the vmcore itself, since it will include customer data.
I can give you the backtrace below, and look up specific things if you
can give me an idea of what you need:
crash> bt
PID: 0 TASK: ffff81207f8bd7e0 CPU: 44 COMMAND: "swapper"
#0 [ffff81107fd33b40] crash_kexec at ffffffff800b0938
#1 [ffff81107fd33c00] __die at ffffffff80065137
#2 [ffff81107fd33c40] die at ffffffff8006c789
#3 [ffff81107fd33c70] do_invalid_op at ffffffff8006cd49
#4 [ffff81107fd33d30] error_exit at ffffffff8005dde9
[exception RIP: list_del+71]
RIP: ffffffff8015a793 RSP: ffff81107fd33de0 RFLAGS: 00010286
RAX: 0000000000000058 RBX: 0000000000000427 RCX: ffffffff80323028
RDX: ffffffff80323028 RSI: 0000000000000000 RDI: ffffffff80323020
RBP: ffff81407f4e8680 R8: ffffffff80323028 R9: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffffc200104494a0
R13: 0000000000000002 R14: ffff81107a2cf500 R15: ffff81407e1bf400
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#5 [ffff81107fd33de8] netxen_process_rcv_ring at ffffffff8830b050 [netxen_nic]
#6 [ffff81107fd33eb8] netxen_nic_poll at ffffffff88306e71 [netxen_nic]
#7 [ffff81107fd33ef8] net_rx_action at ffffffff8000c9b9
#8 [ffff81107fd33f38] __do_softirq at ffffffff80012551
#9 [ffff81107fd33f68] call_softirq at ffffffff8005e2fc
#10 [ffff81107fd33f80] do_softirq at ffffffff8006d646
#11 [ffff81107fd33f90] do_IRQ at ffffffff8006d4d6
--- <IRQ stack> ---
#12 [ffff81307fe2fe38] ret_from_intr at ffffffff8005d615
[exception RIP: mwait_idle_with_hints+102]
RIP: ffffffff8006b9cf RSP: ffff81307fe2fee8 RFLAGS: 00000246
RAX: 0000000000000000 RBX: 00000000000000ff RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 00007f319bfb6f27 R8: ffff81307fe2e000 R9: 0000000000000013
R10: ffff8110b8288510 R11: 00000000ffffffff R12: ffff81306273d040
R13: ffff81207f8bd7e0 R14: 0000000000000001 R15: 0000000000000000
ORIG_RAX: ffffffffffffff64 CS: 0010 SS: 0018
#13 [ffff81307fe2fee8] mwait_idle at ffffffff80056c65
#14 [ffff81307fe2fef0] cpu_idle at ffffffff80048f92
> 4) Test case details:- what type of test is running on the system?
> Just to make sure we also try the same test cases in our
> environment.
No particular type of test, it's an Oracle server in production.
> 5) Server details (Number of CPus, memory etc.) if available.
64 x Xeon X7550 CPUs, 256G RAM
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply
* Contact Email : msellenmore8@rogers.com
From: Gaffney, Vicki @ 2014-01-24 7:15 UTC (permalink / raw)
Account Holder, We would like to inform you that 3,000,000.86 Great British Pounds has been completed. To find instruction, kindly send your Names and Reference Number to EMAIL : msellenmore8@rogers.com<mailto:msellenmore8@rogers.com> . Regards, Mr.Abdelgalil Abdelgader
^ permalink raw reply
* [patch net-next] rtnetlink: remove IFLA_BOND_SLAVE definition
From: Jiri Pirko @ 2014-01-24 7:39 UTC (permalink / raw)
To: netdev; +Cc: davem, sfeldma, stephen, vyasevich
This is in net-next only, for couple of days. Not used anymore, and never
should have been. So just remove it and pretend it was never there.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
include/uapi/linux/if_link.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 206f651..16410b6 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -144,7 +144,6 @@ enum {
IFLA_NUM_RX_QUEUES,
IFLA_CARRIER,
IFLA_PHYS_PORT_ID,
- IFLA_BOND_SLAVE,
__IFLA_MAX
};
--
1.8.3.1
^ permalink raw reply related
* [PATCH net v2] ipv6: reallocate addrconf router for ipv6 address when lo device up
From: Gao feng @ 2014-01-24 8:29 UTC (permalink / raw)
To: netdev; +Cc: davem, Gao feng, Sabrina Dubroca, Hannes Frederic Sowa,
Weilong Chen
commit 25fb6ca4ed9cad72f14f61629b68dc03c0d9713f
"net IPv6 : Fix broken IPv6 routing table after loopback down-up"
allocates addrconf router for ipv6 address when lo device up.
but commit a881ae1f625c599b460cc8f8a7fcb1c438f699ad
"ipv6:don't call addrconf_dst_alloc again when enable lo" breaks
this behavior.
Since the addrconf router is moved to the garbage list when
lo device down, we should release this router and rellocate
a new one for ipv6 address when lo device up.
This patch solves bug 67951 on bugzilla
https://bugzilla.kernel.org/show_bug.cgi?id=67951
change from v1:
use ip6_rt_put to repleace ip6_del_rt, thanks Hannes!
change code style, suggested by Sergei.
CC: Sabrina Dubroca <sd@queasysnail.net>
CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
net/ipv6/addrconf.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 4b6b720..dee84d6 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2611,8 +2611,18 @@ static void init_loopback(struct net_device *dev)
if (sp_ifa->flags & (IFA_F_DADFAILED | IFA_F_TENTATIVE))
continue;
- if (sp_ifa->rt)
- continue;
+ if (sp_ifa->rt) {
+ /* This dst has been added to garbage list when
+ * lo device down, release this obsolete dst and
+ * reallocate a new router for ifa.
+ */
+ if (sp_ifa->rt->dst.obsolete > 0) {
+ ip6_rt_put(sp_ifa->rt);
+ sp_ifa->rt = NULL;
+ } else {
+ continue;
+ }
+ }
sp_rt = addrconf_dst_alloc(idev, &sp_ifa->addr, false);
--
1.8.4.2
^ permalink raw reply related
* Re: [patch net-next] rtnetlink: remove IFLA_BOND_SLAVE definition
From: David Miller @ 2014-01-24 8:37 UTC (permalink / raw)
To: jiri; +Cc: netdev, sfeldma, stephen, vyasevich
In-Reply-To: <1390549156-1697-1-git-send-email-jiri@resnulli.us>
From: Jiri Pirko <jiri@resnulli.us>
Date: Fri, 24 Jan 2014 08:39:16 +0100
> This is in net-next only, for couple of days. Not used anymore, and never
> should have been. So just remove it and pretend it was never there.
>
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Applied, thanks Jiri.
^ permalink raw reply
* Re: [README] bug fixes only...
From: Jiri Pirko @ 2014-01-24 8:39 UTC (permalink / raw)
To: David Miller
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netfilter-devel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20140123.162408.681541949851852312.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Fri, Jan 24, 2014 at 01:24:08AM CET, davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org wrote:
>
>I want to see zero feature and/or cleanup patches submitted at this
>time.
>
>net-next is now frozen and I'm going to let it sit for a day before
>I try to merge it to Linus.
>
>Thanks for your understanding.
Hi Dave.
Please do not forget about the patch named "rtnetlink: remove
IFLA_BOND_SLAVE definition" which is currently in patchwork.
It would not be good if the definition got from net-next into linus's tree.
Thanks a lot.
Jiri
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
>More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH] net/cxgb4: Fix referencing freed adapter
From: Gavin Shan @ 2014-01-24 9:12 UTC (permalink / raw)
To: netdev; +Cc: davem, Gavin Shan
The adapter is freed before we check its flags. It was caused
by commit 144be3d ("net/cxgb4: Avoid disabling PCI device for
towice"). The problem was reported by Intel's "0-day" tool.
The patch fixes it to avoid reverting commit 144be3d.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 139a704..43ab35f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -6163,13 +6163,13 @@ static void remove_one(struct pci_dev *pdev)
iounmap(adapter->regs);
if (!is_t4(adapter->params.chip))
iounmap(adapter->bar2);
- kfree(adapter);
pci_disable_pcie_error_reporting(pdev);
if ((adapter->flags & DEV_ENABLED)) {
pci_disable_device(pdev);
adapter->flags &= ~DEV_ENABLED;
}
pci_release_regions(pdev);
+ kfree(adapter);
} else
pci_release_regions(pdev);
}
--
1.7.10.4
^ permalink raw reply related
* Re: [README] bug fixes only...
From: David Miller @ 2014-01-24 9:18 UTC (permalink / raw)
To: jiri; +Cc: netdev, linux-wireless, netfilter-devel
In-Reply-To: <20140124083927.GA3153@minipsycho.orion>
From: Jiri Pirko <jiri@resnulli.us>
Date: Fri, 24 Jan 2014 09:39:27 +0100
> Fri, Jan 24, 2014 at 01:24:08AM CET, davem@davemloft.net wrote:
>>
>>I want to see zero feature and/or cleanup patches submitted at this
>>time.
>>
>>net-next is now frozen and I'm going to let it sit for a day before
>>I try to merge it to Linus.
>>
>>Thanks for your understanding.
>
> Hi Dave.
>
> Please do not forget about the patch named "rtnetlink: remove
> IFLA_BOND_SLAVE definition" which is currently in patchwork.
> It would not be good if the definition got from net-next into linus's tree.
I applied it before you wrote this. :-)
^ permalink raw reply
* Re: [README] bug fixes only...
From: Jiri Pirko @ 2014-01-24 9:26 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-wireless, netfilter-devel
In-Reply-To: <20140124.011800.1437880947731667846.davem@davemloft.net>
Fri, Jan 24, 2014 at 10:18:00AM CET, davem@davemloft.net wrote:
>From: Jiri Pirko <jiri@resnulli.us>
>Date: Fri, 24 Jan 2014 09:39:27 +0100
>
>> Fri, Jan 24, 2014 at 01:24:08AM CET, davem@davemloft.net wrote:
>>>
>>>I want to see zero feature and/or cleanup patches submitted at this
>>>time.
>>>
>>>net-next is now frozen and I'm going to let it sit for a day before
>>>I try to merge it to Linus.
>>>
>>>Thanks for your understanding.
>>
>> Hi Dave.
>>
>> Please do not forget about the patch named "rtnetlink: remove
>> IFLA_BOND_SLAVE definition" which is currently in patchwork.
>> It would not be good if the definition got from net-next into linus's tree.
>
>I applied it before you wrote this. :-)
Yep, your "applied" message appeared once I sent this :)
^ permalink raw reply
* Re: IPV6 routing problem
From: Emmanuel Thierry @ 2014-01-24 9:36 UTC (permalink / raw)
To: Sharat Masetty; +Cc: netdev
In-Reply-To: <CAJzFV34FhudmwOPY1q9SUtagb8rUuQanrAqt_+cOyETFsLcxqw@mail.gmail.com>
Hello,
Le 23 janv. 2014 à 08:15, Sharat Masetty a écrit :
> Unfortunately toggling this config setting CONFIG_IPV6_ROUTER_PREF did
> not help, I see the same problem in both cases.
> The kernel version I am using is 3.10.0.
>
> Thanks for confirming that this is a non issue on 3.13 kernel version.
> Would you happen to know which patch would have potentially fixed this
> issue?
Your problem looks like the following one :
http://marc.info/?l=linux-netdev&m=137280268407054&w=2
Which has been fixed in 3.10.4, and 3.11+
> On Wed, Jan 22, 2014 at 12:11 AM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>> On Wed, Jan 22, 2014 at 08:07:59AM +0100, Hannes Frederic Sowa wrote:
>>> Hi!
>>>
>>> On Tue, Jan 21, 2014 at 06:41:58PM -0700, Sharat Masetty wrote:
>>>> I have an IPV6 routing problem that has only surfaced on a 3.10
>>>> kernel version. This problem is not seen on 3.4 kernel. I will keep
>>>> the problem statement as brief as possible.
>>>
>>> Could you do me a favor and test this on a recent 3.13 kernel? Thanks!
>>> Please also state the specific kernel version (I guess you use one).
>>
>> +stable.
>>
>> A quick test worked for me. Maybe you could try with CONFIG_IPV6_ROUTER_PREF
>> enabled and one time with it disabled.
>>
>> We should make this knob a runtime setting soon...
>>
>> Thanks,
>>
>> Hannes
Best regards.
Emmanuel Thierry
^ permalink raw reply
* [PATCH net-next] bonding: do't permit slaves to change their mtu ndependently
From: Ding Tianhong @ 2014-01-24 9:43 UTC (permalink / raw)
To: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David S. Miller,
Netdev
I have come to a conclusion by testing every mode for mtu changing:
1). If the slaves support changing mtu and no need to restart the device,
just like virtual nic, the master will not lost any packages for every
mode.
2). If the slaves support changing mtu and need to restart the device,
just like Intel 82599, the AB, 802.3, ALB and TLB mode may lost
packages, but other modes could work well.
The reason is that when the slave's mtu has been changed, the slave will
restart, if the slave is current active slave, the master may set the
slave to backup state and reselect a new slave, after the reselect processing,
the master could work again, but if in load-balance mode, the master could
select another active slave to send and recv packages.
So the best way to fix the problem is don't permit slave to change their
mtu independently.
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
drivers/net/bonding/bond_main.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2ca949f..e127031c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2808,20 +2808,12 @@ static int bond_slave_netdev_event(unsigned long event,
* ... Or is it this?
*/
break;
- case NETDEV_CHANGEMTU:
- /*
- * TODO: Should slaves be allowed to
- * independently alter their MTU? For
- * an active-backup bond, slaves need
- * not be the same type of device, so
- * MTUs may vary. For other modes,
- * slaves arguably should have the
- * same MTUs. To do this, we'd need to
- * take over the slave's change_mtu
- * function for the duration of their
- * servitude.
+ case NETDEV_PRECHANGEMTU:
+ /* The master and slaves should have the
+ * same mtu, so don't permit slaves to
+ * change their mtu independently.
*/
- break;
+ return NOTIFY_BAD;
case NETDEV_CHANGENAME:
/* we don't care if we don't have primary set */
if (!USES_PRIMARY(bond->params.mode) ||
--
1.8.0
^ permalink raw reply related
* Re: [PATCH net-next] bonding: do't permit slaves to change their mtu ndependently
From: David Miller @ 2014-01-24 9:55 UTC (permalink / raw)
To: dingtianhong; +Cc: fubar, vfalico, andy, netdev
In-Reply-To: <52E235DC.1020709@huawei.com>
Diag, your commit messages and subject lines chronically have
many typos.
Please take the time out necessary to correct such errors before
sending your patches.
This subject line shows a complete lack of care.
Subject: Re: [PATCH net-next] bonding: do't permit slaves to change their mtu ndependently
^ permalink raw reply
* Re: [PATCH 4/4] ipv4: mark nexthop as dead when it's subnet becomes unreachable
From: Sergey Popovich @ 2014-01-24 10:12 UTC (permalink / raw)
To: netdev
In-Reply-To: <alpine.LFD.2.11.1401232221010.1667@ja.home.ssi.bg>
> Hello,
Hello, thanks for detailed explanation.
>
>
> My idea was that without calling fib_lookup() as
> done in fib_check_nh() you can not mark NH as dead because
> you are not sure that nh_gw is still reachable in different
> subnet. GW can be part of many subnets! Your patch assumes
> that GW can be part only from one subnet.
Agree, you are completely right.
>
> Lets extend your example in this way:
>
> ip -4 addr add 10.0.0.200/8 dev dummy1
>
> ip route get 10.0.10.5 should return the longest
> match route, 10.0.10.0/24 in our case. If 10.0.10.1 is
> removed ip route get 10.0.10.5 will hit 10.0.0.0/8.
>
> OK, lets delete the last 10.0.10.1 address on dummy1.
>
> Before your patch only fib_sync_down_addr() was called.
> You now call fib_sync_down_dev() with force=0 and ifa, with the
> goal to mark 172.16.0.0/12 as dead (it is unipath route,
> so it will be removed). inet_ifa_match() checks that nh_gw
> 10.0.10.5 is part of the removed subnet: 10.0.10.0/24. Yes, it is.
> Who will check that 10.0.10.5 is still reachable as part of
> another subnet 10.0.0.0/8 ? At this point
> ip -4 route add 172.16.0.0/12 proto static via 10.0.10.5
> should succeed again because fib_check_nh() will see with
> fib_lookup() that 10.0.10.5 is part of 10.0.0.0/8. So, the
> patch wrongly marked the NH as dead.
Really, this true thank you for pointing me on the problem. V3 of patch
makes an attempt to address this. Please review it and tell what do you
thing about.
>
> If last 10.0.10.5 is deleted we can mark NHs with
> nh_gw=10.0.10.5 as dead. It is again expensive (your
> new fib_sync_down_dev call) but this time fib_lookup() is
> not needed because this is a local GW. That is what I mean
> above.
>
> > When deleting address and removing its subnet, instead of removing route
> > from the FIB, resolve new NH with fib_lookup() if possible, as this done
> > in fib_check_nh(), and leave route with modified NH?
>
> I don't say to do it. But it is the only way to
> check if nh_gw is no more reachable.
>
> > Well, sounds good, but what to do with multipath routes?
> > Is this correct at all?
>
> fib_lookup() call is correct but expensive. That is
> why it was not done before.
fib_lookup() do its job well, but I dont think we need it at all because:
1. We known what address is removed (if any).
2. We known device where address is removed.
3. Each route, regargless if this unipath or multipath, has device assigned
for each NH entry (unless NH entry is dead or route is RTN_BLACKHOLE and
friends). fib_check_nh() checks for this.
I think we could simply iterate over interface address list to find if there
are other subnet to which NH gateway resolve, beside removed address if any?
I did tests with v3 patch and seems described problem is gone.
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
--
SP5474-RIPE
Sergey Popovich
^ permalink raw reply
* [PATCH 4/4 v3] ipv4: mark nexthop as dead when it's subnet becomes
From: Sergey Popovich @ 2014-01-24 10:15 UTC (permalink / raw)
To: netdev
In-Reply-To: <alpine.LFD.2.11.1401232221010.1667@ja.home.ssi.bg>
Removing ip address and it's subnet route using fib_del_ifaddr() does
not purge routes with nexthop in such subnet.
This could be easily reproduced with the following config:
ip link add dev dummy1 type dummy
ip link set up dev dummy1
ip -4 addr add 10.0.10.1/24 dev dummy1
ip -4 addr add 10.0.20.1/24 dev dummy1
ip -4 route add 172.16.0.0/12 proto static via 10.0.10.5
ip -4 route show exact 172.16.0.0/12
172.16.0.0/12 via 10.0.10.5 dev dummy1 proto static
ip -4 addr del 10.0.10.1/24 dev dummy1
ip -4 route show exact 172.16.0.0/12
172.16.0.0/12 via 10.0.10.5 dev dummy1 proto static
Add interface address (ifa) parameter to fib_sync_down_dev()
and use it to match nexthop against it's subnet.
Use fib_sync_down_dev() in fib_del_ifaddr() among with fib_sync_down_addr()
to mark as dead routes with nexthop in ifa.
v3. Fix NH marking as dead when NH gateway subnet is still on
interface (e.g. 10.0.10.1/24 and 10.0.30.1/16 and NH is 10.0.10.5).
Thanks to Julian Anastasov.
v2. Fix NH marking as dead when NH created with onlink option.
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru>
---
include/net/ip_fib.h | 3 ++-
net/ipv4/fib_frontend.c | 5 +++--
net/ipv4/fib_semantics.c | 29 +++++++++++++++++++++++++++--
3 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 9922093..0405fc9 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -287,8 +287,9 @@ static inline int fib_num_tclassid_users(struct net *net)
#endif
/* Exported by fib_semantics.c */
+struct in_ifaddr;
int ip_fib_check_default(__be32 gw, struct net_device *dev);
-int fib_sync_down_dev(struct net_device *dev, int force);
+int fib_sync_down_dev(struct net_device *dev, struct in_ifaddr *ifa, int
force);
int fib_sync_down_addr(struct net *net, __be32 local);
int fib_sync_up(struct net_device *dev);
void fib_select_multipath(struct fib_result *res);
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index ae5f35f..fd3445e 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -907,7 +907,8 @@ void fib_del_ifaddr(struct in_ifaddr *ifa, struct
in_ifaddr *iprim)
* First of all, we scan fib_info list searching
* for stray nexthop entries, then ignite fib_flush.
*/
- if (fib_sync_down_addr(dev_net(dev), ifa->ifa_local))
+ if (fib_sync_down_addr(dev_net(dev), ifa->ifa_local) +
+ fib_sync_down_dev(dev, ifa, 0))
fib_flush(dev_net(dev));
}
}
@@ -997,7 +998,7 @@ static void nl_fib_lookup_exit(struct net *net)
static void fib_disable_ip(struct net_device *dev, int force)
{
- if (fib_sync_down_dev(dev, force))
+ if (fib_sync_down_dev(dev, NULL, force))
fib_flush(dev_net(dev));
rt_cache_flush(dev_net(dev));
arp_ifdown(dev);
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 9d43468..fbebba5 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1112,7 +1112,29 @@ int fib_sync_down_addr(struct net *net, __be32 local)
return ret;
}
-int fib_sync_down_dev(struct net_device *dev, int force)
+static inline bool fib_sync_down_gw(struct fib_nh *nh,
+ struct in_ifaddr *ifr)
+{
+ if (!ifr)
+ return true;
+
+ if (nh->nh_flags & RTNH_F_ONLINK)
+ return false;
+
+ if (!inet_ifa_match(nh->nh_gw, ifr))
+ return false;
+
+ for_ifa(ifr->ifa_dev) {
+ if (unlikely(ifa == ifr))
+ continue;
+ if (inet_ifa_match(nh->nh_gw, ifa))
+ return false;
+ } endfor_ifa(ifr->ifa_dev);
+
+ return true;
+}
+
+int fib_sync_down_dev(struct net_device *dev, struct in_ifaddr *ifa, int
force)
{
int ret = 0;
int scope = RT_SCOPE_NOWHERE;
@@ -1124,6 +1146,8 @@ int fib_sync_down_dev(struct net_device *dev, int force)
if (force)
scope = -1;
+ BUG_ON(ifa && ifa->ifa_dev->dev != dev);
+
hlist_for_each_entry(nh, head, nh_hash) {
struct fib_info *fi = nh->nh_parent;
int dead;
@@ -1137,7 +1161,8 @@ int fib_sync_down_dev(struct net_device *dev, int force)
if (nexthop_nh->nh_flags & RTNH_F_DEAD)
dead++;
else if (nexthop_nh->nh_dev == dev &&
- nexthop_nh->nh_scope != scope) {
+ nexthop_nh->nh_scope != scope &&
+ fib_sync_down_gw(nexthop_nh, ifa)) {
nexthop_nh->nh_flags |= RTNH_F_DEAD;
#ifdef CONFIG_IP_ROUTE_MULTIPATH
spin_lock_bh(&fib_multipath_lock);
--
1.8.3.4
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox