Netdev List
 help / color / mirror / Atom feed
* Re: [bisected] Re: WARNING: at net/ipv4/devinet.c:1599
From: Geert Uytterhoeven @ 2014-02-04 19:20 UTC (permalink / raw)
  To: Cong Wang
  Cc: David S. Miller, Jiri Pirko, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAHA+R7Omv++NCg=nocVsW+tM=Zv_qCNhX1JjtFTuCzfz9B3Cig@mail.gmail.com>

On Tue, Feb 4, 2014 at 7:08 PM, Cong Wang <cwang@twopensource.com> wrote:
> On Tue, Feb 4, 2014 at 6:19 AM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>>
>> Anyone with a clue?
>>
>
> Looks like we need:
>
> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> index ac2dff3..bdbf68b 100644
> --- a/net/ipv4/devinet.c
> +++ b/net/ipv4/devinet.c
> @@ -1443,7 +1443,8 @@ static size_t inet_nlmsg_size(void)
>                + nla_total_size(4) /* IFA_LOCAL */
>                + nla_total_size(4) /* IFA_BROADCAST */
>                + nla_total_size(IFNAMSIZ) /* IFA_LABEL */
> -              + nla_total_size(4);  /* IFA_FLAGS */
> +              + nla_total_size(4)  /* IFA_FLAGS */
> +              + nla_total_size(sizeof(struct ifa_cacheinfo)); /*
> IFA_CACHEINFO */
>  }

Thanks for your suggestion, but it doesn't help :-(

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH v2 0/5] net: phy: Ethernet PHY powerdown optimization
From: Sebastian Hesselbarth @ 2014-02-04 19:38 UTC (permalink / raw)
  To: David Miller
  Cc: f.fainelli, mugunthanvnm, netdev, linux-arm-kernel, linux-kernel,
	Andrew Lunn
In-Reply-To: <20131217.144313.1563516273987934449.davem@davemloft.net>

On 12/17/2013 08:43 PM, David Miller wrote:
> From: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> Date: Fri, 13 Dec 2013 10:20:24 +0100
>
>> This is v2 of the ethernet PHY power optimization patches to reduce
>> power consumption of network PHYs with link that are either unused or
>> the corresponding netdev is down.
>>
>> Compared to the last version, this patch set drops a patch to disable
>> unused PHYs after late initcall, as it is not compatible with a modular
>> mdio bus [1]. I'll investigate different ways to have a modular mdio bus
>> driver get notified when driver loading is done.
>>
>> Again, a branch with v2 applied to v3.13-rc2 can also be found at
>> https://github.com/shesselba/linux-dove.git topic/ethphy-power-v2
>>
>> [1] http://www.spinics.net/lists/arm-kernel/msg293028.html
>
> Series applied, thanks.
>

David, Mungunthan, Florian,

as expected the above patches create a Linux to bootloader dependency
that surfaces dumb bootloaders not initializing PHYs correctly.

Andrew has a Kirkwood based board that does not power-up and restart
auto-negotiation on the powered down PHY after a warm restart. While
this specific bootloader allows a soft-workaround by issuing the
required PHY writes before accessing the interface, others may not.

I think we should allow the user to soft-disable the automatic
power-down of PHYs, i.e. by exploiting a kernel parameter.

Do you have any preference for naming it? My call would be something
like libphy.suspend_halted = [0,1] with 1 being the default.

Sebastian

^ permalink raw reply

* Re: igb and bnx2: "NETDEV WATCHDOG: transmit queue timed out" when skb has huge linear buffer
From: Michael Chan @ 2014-02-04 19:47 UTC (permalink / raw)
  To: Zoltan Kiss
  Cc: Zoltan Kiss, Jeff Kirsher, Jesse Brandeburg, Bruce Allan,
	Carolyn Wyborny, Don Skidmore, Greg Rose, Peter P Waskiewicz Jr,
	Alex Duyck, John Ronciak, Tushar Dave, Akeem G Abodunrin,
	David S. Miller, e1000-devel, netdev@vger.kernel.org,
	linux-kernel, xen-devel@lists.xenproject.org
In-Reply-To: <52EBA51E.808@citrix.com>

On Fri, 2014-01-31 at 14:29 +0100, Zoltan Kiss wrote: 
> [ 5417.275472] WARNING: at net/sched/sch_generic.c:255 
> dev_watchdog+0x156/0x1f0()
> [ 5417.275474] NETDEV WATCHDOG: eth1 (bnx2): transmit queue 2 timed out 

The dump shows an internal IRQ pending on MSIX vector 2 which matches
the the queue number that is timing out.  I don't know what happened to
the MSIX and why the driver is not seeing it.  Do you see an IRQ error
message from the kernel a few seconds before the tx timeout message?

^ permalink raw reply

* [PATCH net v3] xen-netback: Fix Rx stall due to race condition
From: Zoltan Kiss @ 2014-02-04 19:54 UTC (permalink / raw)
  To: ian.campbell, wei.liu2, xen-devel, netdev, linux-kernel,
	jonathan.davies
  Cc: Zoltan Kiss

The recent patch to fix receive side flow control
(11b57f90257c1d6a91cee720151b69e0c2020cf6: xen-netback: stop vif thread
spinning if frontend is unresponsive) solved the spinning thread problem,
however caused an another one. The receive side can stall, if:
- [THREAD] xenvif_rx_action sets rx_queue_stopped to true
- [INTERRUPT] interrupt happens, and sets rx_event to true
- [THREAD] then xenvif_kthread sets rx_event to false
- [THREAD] rx_work_todo doesn't return true anymore

Also, if interrupt sent but there is still no room in the ring, it take quite a
long time until xenvif_rx_action realize it. This patch ditch that two variable,
and rework rx_work_todo. If the thread finds it can't fit more skb's into the
ring, it saves the last slot estimation into rx_last_skb_slots, otherwise it's
kept as 0. Then rx_work_todo will check if:
- there is something to send to the ring (like before)
- there is space for the topmost packet in the queue

I think that's more natural and optimal thing to test than two bool which are
set somewhere else.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
---
v2:
- Improve description of the problem

v3:
- Change subject and resend against net to emphasize this is a bugfix

 drivers/net/xen-netback/common.h    |    6 +-----
 drivers/net/xen-netback/interface.c |    1 -
 drivers/net/xen-netback/netback.c   |   16 ++++++----------
 3 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 4c76bcb..ae413a2 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -143,11 +143,7 @@ struct xenvif {
 	char rx_irq_name[IFNAMSIZ+4]; /* DEVNAME-rx */
 	struct xen_netif_rx_back_ring rx;
 	struct sk_buff_head rx_queue;
-	bool rx_queue_stopped;
-	/* Set when the RX interrupt is triggered by the frontend.
-	 * The worker thread may need to wake the queue.
-	 */
-	bool rx_event;
+	RING_IDX rx_last_skb_slots;
 
 	/* This array is allocated seperately as it is large */
 	struct gnttab_copy *grant_copy_op;
diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index b9de31e..7669d49 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -100,7 +100,6 @@ static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id)
 {
 	struct xenvif *vif = dev_id;
 
-	vif->rx_event = true;
 	xenvif_kick_thread(vif);
 
 	return IRQ_HANDLED;
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 2738563..bb241d0 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -477,7 +477,6 @@ static void xenvif_rx_action(struct xenvif *vif)
 	unsigned long offset;
 	struct skb_cb_overlay *sco;
 	bool need_to_notify = false;
-	bool ring_full = false;
 
 	struct netrx_pending_operations npo = {
 		.copy  = vif->grant_copy_op,
@@ -487,7 +486,7 @@ static void xenvif_rx_action(struct xenvif *vif)
 	skb_queue_head_init(&rxq);
 
 	while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) {
-		int max_slots_needed;
+		RING_IDX max_slots_needed;
 		int i;
 
 		/* We need a cheap worse case estimate for the number of
@@ -510,9 +509,10 @@ static void xenvif_rx_action(struct xenvif *vif)
 		if (!xenvif_rx_ring_slots_available(vif, max_slots_needed)) {
 			skb_queue_head(&vif->rx_queue, skb);
 			need_to_notify = true;
-			ring_full = true;
+			vif->rx_last_skb_slots = max_slots_needed;
 			break;
-		}
+		} else
+			vif->rx_last_skb_slots = 0;
 
 		sco = (struct skb_cb_overlay *)skb->cb;
 		sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
@@ -523,8 +523,6 @@ static void xenvif_rx_action(struct xenvif *vif)
 
 	BUG_ON(npo.meta_prod > ARRAY_SIZE(vif->meta));
 
-	vif->rx_queue_stopped = !npo.copy_prod && ring_full;
-
 	if (!npo.copy_prod)
 		goto done;
 
@@ -1727,8 +1725,8 @@ static struct xen_netif_rx_response *make_rx_response(struct xenvif *vif,
 
 static inline int rx_work_todo(struct xenvif *vif)
 {
-	return (!skb_queue_empty(&vif->rx_queue) && !vif->rx_queue_stopped) ||
-		vif->rx_event;
+	return !skb_queue_empty(&vif->rx_queue) &&
+	       xenvif_rx_ring_slots_available(vif, vif->rx_last_skb_slots);
 }
 
 static inline int tx_work_todo(struct xenvif *vif)
@@ -1814,8 +1812,6 @@ int xenvif_kthread(void *data)
 		if (!skb_queue_empty(&vif->rx_queue))
 			xenvif_rx_action(vif);
 
-		vif->rx_event = false;
-
 		if (skb_queue_empty(&vif->rx_queue) &&
 		    netif_queue_stopped(vif->dev))
 			xenvif_start_queue(vif);

^ permalink raw reply related

* Re: [PATCH net-next v2] xen-netback: Rework rx_work_todo
From: Zoltan Kiss @ 2014-02-04 19:55 UTC (permalink / raw)
  To: Wei Liu
  Cc: ian.campbell, xen-devel, netdev, linux-kernel, jonathan.davies,
	David Miller
In-Reply-To: <52F13D24.5090405@citrix.com>

On 04/02/14 19:19, Zoltan Kiss wrote:
> On 20/01/14 16:38, Wei Liu wrote:
>> On Wed, Jan 15, 2014 at 05:11:07PM +0000, Zoltan Kiss wrote:
>>> The recent patch to fix receive side flow control (11b57f) solved the
>>> spinning
>>> thread problem, however caused an another one. The receive side can
>>> stall, if:
>>> - [THREAD] xenvif_rx_action sets rx_queue_stopped to true
>>> - [INTERRUPT] interrupt happens, and sets rx_event to true
>>> - [THREAD] then xenvif_kthread sets rx_event to false
>>> - [THREAD] rx_work_todo doesn't return true anymore
>>>
>>> Also, if interrupt sent but there is still no room in the ring, it
>>> take quite a
>>> long time until xenvif_rx_action realize it. This patch ditch that
>>> two variable,
>>> and rework rx_work_todo. If the thread finds it can't fit more skb's
>>> into the
>>> ring, it saves the last slot estimation into rx_last_skb_slots,
>>> otherwise it's
>>> kept as 0. Then rx_work_todo will check if:
>>> - there is something to send to the ring (like before)
>>> - there is space for the topmost packet in the queue
>>>
>>> I think that's more natural and optimal thing to test than two bool
>>> which are
>>> set somewhere else.
>>>
>>> Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
>>
>> Sorry for the delay.
>>
>> Paul, thanks for reviewing.
>>
>> Acked-by: Wei Liu <wei.liu2@citrix.com>
>
> Hi,
>
> This patch haven't made it to net-next yet, maybe because the subject
> doesn't suggest that this is a bugfix. I suggest to apply it as soon as
> possible, otherwise netback will be quite broken.

I've reposted it with clearer subject, sorry for being too vague

Zoli

^ permalink raw reply

* large degradation in ip netns add/exec performance in 3.13?
From: Rick Jones @ 2014-02-04 19:57 UTC (permalink / raw)
  To: netdev

Hi -

I have a dinky little script which creates what I've been calling "fake 
routers."  It is far from a complete fake router, but it shows what 
appears to be a very large degradation in performance in 3.13 compared 
to 3.12.9 which itself is slow compared to a 3.5.0-44 kernel canonical 
kernel with some upstream commits included:


Start/End    Average Rate of Creation per Second
"Router" Count  3.5.0-44+  3.12.9  3.13.0
------------------------------------------------------
0 to 250          7.58      5.56    2.55
250 to 500        7.14      5.81    2.55
500 to 750        6.41      5.56    2.55
750 to 1000       6.10      4.90    2.55
1000 to 1250      5.68      4.39    2.50
1250 to 1500      5.21      4.24    2.36
1500 to 1750      5.00      3.85    2.23
1750 to 2000      4.81      3.62    2.21
2000 to 2250      4.55      3.47    2.21
2250 to 2500      4.31      3.29    2.14
2500 to 2750      4.03      3.09    2.05
2750 to 3000      3.73      3.09    2.02
3000 to 3250      3.62      2.81    2.02
3250 to 3500      3.38      2.72    1.97
3500 to 3750      3.21      2.55    1.92
3750 to 4000      3.01      2.48    1.87

Here is what the script is doing each time it is called:

#Assumed to be called as add_fake_router <sudo>
SUDO=$1
j=`uuidgen`
$SUDO ip netns add bar-${j}
$SUDO ip netns exec bar-${j} ip link set lo up
$SUDO ip netns exec bar-${j} sysctl -w net.ipv4.ip_forward=1 > /dev/null
k=`echo $j | cut -b -11`
$SUDO /home/rjones2/iproute2_tot/ip/ip link add ro-${k} type veth peer 
name ri-${k} netns bar-${j}
$SUDO /home/rjones2/iproute2_tot/ip/ip link add go-${k} type veth peer 
name gi-${k} netns bar-${j}

There is a calling script which provides a timestamp every 250 fake routers.

I am using a top of trunk ip command to get commit
f0124b0f0aa0e5b9288114eb8e6ff9b4f8c33ec8 which removed an unnecessary
ll_init_map call.  The sudo I am using is one which is also top of its
trunk, with an option enabled to not grab the list of interfaces on
the system each time it is called.

This is all a single stream of creations.  The added commits to the 
Canonical 3.5.0-44 kernel (apart from whatever they've put into -44) are 
665e205c16c1f902ac6763b8ce8a0a3a1dcefe59 and 
32263dd1b43378b4f7d7796ed713f77e95f27e8a plus 
http://marc.info/?l=linux-netdev&m=138546807821170&w=2

The system on which I can reproduce this is mine for the duration, so I 
can experiment with things and gather other information as requested.  I 
have some timings of the individual commands of the script at 0 and 4000 
"fake routers" present which I can provide if desired.  My look at it 
thusfar suggests it may be in the exec but I'm not certain.

happy benchmarking,

rick jones

^ permalink raw reply

* Re: [PATCH RESEND net-next v3 0/2] bonding: Fix some issues for fail_over_mac
From: Jay Vosburgh @ 2014-02-04 20:00 UTC (permalink / raw)
  To: Ding Tianhong; +Cc: Veaceslav Falico, David S. Miller, Netdev, Andy Gospodarek
In-Reply-To: <52E3447B.6050206@huawei.com>

Ding Tianhong <dingtianhong@huawei.com> wrote:

>The parameter fail_over_mac only affect active-backup mode, if it was
>set to active or follow and works with other modes, just like RR or XOR
>mode, the bonding could not set all slaves to the master's address, it
>will cause the slave could not work well with master.
>
>v1->v2: According Jay's suggestion, that we should permit setting an option
>	at any time, but only have it take effect in active-backup mode, so
>	I add mode checking together with fail_over_mac during enslavement and
>	rebuild the patches.
>
>v2->v3: The correct way to fix the problem is that we should not add restrictions when
>    	setting options, just need to modify the bond enslave and removal processing
>    	to check the mode in addition to fail_over_mac when setting a slave's MAC during
>    	enslavement. The change active slave processing already only calls the fail_over_mac
>    	function when in active-backup mode.
>
>	Remove the cleanup patch because the net-next is frozen now.
>
>Regards
>Ding

	Both patches look good to me.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

	-J
	
---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* Re: [bisected] Re: WARNING: at net/ipv4/devinet.c:1599
From: Geert Uytterhoeven @ 2014-02-04 20:03 UTC (permalink / raw)
  To: Cong Wang
  Cc: David S. Miller, Jiri Pirko, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAMuHMdWUR-x=fwG26xg8AsUQwL+FXLaQAnkaWHbrVaep0QzHJQ@mail.gmail.com>

On Tue, Feb 4, 2014 at 8:20 PM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> On Tue, Feb 4, 2014 at 7:08 PM, Cong Wang <cwang@twopensource.com> wrote:
>> On Tue, Feb 4, 2014 at 6:19 AM, Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>>>
>>> Anyone with a clue?
>>>
>>
>> Looks like we need:
>>
>> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
>> index ac2dff3..bdbf68b 100644
>> --- a/net/ipv4/devinet.c
>> +++ b/net/ipv4/devinet.c
>> @@ -1443,7 +1443,8 @@ static size_t inet_nlmsg_size(void)
>>                + nla_total_size(4) /* IFA_LOCAL */
>>                + nla_total_size(4) /* IFA_BROADCAST */
>>                + nla_total_size(IFNAMSIZ) /* IFA_LABEL */
>> -              + nla_total_size(4);  /* IFA_FLAGS */
>> +              + nla_total_size(4)  /* IFA_FLAGS */
>> +              + nla_total_size(sizeof(struct ifa_cacheinfo)); /*
>> IFA_CACHEINFO */
>>  }
>
> Thanks for your suggestion, but it doesn't help :-(

Bummer, I applied it to the wrong tree.

Yes, it works, thanks a lot!

David, Jiri, is this the right fix, or just a band-aid?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH net-next] bnx2[x]: Make module parameters readable
From: James M Leddy @ 2014-02-04 20:10 UTC (permalink / raw)
  To: netdev; +Cc: Michael Chan, Ariel Elior

Occasionally users want to know what parameters their Broadcom drivers
are running with. For example, a user may want to know if MSI is
disabled.

This patch has been compile tested.

Signed-off-by: James M Leddy <james.leddy@redhat.com>
---
 drivers/net/ethernet/broadcom/bnx2.c             |    2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c |   12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index 9d2deda..cda25ac 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -85,7 +85,7 @@ MODULE_FIRMWARE(FW_RV2P_FILE_09_Ax);
 
 static int disable_msi = 0;
 
-module_param(disable_msi, int, 0);
+module_param(disable_msi, int, S_IRUGO);
 MODULE_PARM_DESC(disable_msi, "Disable Message Signaled Interrupt (MSI)");
 
 typedef enum {
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index e118a3e..639e287 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -95,29 +95,29 @@ MODULE_FIRMWARE(FW_FILE_NAME_E1H);
 MODULE_FIRMWARE(FW_FILE_NAME_E2);
 
 int bnx2x_num_queues;
-module_param_named(num_queues, bnx2x_num_queues, int, 0);
+module_param_named(num_queues, bnx2x_num_queues, int, S_IRUGO);
 MODULE_PARM_DESC(num_queues,
 		 " Set number of queues (default is as a number of CPUs)");
 
 static int disable_tpa;
-module_param(disable_tpa, int, 0);
+module_param(disable_tpa, int, S_IRUGO);
 MODULE_PARM_DESC(disable_tpa, " Disable the TPA (LRO) feature");
 
 static int int_mode;
-module_param(int_mode, int, 0);
+module_param(int_mode, int, S_IRUGO);
 MODULE_PARM_DESC(int_mode, " Force interrupt mode other than MSI-X "
 				"(1 INT#x; 2 MSI)");
 
 static int dropless_fc;
-module_param(dropless_fc, int, 0);
+module_param(dropless_fc, int, S_IRUGO);
 MODULE_PARM_DESC(dropless_fc, " Pause on exhausted host ring");
 
 static int mrrs = -1;
-module_param(mrrs, int, 0);
+module_param(mrrs, int, S_IRUGO);
 MODULE_PARM_DESC(mrrs, " Force Max Read Req Size (0..3) (for debug)");
 
 static int debug;
-module_param(debug, int, 0);
+module_param(debug, int, S_IRUGO);
 MODULE_PARM_DESC(debug, " Default debug msglevel");
 
 struct workqueue_struct *bnx2x_wq;
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH V3] net/dt: Add support for overriding phy configuration from device tree
From: Florian Fainelli @ 2014-02-04 20:39 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: netdev, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org,
	Kishon Vijay Abraham I
In-Reply-To: <1389999459-9483-1-git-send-email-matthew.garrett@nebula.com>

2014-01-17 Matthew Garrett <matthew.garrett@nebula.com>:
> Some hardware may be broken in interesting and board-specific ways, such
> that various bits of functionality don't work. This patch provides a
> mechanism for overriding mii registers during init based on the contents of
> the device tree data, allowing board-specific fixups without having to
> pollute generic code.

It would be good to explain exactly how your hardware is broken
exactly. I really do not think that such a fine-grained setting where
you could disable, e.g: 100BaseT_Full, but allow 100BaseT_Half to
remain usable makes that much sense. In general, Gigabit might be
badly broken, but 100 and 10Mbits/sec should work fine. How about the
MASTER-SLAVE bit, is overriding it really required?

Is not a PHY fixup registered for a specific OUI the solution you are
looking for? I am also concerned that this creates PHY troubleshooting
issues much harder to debug than before as we may have no idea about
how much information has been put in Device Tree to override that.

Finally, how about making this more general just like the BCM87xx PHY
driver, which is supplied value/reg pairs directly? There are 16
common MII registers, and 16 others for vendor specific registers,
this is just covering for about 2% of the possible changes.

>
> Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
> ---
>
> V3: Break the main function out into some helper functions and store the
> values in some structs.
>
>  Documentation/devicetree/bindings/net/phy.txt | 21 +++++++
>  drivers/net/phy/phy_device.c                  | 29 ++++++++-
>  drivers/of/of_net.c                           | 87 +++++++++++++++++++++++++++
>  include/linux/of_net.h                        | 12 ++++
>  4 files changed, 148 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/devicetree/bindings/net/phy.txt b/Documentation/devicetree/bindings/net/phy.txt
> index 7cd18fb..fc50f02 100644
> --- a/Documentation/devicetree/bindings/net/phy.txt
> +++ b/Documentation/devicetree/bindings/net/phy.txt
> @@ -23,6 +23,21 @@ Optional Properties:
>    assume clause 22. The compatible list may also contain other
>    elements.
>
> +The following properties may be added to either the phy node or the parent
> +ethernet device. If not present, the hardware defaults will be used.
> +
> +- phy-mii-advertise-10half: 1 to advertise half-duplex 10MBit, 0 to disable
> +- phy-mii-advertise-10full: 1 to advertise full-duplex 10MBit, 0 to disable
> +- phy-mii-advertise-100half: 1 to advertise half-duplex 100MBit, 0 to disable
> +- phy-mii-advertise-100full: 1 to advertise full-duplex 100MBit, 0 to disable
> +- phy-mii-advertise-100base4: 1 to advertise 100base4, 0 to disable
> +- phy-mii-advertise-1000half: 1 to advertise half-duplex 1000MBit, 0 to disable
> +- phy-mii-advertise-1000full: 1 to advertise full-duplex 1000MBit, 0 to disable
> +- phy-mii-manual-master: 1 to enable manual master/slave configuration, 0
> +  to disable manual master/slave configuration
> +- phy-mii-as-master: 1 to configure phy to act as master, 0 to configure phy
> +  to act as slave. Ignored if manual master/slave configuration is not enabled.
> +
>  Example:
>
>  ethernet-phy@0 {
> @@ -32,4 +47,10 @@ ethernet-phy@0 {
>         interrupts = <35 1>;
>         reg = <0>;
>         device_type = "ethernet-phy";
> +       // Disable advertising of full duplex 1000MBit
> +       phy-mii-advertise-1000full = <0>;
> +       // Force manual phy master/slave configuration
> +       phy-mii-manual-master = <1>;
> +       // Forcibly configure phy as slave
> +       phy-mii-as-master = <0>;
>  };
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index d6447b3..91793bc 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -33,6 +33,7 @@
>  #include <linux/mii.h>
>  #include <linux/ethtool.h>
>  #include <linux/phy.h>
> +#include <linux/of_net.h>
>
>  #include <asm/io.h>
>  #include <asm/irq.h>
> @@ -497,6 +498,28 @@ void phy_disconnect(struct phy_device *phydev)
>  }
>  EXPORT_SYMBOL(phy_disconnect);
>
> +int phy_override_from_of(struct phy_device *phydev)
> +{
> +       int reg, regval;
> +       u16 val, mask;
> +
> +       /* Check for phy register overrides from OF */
> +       for (reg = 0; reg < 16; reg++) {
> +               if (!of_get_mii_register(phydev, reg, &val, &mask)) {
> +                       if (!mask)
> +                               continue;
> +                       regval = phy_read(phydev, reg);
> +                       if (regval < 0)
> +                               continue;
> +                       regval &= ~mask;
> +                       regval |= val;
> +                       phy_write(phydev, reg, regval);
> +               }
> +       }
> +
> +       return 0;
> +}
> +
>  int phy_init_hw(struct phy_device *phydev)
>  {
>         int ret;
> @@ -508,7 +531,11 @@ int phy_init_hw(struct phy_device *phydev)
>         if (ret < 0)
>                 return ret;
>
> -       return phydev->drv->config_init(phydev);
> +       ret = phydev->drv->config_init(phydev);
> +       if (ret < 0)
> +               return ret;
> +
> +       return phy_override_from_of(phydev);
>  }
>
>  /**
> diff --git a/drivers/of/of_net.c b/drivers/of/of_net.c
> index 8f9be2e..75751b7 100644
> --- a/drivers/of/of_net.c
> +++ b/drivers/of/of_net.c
> @@ -93,3 +93,90 @@ const void *of_get_mac_address(struct device_node *np)
>         return NULL;
>  }
>  EXPORT_SYMBOL(of_get_mac_address);
> +
> +struct mii_override {
> +       char *prop;
> +       u32 supported;
> +       u16 value;
> +};
> +
> +static struct mii_override mii_advertise_override[] = {
> +       { "phy-mii-advertise-10half", SUPPORTED_10baseT_Half,
> +         ADVERTISE_10HALF },
> +       { "phy-mii-advertise-10full", SUPPORTED_10baseT_Full,
> +         ADVERTISE_10FULL },
> +       { "phy-mii-advertise-100half", SUPPORTED_100baseT_Half,
> +         ADVERTISE_100HALF },
> +       { "phy-mii-advertise-100full", SUPPORTED_100baseT_Full,
> +         ADVERTISE_100FULL },
> +       { "phy-mii-advertise-100base4", 0, ADVERTISE_100BASE4 },
> +       { NULL },
> +};
> +
> +static struct mii_override mii_gigabit_override[] = {
> +       { "phy-mii-advertise-1000half", SUPPORTED_1000baseT_Half,
> +         ADVERTISE_1000HALF },
> +       { "phy-mii-advertise-1000full", SUPPORTED_1000baseT_Full,
> +         ADVERTISE_1000FULL },
> +       { "phy-mii-as-master", 0, CTL1000_AS_MASTER },
> +       { "phy-mii-manual-master", 0, CTL1000_ENABLE_MASTER },
> +       { NULL },
> +};
> +
> +static void mii_handle_override(struct mii_override *override_list,
> +                               struct phy_device *phydev, u16 *val, u16 *mask)
> +{
> +       struct device *dev = &phydev->dev;
> +       struct device_node *np = dev->of_node;
> +       struct mii_override *override;
> +       u32 tmp;
> +
> +       if (!np && dev->parent->of_node)
> +               np = dev->parent->of_node;
> +
> +       if (!np)
> +               return;
> +
> +       for (override = &override_list[0]; override->prop != NULL; override++) {
> +               if (!of_property_read_u32(np, override->prop, &tmp)) {
> +                       if (tmp) {
> +                               *val |= override->value;
> +                               phydev->advertising |= override->supported;
> +                       } else {
> +                               phydev->advertising &= ~(override->supported);
> +                       }
> +
> +                       *mask |= override->value;
> +               }
> +       }
> +
> +       return;
> +}
> +
> +/**
> + * Provide phy register overrides from the device tree. Some hardware may
> + * be broken in interesting and board-specific ways, so we want a mechanism
> + * for the board data to provide overrides for default values. This should be
> + * called during phy init.
> + */
> +int of_get_mii_register(struct phy_device *phydev, int reg, u16 *val,
> +                       u16 *mask)
> +{
> +       *val = 0;
> +       *mask = 0;
> +
> +       switch (reg) {
> +       case MII_ADVERTISE:
> +               mii_handle_override(mii_advertise_override, phydev, val, mask);
> +               break;
> +
> +       case MII_CTRL1000:
> +               mii_handle_override(mii_gigabit_override, phydev, val, mask);
> +               break;
> +
> +       default:
> +               return -EINVAL;
> +       }
> +       return 0;
> +}
> +EXPORT_SYMBOL(of_get_mii_register);
> diff --git a/include/linux/of_net.h b/include/linux/of_net.h
> index 34597c8..2e478bc 100644
> --- a/include/linux/of_net.h
> +++ b/include/linux/of_net.h
> @@ -7,10 +7,14 @@
>  #ifndef __LINUX_OF_NET_H
>  #define __LINUX_OF_NET_H
>
> +#include <linux/phy.h>
> +
>  #ifdef CONFIG_OF_NET
>  #include <linux/of.h>
>  extern int of_get_phy_mode(struct device_node *np);
>  extern const void *of_get_mac_address(struct device_node *np);
> +extern int of_get_mii_register(struct phy_device *np, int reg, u16 *val,
> +                              u16 *mask);
>  #else
>  static inline int of_get_phy_mode(struct device_node *np)
>  {
> @@ -21,6 +25,14 @@ static inline const void *of_get_mac_address(struct device_node *np)
>  {
>         return NULL;
>  }
> +static inline int of_get_mii_register(struct phy_device *np, int reg, u16 *val,
> +                                     u16 *mask)
> +{
> +       *val = 0;
> +       *mask = 0;
> +
> +       return -EINVAL;
> +}
>  #endif
>
>  #endif /* __LINUX_OF_NET_H */
> --
> 1.8.4.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Florian

^ permalink raw reply

* [GIT PULL] tree-wide: clean up no longer required #include <linux/init.h>
From: Paul Gortmaker @ 2014-02-04 20:51 UTC (permalink / raw)
  To: torvalds
  Cc: Paul Gortmaker, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linuxppc-dev, linux-s390,
	sparclinux, x86, netdev, kvm, sfr, rusty, gregkh, akpm

We've had this in linux-next for 2+ weeks (thanks Stephen!) as a
linux-stable like queue of patches, and as can be seen here:

  https://git.kernel.org/pub/scm/linux/kernel/git/paulg/init.git

most of the changes in the last week have been trivial adding acks
or dropping patches that maintainers decided to take themselves.

With -rc1 now containing what was in linux-next, the queue applies
to that baseline w/o issue, and I've redone comprehensive multi
arch build testing on the -rc1 baseline as a final sanity check.

Original RFC discussion and patch posting is here, if needed:

  https://lkml.org/lkml/2014/1/21/434

Suggested merge text follows:

   ----------------------------8<-----------------------------
Summary - We removed cpuinit and devinit, which left ~2000 instances
of include <linux/init.h> that were no longer needed.  To fully enable
this removal/cleanup, we relocate module_init() from init.h into
module.h.  Multi arch/multi config build testing on linux-next has
been used to find and fix any implicit header dependencies prior to
deploying the actual init.h --> module.h move, to preserve bisection.

Additional details:

module_init/module_exit and friends moved to module.h
-----------------------------------------------------
Aside from enabling this init.h cleanup to extend into modular files,
it actually does make sense.  For all modules will use some form of
our initfunc processing/categorization, but not all initfunc users
will be necessarily using modular functionality.  So we move these
module related macros to module.h and ensure module.h sources init.h


module_init in non modular code:
--------------------------------
This series uncovered that we are enabling people to use module_init
in non-modular code.  While that works fine, there are at least three
reasons why it probably should not be encouraged:

 1) it makes a casual reader of the code assume the code is modular
    even though it is obj-y (builtin) or controlled by a bool Kconfig.

 2) it makes it too easy to add dead code in a function that is handed
    to module_exit() -- [more on that below]

 3) it breaks our ability to use priority sorted initcalls properly
    [more on that below.]

 4) on some files, the use of module.h vs. init.h can cost a ~10%
    increase in the number of lines output from CPP.

After this change, a new coder who tries to make use of module_init in
non modular code would find themselves also needing to include the
module.h header.  At which point the odds are greater that they would
ask themselves "Am I doing this right?  I shouldn't need this."

Note that existing non-modular code that already includes module.h and
uses module_init doesn't get fixed here, since they already build w/o
errors triggered by this change; we'll have to hunt them down later.


module_init and initcall ordering:
----------------------------------
We have a group of about ten priority sorted initcalls, that are
called in init/main.c after most of the hard coded/direct calls
have been processed.  These serve the purpose of avoiding everyone
poking at init/main.c to hook in their init sequence.  The bins are:

        pure_initcall               0
        core_initcall               1
        postcore_initcall           2
        arch_initcall               3
        subsys_initcall             4
        fs_initcall                 5
        device_initcall             6
        late_initcall               7

These are meant to eventually replace users of the non specific
priority "__initcall" which currently maps onto device_initcall.
This is of interest, because in non-modular code, cpp does this:

    module_init -->  __initcall --> device_initcall

So all module_init() land in the device_initcall bucket, rather late
in the sequence.  That makes sense, since if it was a module, the init
could be real late (days, weeks after boot).  But now imagine you add
support for some non-modular bus/arch/infrastructure (say for e.g. PCI)
and you use module_init for it.  That means anybody else who wants
to use your subsystem can't do so if they use an initcall of 0 --> 5
priority.  For a real world example of this, see patch #1 in this series:

	https://lkml.org/lkml/2014/1/14/809

We don't want to force code that is clearly arch or subsys or fs
specific to have to use the device_initcall just because something
else has been mistakenly put (or left) in that bucket.  So a couple of
changes do actually change the initcall level where it is inevitably
appropriate to do so.  Those are called out explicitly in their
respective commit logs.


module_exit and dead code
-------------------------
Built in code will never have an opportunity to call functions that
are registered with module_exit(), so any cases of that uncovered in
this series delete that dead code.  Note that any built-in code that
was already including module.h and using module_exit won't have shown
up as breakage on the build coverage of this series, so we'll have to
find those independently later.  It looks like there may be quite a
few that are invisibly created via module_platform_driver -- a macro
that creates module_init and module_exit automatically.  We may want
to consider relocating module_platform_driver into module.h later...


cpuinit
-------
To finalize the removal of cpuinit, which was done several releases
ago, we remove the remaining stub functions from init.h in this
series.  We've seen one or two "users" try to creep back in, so this
will close the door on that chapter and prevent creep.

   ----------------------------8<-----------------------------

Thanks,
Paul.
---

Cc: linux-alpha@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: x86@kernel.org
Cc: netdev@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: sfr@canb.auug.org.au
Cc: rusty@rustcorp.com.au
Cc: gregkh@linuxfoundation.org
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org


The following changes since commit 38dbfb59d1175ef458d006556061adeaa8751b72:

  Linus 3.14-rc1 (2014-02-02 16:42:13 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux.git tags/init-cleanup

for you to fetch changes up to a830e2e2777c893e5bfdaa370d6375023e8cd2a5:

  include: remove needless instances of <linux/init.h> (2014-02-03 16:39:14 -0500)

----------------------------------------------------------------
Cleanup of <linux/init.h> for 3.14-rc1

----------------------------------------------------------------
Paul Gortmaker (77):
      init: delete the __cpuinit related stubs
      kernel: audit/fix non-modular users of module_init in core code
      mm: replace module_init usages with subsys_initcall in nommu.c
      fs/notify: don't use module_init for non-modular inotify_user code
      netfilter: don't use module_init/exit in core IPV4 code
      x86: don't use module_init in non-modular intel_mid_vrtc.c
      x86: don't use module_init for non-modular core bootflag code
      x86: replace __init_or_module with __init in non-modular vsmp_64.c
      x86: don't use module_init in non-modular devicetree.c code
      drivers/tty/hvc: don't use module_init in non-modular hyp. console code
      staging: don't use module_init in non-modular ion_dummy_driver.c
      powerpc: use device_initcall for registering rtc devices
      powerpc: use subsys_initcall for Freescale Local Bus
      powerpc: don't use module_init for non-modular core hugetlb code
      powerpc: don't use module_init in non-modular 83xx suspend code
      arm: include module.h in drivers/bus/omap_l3_smx.c
      arm: fix implicit module.h use in mach-at91 gpio.h
      arm: fix implicit #include <linux/init.h> in entry asm.
      arm: mach-s3c64xx mach-crag6410-module.c is not modular
      arm: use subsys_initcall in non-modular pl320 IPC code
      arm: don't use module_init in non-modular mach-vexpress/spc.c code
      alpha: don't use module_init for non-modular core code
      m68k: don't use module_init in non-modular mvme16x/rtc.c code
      ia64: don't use module_init for non-modular core kernel/mca.c code
      ia64: don't use module_init in non-modular sim/simscsi.c code
      mips: make loongsoon serial driver explicitly modular
      mips: don't use module_init in non-modular sead3-mtd.c code
      cris: don't use module_init for non-modular core intmem.c code
      parisc: don't use module_init for non-modular core pdc_cons code
      parisc64: don't use module_init for non-modular core perf code
      mn10300: don't use module_init in non-modular flash.c code
      sh: don't use module_init in non-modular psw.c code
      sh: mach-highlander/psw.c is tristate and should use module.h
      xtensa: don't use module_init for non-modular core network.c code
      drivers/clk: don't use module_init in clk-nomadik.c which is non-modular
      cpuidle: don't use modular platform register in non-modular ARM drivers
      drivers/platform: don't use modular register in non-modular pdev_bus.c
      module: relocate module_init from init.h to module.h
      logo: emit "#include <linux/init.h> in autogenerated C file
      arm: delete non-required instances of include <linux/init.h>
      mips: restore init.h usage to arch/mips/ar7/time.c
      s390: delete non-required instances of include <linux/init.h>
      alpha: delete non-required instances of <linux/init.h>
      powerpc: delete another unrequired instance of <linux/init.h>
      arm64: delete non-required instances of <linux/init.h>
      watchdog: delete non-required instances of include <linux/init.h>
      video: delete non-required instances of include <linux/init.h>
      rtc: delete non-required instances of include <linux/init.h>
      scsi: delete non-required instances of include <linux/init.h>
      spi: delete non-required instances of include <linux/init.h>
      acpi: delete non-required instances of include <linux/init.h>
      drivers/power: delete non-required instances of include <linux/init.h>
      drivers/media: delete non-required instances of include <linux/init.h>
      drivers/ata: delete non-required instances of include <linux/init.h>
      drivers/hwmon: delete non-required instances of include <linux/init.h>
      drivers/pinctrl: delete non-required instances of include <linux/init.h>
      drivers/isdn: delete non-required instances of include <linux/init.h>
      drivers/leds: delete non-required instances of include <linux/init.h>
      drivers/pcmcia: delete non-required instances of include <linux/init.h>
      drivers/char: delete non-required instances of include <linux/init.h>
      drivers/infiniband: delete non-required instances of include <linux/init.h>
      drivers/mfd: delete non-required instances of include <linux/init.h>
      drivers/gpio: delete non-required instances of include <linux/init.h>
      drivers/bluetooth: delete non-required instances of include <linux/init.h>
      drivers/mmc: delete non-required instances of include <linux/init.h>
      drivers/crypto: delete non-required instances of include <linux/init.h>
      drivers/platform: delete non-required instances of include <linux/init.h>
      drivers/misc: delete non-required instances of include <linux/init.h>
      drivers/edac: delete non-required instances of include <linux/init.h>
      drivers/macintosh: delete non-required instances of include <linux/init.h>
      drivers/base: delete non-required instances of include <linux/init.h>
      drivers/cpufreq: delete non-required instances of <linux/init.h>
      drivers/pci: delete non-required instances of <linux/init.h>
      drivers/dma: delete non-required instances of <linux/init.h>
      drivers/gpu: delete non-required instances of <linux/init.h>
      drivers: delete remaining non-required instances of <linux/init.h>
      include: remove needless instances of <linux/init.h>

 arch/alpha/kernel/err_ev6.c                        |  1 -
 arch/alpha/kernel/irq.c                            |  1 -
 arch/alpha/kernel/srmcons.c                        |  3 +-
 arch/alpha/kernel/traps.c                          |  1 -
 arch/alpha/oprofile/op_model_ev4.c                 |  1 -
 arch/alpha/oprofile/op_model_ev5.c                 |  1 -
 arch/alpha/oprofile/op_model_ev6.c                 |  1 -
 arch/alpha/oprofile/op_model_ev67.c                |  1 -
 arch/arm/common/dmabounce.c                        |  1 -
 arch/arm/firmware/trusted_foundations.c            |  1 -
 arch/arm/include/asm/arch_timer.h                  |  1 -
 arch/arm/kernel/entry-armv.S                       |  2 +
 arch/arm/kernel/entry-header.S                     |  1 -
 arch/arm/kernel/hyp-stub.S                         |  1 -
 arch/arm/kernel/suspend.c                          |  1 -
 arch/arm/kernel/unwind.c                           |  1 -
 arch/arm/mach-at91/include/mach/gpio.h             |  1 +
 arch/arm/mach-cns3xxx/pm.c                         |  1 -
 arch/arm/mach-exynos/headsmp.S                     |  1 -
 arch/arm/mach-footbridge/personal.c                |  1 -
 arch/arm/mach-imx/headsmp.S                        |  1 -
 arch/arm/mach-imx/iomux-v3.c                       |  1 -
 arch/arm/mach-iop33x/uart.c                        |  1 -
 arch/arm/mach-msm/headsmp.S                        |  1 -
 arch/arm/mach-msm/proc_comm.h                      |  1 -
 arch/arm/mach-mvebu/headsmp.S                      |  1 -
 arch/arm/mach-netx/fb.c                            |  1 -
 arch/arm/mach-netx/pfifo.c                         |  1 -
 arch/arm/mach-netx/xc.c                            |  1 -
 arch/arm/mach-nspire/clcd.c                        |  1 -
 arch/arm/mach-omap1/fpga.c                         |  1 -
 arch/arm/mach-omap1/include/mach/serial.h          |  1 -
 arch/arm/mach-omap2/omap-headsmp.S                 |  1 -
 arch/arm/mach-omap2/omap3-restart.c                |  1 -
 arch/arm/mach-omap2/vc3xxx_data.c                  |  1 -
 arch/arm/mach-omap2/vc44xx_data.c                  |  1 -
 arch/arm/mach-omap2/vp3xxx_data.c                  |  1 -
 arch/arm/mach-omap2/vp44xx_data.c                  |  1 -
 arch/arm/mach-prima2/headsmp.S                     |  1 -
 arch/arm/mach-pxa/clock-pxa2xx.c                   |  1 -
 arch/arm/mach-pxa/clock-pxa3xx.c                   |  1 -
 arch/arm/mach-pxa/corgi_pm.c                       |  1 -
 arch/arm/mach-pxa/mfp-pxa3xx.c                     |  1 -
 arch/arm/mach-pxa/spitz_pm.c                       |  1 -
 arch/arm/mach-s3c24xx/clock-s3c244x.c              |  1 -
 arch/arm/mach-s3c24xx/iotiming-s3c2410.c           |  1 -
 arch/arm/mach-s3c24xx/iotiming-s3c2412.c           |  1 -
 arch/arm/mach-s3c24xx/irq-pm.c                     |  1 -
 arch/arm/mach-s3c24xx/pm.c                         |  1 -
 arch/arm/mach-s3c64xx/mach-crag6410-module.c       |  2 +-
 arch/arm/mach-s5p64x0/clock.c                      |  1 -
 arch/arm/mach-sa1100/ssp.c                         |  1 -
 arch/arm/mach-shmobile/headsmp-scu.S               |  1 -
 arch/arm/mach-shmobile/headsmp.S                   |  1 -
 arch/arm/mach-shmobile/platsmp.c                   |  1 -
 arch/arm/mach-shmobile/sleep-sh7372.S              |  1 -
 arch/arm/mach-socfpga/headsmp.S                    |  1 -
 arch/arm/mach-sti/headsmp.S                        |  1 -
 arch/arm/mach-sunxi/headsmp.S                      |  1 -
 arch/arm/mach-tegra/flowctrl.c                     |  1 -
 arch/arm/mach-tegra/headsmp.S                      |  1 -
 arch/arm/mach-tegra/reset-handler.S                |  1 -
 arch/arm/mach-u300/dummyspichip.c                  |  1 -
 arch/arm/mach-ux500/board-mop500-audio.c           |  1 -
 arch/arm/mach-ux500/headsmp.S                      |  1 -
 arch/arm/mach-versatile/versatile_ab.c             |  1 -
 arch/arm/mach-vexpress/spc.c                       |  2 +-
 arch/arm/mach-zynq/headsmp.S                       |  1 -
 arch/arm/mm/hugetlbpage.c                          |  1 -
 arch/arm/plat-iop/i2c.c                            |  1 -
 arch/arm/plat-samsung/pm-check.c                   |  1 -
 arch/arm/plat-samsung/pm-gpio.c                    |  1 -
 arch/arm/plat-samsung/s5p-irq-pm.c                 |  1 -
 arch/arm/plat-versatile/headsmp.S                  |  1 -
 arch/arm/plat-versatile/platsmp.c                  |  1 -
 arch/arm/vfp/entry.S                               |  2 +
 arch/arm64/include/asm/arch_timer.h                |  1 -
 arch/arm64/kernel/cputable.c                       |  2 -
 arch/arm64/kernel/entry.S                          |  1 -
 arch/arm64/kernel/hyp-stub.S                       |  1 -
 arch/arm64/kernel/process.c                        |  1 -
 arch/arm64/kernel/ptrace.c                         |  1 -
 arch/arm64/kernel/smp_spin_table.c                 |  1 -
 arch/arm64/kernel/vdso/vdso.S                      |  1 -
 arch/arm64/lib/delay.c                             |  1 -
 arch/arm64/mm/cache.S                              |  1 -
 arch/arm64/mm/proc.S                               |  1 -
 arch/cris/arch-v32/mm/intmem.c                     |  3 +-
 arch/ia64/hp/sim/simscsi.c                         | 11 +---
 arch/ia64/sn/kernel/mca.c                          |  3 +-
 arch/m68k/mvme16x/rtc.c                            |  2 +-
 arch/mips/ar7/time.c                               |  1 +
 arch/mips/loongson/common/serial.c                 |  9 ++-
 arch/mips/mti-sead3/sead3-mtd.c                    |  3 +-
 arch/mn10300/unit-asb2303/flash.c                  |  3 +-
 arch/parisc/kernel/pdc_cons.c                      |  3 +-
 arch/parisc/kernel/perf.c                          |  3 +-
 arch/powerpc/kernel/time.c                         |  2 +-
 arch/powerpc/mm/hugetlbpage.c                      |  2 +-
 arch/powerpc/platforms/83xx/suspend.c              |  3 +-
 arch/powerpc/platforms/ps3/time.c                  |  3 +-
 arch/powerpc/sysdev/fsl_lbc.c                      |  2 +-
 arch/powerpc/sysdev/indirect_pci.c                 |  1 -
 arch/sh/boards/mach-highlander/psw.c               |  2 +-
 arch/sh/boards/mach-landisk/psw.c                  |  2 +-
 arch/x86/kernel/bootflag.c                         |  2 +-
 arch/x86/kernel/devicetree.c                       |  2 +-
 arch/x86/kernel/vsmp_64.c                          |  2 +-
 arch/x86/platform/intel-mid/intel_mid_vrtc.c       |  3 +-
 arch/xtensa/platforms/iss/network.c                |  4 +-
 drivers/acpi/apei/apei-base.c                      |  1 -
 drivers/acpi/button.c                              |  1 -

 [ ... snip ~1000 lines of trivial driver diffstat ... ]

 drivers/watchdog/wdt_pci.c                         |  1 -
 drivers/xen/xen-stub.c                             |  1 -
 fs/notify/inotify/inotify_user.c                   |  4 +-
 include/drm/drmP.h                                 |  1 -
 include/linux/fb.h                                 |  1 -
 include/linux/ide.h                                |  1 -
 include/linux/init.h                               | 77 ----------------------
 include/linux/kdb.h                                |  1 -
 include/linux/linux_logo.h                         |  3 -
 include/linux/lsm_audit.h                          |  1 -
 include/linux/module.h                             | 72 ++++++++++++++++++++
 include/linux/moduleparam.h                        |  1 -
 include/linux/netfilter.h                          |  1 -
 include/linux/nls.h                                |  2 +-
 include/linux/percpu_ida.h                         |  1 -
 include/linux/profile.h                            |  1 -
 include/linux/pstore_ram.h                         |  1 -
 include/linux/usb/gadget.h                         |  1 -
 include/xen/xenbus.h                               |  1 -
 kernel/hung_task.c                                 |  3 +-
 kernel/kexec.c                                     |  4 +-
 kernel/profile.c                                   |  2 +-
 kernel/sched/stats.c                               |  2 +-
 kernel/user.c                                      |  3 +-
 kernel/user_namespace.c                            |  2 +-
 mm/nommu.c                                         |  4 +-
 net/ipv4/netfilter.c                               |  9 +--
 scripts/pnmtologo.c                                |  1 +
 scripts/tags.sh                                    |  2 +-
 1115 files changed, 148 insertions(+), 1273 deletions(-)

^ permalink raw reply

* Re: [PATCH] fdtable: Avoid triggering OOMs from alloc_fdmem
From: David Rientjes @ 2014-02-04 21:27 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Eric Dumazet, Andrew Morton, linux-kernel, linux-fsdevel, netdev,
	linux-mm
In-Reply-To: <871tzirdwf.fsf@xmission.com>

On Tue, 4 Feb 2014, Eric W. Biederman wrote:

> My gut feel says if there is a code path that has __GFP_NOWARN and
> because of PAGE_ALLOC_COSTLY_ORDER we loop forever then there is
> something fishy going on.
> 

The __GFP_NOWARN without __GFP_NORETRY in alloc_fdmem() is pointless 
because we already know that the allocation is PAGE_ALLOC_COSTLY_ORDER or 
smaller.  That function encodes specific knowledge of the page allocator's 
implementation so it leads me to believe that __GFP_NOWARN was intended to 
be __GFP_NORETRY from the start.  Otherwise, it's just set pointlessly and 
specifically allows for the oom killing that you're now reporting.  Since 
it can fallback to vmalloc() after exhausting all of the page allocator's 
capabilities, the __GFP_NOWARN|__GFP_NORETRY seems entirely appropriate.

The vmalloc() has never been called in this function because of the 
infinite loop in kmalloc() because of its allocation context, but it 
definitely seems better than oom killing something.

Acked-by: David Rientjes <rientjes@google.com>

> I would love to hear some people who are more current on the mm
> subsystem than I am chime in.  It might be that the darn fix is going to
> be to teach __alloc_pages_slowpath to not loop forever, unless order == 0.

It doesn't loop forever, it will either return NULL because of its 
allocation context or the oom killer will kill something, even for order-3 
allocations.  In the case that you've modified, you have sane fallback 
behavior that can be utilized rather than the oom killer and __GFP_NORETRY 
was reasonable from the start.

The question is simple enough: do we want to change 
PAGE_ALLOC_COSTLY_ORDER to be smaller so that order-3 does return NULL 
without oom killing?  Perhaps there's an argument to be made that does 
exactly that, but by not setting __GFP_NORETRY you are really demanding 
order-3 memory at the time you allocate it and are willing to accept the 
consequences to free that memory.  Should we make everything except for 
order-0 inherently __GFP_NORETRY and introduce a replacement __GFP_RETRY?  
That's doable as well, but it would be a massive effort.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [GIT PULL] tree-wide: clean up no longer required #include <linux/init.h>
From: Paul Gortmaker @ 2014-02-04 21:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: linux-arch, linux-mips, linux-m68k, rusty, linux-ia64, kvm, sfr,
	gregkh, x86, netdev, linux-alpha, sparclinux, akpm, linuxppc-dev,
	linux-arm-kernel, linux-s390
In-Reply-To: <1391547118-21967-1-git-send-email-paul.gortmaker@windriver.com>


[-- Attachment #1.1: Type: text/plain, Size: 21587 bytes --]

On Feb 4, 2014 3:52 PM, "Paul Gortmaker" <paul.gortmaker@windriver.com>
wrote:
>
> We've had this in linux-next for 2+ weeks (thanks Stephen!) as a
> linux-stable like queue of patches, and as can be seen here:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/paulg/init.git

Argh, above link is meant for cloning, not viewing.

This should be better...

https://git.kernel.org/cgit/linux/kernel/git/paulg/init.git/

Thanks,
Paul.
--

> most of the changes in the last week have been trivial adding acks
> or dropping patches that maintainers decided to take themselves.
>
> With -rc1 now containing what was in linux-next, the queue applies
> to that baseline w/o issue, and I've redone comprehensive multi
> arch build testing on the -rc1 baseline as a final sanity check.
>
> Original RFC discussion and patch posting is here, if needed:
>
>   https://lkml.org/lkml/2014/1/21/434
>
> Suggested merge text follows:
>
>    ----------------------------8<-----------------------------
> Summary - We removed cpuinit and devinit, which left ~2000 instances
> of include <linux/init.h> that were no longer needed.  To fully enable
> this removal/cleanup, we relocate module_init() from init.h into
> module.h.  Multi arch/multi config build testing on linux-next has
> been used to find and fix any implicit header dependencies prior to
> deploying the actual init.h --> module.h move, to preserve bisection.
>
> Additional details:
>
> module_init/module_exit and friends moved to module.h
> -----------------------------------------------------
> Aside from enabling this init.h cleanup to extend into modular files,
> it actually does make sense.  For all modules will use some form of
> our initfunc processing/categorization, but not all initfunc users
> will be necessarily using modular functionality.  So we move these
> module related macros to module.h and ensure module.h sources init.h
>
>
> module_init in non modular code:
> --------------------------------
> This series uncovered that we are enabling people to use module_init
> in non-modular code.  While that works fine, there are at least three
> reasons why it probably should not be encouraged:
>
>  1) it makes a casual reader of the code assume the code is modular
>     even though it is obj-y (builtin) or controlled by a bool Kconfig.
>
>  2) it makes it too easy to add dead code in a function that is handed
>     to module_exit() -- [more on that below]
>
>  3) it breaks our ability to use priority sorted initcalls properly
>     [more on that below.]
>
>  4) on some files, the use of module.h vs. init.h can cost a ~10%
>     increase in the number of lines output from CPP.
>
> After this change, a new coder who tries to make use of module_init in
> non modular code would find themselves also needing to include the
> module.h header.  At which point the odds are greater that they would
> ask themselves "Am I doing this right?  I shouldn't need this."
>
> Note that existing non-modular code that already includes module.h and
> uses module_init doesn't get fixed here, since they already build w/o
> errors triggered by this change; we'll have to hunt them down later.
>
>
> module_init and initcall ordering:
> ----------------------------------
> We have a group of about ten priority sorted initcalls, that are
> called in init/main.c after most of the hard coded/direct calls
> have been processed.  These serve the purpose of avoiding everyone
> poking at init/main.c to hook in their init sequence.  The bins are:
>
>         pure_initcall               0
>         core_initcall               1
>         postcore_initcall           2
>         arch_initcall               3
>         subsys_initcall             4
>         fs_initcall                 5
>         device_initcall             6
>         late_initcall               7
>
> These are meant to eventually replace users of the non specific
> priority "__initcall" which currently maps onto device_initcall.
> This is of interest, because in non-modular code, cpp does this:
>
>     module_init -->  __initcall --> device_initcall
>
> So all module_init() land in the device_initcall bucket, rather late
> in the sequence.  That makes sense, since if it was a module, the init
> could be real late (days, weeks after boot).  But now imagine you add
> support for some non-modular bus/arch/infrastructure (say for e.g. PCI)
> and you use module_init for it.  That means anybody else who wants
> to use your subsystem can't do so if they use an initcall of 0 --> 5
> priority.  For a real world example of this, see patch #1 in this series:
>
>         https://lkml.org/lkml/2014/1/14/809
>
> We don't want to force code that is clearly arch or subsys or fs
> specific to have to use the device_initcall just because something
> else has been mistakenly put (or left) in that bucket.  So a couple of
> changes do actually change the initcall level where it is inevitably
> appropriate to do so.  Those are called out explicitly in their
> respective commit logs.
>
>
> module_exit and dead code
> -------------------------
> Built in code will never have an opportunity to call functions that
> are registered with module_exit(), so any cases of that uncovered in
> this series delete that dead code.  Note that any built-in code that
> was already including module.h and using module_exit won't have shown
> up as breakage on the build coverage of this series, so we'll have to
> find those independently later.  It looks like there may be quite a
> few that are invisibly created via module_platform_driver -- a macro
> that creates module_init and module_exit automatically.  We may want
> to consider relocating module_platform_driver into module.h later...
>
>
> cpuinit
> -------
> To finalize the removal of cpuinit, which was done several releases
> ago, we remove the remaining stub functions from init.h in this
> series.  We've seen one or two "users" try to creep back in, so this
> will close the door on that chapter and prevent creep.
>
>    ----------------------------8<-----------------------------
>
> Thanks,
> Paul.
> ---
>
> Cc: linux-alpha@vger.kernel.org
> Cc: linux-arch@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-ia64@vger.kernel.org
> Cc: linux-m68k@lists.linux-m68k.org
> Cc: linux-mips@linux-mips.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-s390@vger.kernel.org
> Cc: sparclinux@vger.kernel.org
> Cc: x86@kernel.org
> Cc: netdev@vger.kernel.org
> Cc: kvm@vger.kernel.org
> Cc: sfr@canb.auug.org.au
> Cc: rusty@rustcorp.com.au
> Cc: gregkh@linuxfoundation.org
> Cc: akpm@linux-foundation.org
> Cc: torvalds@linux-foundation.org
>
>
> The following changes since commit
38dbfb59d1175ef458d006556061adeaa8751b72:
>
>   Linus 3.14-rc1 (2014-02-02 16:42:13 -0800)
>
> are available in the git repository at:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux.gittags/init-cleanup
>
> for you to fetch changes up to a830e2e2777c893e5bfdaa370d6375023e8cd2a5:
>
>   include: remove needless instances of <linux/init.h> (2014-02-03
16:39:14 -0500)
>
> ----------------------------------------------------------------
> Cleanup of <linux/init.h> for 3.14-rc1
>
> ----------------------------------------------------------------
> Paul Gortmaker (77):
>       init: delete the __cpuinit related stubs
>       kernel: audit/fix non-modular users of module_init in core code
>       mm: replace module_init usages with subsys_initcall in nommu.c
>       fs/notify: don't use module_init for non-modular inotify_user code
>       netfilter: don't use module_init/exit in core IPV4 code
>       x86: don't use module_init in non-modular intel_mid_vrtc.c
>       x86: don't use module_init for non-modular core bootflag code
>       x86: replace __init_or_module with __init in non-modular vsmp_64.c
>       x86: don't use module_init in non-modular devicetree.c code
>       drivers/tty/hvc: don't use module_init in non-modular hyp. console
code
>       staging: don't use module_init in non-modular ion_dummy_driver.c
>       powerpc: use device_initcall for registering rtc devices
>       powerpc: use subsys_initcall for Freescale Local Bus
>       powerpc: don't use module_init for non-modular core hugetlb code
>       powerpc: don't use module_init in non-modular 83xx suspend code
>       arm: include module.h in drivers/bus/omap_l3_smx.c
>       arm: fix implicit module.h use in mach-at91 gpio.h
>       arm: fix implicit #include <linux/init.h> in entry asm.
>       arm: mach-s3c64xx mach-crag6410-module.c is not modular
>       arm: use subsys_initcall in non-modular pl320 IPC code
>       arm: don't use module_init in non-modular mach-vexpress/spc.c code
>       alpha: don't use module_init for non-modular core code
>       m68k: don't use module_init in non-modular mvme16x/rtc.c code
>       ia64: don't use module_init for non-modular core kernel/mca.c code
>       ia64: don't use module_init in non-modular sim/simscsi.c code
>       mips: make loongsoon serial driver explicitly modular
>       mips: don't use module_init in non-modular sead3-mtd.c code
>       cris: don't use module_init for non-modular core intmem.c code
>       parisc: don't use module_init for non-modular core pdc_cons code
>       parisc64: don't use module_init for non-modular core perf code
>       mn10300: don't use module_init in non-modular flash.c code
>       sh: don't use module_init in non-modular psw.c code
>       sh: mach-highlander/psw.c is tristate and should use module.h
>       xtensa: don't use module_init for non-modular core network.c code
>       drivers/clk: don't use module_init in clk-nomadik.c which is
non-modular
>       cpuidle: don't use modular platform register in non-modular ARM
drivers
>       drivers/platform: don't use modular register in non-modular
pdev_bus.c
>       module: relocate module_init from init.h to module.h
>       logo: emit "#include <linux/init.h> in autogenerated C file
>       arm: delete non-required instances of include <linux/init.h>
>       mips: restore init.h usage to arch/mips/ar7/time.c
>       s390: delete non-required instances of include <linux/init.h>
>       alpha: delete non-required instances of <linux/init.h>
>       powerpc: delete another unrequired instance of <linux/init.h>
>       arm64: delete non-required instances of <linux/init.h>
>       watchdog: delete non-required instances of include <linux/init.h>
>       video: delete non-required instances of include <linux/init.h>
>       rtc: delete non-required instances of include <linux/init.h>
>       scsi: delete non-required instances of include <linux/init.h>
>       spi: delete non-required instances of include <linux/init.h>
>       acpi: delete non-required instances of include <linux/init.h>
>       drivers/power: delete non-required instances of include
<linux/init.h>
>       drivers/media: delete non-required instances of include
<linux/init.h>
>       drivers/ata: delete non-required instances of include <linux/init.h>
>       drivers/hwmon: delete non-required instances of include
<linux/init.h>
>       drivers/pinctrl: delete non-required instances of include
<linux/init.h>
>       drivers/isdn: delete non-required instances of include
<linux/init.h>
>       drivers/leds: delete non-required instances of include
<linux/init.h>
>       drivers/pcmcia: delete non-required instances of include
<linux/init.h>
>       drivers/char: delete non-required instances of include
<linux/init.h>
>       drivers/infiniband: delete non-required instances of include
<linux/init.h>
>       drivers/mfd: delete non-required instances of include <linux/init.h>
>       drivers/gpio: delete non-required instances of include
<linux/init.h>
>       drivers/bluetooth: delete non-required instances of include
<linux/init.h>
>       drivers/mmc: delete non-required instances of include <linux/init.h>
>       drivers/crypto: delete non-required instances of include
<linux/init.h>
>       drivers/platform: delete non-required instances of include
<linux/init.h>
>       drivers/misc: delete non-required instances of include
<linux/init.h>
>       drivers/edac: delete non-required instances of include
<linux/init.h>
>       drivers/macintosh: delete non-required instances of include
<linux/init.h>
>       drivers/base: delete non-required instances of include
<linux/init.h>
>       drivers/cpufreq: delete non-required instances of <linux/init.h>
>       drivers/pci: delete non-required instances of <linux/init.h>
>       drivers/dma: delete non-required instances of <linux/init.h>
>       drivers/gpu: delete non-required instances of <linux/init.h>
>       drivers: delete remaining non-required instances of <linux/init.h>
>       include: remove needless instances of <linux/init.h>
>
>  arch/alpha/kernel/err_ev6.c                        |  1 -
>  arch/alpha/kernel/irq.c                            |  1 -
>  arch/alpha/kernel/srmcons.c                        |  3 +-
>  arch/alpha/kernel/traps.c                          |  1 -
>  arch/alpha/oprofile/op_model_ev4.c                 |  1 -
>  arch/alpha/oprofile/op_model_ev5.c                 |  1 -
>  arch/alpha/oprofile/op_model_ev6.c                 |  1 -
>  arch/alpha/oprofile/op_model_ev67.c                |  1 -
>  arch/arm/common/dmabounce.c                        |  1 -
>  arch/arm/firmware/trusted_foundations.c            |  1 -
>  arch/arm/include/asm/arch_timer.h                  |  1 -
>  arch/arm/kernel/entry-armv.S                       |  2 +
>  arch/arm/kernel/entry-header.S                     |  1 -
>  arch/arm/kernel/hyp-stub.S                         |  1 -
>  arch/arm/kernel/suspend.c                          |  1 -
>  arch/arm/kernel/unwind.c                           |  1 -
>  arch/arm/mach-at91/include/mach/gpio.h             |  1 +
>  arch/arm/mach-cns3xxx/pm.c                         |  1 -
>  arch/arm/mach-exynos/headsmp.S                     |  1 -
>  arch/arm/mach-footbridge/personal.c                |  1 -
>  arch/arm/mach-imx/headsmp.S                        |  1 -
>  arch/arm/mach-imx/iomux-v3.c                       |  1 -
>  arch/arm/mach-iop33x/uart.c                        |  1 -
>  arch/arm/mach-msm/headsmp.S                        |  1 -
>  arch/arm/mach-msm/proc_comm.h                      |  1 -
>  arch/arm/mach-mvebu/headsmp.S                      |  1 -
>  arch/arm/mach-netx/fb.c                            |  1 -
>  arch/arm/mach-netx/pfifo.c                         |  1 -
>  arch/arm/mach-netx/xc.c                            |  1 -
>  arch/arm/mach-nspire/clcd.c                        |  1 -
>  arch/arm/mach-omap1/fpga.c                         |  1 -
>  arch/arm/mach-omap1/include/mach/serial.h          |  1 -
>  arch/arm/mach-omap2/omap-headsmp.S                 |  1 -
>  arch/arm/mach-omap2/omap3-restart.c                |  1 -
>  arch/arm/mach-omap2/vc3xxx_data.c                  |  1 -
>  arch/arm/mach-omap2/vc44xx_data.c                  |  1 -
>  arch/arm/mach-omap2/vp3xxx_data.c                  |  1 -
>  arch/arm/mach-omap2/vp44xx_data.c                  |  1 -
>  arch/arm/mach-prima2/headsmp.S                     |  1 -
>  arch/arm/mach-pxa/clock-pxa2xx.c                   |  1 -
>  arch/arm/mach-pxa/clock-pxa3xx.c                   |  1 -
>  arch/arm/mach-pxa/corgi_pm.c                       |  1 -
>  arch/arm/mach-pxa/mfp-pxa3xx.c                     |  1 -
>  arch/arm/mach-pxa/spitz_pm.c                       |  1 -
>  arch/arm/mach-s3c24xx/clock-s3c244x.c              |  1 -
>  arch/arm/mach-s3c24xx/iotiming-s3c2410.c           |  1 -
>  arch/arm/mach-s3c24xx/iotiming-s3c2412.c           |  1 -
>  arch/arm/mach-s3c24xx/irq-pm.c                     |  1 -
>  arch/arm/mach-s3c24xx/pm.c                         |  1 -
>  arch/arm/mach-s3c64xx/mach-crag6410-module.c       |  2 +-
>  arch/arm/mach-s5p64x0/clock.c                      |  1 -
>  arch/arm/mach-sa1100/ssp.c                         |  1 -
>  arch/arm/mach-shmobile/headsmp-scu.S               |  1 -
>  arch/arm/mach-shmobile/headsmp.S                   |  1 -
>  arch/arm/mach-shmobile/platsmp.c                   |  1 -
>  arch/arm/mach-shmobile/sleep-sh7372.S              |  1 -
>  arch/arm/mach-socfpga/headsmp.S                    |  1 -
>  arch/arm/mach-sti/headsmp.S                        |  1 -
>  arch/arm/mach-sunxi/headsmp.S                      |  1 -
>  arch/arm/mach-tegra/flowctrl.c                     |  1 -
>  arch/arm/mach-tegra/headsmp.S                      |  1 -
>  arch/arm/mach-tegra/reset-handler.S                |  1 -
>  arch/arm/mach-u300/dummyspichip.c                  |  1 -
>  arch/arm/mach-ux500/board-mop500-audio.c           |  1 -
>  arch/arm/mach-ux500/headsmp.S                      |  1 -
>  arch/arm/mach-versatile/versatile_ab.c             |  1 -
>  arch/arm/mach-vexpress/spc.c                       |  2 +-
>  arch/arm/mach-zynq/headsmp.S                       |  1 -
>  arch/arm/mm/hugetlbpage.c                          |  1 -
>  arch/arm/plat-iop/i2c.c                            |  1 -
>  arch/arm/plat-samsung/pm-check.c                   |  1 -
>  arch/arm/plat-samsung/pm-gpio.c                    |  1 -
>  arch/arm/plat-samsung/s5p-irq-pm.c                 |  1 -
>  arch/arm/plat-versatile/headsmp.S                  |  1 -
>  arch/arm/plat-versatile/platsmp.c                  |  1 -
>  arch/arm/vfp/entry.S                               |  2 +
>  arch/arm64/include/asm/arch_timer.h                |  1 -
>  arch/arm64/kernel/cputable.c                       |  2 -
>  arch/arm64/kernel/entry.S                          |  1 -
>  arch/arm64/kernel/hyp-stub.S                       |  1 -
>  arch/arm64/kernel/process.c                        |  1 -
>  arch/arm64/kernel/ptrace.c                         |  1 -
>  arch/arm64/kernel/smp_spin_table.c                 |  1 -
>  arch/arm64/kernel/vdso/vdso.S                      |  1 -
>  arch/arm64/lib/delay.c                             |  1 -
>  arch/arm64/mm/cache.S                              |  1 -
>  arch/arm64/mm/proc.S                               |  1 -
>  arch/cris/arch-v32/mm/intmem.c                     |  3 +-
>  arch/ia64/hp/sim/simscsi.c                         | 11 +---
>  arch/ia64/sn/kernel/mca.c                          |  3 +-
>  arch/m68k/mvme16x/rtc.c                            |  2 +-
>  arch/mips/ar7/time.c                               |  1 +
>  arch/mips/loongson/common/serial.c                 |  9 ++-
>  arch/mips/mti-sead3/sead3-mtd.c                    |  3 +-
>  arch/mn10300/unit-asb2303/flash.c                  |  3 +-
>  arch/parisc/kernel/pdc_cons.c                      |  3 +-
>  arch/parisc/kernel/perf.c                          |  3 +-
>  arch/powerpc/kernel/time.c                         |  2 +-
>  arch/powerpc/mm/hugetlbpage.c                      |  2 +-
>  arch/powerpc/platforms/83xx/suspend.c              |  3 +-
>  arch/powerpc/platforms/ps3/time.c                  |  3 +-
>  arch/powerpc/sysdev/fsl_lbc.c                      |  2 +-
>  arch/powerpc/sysdev/indirect_pci.c                 |  1 -
>  arch/sh/boards/mach-highlander/psw.c               |  2 +-
>  arch/sh/boards/mach-landisk/psw.c                  |  2 +-
>  arch/x86/kernel/bootflag.c                         |  2 +-
>  arch/x86/kernel/devicetree.c                       |  2 +-
>  arch/x86/kernel/vsmp_64.c                          |  2 +-
>  arch/x86/platform/intel-mid/intel_mid_vrtc.c       |  3 +-
>  arch/xtensa/platforms/iss/network.c                |  4 +-
>  drivers/acpi/apei/apei-base.c                      |  1 -
>  drivers/acpi/button.c                              |  1 -
>
>  [ ... snip ~1000 lines of trivial driver diffstat ... ]
>
>  drivers/watchdog/wdt_pci.c                         |  1 -
>  drivers/xen/xen-stub.c                             |  1 -
>  fs/notify/inotify/inotify_user.c                   |  4 +-
>  include/drm/drmP.h                                 |  1 -
>  include/linux/fb.h                                 |  1 -
>  include/linux/ide.h                                |  1 -
>  include/linux/init.h                               | 77
----------------------
>  include/linux/kdb.h                                |  1 -
>  include/linux/linux_logo.h                         |  3 -
>  include/linux/lsm_audit.h                          |  1 -
>  include/linux/module.h                             | 72
++++++++++++++++++++
>  include/linux/moduleparam.h                        |  1 -
>  include/linux/netfilter.h                          |  1 -
>  include/linux/nls.h                                |  2 +-
>  include/linux/percpu_ida.h                         |  1 -
>  include/linux/profile.h                            |  1 -
>  include/linux/pstore_ram.h                         |  1 -
>  include/linux/usb/gadget.h                         |  1 -
>  include/xen/xenbus.h                               |  1 -
>  kernel/hung_task.c                                 |  3 +-
>  kernel/kexec.c                                     |  4 +-
>  kernel/profile.c                                   |  2 +-
>  kernel/sched/stats.c                               |  2 +-
>  kernel/user.c                                      |  3 +-
>  kernel/user_namespace.c                            |  2 +-
>  mm/nommu.c                                         |  4 +-
>  net/ipv4/netfilter.c                               |  9 +--
>  scripts/pnmtologo.c                                |  1 +
>  scripts/tags.sh                                    |  2 +-
>  1115 files changed, 148 insertions(+), 1273 deletions(-)
>

[-- Attachment #1.2: Type: text/html, Size: 26478 bytes --]

[-- Attachment #2: Type: text/plain, Size: 150 bytes --]

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply

* Re: igb and bnx2: "NETDEV WATCHDOG: transmit queue timed out" when skb has huge linear buffer
From: Zoltan Kiss @ 2014-02-04 21:32 UTC (permalink / raw)
  To: Wei Liu, Zoltan Kiss
  Cc: linux-kernel, Carolyn, Tushar, e1000-devel, Michael Chan,
	Bruce Allan, Jesse Brandeburg, David S. Miller, John Ronciak,
	netdev@vger.kernel.org, xen-devel@lists.xenproject.org, Peter
In-Reply-To: <20140131185619.GB27553@zion.uk.xensource.com>

On 31/01/14 18:56, Wei Liu wrote:
> On Thu, Jan 30, 2014 at 07:08:11PM +0000, Zoltan Kiss wrote:
>> Hi,
>>
>> I've experienced some queue timeout problems mentioned in the
>> subject with igb and bnx2 cards. I haven't seen them on other cards
>> so far. I'm using XenServer with 3.10 Dom0 kernel (however igb were
>> already updated to latest version), and there are Windows guests
>> sending data through these cards. I noticed these problems in XenRT
>> test runs, and I know that they usually mean some lost interrupt
>> problem or other hardware error, but in my case they started to
>> appear more often, and they are likely connected to my netback grant
>> mapping patches. These patches causing skb's with huge (~64kb)
>> linear buffers to appear more often.
>> The reason for that is an old problem in the ring protocol:
>> originally the maximum amount of slots were linked to MAX_SKB_FRAGS,
>> as every slot ended up as a frag of the skb. When this value were
>> changed, netback had to cope with the situation by coalescing the
>> packets into fewer frags.
>> My patch series take a different approach: the leftover slots
>> (pages) were assigned to a new skb's frags, and that skb were
>> stashed to the frag_list of the first one. Then, before sending it
>> off to the stack it calls skb = skb_copy_expand(skb, 0, 0,
>> GFP_ATOMIC, __GFP_NOWARN), which basically creates a new skb and
>> copied all the data into it. As far as I understood, it put
>> everything into the linear buffer, which can amount to 64KB at most.
>> The original skb are freed then, and this new one were sent to the
>> stack.
>
> Just my two cents, if it is this case, you can try to call
> skb_copy_expand on every SKB netback receives to manually create SKBs
> with ~64KB linear buffer to see how it goes...

I've tried it, and it did break everything in a similar way, so that's a 
strong clue that the problem lies here. I've rewrote that part of my 
patches to do less modification, based on Malcolm's idea: netback pulls 
the first frag into linear buffer, then moves a frag from the frag_list 
skb into the first one. That seems to help, but so far I have only one 
relevant test result, I'm waiting for more results.

Zoli


------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply

* Re: [PATCH V3] net/dt: Add support for overriding phy configuration from device tree
From: Ben Hutchings @ 2014-02-04 21:40 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Matthew Garrett, netdev, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, Kishon Vijay Abraham I
In-Reply-To: <CAGVrzcZ4TFd=9KP+aoG47QbmqDJ1i23WBcEWDbzNRUfGmPvZHQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1232 bytes --]

On Tue, 2014-02-04 at 12:39 -0800, Florian Fainelli wrote:
> 2014-01-17 Matthew Garrett <matthew.garrett@nebula.com>:
> > Some hardware may be broken in interesting and board-specific ways, such
> > that various bits of functionality don't work. This patch provides a
> > mechanism for overriding mii registers during init based on the contents of
> > the device tree data, allowing board-specific fixups without having to
> > pollute generic code.
> 
> It would be good to explain exactly how your hardware is broken
> exactly. I really do not think that such a fine-grained setting where
> you could disable, e.g: 100BaseT_Full, but allow 100BaseT_Half to
> remain usable makes that much sense. In general, Gigabit might be
> badly broken, but 100 and 10Mbits/sec should work fine. How about the
> MASTER-SLAVE bit, is overriding it really required?

Yes, it is entirely possible that one or other of the clock modes
(locally generated vs recovered) is not reliable.

> Is not a PHY fixup registered for a specific OUI the solution you are
> looking for?
[...]

The fault is in the board, not the PHY.

Ben.

-- 
Ben Hutchings
One of the nice things about standards is that there are so many of them.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply

* Re: [PATCH] net:cpsw: Pass unhandled ioctl's on to generic phy ioctl
From: Ben Hutchings @ 2014-02-04 21:51 UTC (permalink / raw)
  To: Sørensen, Stefan
  Cc: davem@davemloft.net, netdev@vger.kernel.org, mugunthanvnm@ti.com
In-Reply-To: <1391526492.7871.8.camel@e37108.spectralink.com>

[-- Attachment #1: Type: text/plain, Size: 1210 bytes --]

On Tue, 2014-02-04 at 15:08 +0000, Sørensen, Stefan wrote:
> On Tue, 2014-02-04 at 10:50 +0000, Ben Hutchings wrote:
> > > This patch allows the use of a generic timestamping phy connected
> > > to the cpsw if CPTS support is not enabled.
> > 
> > What if CPTS support is enabled in the driver, but this particular
> > machine doesn't have it and uses a timestamping PHY instead?
> 
> That would not work, the CPTS will grab the SIOC{G,S}HWTSTAMP. I'm not
> sure how that could be configured at runtime, other than a private
> ethtool flag.

Do all versions of CPSW include hardware timestamping?  Because the
condition at the top of cpsw_htstamp_ioctl() suggested to me that there
are some that don't.

> Also it seem as the situation with a timestamping MAC and a timestamping
> PHY could deliver bogus ethtool timestamping info, as it will come from
> the PHY if available, but the timestamping will be handled by the MAC.
[...]

Right.  If all versions of CPSW include hardware timestamping then
bother with PHY timestamping at all?  And why make CONFIG_TI_CPTS
configurable?

Ben.

-- 
Ben Hutchings
One of the nice things about standards is that there are so many of them.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply

* Re: IGMP joins come from the wrong SA/interface
From: Steinar H. Gunderson @ 2014-02-04 22:08 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20140130224411.GG25336@order.stressinduktion.org>

On Thu, Jan 30, 2014 at 11:44:11PM +0100, Hannes Frederic Sowa wrote:
> The routing lookup is done at IP_ADD_MEMBERSHIP time. I really wonder why you
> have routed the 239.0.0.0/8 range to eth0.11. It seems to me that the kernel
> does what you told it to do. ;)
> 
> multicast flag on ip route is just used for multicast forwarding and does not
> matter for local multicast. Also if we find unicast route first (more
> specific) kernel does not do backtracking if destination is in multicast
> scope.

Hah, you're right. The issue was a combination of:

 1. mediatomb's initscript on Debian at some point started to add a bogus
    239.0.0.0/8 route (and I didn't notice this because I earlier tested with
    addresses outside this range).
 2. I didn't properly understand that the multicast flag on the route did not
    matter (although it really should!).
 3. rp_filter ate the data packets when they actually arrived. (I don't know
    why I never had this problem before, but I certainly didn't.)

So in various debugging rounds, I managed to fix #1 and #2 to various
degrees, but then #3 would come and make it appear like nothing actually
happened. I didn't see this before tracking it all the way up to the upstream
routers and observing that they actually _did_ send out packets...

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply

* Re: linux-3.14-rc1 & PACKET_QDISC_BYPASS : slow_path warning
From: Daniel Borkmann @ 2014-02-04 22:17 UTC (permalink / raw)
  To: Felix Fietkau
  Cc: Mathias Kretschmer, netdev, linux-wireless@vger.kernel.org,
	Jesper Dangaard Brouer
In-Reply-To: <52F0FDD9.6060901@openwrt.org>

On 02/04/2014 03:48 PM, Felix Fietkau wrote:
> On 2014-02-04 15:35, Mathias Kretschmer wrote:
>> On 02/04/2014 03:25 PM, Daniel Borkmann wrote:
>>> On 02/04/2014 02:13 PM, Mathias Kretschmer wrote:
>>>> On 02/04/2014 01:56 PM, Daniel Borkmann wrote:
>>>>> On 02/03/2014 11:47 PM, Mathias Kretschmer wrote:
>>> ...
>>>>>> we are developing a wired/wireless MPLS switch. Currently the data plane runs in
>>>>>> user space using PF_PACKET sockets via RX_RING/TX_RING.
>>>>>>
>>>>>> We had hoped to test the PACKET_QDISC_BYPASS option since this seems to be the
>>>>>> proper optimization for our purposes.
>>>>>>
>>>>>> Unfortunately, we're seeing a 'slow path' warning for every packet that is being
>>>>>> sent out. With PACKET_QDISC_BYPASS disabled, no warnings are dumped. Hardware is
>>>>>> an older AMD Geode LX embedded board (ALiX).
>>>>>>
>>>>>> BTW, this happens while sending via a wireless (802.11) adhoc interface. Hence, it
>>>>>> might be an interaction with the ieee80211 sub system.
>>>>>
>>>>> Hm, so the WARN_ON() is triggered inside ath9k driver in relation to 802.11 QoS,
>>>>> and came in from commit 066dae93bdf ("ath9k: rework tx queue selection and fix
>>>>> queue stopping/waking"). We did the stress testing of that option for PF_PACKET
>>>>> on 10Gbit/s NICs. Seems to me you might be running into the same issue when using
>>>>> pktgen as it randomly or per round-robin selects tx queues as well? Not entirely
>>>>> sure how necessary this WARN_ON() is though, Felix? I think QDISC_BYPASS might not
>>>>> be the best option in your case, perhaps you will run into increased power usage
>>>>> in your NIC as a side-effect?
>>>>
>>>> I'm not familiar with the exact implementation details, but from the description
>>>> of this option, it seems to me that this is exactly what one would want to use if
>>>> the goal is to send an Ethernet frame out on a particular interface without any
>>>> further processing by the kernel.
>>>>
>>>> Why would this increase the power usage on the NIC ?  Due to a higher achievable
>>>> packet rate ?  That would be acceptable :)
>>>
>>> I'm not too familiar with the ieee80211 sub system, so I let Felix answer side
>>> effects and if actually the WARN_ON() is needed. ;) PACKET_QDISC_BYPASS is, as
>>> documented, designed for advanced pktgen resp. traffic generator like scenarios
>>> where you just sort of "brute force" packets to your NIC to stress test a remote
>>> machine for further analysis. I don't think it's very useful in your scenario
>>> when you have a wired/wireless MPLS switch, you rather might want to buffer/queue
>>> and therefore use qdisc layer instead.
>>
>> Hm, I was hoping/assuming that we still get to use hardware queues, if provided by
>> the driver. The main goal was to avoid any further PF_PACKET framework overhead.
>>
>> If the WARN_ON() issue gets solved, we will revisit this option and evaluate its
>> applicability.
> The reason for the WARN_ON is probably either the .ndo_select_queue call
> is not run, or its queue selection result is changed before the frame
> hits the driver's tx call.
> This call sets both the queue and the TID (similar to 802.1d tag), which
> makes it into the packet via 802.11e (WMM, QoS).
> It is important to the driver that the TID is in sync with the queue
> selection, if that is not the case, then pending frame counters can get
> messed up.
> If you really want to bypass qdisc, make sure that at least
> ndo_select_queue is called before passing the frame to the netdev.

Ok, thanks for the input, we'll look further into it and eventually come up
with something.

> - Felix
>

^ permalink raw reply

* Re: [PATCH V3] net/dt: Add support for overriding phy configuration from device tree
From: Florian Fainelli @ 2014-02-04 22:48 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Matthew Garrett, netdev, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, Kishon Vijay Abraham I
In-Reply-To: <1391550040.3003.28.camel@deadeye.wl.decadent.org.uk>

2014-02-04 Ben Hutchings <ben@decadent.org.uk>:
> On Tue, 2014-02-04 at 12:39 -0800, Florian Fainelli wrote:
>> 2014-01-17 Matthew Garrett <matthew.garrett@nebula.com>:
>> > Some hardware may be broken in interesting and board-specific ways, such
>> > that various bits of functionality don't work. This patch provides a
>> > mechanism for overriding mii registers during init based on the contents of
>> > the device tree data, allowing board-specific fixups without having to
>> > pollute generic code.
>>
>> It would be good to explain exactly how your hardware is broken
>> exactly. I really do not think that such a fine-grained setting where
>> you could disable, e.g: 100BaseT_Full, but allow 100BaseT_Half to
>> remain usable makes that much sense. In general, Gigabit might be
>> badly broken, but 100 and 10Mbits/sec should work fine. How about the
>> MASTER-SLAVE bit, is overriding it really required?
>
> Yes, it is entirely possible that one or other of the clock modes
> (locally generated vs recovered) is not reliable.

That one is not covered in the existing Ethernet PHY binding, okay for
handling it.

>
>> Is not a PHY fixup registered for a specific OUI the solution you are
>> looking for?
> [...]
>
> The fault is in the board, not the PHY.

What kind of fault at the board level are we talking about? Lack of
specific twisted pair wiring to the RJ-45 jack? Out of spec RXC/TXC on
a (R)GMII path? If the latter, this is going to be via vendor-specific
MII registers, and should be a good enough reason for registering a
PHY fixup. What about pad control, and Ethernet MACs specicif register
affecting the internal delays and such?
-- 
Florian

^ permalink raw reply

* Re: [PATCH] fdtable: Avoid triggering OOMs from alloc_fdmem
From: Eric Dumazet @ 2014-02-04 22:48 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, linux-kernel, linux-fsdevel, netdev, linux-mm,
	David Rientjes
In-Reply-To: <87r47ik8ou.fsf@xmission.com>

On Tue, 2014-02-04 at 10:57 -0800, Eric W. Biederman wrote:

> As I have heard it described one tcp connection per small requestion,
> and someone goofed and started creating new connections when the server
> was bogged down.  But since all of the requests and replies were small I
> don't expect even TCP would allocate more than a 4KiB page in that
> worload.

Right, small writes uses regular skb (no page fragments).

> 
> I had oodles of 4KiB and 8KiB pages.  What size of memory allocation did
> you see failing?  

We got some reports of order-3 allocations failing.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH v2 0/5] net: phy: Ethernet PHY powerdown optimization
From: Florian Fainelli @ 2014-02-04 22:51 UTC (permalink / raw)
  To: Sebastian Hesselbarth
  Cc: David Miller, mugunthanvnm, netdev,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Andrew Lunn
In-Reply-To: <52F141AE.8010402@gmail.com>

Hi Sebastian,

2014-02-04 Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>:
> On 12/17/2013 08:43 PM, David Miller wrote:
>>
>> From: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
>> Date: Fri, 13 Dec 2013 10:20:24 +0100
>>
>>> This is v2 of the ethernet PHY power optimization patches to reduce
>>> power consumption of network PHYs with link that are either unused or
>>> the corresponding netdev is down.
>>>
>>> Compared to the last version, this patch set drops a patch to disable
>>> unused PHYs after late initcall, as it is not compatible with a modular
>>> mdio bus [1]. I'll investigate different ways to have a modular mdio bus
>>> driver get notified when driver loading is done.
>>>
>>> Again, a branch with v2 applied to v3.13-rc2 can also be found at
>>> https://github.com/shesselba/linux-dove.git topic/ethphy-power-v2
>>>
>>> [1] http://www.spinics.net/lists/arm-kernel/msg293028.html
>>
>>
>> Series applied, thanks.
>>
>
> David, Mungunthan, Florian,
>
> as expected the above patches create a Linux to bootloader dependency
> that surfaces dumb bootloaders not initializing PHYs correctly.
>
> Andrew has a Kirkwood based board that does not power-up and restart
> auto-negotiation on the powered down PHY after a warm restart. While
> this specific bootloader allows a soft-workaround by issuing the
> required PHY writes before accessing the interface, others may not.
>
> I think we should allow the user to soft-disable the automatic
> power-down of PHYs, i.e. by exploiting a kernel parameter.
>
> Do you have any preference for naming it? My call would be something
> like libphy.suspend_halted = [0,1] with 1 being the default.

The name looks good to me, and it would avoid having to clear the
BMCR_PWRDOWN bit during the MDIO bus remove callback to workaround
such bootloaders.

BTW, it looks like we are omitting a 0.5 seconds delay after clearing
the BMCR_PWRDOWN bit:

"
22.2.4.1.5 Power Down
...
A PHY is not required to meet the RX_CLK and TX_CLK signal functional
requirements when either bit
0.11 or bit 0.10 is set to a logic one. A PHY shall meet the RX_CLK
and TX_CLK signal functional require-
ments defined in 22.2.2 within 0.5 s after both bit 0.11 and 0.10 are
cleared to zero."
-- 
Florian

^ permalink raw reply

* Re: IGMP joins come from the wrong SA/interface
From: Hannes Frederic Sowa @ 2014-02-04 23:32 UTC (permalink / raw)
  To: Steinar H. Gunderson; +Cc: netdev
In-Reply-To: <20140204220809.GB7526@sesse.net>

On Tue, Feb 04, 2014 at 11:08:09PM +0100, Steinar H. Gunderson wrote:
> On Thu, Jan 30, 2014 at 11:44:11PM +0100, Hannes Frederic Sowa wrote:
> > The routing lookup is done at IP_ADD_MEMBERSHIP time. I really wonder why you
> > have routed the 239.0.0.0/8 range to eth0.11. It seems to me that the kernel
> > does what you told it to do. ;)
> > 
> > multicast flag on ip route is just used for multicast forwarding and does not
> > matter for local multicast. Also if we find unicast route first (more
> > specific) kernel does not do backtracking if destination is in multicast
> > scope.
> 
> Hah, you're right. The issue was a combination of:

Thanks for letting me know!

>  1. mediatomb's initscript on Debian at some point started to add a bogus
>     239.0.0.0/8 route (and I didn't notice this because I earlier tested with
>     addresses outside this range).
>  2. I didn't properly understand that the multicast flag on the route did not
>     matter (although it really should!).

Hmm, maybe that would be good but I am not sure if that could break existing
setups if we change that now. It seems it is handled like that since Alexey
implemented it in that way.

Greetings,

  Hannes

^ permalink raw reply

* Re: IGMP joins come from the wrong SA/interface
From: Steinar H. Gunderson @ 2014-02-04 23:34 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20140204233206.GA16198@order.stressinduktion.org>

On Wed, Feb 05, 2014 at 12:32:06AM +0100, Hannes Frederic Sowa wrote:
>>  2. I didn't properly understand that the multicast flag on the route did not
>>     matter (although it really should!).
> Hmm, maybe that would be good but I am not sure if that could break existing
> setups if we change that now. It seems it is handled like that since Alexey
> implemented it in that way.

Thinking of it, traditionally (and by “traditionally” I mean in IOS and the
likes, not Linux) the multicast routing table has been used to look up the
rendezvous point (RP), to know where to send the PIM join. (The leads to the
confusing situation where the multicast routing table contains unicast
addresses.) In this situation, we have IGMP and not PIM, which means we do
not know anything about the RP, so maybe it's the right decision after all.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

^ permalink raw reply

* Re: [PATCH RESEND net-next v3 0/2] bonding: Fix some issues for fail_over_mac
From: David Miller @ 2014-02-05  3:48 UTC (permalink / raw)
  To: fubar; +Cc: dingtianhong, vfalico, netdev, andy
In-Reply-To: <15005.1391544031@death.nxdomain>

From: Jay Vosburgh <fubar@us.ibm.com>
Date: Tue, 04 Feb 2014 12:00:31 -0800

> Ding Tianhong <dingtianhong@huawei.com> wrote:
> 
>>The parameter fail_over_mac only affect active-backup mode, if it was
>>set to active or follow and works with other modes, just like RR or XOR
>>mode, the bonding could not set all slaves to the master's address, it
>>will cause the slave could not work well with master.
>>
>>v1->v2: According Jay's suggestion, that we should permit setting an option
>>	at any time, but only have it take effect in active-backup mode, so
>>	I add mode checking together with fail_over_mac during enslavement and
>>	rebuild the patches.
>>
>>v2->v3: The correct way to fix the problem is that we should not add restrictions when
>>    	setting options, just need to modify the bond enslave and removal processing
>>    	to check the mode in addition to fail_over_mac when setting a slave's MAC during
>>    	enslavement. The change active slave processing already only calls the fail_over_mac
>>    	function when in active-backup mode.
>>
>>	Remove the cleanup patch because the net-next is frozen now.
>>
>>Regards
>>Ding
> 
> 	Both patches look good to me.
> 
> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net V1] net/ipv4: Use proper RCU APIs for writer-side in udp_offload.c
From: David Miller @ 2014-02-05  4:03 UTC (permalink / raw)
  To: ogerlitz; +Cc: netdev, shlomop, edumazet
In-Reply-To: <1391348530-31643-1-git-send-email-ogerlitz@mellanox.com>

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Sun,  2 Feb 2014 15:42:10 +0200

> From: Shlomo Pongratz <shlomop@mellanox.com>
> 
> RCU writer side should use rcu_dereference_protected() and not
> rcu_dereference(), fix that. This also removes the "suspicious RCU usage"
> warning seen when running with CONFIG_PROVE_RCU.
> 
> Also, don't use rcu_assign_pointer/rcu_dereference for pointers
> which are invisible beyond the udp offload code.
> 
> Fixes: b582ef0 ('net: Add GRO support for UDP encapsulating protocols')
> Reported-by: Eric Dumazet <edumazet@google.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>

Applied, thank you.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox