Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [net-next RFC V5 0/5] Multiqueue virtio-net
From: Jason Wang @ 2012-07-09  5:35 UTC (permalink / raw)
  To: Ronen Hod
  Cc: krkumar2, habanero, mashirle, kvm, mst, netdev, linux-kernel,
	virtualization, edumazet, tahm, jwhan, davem, sri
In-Reply-To: <4FF9429A.8020508@redhat.com>

On 07/08/2012 04:19 PM, Ronen Hod wrote:
> On 07/05/2012 01:29 PM, Jason Wang wrote:
>> Hello All:
>>
>> This series is an update version of multiqueue virtio-net driver 
>> based on
>> Krishna Kumar's work to let virtio-net use multiple rx/tx queues to 
>> do the
>> packets reception and transmission. Please review and comments.
>>
>> Test Environment:
>> - Intel(R) Xeon(R) CPU E5620 @ 2.40GHz, 8 cores 2 numa nodes
>> - Two directed connected 82599
>>
>> Test Summary:
>>
>> - Highlights: huge improvements on TCP_RR test
>
> Hi Jason,
>
> It might be that the good TCP_RR results are due to the large number 
> of sessions (50-250). Can you test it also with small number of sessions?

Sure, I would test them.
>
>> - Lowlights: regression on small packet transmission, higher cpu 
>> utilization
>>               than single queue, need further optimization
>>
>> Analysis of the performance result:
>>
>> - I count the number of packets sending/receiving during the test, and
>>    multiqueue show much more ability in terms of packets per second.
>>
>> - For the tx regression, multiqueue send about 1-2 times of more packets
>>    compared to single queue, and the packets size were much smaller 
>> than single
>>    queue does. I suspect tcp does less batching in multiqueue, so I 
>> hack the
>>    tcp_write_xmit() to forece more batching, multiqueue works as well as
>>    singlequeue for both small transmission and throughput
>
> Could it be that since the CPUs are not busy they are available for 
> immediate handling of the packets (little batching)? In such scenario 
> the CPU utilization is not really interesting. What will happen on a 
> busy machine?
>

The regression happnes when test guest transmission in stream test, the 
cpu utilization is 100% in this situation.
> Ronen.
>
>>
>> - I didn't pack the accelerate RFS with virtio-net in this sereis as 
>> it still
>>    need further shaping, for the one that interested in this please see:
>>    http://www.mail-archive.com/kvm@vger.kernel.org/msg64111.html
>>
>> Changes from V4:
>> - Add ability to negotiate the number of queues through control 
>> virtqueue
>> - Ethtool -{L|l} support and default the tx/rx queue number to 1
>> - Expose the API to set irq affinity instead of irq itself
>>
>> Changes from V3:
>>
>> - Rebase to the net-next
>> - Let queue 2 to be the control virtqueue to obey the spec
>> - Prodives irq affinity
>> - Choose txq based on processor id
>>
>> References:
>>
>> - V4: https://lkml.org/lkml/2012/6/25/120
>> - V3: http://lwn.net/Articles/467283/
>>
>> Test result:
>>
>> 1) 1 vm 2 vcpu 1q vs 2q, 1 - 1q, 2 - 2q, no pinning
>>
>> - Guest to External Host TCP STREAM
>> sessions size throughput1 throughput2   norm1 norm2
>> 1 64 650.55 655.61 100% 24.88 24.86 99%
>> 2 64 1446.81 1309.44 90% 30.49 27.16 89%
>> 4 64 1430.52 1305.59 91% 30.78 26.80 87%
>> 8 64 1450.89 1270.82 87% 30.83 25.95 84%
>> 1 256 1699.45 1779.58 104% 56.75 59.08 104%
>> 2 256 4902.71 3446.59 70% 98.53 62.78 63%
>> 4 256 4803.76 2980.76 62% 97.44 54.68 56%
>> 8 256 5128.88 3158.74 61% 104.68 58.61 55%
>> 1 512 2837.98 2838.42 100% 89.76 90.41 100%
>> 2 512 6742.59 5495.83 81% 155.03 99.07 63%
>> 4 512 9193.70 5900.17 64% 202.84 106.44 52%
>> 8 512 9287.51 7107.79 76% 202.18 129.08 63%
>> 1 1024 4166.42 4224.98 101% 128.55 129.86 101%
>> 2 1024 6196.94 7823.08 126% 181.80 168.81 92%
>> 4 1024 9113.62 9219.49 101% 235.15 190.93 81%
>> 8 1024 9324.25 9402.66 100% 239.10 179.99 75%
>> 1 2048 7441.63 6534.04 87% 248.01 215.63 86%
>> 2 2048 7024.61 7414.90 105% 225.79 219.62 97%
>> 4 2048 8971.49 9269.00 103% 278.94 220.84 79%
>> 8 2048 9314.20 9359.96 100% 268.36 192.23 71%
>> 1 4096 8282.60 8990.08 108% 277.45 320.05 115%
>> 2 4096 9194.80 9293.78 101% 317.02 248.76 78%
>> 4 4096 9340.73 9313.19 99% 300.34 230.35 76%
>> 8 4096 9148.23 9347.95 102% 279.49 199.43 71%
>> 1 16384 8787.89 8766.31 99% 312.38 316.53 101%
>> 2 16384 9306.35 9156.14 98% 319.53 279.83 87%
>> 4 16384 9177.81 9307.50 101% 312.69 230.07 73%
>> 8 16384 9035.82 9188.00 101% 298.32 199.17 66%
>> - TCP RR
>> sessions size throughput1 throughput2   norm1 norm2
>> 50 1 54695.41 84164.98 153% 1957.33 1901.31 97%
>> 100 1 60141.88 88598.94 147% 2157.90 2000.45 92%
>> 250 1 74763.56 135584.22 181% 2541.94 2628.59 103%
>> 50 64 51628.38 82867.50 160% 1872.55 1812.16 96%
>> 100 64 60367.73 84080.60 139% 2215.69 1867.69 84%
>> 250 64 68502.70 124910.59 182% 2321.43 2495.76 107%
>> 50 128 53477.08 77625.07 145% 1905.10 1870.99 98%
>> 100 128 59697.56 74902.37 125% 2230.66 1751.03 78%
>> 250 128 71248.74 133963.55 188% 2453.12 2711.72 110%
>> 50 256 47663.86 67742.63 142% 1880.45 1735.30 92%
>> 100 256 54051.84 68738.57 127% 2123.03 1778.59 83%
>> 250 256 68250.06 124487.90 182% 2321.89 2598.60 111%
>> - External Host to Guest TCP STRAM
>> sessions size throughput1 throughput2   norm1 norm2
>> 1 64 847.71 864.83 102% 57.99 57.93 99%
>> 2 64 1690.82 1544.94 91% 80.13 55.09 68%
>> 4 64 3434.98 3455.53 100% 127.17 89.00 69%
>> 8 64 5890.19 6557.35 111% 194.70 146.52 75%
>> 1 256 2094.04 2109.14 100% 130.73 127.14 97%
>> 2 256 5218.13 3731.97 71% 219.15 114.02 52%
>> 4 256 6734.51 9213.47 136% 227.87 208.31 91%
>> 8 256 6452.86 9402.78 145% 224.83 207.77 92%
>> 1 512 3945.07 4203.68 106% 279.72 273.30 97%
>> 2 512 7878.96 8122.55 103% 278.25 231.71 83%
>> 4 512 7645.89 9402.13 122% 252.10 217.42 86%
>> 8 512 6657.06 9403.71 141% 239.81 214.89 89%
>> 1 1024 5729.06 5111.21 89% 289.38 303.09 104%
>> 2 1024 8097.27 8159.67 100% 269.29 242.97 90%
>> 4 1024 7778.93 8919.02 114% 261.28 205.50 78%
>> 8 1024 6458.02 9360.02 144% 221.26 208.09 94%
>> 1 2048 6426.94 5195.59 80% 292.52 307.47 105%
>> 2 2048 8221.90 9025.66 109% 283.80 242.25 85%
>> 4 2048 7364.72 8527.79 115% 248.10 198.36 79%
>> 8 2048 6760.63 9161.07 135% 230.53 205.12 88%
>> 1 4096 7247.02 6874.21 94% 276.23 287.68 104%
>> 2 4096 8346.04 8818.65 105% 281.49 254.81 90%
>> 4 4096 6710.00 9354.59 139% 216.41 210.13 97%
>> 8 4096 6265.69 9406.87 150% 206.69 210.92 102%
>> 1 16384 8159.50 8048.79 98% 266.94 283.11 106%
>> 2 16384 8525.66 8552.41 100% 294.36 239.27 81%
>> 4 16384 6042.24 8447.86 139% 200.21 196.40 98%
>> 8 16384 6432.63 9403.49 146% 211.48 206.13 97%
>>
>> 2) 1 vm 4 vcpu 1q vs 4q, 1 - 1q, 2 - 4q, no pinning
>>
>> - Guest to External Host TCP STREAM
>> sessions size throughput1 throughput2   norm1 norm2
>> 1 64 636.93 657.69 103% 23.55 24.42 103%
>> 2 64 1457.46 1268.78 87% 30.97 26.02 84%
>> 4 64 3062.86 2302.43 75% 41.00 29.64 72%
>> 8 64 3107.68 2308.32 74% 41.62 29.07 69%
>> 1 256 1743.50 1750.11 100% 59.00 56.63 95%
>> 2 256 4582.61 2870.31 62% 92.47 51.97 56%
>> 4 256 8440.96 4795.37 56% 135.10 56.39 41%
>> 8 256 9240.31 6654.82 72% 144.76 74.89 51%
>> 1 512 2918.25 2735.26 93% 91.08 86.47 94%
>> 2 512 8978.32 5107.95 56% 200.00 94.97 47%
>> 4 512 8850.39 6864.37 77% 190.32 101.09 53%
>> 8 512 9270.30 8483.01 91% 193.44 118.73 61%
>> 1 1024 4416.10 3679.70 83% 135.54 110.63 81%
>> 2 1024 9085.20 8770.48 96% 242.23 175.59 72%
>> 4 1024 9158.57 9011.56 98% 234.39 159.17 67%
>> 8 1024 9345.89 9067.43 97% 233.35 138.73 59%
>> 1 2048 8455.19 6077.94 71% 338.52 190.16 56%
>> 2 2048 9223.32 8237.73 89% 270.00 198.27 73%
>> 4 2048 9080.75 9257.63 101% 261.30 172.80 66%
>> 8 2048 9177.39 8977.10 97% 256.89 147.50 57%
>> 1 4096 8665.35 8394.78 96% 289.63 289.85 100%
>> 2 4096 7850.73 8857.86 112% 253.33 252.62 99%
>> 4 4096 9332.55 8508.37 91% 289.19 151.29 52%
>> 8 4096 8482.30 9146.80 107% 255.41 156.02 61%
>> 1 16384 8825.72 8778.26 99% 314.60 308.89 98%
>> 2 16384 9283.85 8927.40 96% 316.48 246.98 78%
>> 4 16384 7766.95 8708.06 112% 265.25 155.59 58%
>> 8 16384 8945.55 8940.23 99% 298.45 151.32 50%
>> - TCP_RR
>> sessions size throughput1 throughput2   norm1 norm2
>> 50 1 60848.70 81719.39 134% 2196.86 1551.05 70%
>> 100 1 61886.19 81425.02 131% 2215.76 1517.52 68%
>> 250 1 72058.41 162597.84 225% 2441.84 2278.14 93%
>> 50 64 51646.93 74160.10 143% 1861.07 1322.22 71%
>> 100 64 57574.86 83488.26 145% 2076.54 1479.79 71%
>> 250 64 67583.35 138482.15 204% 2314.46 2022.83 87%
>> 50 128 59931.51 71633.03 119% 2244.60 1309.18 58%
>> 100 128 58329.80 73104.90 125% 2202.98 1329.52 60%
>> 250 128 71021.55 161067.73 226% 2469.11 2205.28 89%
>> 50 256 47509.24 64330.24 135% 1915.75 1269.90 66%
>> 100 256 49293.03 68507.94 138% 1939.75 1263.64 65%
>> 250 256 63169.07 138390.68 219% 2255.47 2098.13 93%
>> - External Host to Guest TCP STREAM
>> sessions size throughput1 throughput2   norm1 norm2
>> 1 64 850.18 854.96 100% 56.94 58.25 102%
>> 2 64 1659.12 1730.25 104% 81.65 67.57 82%
>> 4 64 3254.70 3397.17 104% 118.57 76.21 64%
>> 8 64 6251.97 6389.29 102% 207.68 104.21 50%
>> 1 256 2029.14 2105.18 103% 116.45 119.69 102%
>> 2 256 5412.02 4260.32 78% 240.87 139.73 58%
>> 4 256 7777.28 8743.12 112% 263.20 174.65 66%
>> 8 256 6459.51 9388.93 145% 218.94 158.37 72%
>> 1 512 4566.31 4269.30 93% 274.74 289.83 105%
>> 2 512 7444.52 8240.64 110% 286.24 243.74 85%
>> 4 512 7722.29 9391.16 121% 261.96 180.36 68%
>> 8 512 6228.50 9134.52 146% 209.17 161.00 76%
>> 1 1024 4965.50 4953.68 99% 307.64 280.48 91%
>> 2 1024 8270.08 7733.71 93% 288.32 197.04 68%
>> 4 1024 7551.04 9394.58 124% 268.41 206.62 76%
>> 8 1024 6307.78 9179.03 145% 216.67 159.63 73%
>> 1 2048 5741.12 5948.80 103% 290.34 268.66 92%
>> 2 2048 7932.79 8766.05 110% 262.96 215.90 82%
>> 4 2048 6907.55 9255.97 133% 233.56 203.96 87%
>> 8 2048 6037.22 9399.41 155% 197.14 164.09 83%
>> 1 4096 7131.70 7535.10 105% 279.43 275.12 98%
>> 2 4096 8109.17 9348.04 115% 274.29 211.49 77%
>> 4 4096 6878.92 9319.13 135% 244.21 192.06 78%
>> 8 4096 6265.92 9408.35 150% 211.85 159.26 75%
>> 1 16384 8288.01 8596.39 103% 272.85 290.22 106%
>> 2 16384 8166.29 9280.12 113% 277.04 236.61 85%
>> 4 16384 6446.97 9382.22 145% 222.91 187.24 83%
>> 8 16384 6066.98 9405.51 155% 198.98 157.09 78%
>>
>> 3) 2 vms each with 2 vcpus, 1q vs 2q - pin vhost/vcpu in the same node
>>
>> - 2 Guests to External Hosts TCP STREAM
>> sessions size throughput1 throughput2   norm1 norm2
>> 1 64 1442.07 1475.11 102% 30.82 31.21 101%
>> 2 64 3124.87 2900.93 92% 40.29 35.95 89%
>> 4 64 3166.52 2864.04 90% 40.70 35.47 87%
>> 8 64 3141.45 2848.94 90% 40.38 35.34 87%
>> 1 256 3628.54 3711.73 102% 68.47 70.22 102%
>> 2 256 7806.95 7586.69 97% 111.23 84.38 75%
>> 4 256 8823.65 7612.74 86% 132.92 85.04 63%
>> 8 256 9194.89 9373.41 101% 135.98 119.62 87%
>> 1 512 7106.67 7128.00 100% 124.79 124.30 99%
>> 2 512 9190.22 9397.33 102% 180.84 149.34 82%
>> 4 512 9401.01 9376.67 99% 173.00 140.15 81%
>> 8 512 8572.84 9032.90 105% 150.49 127.58 84%
>> 1 1024 9361.93 9379.24 100% 205.81 202.94 98%
>> 2 1024 9386.69 9389.04 100% 201.78 165.75 82%
>> 4 1024 9403.43 9378.54 99% 195.33 152.06 77%
>> 8 1024 9213.63 9180.64 99% 178.99 141.51 79%
>> 1 2048 9338.95 9384.67 100% 223.22 227.86 102%
>> 2 2048 9389.28 9389.45 100% 202.37 170.08 84%
>> 4 2048 9405.86 9388.71 99% 193.76 161.54 83%
>> 8 2048 9352.40 9384.06 100% 189.16 157.06 83%
>> 1 4096 9380.74 9384.90 100% 239.37 241.56 100%
>> 2 4096 9393.47 9376.74 99% 213.84 195.61 91%
>> 4 4096 9393.85 9381.50 99% 198.06 170.18 85%
>> 8 4096 9400.41 9232.31 98% 192.87 163.56 84%
>> 1 16384 9348.18 9335.55 99% 253.02 254.86 100%
>> 2 16384 9384.97 9359.53 99% 218.56 208.59 95%
>> 4 16384 9326.60 9382.15 100% 206.24 179.72 87%
>> 8 16384 9355.82 9392.85 100% 198.22 172.89 87%
>> - TCP RR
>> sessions size throughput1 throughput2   norm1 norm2
>> 50 1 200340.33 261750.19 130% 2935.27 3018.59 102%
>> 100 1 236141.58 266304.49 112% 3452.16 3071.74 88%
>> 250 1 361574.59 320825.08 88% 4972.98 3705.70 74%
>> 50 64 225748.53 242671.12 107% 3011.48 2869.07 95%
>> 100 64 249885.37 260453.72 104% 3240.21 3063.67 94%
>> 250 64 360341.12 310775.60 86% 4682.42 3657.91 78%
>> 50 128 227995.27 289320.38 126% 2950.92 3479.37 117%
>> 100 128 239491.11 291135.77 121% 3099.55 3508.75 113%
>> 250 128 390390.68 362484.35 92% 5042.30 4368.52 86%
>> 50 256 222604.51 317140.97 142% 3058.08 3839.39 125%
>> 100 256 254770.92 335606.03 131% 3326.16 4046.65 121%
>> 250 256 400584.52 436749.22 109% 5220.79 5278.86 101%
>> - External Host to 2 Guests
>> sessions size throughput1 throughput2   norm1 norm2
>> 1 64 1667.99 1684.50 100% 59.66 60.77 101%
>> 2 64 3338.83 3379.97 101% 83.61 64.82 77%
>> 4 64 6613.65 6619.11 100% 131.00 97.19 74%
>> 8 64 6553.07 6418.31 97% 141.35 98.27 69%
>> 1 256 3938.40 4068.52 103% 125.21 123.76 98%
>> 2 256 9215.57 9210.88 99% 185.31 154.27 83%
>> 4 256 9407.29 9008.13 95% 186.72 150.01 80%
>> 8 256 9377.17 9385.57 100% 190.28 137.59 72%
>> 1 512 7360.19 6984.80 94% 214.09 211.66 98%
>> 2 512 9392.91 9401.88 100% 193.92 173.11 89%
>> 4 512 9382.64 9394.34 100% 189.27 145.80 77%
>> 8 512 9308.60 9094.08 97% 189.70 141.26 74%
>> 1 1024 9153.26 9066.06 99% 223.07 219.95 98%
>> 2 1024 9393.38 9398.43 100% 194.02 173.82 89%
>> 4 1024 9395.92 8960.73 95% 192.61 145.82 75%
>> 8 1024 9388.92 9399.08 100% 191.18 143.87 75%
>> 1 2048 9355.32 9240.63 98% 221.50 223.03 100%
>> 2 2048 9395.68 9399.62 100% 193.31 177.21 91%
>> 4 2048 9397.67 9399.56 100% 195.25 157.53 80%
>> 8 2048 9397.89 9401.70 100% 197.57 146.96 74%
>> 1 4096 9375.84 9381.72 100% 223.06 225.06 100%
>> 2 4096 9389.47 9396.00 100% 193.91 197.13 101%
>> 4 4096 9397.45 9400.11 100% 192.33 163.60 85%
>> 8 4096 9105.40 9415.76 103% 192.71 140.41 72%
>> 1 16384 9381.53 9381.40 99% 223.53 225.66 100%
>> 2 16384 9387.90 9395.44 100% 193.34 177.03 91%
>> 4 16384 9397.92 9410.98 100% 195.04 151.14 77%
>> 8 16384 9259.00 9419.48 101% 194.91 153.48 78%
>>
>> 4) Local vm to vm 2 vcpu 1q vs 2q - pin vcpu/thread in the same numa 
>> node
>>
>> - VM to VM TCP STREAM
>> sessions size throughput1 throughput2   norm1 norm2
>> 1 64 576.05 576.14 100% 12.25 12.32 100%
>> 2 64 1266.75 1160.04 91% 19.10 16.05 84%
>> 4 64 1267.34 1123.70 88% 19.08 15.51 81%
>> 8 64 1230.88 1174.70 95% 18.53 15.58 84%
>> 1 256 1311.00 1303.02 99% 25.34 25.35 100%
>> 2 256 5400.26 2794.00 51% 75.92 36.43 47%
>> 4 256 5200.67 2818.88 54% 72.81 33.92 46%
>> 8 256 5234.55 2893.74 55% 73.10 34.97 47%
>> 1 512 3244.09 3263.72 100% 56.48 56.65 100%
>> 2 512 8172.16 4661.15 57% 119.05 67.89 57%
>> 4 512 10567.44 7063.25 66% 147.76 77.27 52%
>> 8 512 10477.87 8471.33 80% 145.94 102.91 70%
>> 1 1024 5432.54 5333.99 98% 93.69 92.38 98%
>> 2 1024 12590.24 9259.97 73% 185.37 135.28 72%
>> 4 1024 15600.53 10731.93 68% 222.20 123.60 55%
>> 8 1024 16222.87 10704.85 65% 227.05 113.81 50%
>> 1 2048 6667.61 7484.37 112% 116.75 129.72 111%
>> 2 2048 8180.43 11500.88 140% 137.84 156.64 113%
>> 4 2048 15127.93 14416.16 95% 227.60 154.59 67%
>> 8 2048 16381.79 14794.10 90% 244.29 158.45 64%
>> 1 4096 7375.63 8948.90 121% 131.97 156.57 118%
>> 2 4096 9321.16 14443.21 154% 161.24 163.74 101%
>> 4 4096 13028.45 15984.94 122% 212.78 171.26 80%
>> 8 4096 15611.28 18810.54 120% 245.15 198.65 81%
>> 1 16384 15304.38 14202.08 92% 259.94 244.04 93%
>> 2 16384 15508.97 15913.09 102% 261.30 244.26 93%
>> 4 16384 14859.98 20164.34 135% 248.29 214.26 86%
>> 8 16384 15594.59 19960.99 127% 253.79 211.27 83%
>> - TCP RR
>> sessions size throughput1 throughput2   norm1 norm2
>> 50 1 54972.51 69820.99 127% 1133.58 1063.58 93%
>> 100 1 55847.16 72407.93 129% 1155.73 1024.35 88%
>> 250 1 60066.23 108266.50 180% 1114.30 1323.55 118%
>> 50 64 48727.63 62378.32 128% 1014.29 888.78 87%
>> 100 64 51804.65 69250.51 133% 1077.78 986.97 91%
>> 250 64 61278.68 100015.78 163% 1076.93 1243.18 115%
>> 50 256 51593.29 62046.22 120% 1069.14 871.08 81%
>> 100 256 51647.00 68197.43 132% 1071.66 958.51 89%
>> 250 256 60433.88 99072.59 163% 1072.41 1199.10 111%
>> 50 512 52177.79 66483.77 127% 1082.65 960.82 88%
>> 100 512 50351.67 62537.63 124% 1041.61 876.41 84%
>> 250 512 60510.14 103856.79 171% 1055.21 1245.17 118%
>>
>>
>> Jason Wang (4):
>>    virtio_ring: move queue_index to vring_virtqueue
>>    virtio: intorduce an API to set affinity for a virtqueue
>>    virtio_net: multiqueue support
>>    virtio_net: support negotiating the number of queues through ctrl vq
>>
>> Krishna Kumar (1):
>>    virtio_net: Introduce VIRTIO_NET_F_MULTIQUEUE
>>
>>   drivers/net/virtio_net.c      |  792 
>> +++++++++++++++++++++++++++++------------
>>   drivers/virtio/virtio_mmio.c  |    5 +-
>>   drivers/virtio/virtio_pci.c   |   58 +++-
>>   drivers/virtio/virtio_ring.c  |   17 +
>>   include/linux/virtio.h        |    4 +
>>   include/linux/virtio_config.h |   21 ++
>>   include/linux/virtio_net.h    |   10 +
>>   7 files changed, 677 insertions(+), 230 deletions(-)
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

^ permalink raw reply

* Re: [PATCH 2/2] ksz884x: fix Endian
From: Joe Perches @ 2012-07-09  5:44 UTC (permalink / raw)
  To: RongQing Li; +Cc: Ben Hutchings, netdev, Tristram.Ha
In-Reply-To: <CAJFZqHzm7=-PpsiNZJ9TgkDY2bt5WW7XwY6nBOa_E4eerRh1pg@mail.gmail.com>

On Mon, 2012-07-09 at 13:26 +0800, RongQing Li wrote:
> 2012/7/7, Ben Hutchings <bhutchings@solarflare.com>:
> > On Thu, 2012-07-05 at 10:06 +0800, roy.qing.li@gmail.com wrote:
> >> ETH_P_IP is host Endian, skb->protocol is big Endian, when
> >> compare them, we should change skb->protocol from big endian
> >> to host endian, ntohs, not htons.
[]
> >> diff --git a/drivers/net/ethernet/micrel/ksz884x.c
[]
> >> @@ -4882,7 +4882,7 @@ static netdev_tx_t netdev_tx(struct sk_buff *skb,
> >> struct net_device *dev)
> >>  	if (left) {
> >>  		if (left < num ||
> >>  				((CHECKSUM_PARTIAL == skb->ip_summed) &&
> >> -				(ETH_P_IPV6 == htons(skb->protocol)))) {
> >> +				(ETH_P_IPV6 == ntohs(skb->protocol)))) {
> >
> > This should really be changed to the idiomatic 'skb->protocol ==
> > htons(ETH_P_IPV6)'.  For the current code, the compiler will probably
> > generate a run-time byte-swap for little-endian systems.

True.  Perhaps this would be better written as:

	if (left) {
		if (left < num ||
		    (ip->ip_summed == CHECKSUM_PARTIAL &&
		     skb->protocol == htons(ETH_P_IPV6))) {

			etc...

^ permalink raw reply

* [PATCH] netns: correctly use per-netns ipv4 sysctl_tcp_mem
From: Huang Qiang @ 2012-07-09  6:05 UTC (permalink / raw)
  To: davem, glommer; +Cc: netdev, containers, yangzhenzhang

From: Yang Zhenzhang <yangzhenzhang@huawei.com>

Now, kernel allows each net namespace to independently set up its levels
for tcp memory pressure thresholds.

But it seems there is a bug, as using the following steps:

[root@host socket]# lxc-start -n test -f config /bin/bash
[root@net-test socket]# ip route add default via 192.168.58.2
[root@net-test socket]# echo 0 0 0 > /proc/sys/net/ipv4/tcp_mem
[root@net-test socket]# scp root@192.168.58.174:/home/tcp_mem_test .

and it still can transport the "tcp_mem_test" file which we hope it
would not.

It's because inet_init() (net/ipv4/af_inet.c)initialize the
tcp_prot.sysctl_mem:
tcp_prot.sysctl_mem = init_net.ipv4.sysctl_tcp_mem;

So when the protocal is TCP, sk->sk_prot->sysctl_mem(following code)
always use the ipv4 sysctl_tcp_mem of init_net namespace rather than
it's own net namespace.
This patch simply set "prot" equal to net->ipv4.sysctl_tcp_mem when
the protocol type is TCP.

Signed-off-by: Yang Zhenzhang <yangzhenzhang@huawei.com>
---
 include/net/sock.h |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 4a45216..b62a8d9 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -59,6 +59,7 @@
 #include <linux/static_key.h>
 #include <linux/aio.h>
 #include <linux/sched.h>
+#include <linux/in.h>

 #include <linux/filter.h>
 #include <linux/rculist_nulls.h>
@@ -1062,7 +1063,12 @@ static inline void sk_enter_memory_pressure(struct sock *sk)

 static inline long sk_prot_mem_limits(const struct sock *sk, int index)
 {
+	struct net *net = sock_net(sk);
 	long *prot = sk->sk_prot->sysctl_mem;
+	
+	if (sk->protocol == IPPROTO_TCP)
+		prot = net->ipv4.sysctl_tcp_mem;
+
 	if (mem_cgroup_sockets_enabled && sk->sk_cgrp)
 		prot = sk->sk_cgrp->sysctl_mem;
 	return prot[index];
-- 
1.7.1

^ permalink raw reply related

* Re: [net-next PATCH 02/02] net/ipv4: VTI support new module for ip_vti.
From: David Miller @ 2012-07-09  6:47 UTC (permalink / raw)
  To: saurabh.mohan; +Cc: netdev
In-Reply-To: <20120629013017.GA4649@debian-saurabh-64.vyatta.com>

From: Saurabh <saurabh.mohan@vyatta.com>
Date: Thu, 28 Jun 2012 18:30:17 -0700

> +#define HASH_SIZE  16
> +#define HASH(addr) (((__force u32)addr^((__force u32)addr>>4))&0xF)

Define HASH such that it masks with (HASH_SIZE - 1) instead of
0xf, so that if HASH_SIZE is changed everything automatically
still works without having to remember to update the value in
HASH()'s definition too.

> +	if (skb->protocol != htons(ETH_P_IP))
> +		goto tx_error;

We are really past the point where we can add major inet protocol
features without supporting ipv6 as well.

> +	if (IS_ERR(rt)) {
> +		dev->stats.tx_carrier_errors++;
> +		goto tx_error_icmp;
> +	}
> +#ifdef CONFIG_XFRM
> +		/* if there is no transform then this tunnel is not functional.
> +		 * Or if the xfrm is not mode tunnel.
> +		 */
> +		if (!rt->dst.xfrm ||
> +		    rt->dst.xfrm->props.mode != XFRM_MODE_TUNNEL) {
> +			stats->tx_carrier_errors++;
> +			goto tx_error_icmp;
> +		}
> +#endif

This code in the CONFIG_XFRM block is not indented properly.

And this is a pointless CONFIG_* check, you can't even register
this tunnel outside of the XFRM code.  In fact the code already
depends upon INET_XFRM_MODE_TUNNEL which therefore automatically
means that CONFIG_XFRM must be set for this code.

> +	}
> +
> +
> +	if (tunnel->err_count > 0) {

Get rid of these extra blank lines.

> +	}
> +
> +
> +	IPCB(skb)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED |

Again.

The reason there are long periods of time between my attempts to
review your code (and probably the reason I'm the only person still
reviewing your work at all) is that I know there are going to be so
many problems to let you know about.  It's really painful to review
your work and I've spent so much time on the coding style and the
simpler issues that I really haven't considered the high level issues
of what your code is trying to do.

^ permalink raw reply

* Re: [PATCH net-next 1/2] r8169: support RTL8106E
From: David Miller @ 2012-07-09  6:48 UTC (permalink / raw)
  To: hayeswang; +Cc: romieu, netdev, linux-kernel, hayes
In-Reply-To: <1340966060-2749-1-git-send-email-hayeswang@realtek.com>


Francois, what would you like me to do with these two patches?  I
haven't seen full ACKs from you yet.

Thanks.

^ permalink raw reply

* Re: [PATCH v3] ieee802154: verify packet size before trying to allocate it
From: David Miller @ 2012-07-09  6:50 UTC (permalink / raw)
  To: levinsasha928
  Cc: dbaryshkov, slapin, linux-zigbee-devel, netdev, linux-kernel
In-Reply-To: <1341228595-9883-1-git-send-email-levinsasha928@gmail.com>

From: Sasha Levin <levinsasha928@gmail.com>
Date: Mon,  2 Jul 2012 13:29:55 +0200

> Currently when sending data over datagram, the send function will attempt to
> allocate any size passed on from the userspace.
> 
> We should make sure that this size is checked and limited. We'll limit it
> to the MTU of the device, which is checked later anyway.
> 
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>

Applied.

^ permalink raw reply

* Re: [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for strings
From: David Miller @ 2012-07-09  6:51 UTC (permalink / raw)
  To: mchan
  Cc: dan.carpenter, David.Laight, JBottomley, barak, eddie.wai,
	linux-scsi, netdev
In-Reply-To: <1341242018.7472.5.camel@LTIRV-MCHAN1.corp.ad.broadcom.com>

From: "Michael Chan" <mchan@broadcom.com>
Date: Mon, 2 Jul 2012 08:13:38 -0700

> This came from the net-next tree, so David is the right persion to apply
> this.  Thanks.
> 
> Acked-by: Michael Chan <mchan@broadcom.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] net: dont use __netdev_alloc_skb for bounce buffer
From: David Miller @ 2012-07-09  6:52 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, stefan.bader
In-Reply-To: <1341254172.22621.456.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 02 Jul 2012 20:36:12 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> commit a1c7fff7e1 (net: netdev_alloc_skb() use build_skb()) broke b44 on
> some 64bit machines.
> 
> It appears b44 and b43 use __netdev_alloc_skb() instead of alloc_skb()
> for their bounce buffers.
> 
> There is no need to add an extra NET_SKB_PAD reservation for bounce
> buffers :
> 
> - In TX path, NET_SKB_PAD is useless
> 
> - In RX path in b44, we force a copy of incoming frames if
>   GFP_DMA allocations were needed.
> 
> Reported-and-bisected-by: Stefan Bader <stefan.bader@canonical.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH] sctp: refactor sctp_packet_append_chunk and clenup some memory leaks
From: David Miller @ 2012-07-09  6:54 UTC (permalink / raw)
  To: vyasevich; +Cc: nhorman, netdev, linux-sctp
In-Reply-To: <4FF30428.6070403@gmail.com>

From: Vlad Yasevich <vyasevich@gmail.com>
Date: Tue, 03 Jul 2012 10:39:36 -0400

> On 07/02/2012 03:59 PM, Neil Horman wrote:
>> While doing some recent work on sctp sack bundling I noted that
>> sctp_packet_append_chunk was pretty inefficient.  Specifially, it was
>> called
>> recursively while trying to bundle auth and sack chunks.  Because of
>> that we
>> call sctp_packet_bundle_sack and sctp_packet_bundle_auth a total of 4
>> times for
>> every call to sctp_packet_append_chunk, knowing that at least 3 of
>> those calls
>> will do nothing.
>>
>> So lets refactor sctp_packet_bundle_auth to have an outer part that
>> does the
>> attempted bundling, and an inner part that just does the chunk
>> appends.  This
>> saves us several calls per iteration that we just don't need.
>>
>> Also, noticed that the auth and sack bundling fail to free the chunks
>> they
>> allocate if the append fails, so make sure we add that in
>>
>> Signed-off-by: Neil Horman<nhorman@tuxdriver.com>
>> CC: Vlad Yasevich<vyasevich@gmail.com>
> 
> Acked-by: Vlad Yasevich <vyasevich@gmail.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH] etherdevice: introduce broadcast_ether_addr
From: David Miller @ 2012-07-09  6:58 UTC (permalink / raw)
  To: johannes-cdvu00un1VgdHxzADdlk8Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1341310587.5131.2.camel-8upI4CBIZJIJvtFkdXX2HixXY32XiHfO@public.gmane.org>

From: Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
Date: Tue, 03 Jul 2012 12:16:27 +0200

> From: Johannes Berg <johannes.berg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> 
> A lot of code has either the memset or an
> inefficient copy from a static array that
> contains the all-ones broadcast address.
> Introduce broadcast_ether_addr() to fill
> an address with all ones, making the code
> clearer and allowing us to get rid of the
> various constant arrays.
> 
> Signed-off-by: Johannes Berg <johannes.berg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

I would prefer if this were named "eth_something()", thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net/fsl_pq_mdio: use spin_event_timeout() to poll the indicator register
From: David Miller @ 2012-07-09  6:59 UTC (permalink / raw)
  To: timur; +Cc: afleming, netdev
In-Reply-To: <1341357381-10861-1-git-send-email-timur@freescale.com>

From: Timur Tabi <timur@freescale.com>
Date: Tue, 3 Jul 2012 18:16:21 -0500

> Macro spin_event_timeout() was designed for simple polling of hardware
> registers with a timeout, so use it when we poll the MIIMIND register.
> This allows us to return an error code instead of polling indefinitely.
> 
> Note that PHY_INIT_TIMEOUT is a count of loop iterations, so we can't use
> it for spin_event_timeout(), which asks for microseconds.
> 
> Signed-off-by: Timur Tabi <timur@freescale.com>

Define a macro for the timeout value rather than use an arbitrary
constant.

> +	status = spin_event_timeout(!(in_be32(&regs->miimind) &	MIIMIND_BUSY),
> +		1000, 0);

This indentation is absolutely terrible.

> +	status = spin_event_timeout(!(in_be32(&regs->miimind) &
> +		(MIIMIND_NOTVALID | MIIMIND_BUSY)), 1000, 0);

Same here.

> +	status = spin_event_timeout(!(in_be32(&regs->miimind) &	MIIMIND_BUSY),
> +		1000, 0);

And here too.

^ permalink raw reply

* Re: [PATCH 1/1] atl1c: fix issue of transmit queue 0 timed out
From: David Miller @ 2012-07-09  7:00 UTC (permalink / raw)
  To: cjren; +Cc: netdev, linux-kernel, qca-linux-team, nic-devel
In-Reply-To: <1341370308-23233-1-git-send-email-cjren@qca.qualcomm.com>

From: <cjren@qca.qualcomm.com>
Date: Wed, 4 Jul 2012 10:51:48 +0800

> some people report atl1c could cause system hang with following
> kernel trace info:
> ---------------------------------------
> WARNING: at.../net/sched/sch_generic.c:258 dev_watchdog+0x1db/0x1d0()
> ...
> NETDEV WATCHDOG: eth0 (atl1c): transmit queue 0 timed out
> ...
> ---------------------------------------
> This is caused by netif_stop_queue calling when cable Link is down.
> So remove netif_stop_queue, because link_watch will take it over.
> 
> Signed-off-by: xiong <xiong@qca.qualcomm.com>
> Cc: stable <stable@vger.kernel.org>
> Signed-off-by: Cloud Ren <cjren@qca.qualcomm.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] netem: add limitation to reordered packets
From: David Miller @ 2012-07-09  7:02 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, hagen, msg, aterzis, ycheng
In-Reply-To: <1341384921.2583.1462.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 04 Jul 2012 08:55:21 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> Fix two netem bugs :
> 
> 1) When a frame was dropped by tfifo_enqueue(), drop counter
>    was incremented twice.
> 
> 2) When reordering is triggered, we enqueue a packet without
>    checking queue limit. This can OOM pretty fast when this
>    is repeated enough, since skbs are orphaned, no socket limit
>    can help in this situation.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied, thanks Eric.

^ permalink raw reply

* Re: [PATCH] net/macb: manage carrier state with call to netif_carrier_{on|off}()
From: David Miller @ 2012-07-09  7:03 UTC (permalink / raw)
  To: nicolas.ferre
  Cc: netdev, bhutchings, Arvid.Brodin, kuznet, shemminger,
	linux-arm-kernel
In-Reply-To: <1341393253-6531-1-git-send-email-nicolas.ferre@atmel.com>

From: Nicolas Ferre <nicolas.ferre@atmel.com>
Date: Wed, 4 Jul 2012 11:14:13 +0200

> OFF carrier state is setup in probe() open() and suspend() functions.
> The carrier ON state is managed in macb_handle_link_change().
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH 0/2] [net-next] Marvell sky2 updates
From: David Miller @ 2012-07-09  7:06 UTC (permalink / raw)
  To: mlindner; +Cc: shemminger, netdev
In-Reply-To: <1341394709.14972.39.camel@mlindner-lin.skd.de>

Applied, but you must put a "sky2: " prefix in the subject lines
of future patches.

Otherwise someone scanning the commit log summary has no idea what
driver your changes are for.

^ permalink raw reply

* Re: [RFC PATCH] tcp: limit data skbs in qdisc layer
From: David Miller @ 2012-07-09  7:08 UTC (permalink / raw)
  To: eric.dumazet
  Cc: ycheng, dave.taht, netdev, codel, therbert, mattmathis, nanditad,
	ncardwell, andrewmcgr
In-Reply-To: <1341396687.2583.1757.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 04 Jul 2012 12:11:27 +0200

> sk->sk_wmem_alloc not allowed to grow above a given limit,
> allowing no more than ~4 segments [1] per tcp socket in qdisc layer at a
> given time. (if TSO is enabled, then a single TSO packet hits the limit)

I'm suspicious and anticipate that 10G will need more queueing than
you are able to get away with tg3 at 1G speeds.  But it is an exciting
idea nonetheless :-)

^ permalink raw reply

* Re: [PATCH] bcm87xx: disable autonegotiation by default
From: David Miller @ 2012-07-09  7:09 UTC (permalink / raw)
  To: jacmet; +Cc: netdev, david.daney
In-Reply-To: <1341398037-7591-1-git-send-email-jacmet@sunsite.dk>

From: Peter Korsgaard <jacmet@sunsite.dk>
Date: Wed,  4 Jul 2012 12:33:57 +0200

> The bcm87xx phys don't support autonegotiation, so don't use it by
> default, as otherwise phy_state_machine() will try to enable it (using
> c22 requests, which also don't make any sense for the bcm78xx).
> 
> Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [net] ixgbe: DCB and SR-IOV can not co-exist and will cause hangs
From: David Miller @ 2012-07-09  7:10 UTC (permalink / raw)
  To: jeffrey.t.kirsher; +Cc: alexander.h.duyck, netdev, gospo, sassmann
In-Reply-To: <1341403225-1326-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Wed,  4 Jul 2012 05:00:25 -0700

> From: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> DCB and SR-IOV cannot currently be enabled at the same time as the queueing
> schemes are incompatible.  If they are both enabled it will result in Tx
> hangs since only the first Tx queue will be able to transmit any traffic.
> 
> This simple fix for this is to block us from enabling TCs in ixgbe_setup_tc
> if SR-IOV is enabled.  This change will be reverted once we can support
> SR-IOV and DCB coexistence.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> Acked-by: John Fastabend <john.r.fastabend@intel.com>
> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
> Tested-by: Ross Brattain <ross.b.brattain@intel.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] phylib: Support registering a bunch of drivers
From: David Miller @ 2012-07-09  7:11 UTC (permalink / raw)
  To: chohnstaedt; +Cc: netdev
In-Reply-To: <20120704154434.GZ19422@elara.bln.innominate.local>

From: Christian Hohnstaedt <chohnstaedt@innominate.com>
Date: Wed, 4 Jul 2012 17:44:34 +0200

> If registering of one of them fails, all already registered drivers
> of this module will be unregistered.
> 
> Use the new register/unregister functions in all drivers
> registering more than one driver.
> 
> amd.c, realtek.c: Simplify: directly return registration result.
> 
> Tested with broadcom.c
> All others compile-tested.
> 
> Signed-off-by: Christian Hohnstaedt <chohnstaedt@innominate.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] bcm87xx: fix reg-init comment typo
From: David Miller @ 2012-07-09  7:12 UTC (permalink / raw)
  To: ddaney.cavm; +Cc: jacmet, netdev, david.daney
In-Reply-To: <4FF47BDE.3010002@gmail.com>

From: David Daney <ddaney.cavm@gmail.com>
Date: Wed, 04 Jul 2012 10:22:38 -0700

> On 07/04/2012 08:05 AM, Peter Korsgaard wrote:
>> broadcom, not marvell.
>>
>> Signed-off-by: Peter Korsgaard<jacmet@sunsite.dk>
> 
> Indeed, it was a cut-and-paste error.  Thanks for fixing it...
> 
> Acked-by: David Daney <david.daney@cavium.com>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH] netdev/phy: Fixup lockdep warnings in mdio-mux.c
From: David Miller @ 2012-07-09  7:13 UTC (permalink / raw)
  To: ddaney.cavm; +Cc: netdev, linux-kernel, david.daney
In-Reply-To: <1341439576-1413-1-git-send-email-ddaney.cavm@gmail.com>

From: David Daney <ddaney.cavm@gmail.com>
Date: Wed,  4 Jul 2012 15:06:16 -0700

> From: David Daney <david.daney@cavium.com>
> 
> With lockdep enabled we get:
 ...
> This is a false positive, since we are indeed using 'nested' locking,
> we need to use mutex_lock_nested().
> 
> Now in theory we can stack multiple MDIO multiplexers, but that would
> require passing the nesting level (which is difficult to know) to
> mutex_lock_nested().  Instead we assume the simple case of a single
> level of nesting.  Since these are only warning messages, it isn't so
> important to solve the general case.
> 
> Signed-off-by: David Daney <david.daney@cavium.com>

Applied to 'net', thanks.

^ permalink raw reply

* [PATCH 0/2] stmmac: nfs reboot crash & jumbo frame handling fix
From: Deepak Sikri @ 2012-07-09  7:14 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: spear--sw-devel, netdev, Deepak Sikri

This patch set handles in the fixes for following bugs that were
observed during testing.
1. On Multiple reboot operations using nfs, system crash were observed
with inconsistency in status of dma descriptors.
2. There were data losses observed whenever the jumbo frames were used
for data transfers.

Deepak Sikri (2):
  stmmac: Fix for nfs hang on multiple reboot
  stmmac: Fix for higher mtu size handling

 drivers/net/ethernet/stmicro/stmmac/ring_mode.c   |    3 ++-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |    3 +++
 2 files changed, 5 insertions(+), 1 deletions(-)

-- 
1.7.2.2

^ permalink raw reply

* [PATCH 1/2] stmmac: Fix for nfs hang on multiple reboot
From: Deepak Sikri @ 2012-07-09  7:14 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: spear--sw-devel, netdev, Deepak Sikri
In-Reply-To: <1341818086-28897-1-git-send-email-deepak.sikri@st.com>

It was observed that during multiple reboots nfs hangs. The status of
receive descriptors shows that all the descriptors were in control of
CPU, and none were assigned to DMA.
Also the DMA status register confirmed that the Rx buffer is
unavailable.

This patch adds the fix for the same by adding the memory barriers to
ascertain that the all instructions before enabling the Rx or Tx DMA are
completed which involves the proper setting of the ownership bit in DMA
descriptors.

Signed-off-by: Deepak Sikri <deepak.sikri@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 51b3b68..ea3003e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1212,6 +1212,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum_insertion);
 		wmb();
 		priv->hw->desc->set_tx_owner(desc);
+		wmb();
 	}
 
 	/* Interrupt on completition only for the latest segment */
@@ -1227,6 +1228,7 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	/* To avoid raise condition */
 	priv->hw->desc->set_tx_owner(first);
+	wmb();
 
 	priv->cur_tx++;
 
@@ -1290,6 +1292,7 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
 		}
 		wmb();
 		priv->hw->desc->set_rx_owner(p + entry);
+		wmb();
 	}
 }
 
-- 
1.7.2.2

^ permalink raw reply related

* Re: [PATCH 1/2] be2net: Fix Endian
From: David Miller @ 2012-07-09  7:14 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev, somnath.kotur
In-Reply-To: <1341453942-4198-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Thu,  5 Jul 2012 10:05:42 +0800

> From: Li RongQing <roy.qing.li@gmail.com>
> 
> ETH_P_IP is host Endian, skb->protocol is big Endian, when
> compare them, we should change ETH_P_IP from host endian
> to big endian, htons, not ntohs.
> 
> CC: Somnath Kotur <somnath.kotur@emulex.com>
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

Applied to net-next since this actually doesn't cause any real
problems winc htons() and ntohs() are implemented identically
and perform the same operation.

Thanks.

^ permalink raw reply

* [PATCH 2/2] stmmac: Fix for higher mtu size handling
From: Deepak Sikri @ 2012-07-09  7:14 UTC (permalink / raw)
  To: peppe.cavallaro; +Cc: spear--sw-devel, netdev, Deepak Sikri
In-Reply-To: <1341818086-28897-2-git-send-email-deepak.sikri@st.com>

For the higher mtu sizes requiring the buffer size greater than 8192,
the buffers are sent or received using multiple dma descriptors/ same
descriptor with option of multi buffer handling.
It was observed during tests that the driver was missing on data
packets during the normal ping operations if the data buffers being used
catered to jumbo frame handling.

The memory barrriers are added in between preparation of dma descriptors
in the jumbo frame handling path to ensure all instructions before
enabling the dma are complete.

Signed-off-by: Deepak Sikri <deepak.sikri@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/ring_mode.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
index fb8377d..4b785e1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
+++ b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c
@@ -51,7 +51,7 @@ static unsigned int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 1, bmax,
 						csum);
-
+		wmb();
 		entry = (++priv->cur_tx) % txsize;
 		desc = priv->dma_tx + entry;
 
@@ -59,6 +59,7 @@ static unsigned int stmmac_jumbo_frm(void *p, struct sk_buff *skb, int csum)
 					    len, DMA_TO_DEVICE);
 		desc->des3 = desc->des2 + BUF_SIZE_4KiB;
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum);
+		wmb();
 		priv->hw->desc->set_tx_owner(desc);
 		priv->tx_skbuff[entry] = NULL;
 	} else {
-- 
1.7.2.2

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox