public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* intel iommu causing performance drop in mlx4 ipoib
@ 2016-05-16 10:39 Nikolay Borisov
       [not found] ` <5739A347.20807-6AxghH7DbtA@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolay Borisov @ 2016-05-16 10:39 UTC (permalink / raw)
  To: matanb-VPRAkNaXOzVWk0Htik3J/w
  Cc: guysh-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	markb-VPRAkNaXOzVWk0Htik3J/w, erezsh-VPRAkNaXOzVWk0Htik3J/w,
	dwmw2-wEGCiKHe2LqWVfeAwA7xHQ,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Hello, 

I've been testing various infiniband cards for performance and one of 
them is the a ConnectX-3: Mellanox Technologies MT27500 Family [ConnectX-3]. 

I've observed a strange performance pathology with it when running ipoib 
and using a naive iperf test. My setup has multiple machines with a mix 
of qlogic/mellanox cards, connected via an QLogic 12300 switch. All of 
the nodes are running on 4x 10Gbps. When I run  a performance test and 
the mellanox card is a server i.e it is receiving data I get very bad 
performance. By this I mean I cannot get more than 4 gigabits per 
second - very low. 'perf top' clearly shows that the culprit is 
intel_map_page which is being called form the receive path 
of the mellanox adapter: 

84.26%     0.04%  ksoftirqd/0  [kernel.kallsyms]  [k] intel_map_page                            
            |
            --- intel_map_page
               |          
               |--98.38%-- ipoib_cm_alloc_rx_skb
               |          ipoib_cm_handle_rx_wc
               |          ipoib_poll
               |          net_rx_action
               |          __do_softirq
               |          run_ksoftirqd
               |          smpboot_thread_fn
               |          kthread
               |          ret_from_fork


When I disable intel_iommu support (By defualt the iommu is not 
turned on, just compiled, with this performance profile I have 
compiled out the code altogether) things look very differently:

         86.76%     0.16%  ksoftirqd/0  [kernel.kallsyms]  [k] ipoib_poll                                
            |
            --- ipoib_poll
                net_rx_action
                __do_softirq



Essentially the majority is spent in just receiving the packets and the 
sustained rate is 26Gbps. So the question why does compiling in (but not 
enabling intel_iommu=on kills performance) only on the receive side, e.g. if 
the machine which exhibits poor performance with the mlx card is a client, 
that is the mellanox driver is sending data the performance is not affected. 
So far the only workaround is to remove intel iommu support in the kernel 
altogether. 

Regards, 
Nikolay
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: intel iommu causing performance drop in mlx4 ipoib
       [not found] ` <5739A347.20807-6AxghH7DbtA@public.gmane.org>
@ 2016-05-16 10:54   ` David Woodhouse
       [not found]     ` <1463396094.2484.194.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
  2016-05-16 12:05   ` intel iommu causing performance drop in mlx4 ipoib Or Gerlitz
  1 sibling, 1 reply; 7+ messages in thread
From: David Woodhouse @ 2016-05-16 10:54 UTC (permalink / raw)
  To: Nikolay Borisov, matanb-VPRAkNaXOzVWk0Htik3J/w
  Cc: guysh-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	markb-VPRAkNaXOzVWk0Htik3J/w, erezsh-VPRAkNaXOzVWk0Htik3J/w,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

[-- Attachment #1: Type: text/plain, Size: 1827 bytes --]

On Mon, 2016-05-16 at 13:39 +0300, Nikolay Borisov wrote:
> 
> I've observed a strange performance pathology with it when running ipoib 
> and using a naive iperf test. My setup has multiple machines with a mix 
> of qlogic/mellanox cards, connected via an QLogic 12300 switch. All of 
> the nodes are running on 4x 10Gbps. When I run  a performance test and 
> the mellanox card is a server i.e it is receiving data I get very bad 
> performance. By this I mean I cannot get more than 4 gigabits per 
> second - very low. 'perf top' clearly shows that the culprit is 
> intel_map_page which is being called form the receive path 
> of the mellanox adapter: 
> 
> 84.26%     0.04%  ksoftirqd/0  [kernel.kallsyms]  [k] intel_map_page                            
>             |
>             --- intel_map_page
>                |          
>                |--98.38%-- ipoib_cm_alloc_rx_skb

Are you *sure* it's disabled? Can you be more specific about where the
time is spent? intel_map_page() doesn't really do much except calling
in to __intel_map_single()... which should return fairly much
immediately.

I'm working on improving the per-device DMA ops so that for passthrough
devices you don't end up in the IOMMU code at all, but it really
shouldn't be taking *that* long... unless you really are doing
translation.

Note that even in the case where you're doing translation, there's code
which I'm about to ask Linus to pull for 4.7, which will kill most of
the performance hit of using the IOMMU.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org                              Intel Corporation


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: intel iommu causing performance drop in mlx4 ipoib
       [not found] ` <5739A347.20807-6AxghH7DbtA@public.gmane.org>
  2016-05-16 10:54   ` David Woodhouse
@ 2016-05-16 12:05   ` Or Gerlitz
  1 sibling, 0 replies; 7+ messages in thread
From: Or Gerlitz @ 2016-05-16 12:05 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: Matan Barak, Guy Shapiro,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	markb-VPRAkNaXOzVWk0Htik3J/w, Erez Shitrit, David Woodhouse,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Mon, May 16, 2016 at 1:39 PM, Nikolay Borisov <kernel-6AxghH7DbtA@public.gmane.org> wrote:
[...]
> Essentially the majority is spent in just receiving the packets and the
> sustained rate is 26Gbps. So the question why does compiling in (but not
> enabling intel_iommu=on kills performance) only on the receive side, e.g. if

add iommu=pt to your boot, this is 2009 material [1]

Or.

[1] https://lwn.net/Articles/329174/
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: intel iommu causing performance drop in mlx4 ipoib
       [not found]     ` <1463396094.2484.194.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-05-16 12:37       ` Nikolay Borisov
       [not found]         ` <5739BEEC.3070308-6AxghH7DbtA@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolay Borisov @ 2016-05-16 12:37 UTC (permalink / raw)
  To: David Woodhouse, matanb-VPRAkNaXOzVWk0Htik3J/w
  Cc: guysh-VPRAkNaXOzVWk0Htik3J/w,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	markb-VPRAkNaXOzVWk0Htik3J/w, erezsh-VPRAkNaXOzVWk0Htik3J/w,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA



On 05/16/2016 01:54 PM, David Woodhouse wrote:
> On Mon, 2016-05-16 at 13:39 +0300, Nikolay Borisov wrote:
>>
>> I've observed a strange performance pathology with it when running ipoib 
>> and using a naive iperf test. My setup has multiple machines with a mix 
>> of qlogic/mellanox cards, connected via an QLogic 12300 switch. All of 
>> the nodes are running on 4x 10Gbps. When I run  a performance test and 
>> the mellanox card is a server i.e it is receiving data I get very bad 
>> performance. By this I mean I cannot get more than 4 gigabits per 
>> second - very low. 'perf top' clearly shows that the culprit is 
>> intel_map_page which is being called form the receive path 
>> of the mellanox adapter: 
>>
>> 84.26%     0.04%  ksoftirqd/0  [kernel.kallsyms]  [k] intel_map_page                            
>>             |
>>             --- intel_map_page
>>                |          
>>                |--98.38%-- ipoib_cm_alloc_rx_skb
> 
> Are you *sure* it's disabled? Can you be more specific about where the
> time is spent? intel_map_page() doesn't really do much except calling
> in to __intel_map_single()... which should return fairly much
> immediately.

Oops, turned out I had intel_iommu=on, though I'm sure some days ago I
didn't and performance suffered. I now tried with this option removed
and performance is again back to normal. All in all this driver +
intel_iommu is a bad combination I guess. Anyway, sorry for the noise
and thanks for the prompt reply.


Regards,
Nikolay
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: intel iommu causing performance drop in mlx4 ipoib
       [not found]         ` <5739BEEC.3070308-6AxghH7DbtA@public.gmane.org>
@ 2016-05-16 12:39           ` David Woodhouse
       [not found]             ` <1463402381.2484.202.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: David Woodhouse @ 2016-05-16 12:39 UTC (permalink / raw)
  To: Nikolay Borisov, matanb-VPRAkNaXOzVWk0Htik3J/w
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	markb-VPRAkNaXOzVWk0Htik3J/w,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	guysh-VPRAkNaXOzVWk0Htik3J/w, erezsh-VPRAkNaXOzVWk0Htik3J/w


[-- Attachment #1.1: Type: text/plain, Size: 729 bytes --]

On Mon, 2016-05-16 at 15:37 +0300, Nikolay Borisov wrote:
> 
> Oops, turned out I had intel_iommu=on, though I'm sure some days ago I
> didn't and performance suffered. I now tried with this option removed
> and performance is again back to normal. All in all this driver +
> intel_iommu is a bad combination I guess. Anyway, sorry for the noise
> and thanks for the prompt reply.

I would be very interested to see your results with the latest code
from git://git.infradead.org/intel-iommu.git (and with the IOMMU still
enabled).

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org                              Intel Corporation


[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: intel iommu causing performance drop in mlx4 ipoib (OFFLIST)
       [not found]             ` <1463402381.2484.202.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2016-05-16 12:46               ` Nikolay Borisov
       [not found]                 ` <5739C10C.6040805-6AxghH7DbtA@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Nikolay Borisov @ 2016-05-16 12:46 UTC (permalink / raw)
  To: David Woodhouse; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org



On 05/16/2016 03:39 PM, David Woodhouse wrote:
> On Mon, 2016-05-16 at 15:37 +0300, Nikolay Borisov wrote:
>>
>> Oops, turned out I had intel_iommu=on, though I'm sure some days ago I
>> didn't and performance suffered. I now tried with this option removed
>> and performance is again back to normal. All in all this driver +
>> intel_iommu is a bad combination I guess. Anyway, sorry for the noise
>> and thanks for the prompt reply.
> 
> I would be very interested to see your results with the latest code
> from git://git.infradead.org/intel-iommu.git (and with the IOMMU still
> enabled).

Would I be able to apply this cleanly on 4.4, since this is the version
I'm currently using?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: intel iommu causing performance drop in mlx4 ipoib (OFFLIST)
       [not found]                 ` <5739C10C.6040805-6AxghH7DbtA@public.gmane.org>
@ 2016-05-16 12:53                   ` David Woodhouse
  0 siblings, 0 replies; 7+ messages in thread
From: David Woodhouse @ 2016-05-16 12:53 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org

[-- Attachment #1: Type: text/plain, Size: 287 bytes --]

On Mon, 2016-05-16 at 15:46 +0300, Nikolay Borisov wrote:
> Would I be able to apply this cleanly on 4.4, since this is the
> version I'm currently using?

Yes, it all applies cleanly as long as you also apply commit
da972fb13bc5a1baad450c11f9182e4cd0a091f6 first.

-- 
dwmw2


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-05-16 12:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-16 10:39 intel iommu causing performance drop in mlx4 ipoib Nikolay Borisov
     [not found] ` <5739A347.20807-6AxghH7DbtA@public.gmane.org>
2016-05-16 10:54   ` David Woodhouse
     [not found]     ` <1463396094.2484.194.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-05-16 12:37       ` Nikolay Borisov
     [not found]         ` <5739BEEC.3070308-6AxghH7DbtA@public.gmane.org>
2016-05-16 12:39           ` David Woodhouse
     [not found]             ` <1463402381.2484.202.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2016-05-16 12:46               ` intel iommu causing performance drop in mlx4 ipoib (OFFLIST) Nikolay Borisov
     [not found]                 ` <5739C10C.6040805-6AxghH7DbtA@public.gmane.org>
2016-05-16 12:53                   ` David Woodhouse
2016-05-16 12:05   ` intel iommu causing performance drop in mlx4 ipoib Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox