From: Michal Simek <monstr@monstr.eu>
To: Eric Dumazet <eric.dumazet@gmail.com>,
netdev@vger.kernel.org, David Miller <davem@davemloft.net>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
John Williams <john.williams@petalogix.com>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org
Subject: Look for physical address from user space address/fixup - NET_DMA
Date: Wed, 08 Jun 2011 16:45:36 +0200 [thread overview]
Message-ID: <4DEF8B10.6060904@monstr.eu> (raw)
Hi,
I do some investigation how to speedup memory operations
(memcopy/memset/copy_tofrom_user/etc) by dma to improve ethernet performance
(currently for PAGE_SIZE operations).
I profiled kernel and copy_tofrom_user is the weakest place for network
operations. I have optimize it by loop unrolling which gave me 20% better
throughput but still no enough.
Then I added hw dma to the design and changed u-boot mem operations (saved me 5s
in bootup time - loading 20MB kernel through 100Mbit/s LAN) and also I have add
support to Linux memcpy (haven't measured improvement but there is some).
For copy_tofrom_user is situation a little bit complicated but I have prototyped
it by dma without fixup to see improvement. There could be next 20%.
Based on this I have measured spending time on this code and I found that most
of the time is spent on looking for physical address from user space address.
I need to get physical address because dma requires it. It is around 70% of
total time.
I use for Microblaze the part of code shown below but it is slow. Do you know
how to do it faster?
pmd_t *pmdp;
pte_t *ptep;
pmdp = pmd_offset(pud_offset(
pgd_offset(current->mm, address),
address), address);
preempt_disable();
ptep = pte_offset_map(pmdp, address);
if (pte_present(*ptep)) {
address = (unsigned long) page_address(pte_page(*ptep));
/* MS: I need add offset in page */
address += address & ~PAGE_MASK;
/* MS address is virtual */
address = virt_to_phys(address);
}
pte_unmap(ptep);
preempt_enable();
Currently this is my bottleneck to get better improvement.
Not sure if someone has ever tried to replace by dma with fixup support. That's
the second thing where I would like to hear your opinion. Would it be possible
to simplify it by access user space address and address + PAGE_SIZE? Or any
other scheme?
There is also one option NET_DMA where I expect that dma will be used instead of
mem operations. Is it correct assumption? Because I see that there are no irqs
coming from dma. Dma test is working well.
Eric, David: How is it supposed to work?
Thanks,
Michal
--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
WARNING: multiple messages have this Message-ID (diff)
From: Michal Simek <monstr@monstr.eu>
To: Eric Dumazet <eric.dumazet@gmail.com>,
netdev@vger.kernel.org, David Miller <davem@davemloft.net>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
John Williams <john.williams@petalogix.com>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org
Subject: Look for physical address from user space address/fixup - NET_DMA
Date: Wed, 08 Jun 2011 16:45:36 +0200 [thread overview]
Message-ID: <4DEF8B10.6060904@monstr.eu> (raw)
Hi,
I do some investigation how to speedup memory operations
(memcopy/memset/copy_tofrom_user/etc) by dma to improve ethernet performance
(currently for PAGE_SIZE operations).
I profiled kernel and copy_tofrom_user is the weakest place for network
operations. I have optimize it by loop unrolling which gave me 20% better
throughput but still no enough.
Then I added hw dma to the design and changed u-boot mem operations (saved me 5s
in bootup time - loading 20MB kernel through 100Mbit/s LAN) and also I have add
support to Linux memcpy (haven't measured improvement but there is some).
For copy_tofrom_user is situation a little bit complicated but I have prototyped
it by dma without fixup to see improvement. There could be next 20%.
Based on this I have measured spending time on this code and I found that most
of the time is spent on looking for physical address from user space address.
I need to get physical address because dma requires it. It is around 70% of
total time.
I use for Microblaze the part of code shown below but it is slow. Do you know
how to do it faster?
pmd_t *pmdp;
pte_t *ptep;
pmdp = pmd_offset(pud_offset(
pgd_offset(current->mm, address),
address), address);
preempt_disable();
ptep = pte_offset_map(pmdp, address);
if (pte_present(*ptep)) {
address = (unsigned long) page_address(pte_page(*ptep));
/* MS: I need add offset in page */
address += address & ~PAGE_MASK;
/* MS address is virtual */
address = virt_to_phys(address);
}
pte_unmap(ptep);
preempt_enable();
Currently this is my bottleneck to get better improvement.
Not sure if someone has ever tried to replace by dma with fixup support. That's
the second thing where I would like to hear your opinion. Would it be possible
to simplify it by access user space address and address + PAGE_SIZE? Or any
other scheme?
There is also one option NET_DMA where I expect that dma will be used instead of
mem operations. Is it correct assumption? Because I see that there are no irqs
coming from dma. Dma test is working well.
Eric, David: How is it supposed to work?
Thanks,
Michal
--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Michal Simek <monstr@monstr.eu>
To: Eric Dumazet <eric.dumazet@gmail.com>,
netdev@vger.kernel.org, David Miller <davem@davemloft.net>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>,
Thomas Gle
Subject: Look for physical address from user space address/fixup - NET_DMA
Date: Wed, 08 Jun 2011 16:45:36 +0200 [thread overview]
Message-ID: <4DEF8B10.6060904@monstr.eu> (raw)
Hi,
I do some investigation how to speedup memory operations
(memcopy/memset/copy_tofrom_user/etc) by dma to improve ethernet performance
(currently for PAGE_SIZE operations).
I profiled kernel and copy_tofrom_user is the weakest place for network
operations. I have optimize it by loop unrolling which gave me 20% better
throughput but still no enough.
Then I added hw dma to the design and changed u-boot mem operations (saved me 5s
in bootup time - loading 20MB kernel through 100Mbit/s LAN) and also I have add
support to Linux memcpy (haven't measured improvement but there is some).
For copy_tofrom_user is situation a little bit complicated but I have prototyped
it by dma without fixup to see improvement. There could be next 20%.
Based on this I have measured spending time on this code and I found that most
of the time is spent on looking for physical address from user space address.
I need to get physical address because dma requires it. It is around 70% of
total time.
I use for Microblaze the part of code shown below but it is slow. Do you know
how to do it faster?
pmd_t *pmdp;
pte_t *ptep;
pmdp = pmd_offset(pud_offset(
pgd_offset(current->mm, address),
address), address);
preempt_disable();
ptep = pte_offset_map(pmdp, address);
if (pte_present(*ptep)) {
address = (unsigned long) page_address(pte_page(*ptep));
/* MS: I need add offset in page */
address += address & ~PAGE_MASK;
/* MS address is virtual */
address = virt_to_phys(address);
}
pte_unmap(ptep);
preempt_enable();
Currently this is my bottleneck to get better improvement.
Not sure if someone has ever tried to replace by dma with fixup support. That's
the second thing where I would like to hear your opinion. Would it be possible
to simplify it by access user space address and address + PAGE_SIZE? Or any
other scheme?
There is also one option NET_DMA where I expect that dma will be used instead of
mem operations. Is it correct assumption? Because I see that there are no irqs
coming from dma. Dma test is working well.
Eric, David: How is it supposed to work?
Thanks,
Michal
--
Michal Simek, Ing. (M.Eng)
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel 2.6 Microblaze Linux - http://www.monstr.eu/fdt/
Microblaze U-BOOT custodian
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next reply other threads:[~2011-06-08 14:45 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-08 14:45 Michal Simek [this message]
2011-06-08 14:45 ` Look for physical address from user space address/fixup - NET_DMA Michal Simek
2011-06-08 14:45 ` Michal Simek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4DEF8B10.6060904@monstr.eu \
--to=monstr@monstr.eu \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=john.williams@petalogix.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=netdev@vger.kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.