From: Christopher Yeoh <cyeoh@au1.ibm.com>
To: Brice Goglin <Brice.Goglin@inria.fr>
Cc: linux-kernel@vger.kernel.org,
Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [RFC][PATCH] Cross Memory Attach
Date: Thu, 16 Sep 2010 23:30:45 +0930 [thread overview]
Message-ID: <20100916233045.73aecc26@lilo> (raw)
In-Reply-To: <4C91E01E.4070209@inria.fr>
On Thu, 16 Sep 2010 11:15:10 +0200
Brice Goglin <Brice.Goglin@inria.fr> wrote:
> Le 16/09/2010 08:32, Brice Goglin a écrit :
> > I am the guy doing KNEM so I can comment on this. The I/OAT part of
> > KNEM was mostly a research topic, it's mostly useless on current
> > machines since the memcpy performance is much larger than I/OAT DMA
> > Engine. We also have an offload model with a kernel thread, but it
> > wasn't used a lot so far. These features can be ignored for the
> > current discussion.
>
> I've just created a knem branch where I removed all the above, and
> some other stuff that are not necessary for normal users. So it just
> contains the region management code and two commands to copy between
> regions or between a region and some local iovecs.
When I did the original hpcc runs for CMA vs shared mem double copy I
also did some KNEM runs as a bit of a sanity check. The CMA OpenMPI
implementation actually uses the infrastructure KNEM put into the
OpenMPI shared mem btl - thanks for that btw it made things much easier
for me to test CMA.
Interestingly although KNEM and CMA fundamentally are doing very
similar things, at least with hpcc I didn't see as much of a gain with
KNEM as with CMA:
MB/s
Naturally Ordered 4 8 16 32
Base 1235 935 622 419
CMA 4741 3769 1977 703
KNEM 3362 3091 1857 681
MB/s
Randomly Ordered 4 8 16 32
Base 1227 947 638 412
CMA 4666 3682 1978 710
KNEM 3348 3050 1883 684
MB/s
Max Ping Pong 4 8 16 32
Base 2028 1938 1928 1882
CMA 7424 7510 7598 7708
KNEM 5661 5476 6050 6290
I don't know the reason behind the difference - if its something
perculiar to hpcc, or if there's extra overhead the way that
knem does setup for copying, or if knem wasn't configured
optimally. I haven't done any comparison IMB or NPB runs...
syscall and setup overhead does have some measurable effect - although I
don't have the numbers for it here, neither KNEM nor CMA does quite as
well with hpcc when compared against a hacked version of hpcc where
everything is declared ahead of time as shared memory so the receiver
can just do a single copy from userspace - which I think is
representative of a theoretical maximum gain from the single copy
approach.
Chris
--
cyeoh@au.ibm.com
WARNING: multiple messages have this Message-ID (diff)
From: Christopher Yeoh <cyeoh@au1.ibm.com>
To: Brice Goglin <Brice.Goglin@inria.fr>
Cc: linux-kernel@vger.kernel.org,
Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [RFC][PATCH] Cross Memory Attach
Date: Thu, 16 Sep 2010 23:30:45 +0930 [thread overview]
Message-ID: <20100916233045.73aecc26@lilo> (raw)
In-Reply-To: <4C91E01E.4070209@inria.fr>
On Thu, 16 Sep 2010 11:15:10 +0200
Brice Goglin <Brice.Goglin@inria.fr> wrote:
> Le 16/09/2010 08:32, Brice Goglin a écrit :
> > I am the guy doing KNEM so I can comment on this. The I/OAT part of
> > KNEM was mostly a research topic, it's mostly useless on current
> > machines since the memcpy performance is much larger than I/OAT DMA
> > Engine. We also have an offload model with a kernel thread, but it
> > wasn't used a lot so far. These features can be ignored for the
> > current discussion.
>
> I've just created a knem branch where I removed all the above, and
> some other stuff that are not necessary for normal users. So it just
> contains the region management code and two commands to copy between
> regions or between a region and some local iovecs.
When I did the original hpcc runs for CMA vs shared mem double copy I
also did some KNEM runs as a bit of a sanity check. The CMA OpenMPI
implementation actually uses the infrastructure KNEM put into the
OpenMPI shared mem btl - thanks for that btw it made things much easier
for me to test CMA.
Interestingly although KNEM and CMA fundamentally are doing very
similar things, at least with hpcc I didn't see as much of a gain with
KNEM as with CMA:
MB/s
Naturally Ordered 4 8 16 32
Base 1235 935 622 419
CMA 4741 3769 1977 703
KNEM 3362 3091 1857 681
MB/s
Randomly Ordered 4 8 16 32
Base 1227 947 638 412
CMA 4666 3682 1978 710
KNEM 3348 3050 1883 684
MB/s
Max Ping Pong 4 8 16 32
Base 2028 1938 1928 1882
CMA 7424 7510 7598 7708
KNEM 5661 5476 6050 6290
I don't know the reason behind the difference - if its something
perculiar to hpcc, or if there's extra overhead the way that
knem does setup for copying, or if knem wasn't configured
optimally. I haven't done any comparison IMB or NPB runs...
syscall and setup overhead does have some measurable effect - although I
don't have the numbers for it here, neither KNEM nor CMA does quite as
well with hpcc when compared against a hacked version of hpcc where
everything is declared ahead of time as shared memory so the receiver
can just do a single copy from userspace - which I think is
representative of a theoretical maximum gain from the single copy
approach.
Chris
--
cyeoh@au.ibm.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-09-16 14:00 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-15 1:18 [RFC][PATCH] Cross Memory Attach Christopher Yeoh
2010-09-15 8:02 ` Ingo Molnar
2010-09-15 8:02 ` Ingo Molnar
2010-09-15 8:16 ` Ingo Molnar
2010-09-15 8:16 ` Ingo Molnar
2010-09-15 13:23 ` Christopher Yeoh
2010-09-15 13:23 ` Christopher Yeoh
2010-09-15 13:20 ` Christopher Yeoh
2010-09-15 13:20 ` Christopher Yeoh
2010-09-15 10:58 ` Avi Kivity
2010-09-15 10:58 ` Avi Kivity
2010-09-15 13:51 ` Ingo Molnar
2010-09-15 13:51 ` Ingo Molnar
2010-09-15 16:10 ` Avi Kivity
2010-09-15 16:10 ` Avi Kivity
2010-09-15 14:42 ` Christopher Yeoh
2010-09-15 14:42 ` Christopher Yeoh
2010-09-15 14:52 ` Linus Torvalds
2010-09-15 14:52 ` Linus Torvalds
2010-09-15 15:44 ` Robin Holt
2010-09-15 15:44 ` Robin Holt
2010-09-16 6:32 ` Brice Goglin
2010-09-16 6:32 ` Brice Goglin
2010-09-16 9:15 ` Brice Goglin
2010-09-16 9:15 ` Brice Goglin
2010-09-16 14:00 ` Christopher Yeoh [this message]
2010-09-16 14:00 ` Christopher Yeoh
2010-09-15 14:46 ` Bryan Donlan
2010-09-15 14:46 ` Bryan Donlan
2010-09-15 16:13 ` Avi Kivity
2010-09-15 16:13 ` Avi Kivity
2010-09-15 19:35 ` Eric W. Biederman
2010-09-15 19:35 ` Eric W. Biederman
2010-09-16 1:18 ` Christopher Yeoh
2010-09-16 1:18 ` Christopher Yeoh
2010-09-16 9:26 ` Avi Kivity
2010-09-16 9:26 ` Avi Kivity
2010-11-02 3:37 ` Christopher Yeoh
2010-11-02 3:37 ` Christopher Yeoh
2010-11-02 11:10 ` Avi Kivity
2010-11-02 11:10 ` Avi Kivity
2010-09-16 1:58 ` KOSAKI Motohiro
2010-09-16 1:58 ` KOSAKI Motohiro
2010-09-16 8:08 ` Ingo Molnar
2010-09-16 8:08 ` Ingo Molnar
2010-09-15 15:11 ` Linus Torvalds
2010-09-15 15:14 ` Linus Torvalds
2010-09-16 2:25 ` Christopher Yeoh
2010-09-16 16:27 ` Peter Zijlstra
2010-09-16 16:54 ` Linus Torvalds
2010-09-16 17:13 ` Peter Zijlstra
2010-09-16 17:34 ` Linus Torvalds
2010-09-16 17:47 ` Peter Zijlstra
2010-09-16 17:54 ` Linus Torvalds
2010-09-16 18:00 ` Linus Torvalds
2010-09-19 4:44 ` Yuhong Bao
2010-09-19 19:20 ` Yuhong Bao
2010-09-19 21:48 ` Russell King - ARM Linux
2010-09-19 22:47 ` Yuhong Bao
2010-09-19 4:55 ` Yuhong Bao
2010-09-15 16:07 ` Valdis.Kletnieks
2010-09-16 2:17 ` Christopher Yeoh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100916233045.73aecc26@lilo \
--to=cyeoh@au1.ibm.com \
--cc=Brice.Goglin@inria.fr \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.