All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [SPDK] queue pair vs cpu cores
@ 2017-03-29 20:46 Walker, Benjamin
  0 siblings, 0 replies; 4+ messages in thread
From: Walker, Benjamin @ 2017-03-29 20:46 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 4184 bytes --]

On Mon, 2017-03-27 at 17:06 -0700, Isaac Otsiabah wrote:
>  
> Ben, the implementation of your second paragraph may be a little
> easier, correct? Is there an example somewhere that shows how to use
> SPDK to do blocking I/O. It does not have to be robust but, can serve
> as an example of how to structure the code (using threads or
> processes) to do blocking I/O  for legacy applications.

The second approach is easier if you are trying to integrate with an
existing application that is implemented with a pool of threads
performing blocking I/O. We happen to have a great example of how to do
this that we're about to release. See lib/blobfs on the master branch.
This is implementing blocking filesystem-like calls by message passing
to a single core performing operations asynchronously. We use this to
implement an alternate backend to RocksDB that bypasses the kernel
filesystem. RocksDB just happens to be implemented as a pool of threads
performing blocking I/O.

>  
> Isaac
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker,
> Benjamin
> Sent: Monday, March 27, 2017 3:55 PM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: Re: [SPDK] queue pair vs cpu cores
>  
> Hi Raj,
>  
> It’s true - the SPDK event framework spins in reactor loops on each
> core, using them all up even if you aren’t busy, and it’s a model of
> how we think applications should generally be structured for best
> performance. However, the vast majority of the components in SPDK
> don’t require the event framework. These components don’t actually do
> anything unless you call one of their functions from your
> application. In particular, your application is in charge of making
> the call to poll for completions, so one quick solution is to poll
> less often. The ‘less often’ can even be intelligent – for instance
> if no I/O is outstanding don’t bother polling and maybe sleep for a
> short period instead. The SPDK event framework is actually capable of
> sleeping to save CPU cycles (off by default), so you can use that as
> a concrete example.
>  
> There are other designs that work well too. For instance, many legacy
> applications are written with a pool of threads (or processes) doing
> blocking I/O operations. You can dedicate one thread to actually
> doing I/O with SPDK asynchronously, and then have the pool of threads
> pass a message to the main I/O thread and block on a semaphore until
> it is completed. Then, the pool of threads all goes to sleep just as
> if they were doing a blocking I/O operation, and only the one
> dedicated I/O thread spins and polls. You can, of course, combine
> this technique with the first one.
>  
> Another design is to simply limit your application to fewer CPU
> cores. With SPDK making the I/O path so efficient, does that enable
> you to get away with fewer cores? Or you can funnel more work to your
> application such that it is rarely idle, which is often possible in
> the cloud.
>  
> These are at least a few ideas. I’m sure others will come up with
> additional clever ways of managing this.
>  
> -Ben
>  
>  
> From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of
> Rajinikanth Pandurangan
> Sent: Wednesday, March 22, 2017 2:21 PM
> To: spdk(a)lists.01.org
> Subject: [SPDK] queue pair vs cpu cores
> Hi
> As per understanding, using SPDK we can achieve full performance of a
> SSD with 1 core and there is 1 queue pair per thread and 1 thread per
> core.   Typically applications spans multiple threads.  As locking is
> discouraged in SPDK, we have to map each thread to a queue pair.  By
> doing so, we might end up keeping all the cores busy just for IO
> polling.  How do you guys compare this scenario with Kernel mode in-
> terms of overall performance and utilization?  How do we control cpu
> utilization with no locks while satisfying application that spans
> multiple threads?
>  
> Thanks,
> Raj
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3274 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread
* Re: [SPDK] queue pair vs cpu cores
@ 2017-03-28  0:06 Isaac Otsiabah
  0 siblings, 0 replies; 4+ messages in thread
From: Isaac Otsiabah @ 2017-03-28  0:06 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 3210 bytes --]


Ben, the implementation of your second paragraph may be a little easier, correct? Is there an example somewhere that shows how to use SPDK to do blocking I/O. It does not have to be robust but, can serve as an example of how to structure the code (using threads or processes) to do blocking I/O  for legacy applications.

Isaac
From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Walker, Benjamin
Sent: Monday, March 27, 2017 3:55 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: Re: [SPDK] queue pair vs cpu cores

Hi Raj,

It’s true - the SPDK event framework spins in reactor loops on each core, using them all up even if you aren’t busy, and it’s a model of how we think applications should generally be structured for best performance. However, the vast majority of the components in SPDK don’t require the event framework. These components don’t actually do anything unless you call one of their functions from your application. In particular, your application is in charge of making the call to poll for completions, so one quick solution is to poll less often. The ‘less often’ can even be intelligent – for instance if no I/O is outstanding don’t bother polling and maybe sleep for a short period instead. The SPDK event framework is actually capable of sleeping to save CPU cycles (off by default), so you can use that as a concrete example.

There are other designs that work well too. For instance, many legacy applications are written with a pool of threads (or processes) doing blocking I/O operations. You can dedicate one thread to actually doing I/O with SPDK asynchronously, and then have the pool of threads pass a message to the main I/O thread and block on a semaphore until it is completed. Then, the pool of threads all goes to sleep just as if they were doing a blocking I/O operation, and only the one dedicated I/O thread spins and polls. You can, of course, combine this technique with the first one.

Another design is to simply limit your application to fewer CPU cores. With SPDK making the I/O path so efficient, does that enable you to get away with fewer cores? Or you can funnel more work to your application such that it is rarely idle, which is often possible in the cloud.

These are at least a few ideas. I’m sure others will come up with additional clever ways of managing this.

-Ben


From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Rajinikanth Pandurangan
Sent: Wednesday, March 22, 2017 2:21 PM
To: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
Subject: [SPDK] queue pair vs cpu cores
Hi
As per understanding, using SPDK we can achieve full performance of a SSD with 1 core and there is 1 queue pair per thread and 1 thread per core.   Typically applications spans multiple threads.  As locking is discouraged in SPDK, we have to map each thread to a queue pair.  By doing so, we might end up keeping all the cores busy just for IO polling.  How do you guys compare this scenario with Kernel mode in-terms of overall performance and utilization?  How do we control cpu utilization with no locks while satisfying application that spans multiple threads?

Thanks,
Raj

[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 8177 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread
* Re: [SPDK] queue pair vs cpu cores
@ 2017-03-27 22:55 Walker, Benjamin
  0 siblings, 0 replies; 4+ messages in thread
From: Walker, Benjamin @ 2017-03-27 22:55 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 2622 bytes --]

Hi Raj,

It’s true - the SPDK event framework spins in reactor loops on each core, using them all up even if you aren’t busy, and it’s a model of how we think applications should generally be structured for best performance. However, the vast majority of the components in SPDK don’t require the event framework. These components don’t actually do anything unless you call one of their functions from your application. In particular, your application is in charge of making the call to poll for completions, so one quick solution is to poll less often. The ‘less often’ can even be intelligent – for instance if no I/O is outstanding don’t bother polling and maybe sleep for a short period instead. The SPDK event framework is actually capable of sleeping to save CPU cycles (off by default), so you can use that as a concrete example.

There are other designs that work well too. For instance, many legacy applications are written with a pool of threads (or processes) doing blocking I/O operations. You can dedicate one thread to actually doing I/O with SPDK asynchronously, and then have the pool of threads pass a message to the main I/O thread and block on a semaphore until it is completed. Then, the pool of threads all goes to sleep just as if they were doing a blocking I/O operation, and only the one dedicated I/O thread spins and polls. You can, of course, combine this technique with the first one.

Another design is to simply limit your application to fewer CPU cores. With SPDK making the I/O path so efficient, does that enable you to get away with fewer cores? Or you can funnel more work to your application such that it is rarely idle, which is often possible in the cloud.

These are at least a few ideas. I’m sure others will come up with additional clever ways of managing this.

-Ben


From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Rajinikanth Pandurangan
Sent: Wednesday, March 22, 2017 2:21 PM
To: spdk(a)lists.01.org
Subject: [SPDK] queue pair vs cpu cores
Hi
As per understanding, using SPDK we can achieve full performance of a SSD with 1 core and there is 1 queue pair per thread and 1 thread per core.   Typically applications spans multiple threads.  As locking is discouraged in SPDK, we have to map each thread to a queue pair.  By doing so, we might end up keeping all the cores busy just for IO polling.  How do you guys compare this scenario with Kernel mode in-terms of overall performance and utilization?  How do we control cpu utilization with no locks while satisfying application that spans multiple threads?

Thanks,
Raj

[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 6860 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread
* [SPDK] queue pair vs cpu cores
@ 2017-03-22 21:21 Rajinikanth Pandurangan
  0 siblings, 0 replies; 4+ messages in thread
From: Rajinikanth Pandurangan @ 2017-03-22 21:21 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 603 bytes --]

Hi



As per understanding, using SPDK we can achieve full performance of a SSD
with 1 core and there is 1 queue pair per thread and 1 thread per core.
Typically applications spans multiple threads.  As locking is discouraged
in SPDK, we have to map each thread to a queue pair.  By doing so, we might
end up keeping all the cores busy just for IO polling.  How do you guys
compare this scenario with Kernel mode in-terms of overall performance and
utilization?  How do we control cpu utilization with no locks while
satisfying application that spans multiple threads?



Thanks,

Raj

[-- Attachment #2: attachment.html --]
[-- Type: text/html, Size: 846 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-03-29 20:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-29 20:46 [SPDK] queue pair vs cpu cores Walker, Benjamin
  -- strict thread matches above, loose matches on Subject: below --
2017-03-28  0:06 Isaac Otsiabah
2017-03-27 22:55 Walker, Benjamin
2017-03-22 21:21 Rajinikanth Pandurangan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.