linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: DMMP request-queue vs. BiO
       [not found]     ` <20241105103307.GA1385@lst.de>
@ 2024-11-07 18:35       ` John Meneghini
  2024-11-15 14:05         ` Mikulas Patocka
  0 siblings, 1 reply; 6+ messages in thread
From: John Meneghini @ 2024-11-07 18:35 UTC (permalink / raw)
  To: linux-block, dm-devel, linux-scsi
  Cc: Chris Leech, Hannes Reinecke, Christoph Hellwig, snitzer,
	Ming Lei, Benjamin Marzinski, Jonathan Brassow, Ewan Milne,
	Mikulas Patocka, bmarson, Jeff Moyer, spetrovi@redhat.com,
	Rob Evers

I've been asked to move this conversation to a public thread on the upstream email distros.

Background:

At ALPSS last month (Sept. 2024) Hannes and Christoph spoke with Chris and I about how they'd like to remove the 
request-interface from DMMP and asked if Red Hat would be willing to help out by running some DMMP/Bio vs. DMMP/req performance 
tests and share the results.The idea was: with some of the recent performance improvements in the BIO path upstream we believe 
there may not be much of a performance difference between these two code paths and would like Red Hat's help in demonstrating that.

So Chris and I returned to Red Hat and broached this subject here internally. The Red Hat performance team has agreed to work 
work with us on an ad hoc basis to do this and we've made some preliminary plans to build a test bed that can used to do some 
performance tests with DMMP on an upstream kernel using iSCSI and FCP. Then we talked to the DMMP guys about it. They have some 
questions and asked me discuss this topic in an email thread on linux-scsi, linux-block and dm-devel.

Some questions are:

What are the exact patches which make us think the BIO path is now performant?

Is it Ming's immutable bvecs and moving the splitting down to the driver?

I've been told these changes are only applicable if a filesystem is involved. Databases can make direct use of the dmmp device, 
so late bio splitting not applicable for them. It is filesystems that are building larger bios. See the comments from Hannes and 
Christoph below.

I think Red Hat can help out with the performance testing but we will need to answer some of these questions. It will also be 
important to determine exactly what kind of workload we should use with any DMMP performance tests. Will a simple workload 
generated with fio work, or do we need to test some actual data base work loads as well?

Please reply to this public thread with your thoughts and ideas.

Thanks,

John A. Meneghini
Senior Principal Platform Storage Engineer
RHEL SST - Platform Storage Group
jmeneghi@redhat.com

On 11/5/24 05:33, Christoph Hellwig wrote:
> On Tue, Nov 05, 2024 at 08:44:45AM +0100, Hannes Reinecke wrote:
>>> I think the big change is really Ming's immutable bvecs and moving the
>>> splitting down to the driver.  This means bios are much bigger (and
>>> even bigger now with large folios for file systems supporting it).
>>>
>> Exactly. With the current code we should never merge requests; all
>> data should be assembled in the bio already.
>> (I wonder if we could trigger a WARN_ON if request merging is
>> attempted ...)
> 
> Request merging is obviosuly still pretty common.  For one because
> a lot of crappy file systems submit a buffer_head per block (none of
> the should be relevant for multipathing), but also because we reach
> the bio size at some point and just need to split.  While large folios
> reduce that a lot, not all file systems that matter support that.
> (that what the plug callback would fix IFF it turns out to be an
> issue) and last but not least I/O schedulers delay I/O to be able to
> do better merging.  My theory is that this not important for the kind
> of storage we use multipathing for, or rather not for the pathing
> decisions.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMMP request-queue vs. BiO
  2024-11-07 18:35       ` DMMP request-queue vs. BiO John Meneghini
@ 2024-11-15 14:05         ` Mikulas Patocka
  2024-11-15 17:09           ` Christoph Hellwig
  2024-11-15 20:24           ` John Meneghini
  0 siblings, 2 replies; 6+ messages in thread
From: Mikulas Patocka @ 2024-11-15 14:05 UTC (permalink / raw)
  To: John Meneghini
  Cc: linux-block, dm-devel, linux-scsi, Chris Leech, Hannes Reinecke,
	Christoph Hellwig, snitzer, Ming Lei, Benjamin Marzinski,
	Jonathan Brassow, Ewan Milne, bmarson, Jeff Moyer,
	spetrovi@redhat.com, Rob Evers

Hi


On Thu, 7 Nov 2024, John Meneghini wrote:

> I've been asked to move this conversation to a public thread on the upstream
> email distros.
> 
> Background:
> 
> At ALPSS last month (Sept. 2024) Hannes and Christoph spoke with Chris and I
> about how they'd like to remove the request-interface from DMMP and asked if
> Red Hat would be willing to help out by running some DMMP/Bio vs. DMMP/req
> performance tests and share the results.The idea was: with some of the recent
> performance improvements in the BIO path upstream we believe there may not be
> much of a performance difference between these two code paths and would like
> Red Hat's help in demonstrating that.
> 
> So Chris and I returned to Red Hat and broached this subject here internally.
> The Red Hat performance team has agreed to work work with us on an ad hoc
> basis to do this and we've made some preliminary plans to build a test bed
> that can used to do some performance tests with DMMP on an upstream kernel
> using iSCSI and FCP. Then we talked to the DMMP guys about it. They have some
> questions and asked me discuss this topic in an email thread on linux-scsi,
> linux-block and dm-devel.
> 
> Some questions are:
> 
> What are the exact patches which make us think the BIO path is now performant?

There are too many changes that help increasing bio size, so it's not 
possible to pick one or a few patches.

> Is it Ming's immutable bvecs and moving the splitting down to the driver?

Yes, splitting bios at the driver helps.

Folios also help with using larger bio size.

> I've been told these changes are only applicable if a filesystem is involved.
> Databases can make direct use of the dmmp device, so late bio splitting not
> applicable for them. It is filesystems that are building larger bios. See the
> comments from Hannes and Christoph below.

Databases should use direct I/O and with direct I/O, they can generate as 
big bios as they want.

Note, that if a database uses buffered block device, performance will be 
suboptimal, because the buffering mechanism can't create large bios, it 
only sends page-sized bios. But that is expected to not be used - the 
database should either use a block device with direct I/O or a filesystem 
with or without direct I/O.

> I think Red Hat can help out with the performance testing but we will need to
> answer some of these questions. It will also be important to determine exactly
> what kind of workload we should use with any DMMP performance tests. Will a
> simple workload generated with fio work, or do we need to test some actual
> data base work loads as well?

I suggest to use some real-world workload - you can use something that you 
already use to verify the performance of RHEL.

The problem with fio is that it generates I/O at random locations, so 
there is no bio merging possible, so it will show just the IOPS value of 
the underlying storage device.

> Please reply to this public thread with your thoughts and ideas.
> 
> Thanks,
> 
> John A. Meneghini
> Senior Principal Platform Storage Engineer
> RHEL SST - Platform Storage Group
> jmeneghi@redhat.com

Mikulas


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMMP request-queue vs. BiO
  2024-11-15 14:05         ` Mikulas Patocka
@ 2024-11-15 17:09           ` Christoph Hellwig
  2024-11-15 20:28             ` John Meneghini
  2024-11-15 20:24           ` John Meneghini
  1 sibling, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2024-11-15 17:09 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: John Meneghini, linux-block, dm-devel, linux-scsi, Chris Leech,
	Hannes Reinecke, Christoph Hellwig, snitzer, Ming Lei,
	Benjamin Marzinski, Jonathan Brassow, Ewan Milne, bmarson,
	Jeff Moyer, spetrovi@redhat.com, Rob Evers

On Fri, Nov 15, 2024 at 03:05:21PM +0100, Mikulas Patocka wrote:
> Note, that if a database uses buffered block device, performance will be 
> suboptimal, because the buffering mechanism can't create large bios, it 
> only sends page-sized bios. But that is expected to not be used - the 
> database should either use a block device with direct I/O or a filesystem 
> with or without direct I/O.

And, as pointed out in the private mail that John forwarded to the list
without my permission if we really have a workload that cares md could
implement the plugging callback as done in md to operate on a batch
of bios.

Also not building large bios is not a fundamental property of block
device writes but because it uses the legacy buffered head helpers.
That means:

  a) the same is applicable to file systems using them as well
  b) can be fixed if someone cares enough, but apparently no one does


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMMP request-queue vs. BiO
  2024-11-15 14:05         ` Mikulas Patocka
  2024-11-15 17:09           ` Christoph Hellwig
@ 2024-11-15 20:24           ` John Meneghini
  1 sibling, 0 replies; 6+ messages in thread
From: John Meneghini @ 2024-11-15 20:24 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: linux-block, dm-devel, linux-scsi, Chris Leech, Hannes Reinecke,
	Christoph Hellwig, snitzer, Ming Lei, Benjamin Marzinski,
	Jonathan Brassow, Ewan Milne, bmarson, Jeff Moyer,
	spetrovi@redhat.com, Rob Evers

On 11/15/24 09:05, Mikulas Patocka wrote:
> I suggest to use some real-world workload - you can use something that you
> already use to verify the performance of RHEL.
> 
> The problem with fio is that it generates I/O at random locations, so
> there is no bio merging possible, so it will show just the IOPS value of
> the underlying storage device.

OK. That's the information was as looking for. So we'll be sure to run some real world workloads that hit the bio merging code 
path.

Thanks,

/John


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMMP request-queue vs. BiO
  2024-11-15 17:09           ` Christoph Hellwig
@ 2024-11-15 20:28             ` John Meneghini
  2024-11-18 13:05               ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: John Meneghini @ 2024-11-15 20:28 UTC (permalink / raw)
  To: Christoph Hellwig, Mikulas Patocka
  Cc: linux-block, dm-devel, linux-scsi, Chris Leech, Hannes Reinecke,
	snitzer, Ming Lei, Benjamin Marzinski, Jonathan Brassow,
	Ewan Milne, bmarson, Jeff Moyer, spetrovi@redhat.com, Rob Evers

On 11/15/24 12:09, Christoph Hellwig wrote:
> On Fri, Nov 15, 2024 at 03:05:21PM +0100, Mikulas Patocka wrote:
>> Note, that if a database uses buffered block device, performance will be
>> suboptimal, because the buffering mechanism can't create large bios, it
>> only sends page-sized bios. But that is expected to not be used - the
>> database should either use a block device with direct I/O or a filesystem
>> with or without direct I/O.
> 
> And, as pointed out in the private mail that John forwarded to the list
> without my permission if we really have a workload that cares md could

Ah come on. I deleted most of the private thread....

> implement the plugging callback as done in md to operate on a batch
> of bios.
> 
> Also not building large bios is not a fundamental property of block
> device writes but because it uses the legacy buffered head helpers.
> That means:
> 
>    a) the same is applicable to file systems using them as well
>    b) can be fixed if someone cares enough, but apparently no one does
> 

OK. Thanks, that the info I was looking for.

Thanks,

/John


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: DMMP request-queue vs. BiO
  2024-11-15 20:28             ` John Meneghini
@ 2024-11-18 13:05               ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2024-11-18 13:05 UTC (permalink / raw)
  To: John Meneghini
  Cc: Christoph Hellwig, Mikulas Patocka, linux-block, dm-devel,
	linux-scsi, Chris Leech, Hannes Reinecke, snitzer, Ming Lei,
	Benjamin Marzinski, Jonathan Brassow, Ewan Milne, bmarson,
	Jeff Moyer, spetrovi@redhat.com, Rob Evers

On Fri, Nov 15, 2024 at 03:28:03PM -0500, John Meneghini wrote:
>> And, as pointed out in the private mail that John forwarded to the list
>> without my permission if we really have a workload that cares md could
>
> Ah come on. I deleted most of the private thread....

As a rule of thumb forwarding private mail to a public list is never
valid without previous permission.  I'm not worried about any actual
information in this one, but it is still a breach of trust and privacy
expectations.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-11-18 13:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <2d5fe016-2941-43a4-8b7c-850b8ee1d6ce@redhat.com>
     [not found] ` <20241104073547.GA20614@lst.de>
     [not found]   ` <d9733713-eb7b-4efa-ad6b-e6b41d1df93b@suse.de>
     [not found]     ` <20241105103307.GA1385@lst.de>
2024-11-07 18:35       ` DMMP request-queue vs. BiO John Meneghini
2024-11-15 14:05         ` Mikulas Patocka
2024-11-15 17:09           ` Christoph Hellwig
2024-11-15 20:28             ` John Meneghini
2024-11-18 13:05               ` Christoph Hellwig
2024-11-15 20:24           ` John Meneghini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).