public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* XFS performance degradation during running cp command with big test file
@ 2024-10-16 11:09 Xiongwei Song
  2024-10-17  0:29 ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: Xiongwei Song @ 2024-10-16 11:09 UTC (permalink / raw)
  To: cem, djwong, linux-xfs; +Cc: linux-kernel

Dear Experts,

We are facing a performance degradation on the XFS partition. We
was trying to copy a big file(200GB ~ 250GB) from a path to /dev/null,
when performing cp command to 60s ~ 90s, the reading speed was
suddenly down. At the beginning, the reading speed was around
1080MB/s, 60s later the speed was down to around 350MB/s. This
problem  is only found with XFS + Thick LUN.

The test environment:
Storage Model: Dell unity XT 380 Think/Thin LUN
Linux Version: 4.12.14

The steps to run test:
1) Create a xfs partition with following commands
   parted -a opt /dev/sdb mklabel gpt mkpart sdb xfs 0% 100%
   mkfs.xfs /dev/sdbx
   mount /dev/sdbx /xfs
2) Create a ~200GB file named fileA in the partition.
3) Run cp command to copy the file created in step 2. Meanwhile,
   run iostat vmstat and blktrace to capture logs.
   cp /xfs/fileA /dev/null

To narrow down this issue, we also did some experiments
below to compare:
1) Run the test with dd command with XFS + Thick LUN
   dd if=/xfs/fileA of=/dev/null bs=32k status=progress
   Result: also meet performance degradation
   Speed: around 650MB/S
   Speed has changed to around 350MB/S since the 60s ~ 90s of cp run.

2) Run the test with dd command with raw device with XFS + Thick LUN
   dd if=/dev/sdbx of=/dev/null bs=32k status=progress
   Results: No performance degradation
   Speed: around 520MB/s

3) Run run test with ext4 + Think LUN
   cp /xfs/fileA /dev/null
   Results: No performance degradation
   Speed: around 1080MB/s

4) Run the test with cp with XFS + Thin LUN
   cp /xfs/fileA /dev/null
   Result: No performance degradation
   Speed: around 500MB/s

5) Run the test with dd with XFS + Thin LUN
   dd if=/xfs/fileA of=/dev/null bs=32k status=progress
   Result: No performance degradation
   Speed: around 500MB/s

It seems the issue only can be triggered with XFS + Thick LUN,
no matter dd or cp to read the test file. We would like to learn
if there is something special with XFS in this test situation?
Is it known?

Do you have any thoughts or suggestions? Also, do you need vmstat
or iostat logs or blktrace or any other logs to address this issue?

Thank you in advance.

Regards,
Bruce

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: XFS performance degradation during running cp command with big test file
  2024-10-16 11:09 XFS performance degradation during running cp command with big test file Xiongwei Song
@ 2024-10-17  0:29 ` Dave Chinner
  2024-10-17  2:18   ` Xiongwei Song
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2024-10-17  0:29 UTC (permalink / raw)
  To: Xiongwei Song; +Cc: cem, djwong, linux-xfs, linux-kernel

On Wed, Oct 16, 2024 at 07:09:29PM +0800, Xiongwei Song wrote:
> Dear Experts,
> 
> We are facing a performance degradation on the XFS partition. We
> was trying to copy a big file(200GB ~ 250GB) from a path to /dev/null,
> when performing cp command to 60s ~ 90s, the reading speed was
> suddenly down. At the beginning, the reading speed was around
> 1080MB/s, 60s later the speed was down to around 350MB/s. This
> problem  is only found with XFS + Thick LUN.

There are so many potential things that this could be caused by.

> The test environment:
> Storage Model: Dell unity XT 380 Think/Thin LUN

How many CPUS, RAM, etc does this have?  What disks and what is the
configuration of the fully provisioned LUN you are testing on?

> Linux Version: 4.12.14

You're running an ancient kernel, so the first thing to do is move
to a much more recent kernel (e.g. 6.11) and see if the same
behaviour occurs. If it does, then please answer all the other
questions I've asked and provide the information from running the
tests on the 6.11 kernel...

> The steps to run test:
> 1) Create a xfs partition with following commands
>    parted -a opt /dev/sdb mklabel gpt mkpart sdb xfs 0% 100%
>    mkfs.xfs /dev/sdbx
>    mount /dev/sdbx /xfs

What is the output of mkfs.xfs?

Did you drop the page cache between the initial file create and
the measured copy?

what is the layout of the file you are copying from (ie. xfs_bmap
-vvp <file> output)?

> It seems the issue only can be triggered with XFS + Thick LUN,
> no matter dd or cp to read the test file. We would like to learn
> if there is something special with XFS in this test situation?
> Is it known?

It smells like the difference in bandwidth between the outside edge
and the inside edge of a spinning disk, and XFS is switching
allocation location of the very big file from the outside to the
inside part way through the file (e.g. because the initial AG the
file is located in is full)...

> Do you have any thoughts or suggestions? Also, do you need vmstat
> or iostat logs or blktrace or any other logs to address this issue?

iostat and vmstat output in 1s increments would be useful.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: XFS performance degradation during running cp command with big test file
  2024-10-17  0:29 ` Dave Chinner
@ 2024-10-17  2:18   ` Xiongwei Song
  0 siblings, 0 replies; 3+ messages in thread
From: Xiongwei Song @ 2024-10-17  2:18 UTC (permalink / raw)
  To: Dave Chinner; +Cc: cem, djwong, linux-xfs, linux-kernel

Hi Dave,

Thank you so much for the response.

On Thu, Oct 17, 2024 at 8:29 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Wed, Oct 16, 2024 at 07:09:29PM +0800, Xiongwei Song wrote:
> > Dear Experts,
> >
> > We are facing a performance degradation on the XFS partition. We
> > was trying to copy a big file(200GB ~ 250GB) from a path to /dev/null,
> > when performing cp command to 60s ~ 90s, the reading speed was
> > suddenly down. At the beginning, the reading speed was around
> > 1080MB/s, 60s later the speed was down to around 350MB/s. This
> > problem  is only found with XFS + Thick LUN.
>
> There are so many potential things that this could be caused by.
>
> > The test environment:
> > Storage Model: Dell unity XT 380 Think/Thin LUN
>
> How many CPUS, RAM, etc does this have?  What disks and what is the
> configuration of the fully provisioned LUN you are testing on?
>
> > Linux Version: 4.12.14
>
> You're running an ancient kernel, so the first thing to do is move
> to a much more recent kernel (e.g. 6.11) and see if the same
> behaviour occurs. If it does, then please answer all the other
> questions I've asked and provide the information from running the
> tests on the 6.11 kernel...
Ok, sure. I will try to upgrade the kernel version and run the test again.
But I don't own the test hardware. This issue can't be reproduced on any
machines, so I might not reply to you very quickly.  The worst situation is
I can't use the hardware any more. But once I get the test result I will get
back to you and answer all your questions as soon as possible.

Thank you again.

Regards,
Bruce

>
> > The steps to run test:
> > 1) Create a xfs partition with following commands
> >    parted -a opt /dev/sdb mklabel gpt mkpart sdb xfs 0% 100%
> >    mkfs.xfs /dev/sdbx
> >    mount /dev/sdbx /xfs
>
> What is the output of mkfs.xfs?
>
> Did you drop the page cache between the initial file create and
> the measured copy?
>
> what is the layout of the file you are copying from (ie. xfs_bmap
> -vvp <file> output)?
>
> > It seems the issue only can be triggered with XFS + Thick LUN,
> > no matter dd or cp to read the test file. We would like to learn
> > if there is something special with XFS in this test situation?
> > Is it known?
>
> It smells like the difference in bandwidth between the outside edge
> and the inside edge of a spinning disk, and XFS is switching
> allocation location of the very big file from the outside to the
> inside part way through the file (e.g. because the initial AG the
> file is located in is full)...
>
> > Do you have any thoughts or suggestions? Also, do you need vmstat
> > or iostat logs or blktrace or any other logs to address this issue?
>
> iostat and vmstat output in 1s increments would be useful.
>
> -Dave.
> --
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-10-17  2:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-16 11:09 XFS performance degradation during running cp command with big test file Xiongwei Song
2024-10-17  0:29 ` Dave Chinner
2024-10-17  2:18   ` Xiongwei Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox