public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* sequential I/O on SSD disk varies from 20 to 300 MBytes/s every week
@ 2015-02-02 13:07 Siim Vahtre
  2015-02-02 13:49 ` Suman Tripathi
  2015-02-25 21:50 ` Pavel Machek
  0 siblings, 2 replies; 5+ messages in thread
From: Siim Vahtre @ 2015-02-02 13:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-scsi, siim

Hello,

I have an extremely odd situation when the I/O speed changes for both SATA 
and SSD disks every few days or weeks with no apparent reason.

The servers have clean base install with nothing but SSH running and the 
test I am doing is the following:

# dd if=/dev/zero of=/dev/sda4 bs=1M count=10240 conv=fsync

And the results are:
1) 3.5Mbytes/s - 120Mbytes/s for SATA disks
2) 20Mbytes/s - 300Mbytes/s for SSD disks


Note that:

1) for every disk, the speed (either slow or fast) is usually consistent 
2-14 days, and then it randomly changes.

2) One disk speed does not correlate with the speeds of other disks in the 
same server - one can be 100Mbyte/s while other is 10Mbytes/s) and month 
later it might be vice-versa.

3) I have not yet discovered anything that triggers the change of speed. 
Seemingly it is just random: on week 1 the speed is ~70-80Mbytes/s, and 
then on week two it goes to 20Mbytes/s, and then few days later goes to 
90Mbyte/s. But the speed (slow or fast) is consistent for a longer period 
of time - it does not usually change in matter of hours.

4) Speed is slow for reads as well, but the difference is a bit less 
dramatic. (eg. 400Mbytes/s vs 500Mbytes/s).

5) The random I/O speed also changes, but as it is easier to test.


During the testing period of about 5 months I have concluded:

1) There are 3 identical Fujitsu RX200 S6 test servers which all show the 
same problem, but I also reproduced it on some Sun Fire and Dell server.

2) The problem happens with both HW RAID (MegaRAID SAS 2108) and when 
disks were directly on integrated SATA card.

3) The problem happens with different Kernel versions (tried 3.14, 3.16, 
3.18)

4) The problem happens with newest FW/BIOS versions and on older version

5) I have checked/replaced the cabling.

6) It is not a caching issue (controller/disk caches were off during 
testing, but even putting them on had minor impact on the results)

7) The problem happens with both 2.5" SATA (12 x HGST Travelstar 1TB, 3 x 
WD Black 750G), and SSD disks (3 x Samsung Pro 840)

8) I have NOT been able to reproduce it on Windows - the speeds have been 
good for all disks at all times.

9) Changing the disks (eg. taking currently slow disk and putting it to 
another server) has mixed results - it usually triggers some change of 
speed (slow becomes fast or vice-versa) but not always.


The only thing that somewhat correlates with the change of speed is the 
environment: the IO speed of disks is generally better when testing in the 
office vs if that exact same server is in the server room. It might just 
been luck, however.

I did not find correlation with the uptime, restarts, change of 
temperature, etc, so I assumed it might be the vibrations/rotations for 
SATA disks, but now that I have reproduced it with expensive SSD disks as 
well, I am out of ideas.

Only 20Mbytes/s on SSD must be wrong, right? (Especially if week earlier 
or week later it is ~300MBytes/s).

Any comments would be highly appreciated.

--
Siim Vahtre

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sequential I/O on SSD disk varies from 20 to 300 MBytes/s every week
  2015-02-02 13:07 sequential I/O on SSD disk varies from 20 to 300 MBytes/s every week Siim Vahtre
@ 2015-02-02 13:49 ` Suman Tripathi
  2015-02-02 14:08   ` Siim Vahtre
  2015-02-25 21:50 ` Pavel Machek
  1 sibling, 1 reply; 5+ messages in thread
From: Suman Tripathi @ 2015-02-02 13:49 UTC (permalink / raw)
  To: Siim Vahtre; +Cc: Linux Kernel Mailing List, Linux SCSI List

Hi,

I have an extremely odd situation when the I/O speed changes for both
SATA and SSD disks every few days or weeks with no apparent reason.

The servers have clean base install with nothing but SSH running and
the test I am doing is the following:

# dd if=/dev/zero of=/dev/sda4 bs=1M count=10240 conv=fsync

And the results are:
1) 3.5Mbytes/s - 120Mbytes/s for SATA disks
2) 20Mbytes/s - 300Mbytes/s for SSD disks

The previous reply failed to deliver as plaintext mode was not.
Please check whether write cache is enabled in the drive . This can be
checked from the logs during bootup and also check whether NCQ is
enabled or not.

you can enable write cache by :

hdparm -W 1 /dev/<sdX>

On Mon, Feb 2, 2015 at 6:37 PM, Siim Vahtre <siim@xf.ee> wrote:
> Hello,
>
> I have an extremely odd situation when the I/O speed changes for both SATA
> and SSD disks every few days or weeks with no apparent reason.
>
> The servers have clean base install with nothing but SSH running and the
> test I am doing is the following:
>
> # dd if=/dev/zero of=/dev/sda4 bs=1M count=10240 conv=fsync
>
> And the results are:
> 1) 3.5Mbytes/s - 120Mbytes/s for SATA disks
> 2) 20Mbytes/s - 300Mbytes/s for SSD disks
>
>
> Note that:
>
> 1) for every disk, the speed (either slow or fast) is usually consistent
> 2-14 days, and then it randomly changes.
>
> 2) One disk speed does not correlate with the speeds of other disks in the
> same server - one can be 100Mbyte/s while other is 10Mbytes/s) and month
> later it might be vice-versa.
>
> 3) I have not yet discovered anything that triggers the change of speed.
> Seemingly it is just random: on week 1 the speed is ~70-80Mbytes/s, and then
> on week two it goes to 20Mbytes/s, and then few days later goes to
> 90Mbyte/s. But the speed (slow or fast) is consistent for a longer period of
> time - it does not usually change in matter of hours.
>
> 4) Speed is slow for reads as well, but the difference is a bit less
> dramatic. (eg. 400Mbytes/s vs 500Mbytes/s).
>
> 5) The random I/O speed also changes, but as it is easier to test.
>
>
> During the testing period of about 5 months I have concluded:
>
> 1) There are 3 identical Fujitsu RX200 S6 test servers which all show the
> same problem, but I also reproduced it on some Sun Fire and Dell server.
>
> 2) The problem happens with both HW RAID (MegaRAID SAS 2108) and when disks
> were directly on integrated SATA card.
>
> 3) The problem happens with different Kernel versions (tried 3.14, 3.16,
> 3.18)
>
> 4) The problem happens with newest FW/BIOS versions and on older version
>
> 5) I have checked/replaced the cabling.
>
> 6) It is not a caching issue (controller/disk caches were off during
> testing, but even putting them on had minor impact on the results)
>
> 7) The problem happens with both 2.5" SATA (12 x HGST Travelstar 1TB, 3 x WD
> Black 750G), and SSD disks (3 x Samsung Pro 840)
>
> 8) I have NOT been able to reproduce it on Windows - the speeds have been
> good for all disks at all times.
>
> 9) Changing the disks (eg. taking currently slow disk and putting it to
> another server) has mixed results - it usually triggers some change of speed
> (slow becomes fast or vice-versa) but not always.
>
>
> The only thing that somewhat correlates with the change of speed is the
> environment: the IO speed of disks is generally better when testing in the
> office vs if that exact same server is in the server room. It might just
> been luck, however.
>
> I did not find correlation with the uptime, restarts, change of temperature,
> etc, so I assumed it might be the vibrations/rotations for SATA disks, but
> now that I have reproduced it with expensive SSD disks as well, I am out of
> ideas.
>
> Only 20Mbytes/s on SSD must be wrong, right? (Especially if week earlier or
> week later it is ~300MBytes/s).
>
> Any comments would be highly appreciated.
>
> --
> Siim Vahtre
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Thanks,
with regards,
Suman Tripathi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sequential I/O on SSD disk varies from 20 to 300 MBytes/s every week
  2015-02-02 13:49 ` Suman Tripathi
@ 2015-02-02 14:08   ` Siim Vahtre
       [not found]     ` <CAN7X1UkducENcF_+GuY7rY3q+qAhAp9H1HWMAXnFFtfTjX7Zrw@mail.gmail.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Siim Vahtre @ 2015-02-02 14:08 UTC (permalink / raw)
  To: Suman Tripathi; +Cc: Linux Kernel Mailing List, Linux SCSI List

> Please check whether write cache is enabled in the drive
> also check whether NCQ is enabled or not.

About caches please note quote from my e-mail below. Changing NCQ 
enabled/disabled had no effect either.

>> 6) It is not a caching issue (controller/disk caches were off during 
>> testing, but even putting them on had minor impact on the results)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sequential I/O on SSD disk varies from 20 to 300 MBytes/s every week
       [not found]     ` <CAN7X1UkducENcF_+GuY7rY3q+qAhAp9H1HWMAXnFFtfTjX7Zrw@mail.gmail.com>
@ 2015-02-04  9:50       ` Siim Vahtre
  0 siblings, 0 replies; 5+ messages in thread
From: Siim Vahtre @ 2015-02-04  9:50 UTC (permalink / raw)
  To: Purush Gupta; +Cc: Linux Kernel Mailing List, Linux SCSI List

> Since you mentioned problem does not happen on Windows, what do you use 
> to emulate the workload (iometer ?)

CrystalDiskMark, HD Tune, ATTO Disk Benchmark

Tested for 2-3 hours, but got stable results. Directly afterwards booted 
Live CD with Linux and immediately got poor results.


> and did you see the same behaviour with other Kernel versions like 
> 2.6.32.

I did not test this, as this kernel version wouldn't work for me anyway. 
If you think it would be very important datapoint, I can try.


> You may want to check the per device queue configuration, lsscsi -L

"queue_depth=256", same for all disks


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: sequential I/O on SSD disk varies from 20 to 300 MBytes/s every week
  2015-02-02 13:07 sequential I/O on SSD disk varies from 20 to 300 MBytes/s every week Siim Vahtre
  2015-02-02 13:49 ` Suman Tripathi
@ 2015-02-25 21:50 ` Pavel Machek
  1 sibling, 0 replies; 5+ messages in thread
From: Pavel Machek @ 2015-02-25 21:50 UTC (permalink / raw)
  To: Siim Vahtre; +Cc: linux-kernel, linux-scsi

Hi!

> During the testing period of about 5 months I have concluded:
> 
> 1) There are 3 identical Fujitsu RX200 S6 test servers which all show the
> same problem, but I also reproduced it on some Sun Fire and Dell server.
> 
> 2) The problem happens with both HW RAID (MegaRAID SAS 2108) and when disks
> were directly on integrated SATA card.
> 
> 3) The problem happens with different Kernel versions (tried 3.14, 3.16,
> 3.18)
> 
> 4) The problem happens with newest FW/BIOS versions and on older version
> 
> 5) I have checked/replaced the cabling.
> 
> 6) It is not a caching issue (controller/disk caches were off during
> testing, but even putting them on had minor impact on the results)
> 
> 7) The problem happens with both 2.5" SATA (12 x HGST Travelstar 1TB, 3 x WD
> Black 750G), and SSD disks (3 x Samsung Pro 840)
> 
> 8) I have NOT been able to reproduce it on Windows - the speeds have been
> good for all disks at all times.
> 
> 9) Changing the disks (eg. taking currently slow disk and putting it to
> another server) has mixed results - it usually triggers some change of speed
> (slow becomes fast or vice-versa) but not always.
> 
> 
> The only thing that somewhat correlates with the change of speed is the
> environment: the IO speed of disks is generally better when testing in the
> office vs if that exact same server is in the server room. It might just
> been luck, however.

> I did not find correlation with the uptime, restarts, change of temperature,
> etc, so I assumed it might be the vibrations/rotations for SATA disks, but
> now that I have reproduced it with expensive SSD disks as well, I am out of
> ideas.


That's strange. Vibrations? But not for SSDs. Does hwmon say anything
interesting? Anything in smart?

> Only 20Mbytes/s on SSD must be wrong, right? (Especially if week earlier or
> week later it is ~300MBytes/s).

Yes.

Can you try the disks in different mainboard (but keep software
version?)

Are there any other performance problems?
									Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-02-25 21:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-02 13:07 sequential I/O on SSD disk varies from 20 to 300 MBytes/s every week Siim Vahtre
2015-02-02 13:49 ` Suman Tripathi
2015-02-02 14:08   ` Siim Vahtre
     [not found]     ` <CAN7X1UkducENcF_+GuY7rY3q+qAhAp9H1HWMAXnFFtfTjX7Zrw@mail.gmail.com>
2015-02-04  9:50       ` Siim Vahtre
2015-02-25 21:50 ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox