* on-off periodic hangs in scsi commands
@ 2011-01-05 18:56 Spelic
[not found] ` <4D24BEC4.7040906-9AbUPqfR1/2XDw4h08c5KA@public.gmane.org>
2011-01-06 21:06 ` Spelic
0 siblings, 2 replies; 3+ messages in thread
From: Spelic @ 2011-01-05 18:56 UTC (permalink / raw)
To: linux-scsi@vger.kernel.org; +Cc: open-iscsi, scst-devel
Hello
I am on kernel 2.6.36.2 (but I think I saw this problem also long time
ago on 2.6.31)
I am having problems of greatly reduced throughput to scsi devices
Commands run for about 0.3 seconds and then are hanged for about 1.7
seconds. Then run again for 0.3 seconds then are hanged again for 1.7
and so on.
This is very recognizable by looking at iostat -x 1.
In the 1.7 seconds of freeze (actually with iostat you see 1 second of
freeze) you can see a few outstanding writes. I saw either outstanding
reads+writes or outstanding writes alone but I think I never saw
outstanding reads alone. For these writes, completion is returned only
after the pause (or so says blktrace...) . I am even wondering if this
could be an interrupt serviced late...?
I saw this in two very different occasions: on 3ware with real disks,
and now on iscsi against a ramdisk.
First case was with my 3ware 9650SE controller used as a SATA controller
with MD on top of it: the problem happens with MD raid 5 doing e.g. 4k
I/O O_DIRECT on it. You can read a detailed description in linux-raid
mailing list
http://marc.info/?l=linux-raid&m=129377016700391&w=2
http://marc.info/?l=linux-raid&m=129377380202417&w=2
http://marc.info/?l=linux-raid&m=129422832802809&w=2
the last post is the most relevant, start from that one.
Initially I thought it was a bug of the 3ware controller, famous for not
being fast / capable to handle high IOPS, however it's strange the
problem does not happen on raid10 with a much higher throughput on same
controller and same disks. The situation is similar because raid-5 in
this case does not read (see last link) but it definitely only happens
in raid5, and not in raid10, i tried lots of times.
Now I am seeing the same thing with open-iscsi over IPoIB against SCST
backed by a ramdisk at the other side, so it's network, completely
different beast, no disks involved (it's a linux ramdisk), on a
completely different hardware for both client and server machines. The
only thing these two iscsi servers have in common with the storage
server (the one with the 3ware) is the type of CPUs: dual Xeon 5420.
Mainboard also is different: supermicro for both computers in this case,
tyan for the storage server.
Here is an excerpt of iostat -x 1 for the iscsi case, initiator side:
-----------------------------------------------------------------
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 100.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00
0.00 184.00 0.00 0.00 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.59 0.00 10.95 67.40 0.00 21.05
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdc 30.00 26676.00 934.00 5773.00 18912.00 258200.00
41.32 153.98 51.05 0.15 98.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.15 83.89 0.00 15.96
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00
0.00 172.00 0.00 0.00 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.78 0.00 16.96 51.33 0.00 30.93
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdc 62.00 22410.00 1773.00 6236.00 35696.00 235392.00
33.85 156.14 42.12 0.12 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 80.00 0.00 20.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00
0.00 138.00 0.00 0.00 100.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.19 0.00 15.03 61.41 0.00 23.38
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdc 19.00 21617.00 806.00 5303.00 13464.00 221096.00
38.40 136.32 44.53 0.16 99.00
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 16.96 0.00 83.04
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00
0.00 152.00 0.00 0.00 100.00
-----------------------------------------------------------------
...off-on-off-on....
See what I mean?
"Top" says that all cpus are idle in the 1 second "off" period,
synchronously for both at initiator side and target side. I can also say
that the IPoIB network is practically stuck during transfer times, pings
between the two machines do not pass, but it's totally idle in the still
time, pings pass. At first I thought the network could have caused these
problems but TCP retransmissions do not wait exactly 1 second, it's a
too perfect synchronization to be a TCP problem. I have never seen a TCP
transfer go fullspeed for 1 second then 0 bytes/sec for another second
then fullspeed again for 1 second... it's not how TCP works.
Here is an iostat -x 1 with the real-disks on 3ware controller during:
dd if=/dev/zero of=/dev/mapper/datavg1-lvol0 bs=4K oflag=direct
(it is the same with oflag=direct,sync)
Note: I am keeping queue_depth at 1 with these controllers: they don't
behave well with ncq.
Note2: it does not pass through a filesystem.
-------------------------------------------------
sdi 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdj 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdl 0.00 0.00 0.00 0.00 0.00 0.00
0.00 1.00 0.00 0.00 100.00 (*)
sdm 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdk 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdn 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
...[CUT]...
sdj 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdl 0.00 0.00 0.00 448.00 0.00 3584.00
8.00 0.99 4.42 2.21 99.00
sdm 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdk 0.00 0.00 0.00 448.00 0.00 3584.00
8.00 0.03 0.07 0.07 3.00
sdn 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
...[CUT]...
sdj 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdl 0.00 0.00 0.00 0.00 0.00 0.00
0.00 1.00 0.00 0.00 100.00 (*)
sdm 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdk 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdn 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
...[CUT]...
sdj 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdl 0.00 0.00 0.00 384.00 0.00 3072.00
8.00 0.98 5.13 2.55 98.00
sdm 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdk 0.00 0.00 0.00 384.00 0.00 3072.00
8.00 0.00 0.00 0.00 0.00
sdn 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
...[CUT]...
sdj 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdl 0.00 0.00 0.00 0.00 0.00 0.00
0.00 1.00 0.00 0.00 100.00 (*)
sdm 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdk 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sdn 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
-------------------------------------------------
(*) <--- see the write stuck in these places... the dd write is sync so
nothing else goes on until it has returned, and it does not return for
one whole second, actually 1.7 seconds.
Now I can show you a blktrace for the real-disks on 3ware case:
Note: this is where I took the 0.3 -- 1.7 figure. For the iscsi case I
am not sure of the exact timing but it also cycles every about 2
seconds. The iscsi case drifts slowly so that after a few minutes is not
in-sync anymore with iostat, then after a few more minutes it is again
in-sync with iostat -x 1 and I could take some clear shots of the
problem like in the iostat I pasted to you.
--------------------------------------------------
8,240 2 1119 0.097304762 5319 I W [kworker/u:1]
8,240 2 1120 0.097305511 5319 C W 0 [0]
8,240 1 3333 0.097350168 1557 A W 859034104 + 8 <-
(8,241) 859032056
8,240 1 3334 0.097350336 1557 Q W 859034104 + 8 [md11_raid5]
8,240 1 3335 0.097351193 1557 G W 859034104 + 8 [md11_raid5]
8,240 1 3336 0.097352168 1557 P N [md11_raid5]
8,240 1 3337 0.097352469 1557 I W 859034104 + 8 [md11_raid5]
8,240 1 3338 0.097359991 1557 U N [md11_raid5] 1
8,240 1 3339 0.097360743 1557 D W 859034104 + 8 [md11_raid5]
8,240 1 3340 0.097363489 1557 U N [md11_raid5] 1
8,240 1 3341 0.097464999 0 C W 859034104 + 8 [0]
8,240 1 3342 0.097494707 1557 U N [md11_raid5] 0
8,240 0 1116 0.097508456 5319 Q WBS [kworker/u:1]
8,240 0 1117 0.097509015 5319 G WBS [kworker/u:1]
8,240 0 1118 0.097509674 5319 P N [kworker/u:1]
8,240 0 1119 0.097509992 5319 I W [kworker/u:1]
8,240 0 1120 0.097510835 5319 C W 0 [0]
8,240 1 3343 0.097555058 1557 A W 859034112 + 8 <-
(8,241) 859032064
8,240 1 3344 0.097555236 1557 Q W 859034112 + 8 [md11_raid5]
8,240 1 3345 0.097556090 1557 G W 859034112 + 8 [md11_raid5]
8,240 1 3346 0.097557062 1557 P N [md11_raid5]
8,240 1 3347 0.097557386 1557 I W 859034112 + 8 [md11_raid5]
8,240 1 3348 0.097564103 1557 U N [md11_raid5] 1
8,240 1 3349 0.097564957 1557 D W 859034112 + 8 [md11_raid5]
8,240 1 3350 0.097567754 1557 U N [md11_raid5] 1
8,240 1 3351 0.302256343 1557 U N [md11_raid5] 1
8,240 3 1117 1.999682436 0 C W 859034112 + 8 [0]
<--------- write completes only here after 1.7 secs
8,240 2 1121 1.999737611 5319 Q WBS [kworker/u:1]
8,240 2 1122 1.999738330 5319 G WBS [kworker/u:1]
8,240 2 1123 1.999738907 5319 P N [kworker/u:1]
8,240 2 1124 1.999739262 5319 I W [kworker/u:1]
8,240 2 1125 1.999740107 5319 C W 0 [0]
8,240 3 1118 1.999792262 1557 A W 859034120 + 8 <-
(8,241) 859032072
8,240 3 1119 1.999792413 1557 Q W 859034120 + 8 [md11_raid5]
8,240 3 1120 1.999793213 1557 G W 859034120 + 8 [md11_raid5]
8,240 3 1121 1.999794118 1557 P N [md11_raid5]
8,240 3 1122 1.999794422 1557 I W 859034120 + 8 [md11_raid5]
8,240 3 1123 1.999801199 1557 U N [md11_raid5] 1
8,240 3 1124 1.999801884 1557 D W 859034120 + 8 [md11_raid5]
8,240 3 1125 1.999804152 1557 U N [md11_raid5] 1
8,240 1 3352 1.999972330 0 C W 859034120 + 8 [0]
8,240 0 1121 2.000030768 5319 Q WBS [kworker/u:1]
8,240 0 1122 2.000031333 5319 G WBS [kworker/u:1]
...[CUT]...
8,240 2 2236 2.097562839 5319 Q WBS [kworker/u:1]
8,240 2 2237 2.097563398 5319 G WBS [kworker/u:1]
8,240 2 2238 2.097564060 5319 P N [kworker/u:1]
8,240 2 2239 2.097564381 5319 I W [kworker/u:1]
8,240 2 2240 2.097565242 5319 C W 0 [0]
8,240 1 6656 2.097613839 1557 A W 859037688 + 8 <-
(8,241) 859035640
8,240 1 6657 2.097613995 1557 Q W 859037688 + 8 [md11_raid5]
8,240 1 6658 2.097614822 1557 G W 859037688 + 8 [md11_raid5]
8,240 1 6659 2.097615667 1557 P N [md11_raid5]
8,240 1 6660 2.097615929 1557 I W 859037688 + 8 [md11_raid5]
8,240 1 6661 2.097622654 1557 U N [md11_raid5] 1
8,240 1 6662 2.097623343 1557 D W 859037688 + 8 [md11_raid5]
8,240 1 6663 2.097625494 1557 U N [md11_raid5] 1
8,240 1 6664 2.097729515 0 C W 859037688 + 8 [0]
8,240 1 6665 2.097758720 1557 U N [md11_raid5] 0
8,240 0 2236 2.097772162 5319 Q WBS [kworker/u:1]
8,240 0 2237 2.097772725 5319 G WBS [kworker/u:1]
8,240 0 2238 2.097773386 5319 P N [kworker/u:1]
8,240 0 2239 2.097773705 5319 I W [kworker/u:1]
8,240 0 2240 2.097774617 5319 C W 0 [0]
8,240 1 6666 2.097823090 1557 A W 859037696 + 8 <-
(8,241) 859035648
8,240 1 6667 2.097823331 1557 Q W 859037696 + 8 [md11_raid5]
8,240 1 6668 2.097824152 1557 G W 859037696 + 8 [md11_raid5]
8,240 1 6669 2.097824997 1557 P N [md11_raid5]
8,240 1 6670 2.097825247 1557 I W 859037696 + 8 [md11_raid5]
8,240 1 6671 2.097831340 1557 U N [md11_raid5] 1
8,240 1 6672 2.097832048 1557 D W 859037696 + 8 [md11_raid5]
8,240 1 6673 2.097834622 1557 U N [md11_raid5] 1
8,240 1 6674 2.302260290 1557 U N [md11_raid5] 1
8,240 3 2234 4.000203670 0 C W 859037696 + 8 [0]
<--------- write completes here
8,240 2 2241 4.000256295 5319 Q WBS [kworker/u:1]
8,240 2 2242 4.000256890 5319 G WBS [kworker/u:1]
8,240 2 2243 4.000257477 5319 P N [kworker/u:1]
8,240 2 2244 4.000257832 5319 I W [kworker/u:1]
8,240 2 2245 4.000258659 5319 C W 0 [0]
8,240 3 2235 4.000311545 1557 A W 859037704 + 8 <-
(8,241) 859035656
8,240 3 2236 4.000311677 1557 Q W 859037704 + 8 [md11_raid5]
8,240 3 2237 4.000312973 1557 G W 859037704 + 8 [md11_raid5]
8,240 3 2238 4.000313864 1557 P N [md11_raid5]
8,240 3 2239 4.000314120 1557 I W 859037704 + 8 [md11_raid5]
8,240 3 2240 4.000321038 1557 U N [md11_raid5] 1
8,240 3 2241 4.000321735 1557 D W 859037704 + 8 [md11_raid5]
8,240 3 2242 4.000323994 1557 U N [md11_raid5] 1
8,240 1 6675 4.000427558 0 C W 859037704 + 8 [0]
8,240 1 6676 4.000461787 1557 U N [md11_raid5] 0
8,240 0 2241 4.000476233 5319 Q WBS [kworker/u:1]
8,240 0 2242 4.000476793 5319 G WBS [kworker/u:1]
...[CUT]...
8,240 2 3516 4.111962898 5319 Q WBS [kworker/u:1]
8,240 2 3517 4.111963472 5319 G WBS [kworker/u:1]
8,240 2 3518 4.111964044 5319 P N [kworker/u:1]
8,240 2 3519 4.111964378 5319 I W [kworker/u:1]
8,240 2 3520 4.111965295 5319 C W 0 [0]
8,240 3 3804 4.112013696 1557 A W 859033592 + 8 <-
(8,241) 859031544
8,240 3 3805 4.112013841 1557 Q W 859033592 + 8 [md11_raid5]
8,240 3 3806 4.112014653 1557 G W 859033592 + 8 [md11_raid5]
8,240 3 3807 4.112015874 1557 P N [md11_raid5]
8,240 3 3808 4.112016166 1557 I W 859033592 + 8 [md11_raid5]
8,240 3 3809 4.112024245 1557 U N [md11_raid5] 1
8,240 3 3810 4.112025141 1557 D W 859033592 + 8 [md11_raid5]
8,240 3 3811 4.112027938 1557 U N [md11_raid5] 1
8,240 1 10164 4.112131048 6493 C W 859033592 + 8 [0]
8,240 3 3812 4.112163132 1557 U N [md11_raid5] 0
8,240 0 3516 4.112174324 5319 Q WBS [kworker/u:1]
8,240 0 3517 4.112174914 5319 G WBS [kworker/u:1]
8,240 0 3518 4.112175476 5319 P N [kworker/u:1]
8,240 0 3519 4.112175795 5319 I W [kworker/u:1]
8,240 0 3520 4.112176739 5319 C W 0 [0]
8,240 3 3813 4.112221841 1557 A W 859033600 + 8 <-
(8,241) 859031552
8,240 3 3814 4.112221952 1557 Q W 859033600 + 8 [md11_raid5]
8,240 3 3815 4.112222762 1557 G W 859033600 + 8 [md11_raid5]
8,240 3 3816 4.112223586 1557 P N [md11_raid5]
8,240 3 3817 4.112223901 1557 I W 859033600 + 8 [md11_raid5]
8,240 3 3818 4.112229650 1557 U N [md11_raid5] 1
8,240 3 3819 4.112230275 1557 D W 859033600 + 8 [md11_raid5]
8,240 3 3820 4.112232817 1557 U N [md11_raid5] 1
8,240 3 3821 4.312255629 1557 U N [md11_raid5] 1
8,240 3 3822 6.001904854 0 C W 859033600 + 8 [0]
<--------- write completes here
8,240 3 3823 6.001944705 1557 U N [md11_raid5] 0
8,240 2 3521 6.001961244 5319 Q WBS [kworker/u:1]
8,240 2 3522 6.001961855 5319 G WBS [kworker/u:1]
8,240 2 3523 6.001962360 5319 P N [kworker/u:1]
8,240 2 3524 6.001962718 5319 I W [kworker/u:1]
8,240 2 3525 6.001963488 5319 C W 0 [0]
8,240 3 3824 6.002019367 1557 A W 859033608 + 8 <-
(8,241) 859031560
8,240 3 3825 6.002019532 1557 Q W 859033608 + 8 [md11_raid5]
8,240 3 3826 6.002020332 1557 G W 859033608 + 8 [md11_raid5]
8,240 3 3827 6.002021283 1557 P N [md11_raid5]
8,240 3 3828 6.002021547 1557 I W 859033608 + 8 [md11_raid5]
8,240 3 3829 6.002028883 1557 U N [md11_raid5] 1
8,240 3 3830 6.002029497 1557 D W 859033608 + 8 [md11_raid5]
8,240 3 3831 6.002032017 1557 U N [md11_raid5] 1
8,240 1 10165 6.002174214 0 C W 859033608 + 8 [0]
8,240 0 3521 6.002243288 5319 Q WBS [kworker/u:1]
8,240 0 3522 6.002248444 5319 G WBS [kworker/u:1]
--------------------------------------------------
You see how perfect is that? Hangs at exactly .30--.31 every time and
restores after 1.7 seconds at or .00 +/- about 500us !
Maybe it drifts very slowly like 500us at every cycle but it's almost
perfectly synchronized...
Have you seen anything similar?
As I said, I think I saw this bug with XFS 2 years ago on the
3.6.31-rcsomething with a MD raid-6 on the same 3ware.
So if it is a kernel bug it has been there for long
What do you think?
I am cc-ing the open-iscsi and scst people to see if they can suggest a
way to debug the iscsi one. In particular it would be very interesting
to get the number of commands received and replied from target side, and
match it to the number of commands submitted from initiator side. (but I
suspect these status variables do not exist) If target side says it has
already replied, but initiator side sees the reply after 1 second...
that would be very meaningful. The bad thing is that I do not have much
time to debug this :-(
Thank you
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: on-off periodic hangs in scsi commands
[not found] ` <4D24BEC4.7040906-9AbUPqfR1/2XDw4h08c5KA@public.gmane.org>
@ 2011-01-06 19:25 ` Vladislav Bolkhovitin
0 siblings, 0 replies; 3+ messages in thread
From: Vladislav Bolkhovitin @ 2011-01-06 19:25 UTC (permalink / raw)
To: Spelic
Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
open-iscsi-/JYPxA39Uh5TLH3MbocFFw,
scst-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
Spelic, on 01/05/2011 09:56 PM wrote:
> I am cc-ing the open-iscsi and scst people to see if they can suggest a
> way to debug the iscsi one. In particular it would be very interesting
> to get the number of commands received and replied from target side, and
> match it to the number of commands submitted from initiator side. (but I
> suspect these status variables do not exist) If target side says it has
> already replied, but initiator side sees the reply after 1 second...
> that would be very meaningful. The bad thing is that I do not have much
> time to debug this :-(
For SCST you can enable "scsi" logging. Then you will see all the coming
SCSI commands and responses for them in the kernel log. To deal with the
large volume of logs this facility can produce, you will need to switch
your logging application (syslogd?) into async mode (no fsync after each
logged line).
Vlad
--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: on-off periodic hangs in scsi commands
2011-01-05 18:56 on-off periodic hangs in scsi commands Spelic
[not found] ` <4D24BEC4.7040906-9AbUPqfR1/2XDw4h08c5KA@public.gmane.org>
@ 2011-01-06 21:06 ` Spelic
1 sibling, 0 replies; 3+ messages in thread
From: Spelic @ 2011-01-06 21:06 UTC (permalink / raw)
To: Spelic; +Cc: linux-scsi@vger.kernel.org, open-iscsi, scst-devel
On 01/05/2011 07:56 PM, Spelic wrote:
> Hello
> I am on kernel 2.6.36.2 (but I think I saw this problem also long
> time ago on 2.6.31)
> I am having problems of greatly reduced throughput to scsi devices
>
> ...with 3ware...
> ...and now with iscsi...
Please disregard the whole iSCSI thing, iSCSI works perfectly!
I forgot I had a firewall test running simultaneously, I'm just
stupid... let's not even talk about it :-(
The 3ware thing is real though, but I guess it's some kind of 3ware bug.
Difficult to say for sure until I can get my hands on another controller.
Strange thing only one MD array has that problem even though there are
other similar MD raid5 arrays on the same controller and other identical
controllers.
It can't be a damaged disk because writes get stuck cyclically on all 6
drives of that array.
Even cables can't be: they are cables for 4 drives each, and 2 drives
are together with other 2 drives on another MD array which doesn't show
the problem.
Is there any way to tell Linux to poll a PCI device even if it hasn't
received an interrupt from it?
If I could poll the controller every, say, 0.2 sec I could test if the
problem goes away.
Thank you
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-01-06 21:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-05 18:56 on-off periodic hangs in scsi commands Spelic
[not found] ` <4D24BEC4.7040906-9AbUPqfR1/2XDw4h08c5KA@public.gmane.org>
2011-01-06 19:25 ` Vladislav Bolkhovitin
2011-01-06 21:06 ` Spelic
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).