* sata_sil24 memory fragmentation issues
@ 2010-09-02 17:02 Jonathan Haws
2010-09-03 0:22 ` Robert Hancock
0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Haws @ 2010-09-02 17:02 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org
I am having some issues with the sata_sil24 driver. It appears that when memory gets fragmented enough, bad things start to happen. However, this only occurs when I am receiving large amounts of data over the network as well.
Here is my test setup: I am running an AMCC 405EX processor on their Kilauea development board. I have a PCIe SATA controller based on the 3531 single port chip (which uses the sata_sil24 driver). I have a program that simply dumps data out to disk. When I am running that program, I am also running ping -s 8500 <some-ip>.
Here is the output:
8508 bytes from 172.31.22.21: seq=137 ttl=128 time=1.306 ms
CNT: 129 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.17188 MB/s AVG RD: 0.00000 MB/s
8508 bytes from 172.31.22.21: seq=138 ttl=128 time=1.254 ms
CNT: 130 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.16279 MB/s AVG RD: 0.00000 MB/s
8508 bytes from 172.31.22.21: seq=139 ttl=128 time=1.291 ms
CNT: 131 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15385 MB/s AVG RD: 0.00000 MB/s
8508 bytes from 172.31.22.21: seq=140 ttl=128 time=1.254 ms
CNT: 132 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15267 MB/s AVG RD: 0.00000 MB/s
sata: page allocation failure. order:0, mode:0x22
Call Trace:
[ccad9a10] [c0006ef0] show_stack+0x44/0x16c (unreliable)
[ccad9a50] [c006f9f0] __alloc_pages_nodemask+0x38c/0x4f8
[ccad9ad0] [c01a7e3c] emac_poll_rx+0x5ac/0x768
[ccad9b10] [c01a28e4] mal_poll+0xa8/0x1ec
[ccad9b40] [c01d3eec] net_rx_action+0x9c/0x1b4
[ccad9b70] [c003b3c0] __do_softirq+0xc4/0x148
[ccad9bb0] [c0004d18] do_softirq+0x78/0x80
[ccad9bc0] [c003afac] irq_exit+0x64/0x7c
[ccad9bd0] [c0005210] do_IRQ+0x9c/0xb4
[ccad9bf0] [c000fa7c] ret_from_except+0x0/0x18
[ccad9cb0] [00000001] 0x1
[ccad9cd0] [c00c2a68] generic_write_end+0x24/0xe0
[ccad9d00] [c0069cc0] generic_file_buffered_write+0x18c/0x304
[ccad9d90] [c006a38c] __generic_file_aio_write_nolock+0x288/0x4fc
[ccad9e00] [c006a8e4] generic_file_aio_write+0x68/0xf8
[ccad9e30] [c009a9b4] do_sync_write+0xc4/0x138
[ccad9ef0] [c009b41c] vfs_write+0xb4/0x158
[ccad9f10] [c009ba4c] sys_write+0x4c/0x90
[ccad9f40] [c000f434] ret_from_syscall+0x0/0x3c
Mem-Info:
DMA per-cpu:
CPU 0: hi: 90, btch: 15 usd: 53
Active_anon:554 active_file:1753 inactive_anon:608
inactive_file:49910 unevictable:0 dirty:6020 writeback:0 unstable:0
free:192 slab:2573 mapped:445 pagetables:25 bounce:0
DMA free:768kB min:2036kB low:2544kB high:3052kB active_anon:2216kB inactive_anon:2432kB active_file:7012kB inactive_file:199640kB unevictable:0kB present:260096kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 2*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 768kB
51673 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
65536 pages RAM
1400 pages reserved
51949 pages shared
12273 pages non-shared
emac_alloc_rx_skb: allocating new page after consumption
I am seeing a few different issues - one, the SATA driver is failing to allocate memory, even though there is memory available. Also, the ibm_newemac driver is failing to allocate memory even though there is memory available.
I have already modified the ibm_newemac driver to only ever allocate single pages. Before, it would allocate based on MTU size - so if I had the MTU set to 9000, it would try and allocate 4 contiguous pages, which it would fail because there were no contiguous 16k pages available.
Can anyone help me with this issue? I have been trying to move to Linux on this hardware for a long time, but just cannot seem to get past this issue. I am very new to kernel development and am not very familiar with the ATA system at all, let alone the memory manager. I had a hard enough time getting to understand the networking stack (which I still feel like I don't understand completely).
Thanks for the help.
Jonathan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: sata_sil24 memory fragmentation issues
2010-09-02 17:02 sata_sil24 memory fragmentation issues Jonathan Haws
@ 2010-09-03 0:22 ` Robert Hancock
2010-09-03 14:46 ` Jonathan Haws
0 siblings, 1 reply; 5+ messages in thread
From: Robert Hancock @ 2010-09-03 0:22 UTC (permalink / raw)
To: Jonathan Haws; +Cc: linux-kernel@vger.kernel.org
On 09/02/2010 11:02 AM, Jonathan Haws wrote:
> I am having some issues with the sata_sil24 driver. It appears that when memory gets fragmented enough, bad things start to happen. However, this only occurs when I am receiving large amounts of data over the network as well.
>
> Here is my test setup: I am running an AMCC 405EX processor on their Kilauea development board. I have a PCIe SATA controller based on the 3531 single port chip (which uses the sata_sil24 driver). I have a program that simply dumps data out to disk. When I am running that program, I am also running ping -s 8500<some-ip>.
>
> Here is the output:
>
> 8508 bytes from 172.31.22.21: seq=137 ttl=128 time=1.306 ms
> CNT: 129 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.17188 MB/s AVG RD: 0.00000 MB/s
> 8508 bytes from 172.31.22.21: seq=138 ttl=128 time=1.254 ms
> CNT: 130 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.16279 MB/s AVG RD: 0.00000 MB/s
> 8508 bytes from 172.31.22.21: seq=139 ttl=128 time=1.291 ms
> CNT: 131 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15385 MB/s AVG RD: 0.00000 MB/s
> 8508 bytes from 172.31.22.21: seq=140 ttl=128 time=1.254 ms
> CNT: 132 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15267 MB/s AVG RD: 0.00000 MB/s
> sata: page allocation failure. order:0, mode:0x22
sata is the process name, I believe? Not sure the SATA driver is
involved here at all.
> Call Trace:
> [ccad9a10] [c0006ef0] show_stack+0x44/0x16c (unreliable)
> [ccad9a50] [c006f9f0] __alloc_pages_nodemask+0x38c/0x4f8
> [ccad9ad0] [c01a7e3c] emac_poll_rx+0x5ac/0x768
> [ccad9b10] [c01a28e4] mal_poll+0xa8/0x1ec
> [ccad9b40] [c01d3eec] net_rx_action+0x9c/0x1b4
> [ccad9b70] [c003b3c0] __do_softirq+0xc4/0x148
> [ccad9bb0] [c0004d18] do_softirq+0x78/0x80
> [ccad9bc0] [c003afac] irq_exit+0x64/0x7c
> [ccad9bd0] [c0005210] do_IRQ+0x9c/0xb4
> [ccad9bf0] [c000fa7c] ret_from_except+0x0/0x18
> [ccad9cb0] [00000001] 0x1
> [ccad9cd0] [c00c2a68] generic_write_end+0x24/0xe0
> [ccad9d00] [c0069cc0] generic_file_buffered_write+0x18c/0x304
> [ccad9d90] [c006a38c] __generic_file_aio_write_nolock+0x288/0x4fc
> [ccad9e00] [c006a8e4] generic_file_aio_write+0x68/0xf8
> [ccad9e30] [c009a9b4] do_sync_write+0xc4/0x138
> [ccad9ef0] [c009b41c] vfs_write+0xb4/0x158
> [ccad9f10] [c009ba4c] sys_write+0x4c/0x90
> [ccad9f40] [c000f434] ret_from_syscall+0x0/0x3c
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 90, btch: 15 usd: 53
> Active_anon:554 active_file:1753 inactive_anon:608
> inactive_file:49910 unevictable:0 dirty:6020 writeback:0 unstable:0
> free:192 slab:2573 mapped:445 pagetables:25 bounce:0
> DMA free:768kB min:2036kB low:2544kB high:3052kB active_anon:2216kB inactive_anon:2432kB active_file:7012kB inactive_file:199640kB unevictable:0kB present:260096kB pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> DMA: 2*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 768kB
> 51673 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 0, delete 0, find 0/0
> Free swap = 0kB
> Total swap = 0kB
> 65536 pages RAM
> 1400 pages reserved
> 51949 pages shared
> 12273 pages non-shared
> emac_alloc_rx_skb: allocating new page after consumption
>
> I am seeing a few different issues - one, the SATA driver is failing to allocate memory, even though there is memory available. Also, the ibm_newemac driver is failing to allocate memory even though there is memory available.
>
> I have already modified the ibm_newemac driver to only ever allocate single pages. Before, it would allocate based on MTU size - so if I had the MTU set to 9000, it would try and allocate 4 contiguous pages, which it would fail because there were no contiguous 16k pages available.
>
> Can anyone help me with this issue? I have been trying to move to Linux on this hardware for a long time, but just cannot seem to get past this issue. I am very new to kernel development and am not very familiar with the ATA system at all, let alone the memory manager. I had a hard enough time getting to understand the networking stack (which I still feel like I don't understand completely).
>
> Thanks for the help.
>
> Jonathan
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: sata_sil24 memory fragmentation issues
2010-09-03 0:22 ` Robert Hancock
@ 2010-09-03 14:46 ` Jonathan Haws
2010-09-03 15:54 ` Jonathan Haws
2010-09-03 16:47 ` Robert Hancock
0 siblings, 2 replies; 5+ messages in thread
From: Jonathan Haws @ 2010-09-03 14:46 UTC (permalink / raw)
To: Robert Hancock; +Cc: linux-kernel@vger.kernel.org
> I am having some issues with the sata_sil24 driver. It appears that when memory gets fragmented enough, bad things start to happen. However, this only occurs when I am receiving large amounts of data over the network as well.
>
> Here is my test setup: I am running an AMCC 405EX processor on their Kilauea development board. I have a PCIe SATA controller based on the 3531 single port chip (which uses the sata_sil24 driver). I have a program that simply dumps data out to disk. When I am running that program, I am also running ping -s 8500<some-ip>.
>
> Here is the output:
>
> 8508 bytes from 172.31.22.21: seq=137 ttl=128 time=1.306 ms
> CNT: 129 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.17188 MB/s AVG RD: 0.00000 MB/s
> 8508 bytes from 172.31.22.21: seq=138 ttl=128 time=1.254 ms
> CNT: 130 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.16279 MB/s AVG RD: 0.00000 MB/s
> 8508 bytes from 172.31.22.21: seq=139 ttl=128 time=1.291 ms
> CNT: 131 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15385 MB/s AVG RD: 0.00000 MB/s
> 8508 bytes from 172.31.22.21: seq=140 ttl=128 time=1.254 ms
> CNT: 132 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15267 MB/s AVG RD: 0.00000 MB/s
> sata: page allocation failure. order:0, mode:0x22
>sata is the process name, I believe? Not sure the SATA driver is
>involved here at all.
I think it is because if you look at the call trace, the exception occurs down in the kernel. The driver I am using is the sata_sil24 driver and doing some searches online, others have experienced similar problems when the system is under heavy load (such as a high level of network interrupts). Unfortunately the solutions to those problems is to go with different SATA controllers, which is not an option for me.
However, when you mention that the driver is not involved, are you implying that there may be a bug in my program? I will go back and look through my code, but it is a really dumb program - I have a large statically allocated buffer that I write to disk over and over again. I will go back and check to make sure I am not doing anything stupid, but I don't think I am.
Thanks,
Jonathan
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: sata_sil24 memory fragmentation issues
2010-09-03 14:46 ` Jonathan Haws
@ 2010-09-03 15:54 ` Jonathan Haws
2010-09-03 16:47 ` Robert Hancock
1 sibling, 0 replies; 5+ messages in thread
From: Jonathan Haws @ 2010-09-03 15:54 UTC (permalink / raw)
To: Robert Hancock; +Cc: linux-kernel@vger.kernel.org
> I am having some issues with the sata_sil24 driver. It appears that when memory gets fragmented enough, bad things start to happen. However, this only occurs when I am receiving large amounts of data over the network as well.
>
> Here is my test setup: I am running an AMCC 405EX processor on their Kilauea development board. I have a PCIe SATA controller based on the 3531 single port chip (which uses the sata_sil24 driver). I have a program that simply dumps data out to disk. When I am running that program, I am also running ping -s 8500<some-ip>.
>
> Here is the output:
>
> 8508 bytes from 172.31.22.21: seq=137 ttl=128 time=1.306 ms
> CNT: 129 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.17188 MB/s AVG RD: 0.00000 MB/s
> 8508 bytes from 172.31.22.21: seq=138 ttl=128 time=1.254 ms
> CNT: 130 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.16279 MB/s AVG RD: 0.00000 MB/s
> 8508 bytes from 172.31.22.21: seq=139 ttl=128 time=1.291 ms
> CNT: 131 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15385 MB/s AVG RD: 0.00000 MB/s
> 8508 bytes from 172.31.22.21: seq=140 ttl=128 time=1.254 ms
> CNT: 132 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15267 MB/s AVG RD: 0.00000 MB/s
> sata: page allocation failure. order:0, mode:0x22
>sata is the process name, I believe? Not sure the SATA driver is
>involved here at all.
I think it is because if you look at the call trace, the exception occurs down in the kernel. The driver I am using is the sata_sil24 driver and doing some searches online, others have experienced similar problems when the system is under heavy load (such as a high level of network interrupts). Unfortunately the solutions to those problems is to go with different SATA controllers, which is not an option for me.
However, when you mention that the driver is not involved, are you implying that there may be a bug in my program? I will go back and look through my code, but it is a really dumb program - I have a large statically allocated buffer that I write to disk over and over again. I will go back and check to make sure I am not doing anything stupid, but I don't think I am.
Here is some more crash dump. This one shows the error coming from kswapd0. Any thoughts:
kswapd0: page allocation failure. order:2, mode:0x4020
Call Trace:
[cfff9de0] [c000711c] show_stack+0x44/0x16c (unreliable)
[cfff9e20] [c00746b4] __alloc_pages_nodemask+0x3c8/0x570
[cfff9ec0] [c007487c] __get_free_pages+0x20/0x50
[cfff9ed0] [c009e82c] __kmalloc_track_caller+0xcc/0xec
[cfff9ef0] [c01e16a8] __alloc_skb+0x64/0x124
[cfff9f10] [c01cc1c8] emac_poll_rx+0x45c/0x7cc
[cfff9f50] [c01c766c] mal_poll+0xa8/0x1ec
[cfff9f80] [c01ed61c] net_rx_action+0x9c/0x1a4
[cfff9fb0] [c0039c70] __do_softirq+0xac/0x124
[cfff9ff0] [c000cfd4] call_do_softirq+0x14/0x24
[ce433c60] [c0005238] do_softirq+0x84/0x90
[ce433c80] [c0039798] irq_exit+0x54/0x6c
[ce433c90] [c00052a8] do_IRQ+0x64/0x158
[ce433cc0] [c000dce0] ret_from_except+0x0/0x18
[ce433d80] [ce433e30] 0xce433e30
[ce433e00] [c0078718] __pagevec_release+0x28/0x44
[ce433e20] [c007a308] move_active_pages_to_lru+0xfc/0x1b0
[ce433e90] [c007a9dc] shrink_active_list+0x284/0x35c
[ce433f00] [c007c990] kswapd+0x3c4/0x540
[ce433fb0] [c004f7d0] kthread+0x7c/0x80
[ce433ff0] [c000d484] kernel_thread+0x4c/0x68
Mem-Info:
DMA per-cpu:
CPU 0: hi: 90, btch: 15 usd: 89
active_anon:895 inactive_anon:224 isolated_anon:32
active_file:400 inactive_file:50824 isolated_file:0
unevictable:0 dirty:1475 writeback:0 unstable:0
free:195 slab_reclaimable:986 slab_unreclaimable:234
mapped:424 shmem:0 pagetables:25 bounce:0
DMA free:780kB min:2036kB low:2544kB high:3052kB active_anon:3580kB inactive_anon:896kB active_file:1600kB inactive_file:203296kB unevictable:0kB isolated(anon):128kB isolated(file):0kB present:260096kB mlocked:0kB dirty:5900kB writeback:0kB mapped:1696kB shmem:0kB slab_reclaimable:3944kB slab_unreclaimable:936kB kernel_stack:280kB pagetables:100kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:64 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 25*4kB 7*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 780kB
51224 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
65536 pages RAM
1485 pages reserved
50830 pages shared
13188 pages non-shared
It appears to me that I am running out of memory. I should not that this is an embedded system with not a whole lot of memory and no swapfile. Also, I am using the standard drivers (not my modified EMAC network driver and a stock sata_sil24). Am I just dead in the water or is there something I can do to get around this?
Thanks,
Jonathan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: sata_sil24 memory fragmentation issues
2010-09-03 14:46 ` Jonathan Haws
2010-09-03 15:54 ` Jonathan Haws
@ 2010-09-03 16:47 ` Robert Hancock
1 sibling, 0 replies; 5+ messages in thread
From: Robert Hancock @ 2010-09-03 16:47 UTC (permalink / raw)
To: Jonathan Haws; +Cc: linux-kernel@vger.kernel.org
On Fri, Sep 3, 2010 at 8:46 AM, Jonathan Haws <Jonathan.Haws@sdl.usu.edu> wrote:
>> I am having some issues with the sata_sil24 driver. It appears that when memory gets fragmented enough, bad things start to happen. However, this only occurs when I am receiving large amounts of data over the network as well.
>>
>> Here is my test setup: I am running an AMCC 405EX processor on their Kilauea development board. I have a PCIe SATA controller based on the 3531 single port chip (which uses the sata_sil24 driver). I have a program that simply dumps data out to disk. When I am running that program, I am also running ping -s 8500<some-ip>.
>>
>> Here is the output:
>>
>> 8508 bytes from 172.31.22.21: seq=137 ttl=128 time=1.306 ms
>> CNT: 129 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.17188 MB/s AVG RD: 0.00000 MB/s
>> 8508 bytes from 172.31.22.21: seq=138 ttl=128 time=1.254 ms
>> CNT: 130 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.16279 MB/s AVG RD: 0.00000 MB/s
>> 8508 bytes from 172.31.22.21: seq=139 ttl=128 time=1.291 ms
>> CNT: 131 WRIT: 34603008 RATE: 33.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15385 MB/s AVG RD: 0.00000 MB/s
>> 8508 bytes from 172.31.22.21: seq=140 ttl=128 time=1.254 ms
>> CNT: 132 WRIT: 35651584 RATE: 34.00000 MB/s READ: 0 RATE: 0.00000 MB/s AVG WR: 34.15267 MB/s AVG RD: 0.00000 MB/s
>> sata: page allocation failure. order:0, mode:0x22
>
>>sata is the process name, I believe? Not sure the SATA driver is
>>involved here at all.
>
> I think it is because if you look at the call trace, the exception occurs down in the kernel. The driver I am using is the sata_sil24 driver and doing some searches online, others have experienced similar problems when the system is under heavy load (such as a high level of network interrupts). Unfortunately the solutions to those problems is to go with different SATA controllers, which is not an option for me.
>
> However, when you mention that the driver is not involved, are you implying that there may be a bug in my program? I will go back and look through my code, but it is a really dumb program - I have a large statically allocated buffer that I write to disk over and over again. I will go back and check to make sure I am not doing anything stupid, but I don't think I am.
I meant that "sata" is just the process name (I assume), it's not
really anything to do with the SATA driver. Normally SATA host
controller drivers don't really allocate memory at runtime so this
wouldn't really be an issue with them. Network controllers do in order
to handle received packets, though - it appears that for some reason
the memory allocation by the network driver is failing.
I'm not really sure why that is - it seems like you do have memory
available. Hopefully some VM guru can pipe up with a suggestion.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-09-03 16:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-02 17:02 sata_sil24 memory fragmentation issues Jonathan Haws
2010-09-03 0:22 ` Robert Hancock
2010-09-03 14:46 ` Jonathan Haws
2010-09-03 15:54 ` Jonathan Haws
2010-09-03 16:47 ` Robert Hancock
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox