* Re: [BUG] AS io-scheduler.
@ 2007-07-15 15:20 Ian Kumlien
2007-07-16 16:31 ` Chuck Ebbert
0 siblings, 1 reply; 11+ messages in thread
From: Ian Kumlien @ 2007-07-15 15:20 UTC (permalink / raw)
To: Linux-kernel
[-- Attachment #1: Type: text/plain, Size: 4615 bytes --]
Sorry, due to mental fatigue i actually forgot to repor that this was
with 2.6.22.1
Since this is mailed from outside the ml, i'm attaching the original
message:
---
Hi,
I had emerge --sync failing several times...
So i checked dmesg and found some info, attached further down.
This is a old VIA C3 machine with one disk, it's been running most
kernels in the 2.6.x series with no problems until now.
PS. Don't forget to CC me
DS.
BUG: unable to handle kernel paging request at virtual address ea86ac54
printing eip:
c022dfec
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge
CPU: 0
EIP: 0060:[<c022dfec>] Not tainted VLI
EFLAGS: 00010082 (2.6.22.1 #26)
EIP is at as_can_break_anticipation+0xc/0x190
eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844
esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70
ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068
Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000)
Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844
00000000
dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000
dfcffb9c
00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30
c04d1ec0
Call Trace:
[<c022efc8>] as_add_request+0xa8/0xc0
[<c0227a76>] elv_insert+0xa6/0x150
[<c016e96e>] bio_phys_segments+0xe/0x20
[<c022af64>] __make_request+0x384/0x490
[<c02add1e>] ide_do_request+0x6ee/0x890
[<c02294ab>] generic_make_request+0x18b/0x1c0
[<c022b596>] submit_bio+0xa6/0xb0
[<c013b7b8>] mempool_alloc+0x28/0xa0
[<c016bb66>] __find_get_block+0xf6/0x130
[<c016e0bc>] bio_alloc_bioset+0x8c/0xf0
[<c016b647>] submit_bh+0xb7/0xe0
[<c016c1f8>] ll_rw_block+0x78/0x90
[<c019c85d>] search_by_key+0xdd/0xd20
[<c016c201>] ll_rw_block+0x81/0x90
[<c011f190>] irq_exit+0x40/0x60
[<c01066e4>] do_IRQ+0x94/0xb0
[<c0104bc3>] common_interrupt+0x23/0x30
[<c018beca>] reiserfs_read_locked_inode+0x6a/0x490
[<c018e580>] reiserfs_find_actor+0x0/0x20
[<c018c33b>] reiserfs_iget+0x4b/0x80
[<c018e570>] reiserfs_init_locked_inode+0x0/0x10
[<c0189824>] reiserfs_lookup+0xa4/0xf0
[<c0157b03>] do_lookup+0xa3/0x140
[<c0159265>] __link_path_walk+0x615/0xa20
[<c0168a18>] __mark_inode_dirty+0x28/0x150
[<c01631c1>] mntput_no_expire+0x11/0x50
[<c01596b2>] link_path_walk+0x42/0xb0
[<c0159960>] do_path_lookup+0x130/0x150
[<c015a190>] __user_walk_fd+0x30/0x50
[<c0154766>] vfs_lstat_fd+0x16/0x40
[<c01547df>] sys_lstat64+0xf/0x30
[<c0103c42>] syscall_call+0x7/0xb
=======================
Code: c0 8b 44 cb 0c 8b 40 08 29 d0 f7 d0 c1 e8 1f 83 f0 01 eb 02 31 c0
5b c3 8d b4 26 00 00 00 00 55 57 56 53 83 ec 04 89 c3 89 14 24 <8b> 90
b4 00 00 00 85 d2 75 04 0f 0b eb fe 83 3c 24 00 74 0c 8b
EIP: [<c022dfec>] as_can_break_anticipation+0xc/0x190 SS:ESP
0068:ceff6a70
WARNING: at block/as-iosched.c:862 as_remove_queued_request()
[<c022e560>] as_remove_queued_request+0x40/0xc0
[<c022e684>] as_move_to_dispatch+0xa4/0x110
[<c0236626>] __delay+0x6/0x10
[<c022ee03>] as_dispatch_request+0x2d3/0x310
[<c02b39b2>] ide_dma_start+0x22/0x30
[<c02b6be0>] ide_do_rw_disk+0x310/0x3f0
[<c02279b5>] elv_next_request+0x105/0x120
[<c02ad876>] ide_do_request+0x246/0x890
[<c03d5c9d>] schedule+0x45d/0x4c0
[<c022e7f0>] as_work_handler+0x0/0x20
[<c022a331>] blk_start_queueing+0x11/0x20
[<c022e7ff>] as_work_handler+0xf/0x20
[<c0126bab>] run_workqueue+0x6b/0xe0
[<c0127200>] worker_thread+0x0/0xc0
[<c01272b2>] worker_thread+0xb2/0xc0
[<c01297e0>] autoremove_wake_function+0x0/0x40
[<c0129658>] kthread+0x38/0x60
[<c0129620>] kthread+0x0/0x60
[<c0104d87>] kernel_thread_helper+0x7/0x10
=======================
WARNING: at block/as-iosched.c:966 as_move_to_dispatch()
[<c022e6b3>] as_move_to_dispatch+0xd3/0x110
[<c022ee03>] as_dispatch_request+0x2d3/0x310
[<c02b39b2>] ide_dma_start+0x22/0x30
[<c02b6be0>] ide_do_rw_disk+0x310/0x3f0
[<c02279b5>] elv_next_request+0x105/0x120
[<c02ad876>] ide_do_request+0x246/0x890
[<c03d5c9d>] schedule+0x45d/0x4c0
[<c022e7f0>] as_work_handler+0x0/0x20
[<c022a331>] blk_start_queueing+0x11/0x20
[<c022e7ff>] as_work_handler+0xf/0x20
[<c0126bab>] run_workqueue+0x6b/0xe0
[<c0127200>] worker_thread+0x0/0xc0
[<c01272b2>] worker_thread+0xb2/0xc0
[<c01297e0>] autoremove_wake_function+0x0/0x40
[<c0129658>] kthread+0x38/0x60
[<c0129620>] kthread+0x0/0x60
[<c0104d87>] kernel_thread_helper+0x7/0x10
=======================
--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [BUG] AS io-scheduler. 2007-07-15 15:20 [BUG] AS io-scheduler Ian Kumlien @ 2007-07-16 16:31 ` Chuck Ebbert 2007-07-16 17:29 ` Jens Axboe 2007-07-17 1:44 ` Rene Herman 0 siblings, 2 replies; 11+ messages in thread From: Chuck Ebbert @ 2007-07-16 16:31 UTC (permalink / raw) To: pomac; +Cc: Linux-kernel, Jens Axboe, Nick Piggin On 07/15/2007 11:20 AM, Ian Kumlien wrote: > I had emerge --sync failing several times... > > So i checked dmesg and found some info, attached further down. > This is a old VIA C3 machine with one disk, it's been running most > kernels in the 2.6.x series with no problems until now. > > PS. Don't forget to CC me > DS. > > BUG: unable to handle kernel paging request at virtual address ea86ac54 > printing eip: > c022dfec > *pde = 00000000 > Oops: 0000 [#1] > Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge > CPU: 0 > EIP: 0060:[<c022dfec>] Not tainted VLI > EFLAGS: 00010082 (2.6.22.1 #26) > EIP is at as_can_break_anticipation+0xc/0x190 > eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844 > esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70 > ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068 > Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000) > Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844 > 00000000 > dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000 > dfcffb9c > 00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30 > c04d1ec0 > Call Trace: > [<c022efc8>] as_add_request+0xa8/0xc0 > [<c0227a76>] elv_insert+0xa6/0x150 > [<c016e96e>] bio_phys_segments+0xe/0x20 > [<c022af64>] __make_request+0x384/0x490 > [<c02add1e>] ide_do_request+0x6ee/0x890 > [<c02294ab>] generic_make_request+0x18b/0x1c0 > [<c022b596>] submit_bio+0xa6/0xb0 > [<c013b7b8>] mempool_alloc+0x28/0xa0 > [<c016bb66>] __find_get_block+0xf6/0x130 > [<c016e0bc>] bio_alloc_bioset+0x8c/0xf0 > [<c016b647>] submit_bh+0xb7/0xe0 > [<c016c1f8>] ll_rw_block+0x78/0x90 > [<c019c85d>] search_by_key+0xdd/0xd20 > [<c016c201>] ll_rw_block+0x81/0x90 > [<c011f190>] irq_exit+0x40/0x60 > [<c01066e4>] do_IRQ+0x94/0xb0 > [<c0104bc3>] common_interrupt+0x23/0x30 > [<c018beca>] reiserfs_read_locked_inode+0x6a/0x490 > [<c018e580>] reiserfs_find_actor+0x0/0x20 > [<c018c33b>] reiserfs_iget+0x4b/0x80 > [<c018e570>] reiserfs_init_locked_inode+0x0/0x10 > [<c0189824>] reiserfs_lookup+0xa4/0xf0 > [<c0157b03>] do_lookup+0xa3/0x140 > [<c0159265>] __link_path_walk+0x615/0xa20 > [<c0168a18>] __mark_inode_dirty+0x28/0x150 > [<c01631c1>] mntput_no_expire+0x11/0x50 > [<c01596b2>] link_path_walk+0x42/0xb0 > [<c0159960>] do_path_lookup+0x130/0x150 > [<c015a190>] __user_walk_fd+0x30/0x50 > [<c0154766>] vfs_lstat_fd+0x16/0x40 > [<c01547df>] sys_lstat64+0xf/0x30 > [<c0103c42>] syscall_call+0x7/0xb > ======================= static int as_can_break_anticipation(struct as_data *ad, struct request *rq) { struct io_context *ioc; struct as_io_context *aic; ioc = ad->io_context; <======== ad is bogus BUG_ON(!ioc); Call chain is: as_add_request as_update_rq: if (ad->antic_status == ANTIC_WAIT_REQ || ad->antic_status == ANTIC_WAIT_NEXT) { if (as_can_break_anticipation(ad, rq)) as_antic_stop(ad); } So somehow 'ad' became invalid between the time ad->antic_status was checked and as_can_break_anticipation() tried to access ad->io_context? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] AS io-scheduler. 2007-07-16 16:31 ` Chuck Ebbert @ 2007-07-16 17:29 ` Jens Axboe 2007-07-16 19:49 ` Ian Kumlien 2007-07-17 1:44 ` Rene Herman 1 sibling, 1 reply; 11+ messages in thread From: Jens Axboe @ 2007-07-16 17:29 UTC (permalink / raw) To: Chuck Ebbert; +Cc: pomac, Linux-kernel, Nick Piggin On Mon, Jul 16 2007, Chuck Ebbert wrote: > On 07/15/2007 11:20 AM, Ian Kumlien wrote: > > I had emerge --sync failing several times... > > > > So i checked dmesg and found some info, attached further down. > > This is a old VIA C3 machine with one disk, it's been running most > > kernels in the 2.6.x series with no problems until now. > > > > PS. Don't forget to CC me > > DS. > > > > BUG: unable to handle kernel paging request at virtual address ea86ac54 > > printing eip: > > c022dfec > > *pde = 00000000 > > Oops: 0000 [#1] > > Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge > > CPU: 0 > > EIP: 0060:[<c022dfec>] Not tainted VLI > > EFLAGS: 00010082 (2.6.22.1 #26) > > EIP is at as_can_break_anticipation+0xc/0x190 > > eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844 > > esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70 > > ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068 > > Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000) > > Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844 > > 00000000 > > dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000 > > dfcffb9c > > 00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30 > > c04d1ec0 > > Call Trace: > > [<c022efc8>] as_add_request+0xa8/0xc0 > > [<c0227a76>] elv_insert+0xa6/0x150 > > [<c016e96e>] bio_phys_segments+0xe/0x20 > > [<c022af64>] __make_request+0x384/0x490 > > [<c02add1e>] ide_do_request+0x6ee/0x890 > > [<c02294ab>] generic_make_request+0x18b/0x1c0 > > [<c022b596>] submit_bio+0xa6/0xb0 > > [<c013b7b8>] mempool_alloc+0x28/0xa0 > > [<c016bb66>] __find_get_block+0xf6/0x130 > > [<c016e0bc>] bio_alloc_bioset+0x8c/0xf0 > > [<c016b647>] submit_bh+0xb7/0xe0 > > [<c016c1f8>] ll_rw_block+0x78/0x90 > > [<c019c85d>] search_by_key+0xdd/0xd20 > > [<c016c201>] ll_rw_block+0x81/0x90 > > [<c011f190>] irq_exit+0x40/0x60 > > [<c01066e4>] do_IRQ+0x94/0xb0 > > [<c0104bc3>] common_interrupt+0x23/0x30 > > [<c018beca>] reiserfs_read_locked_inode+0x6a/0x490 > > [<c018e580>] reiserfs_find_actor+0x0/0x20 > > [<c018c33b>] reiserfs_iget+0x4b/0x80 > > [<c018e570>] reiserfs_init_locked_inode+0x0/0x10 > > [<c0189824>] reiserfs_lookup+0xa4/0xf0 > > [<c0157b03>] do_lookup+0xa3/0x140 > > [<c0159265>] __link_path_walk+0x615/0xa20 > > [<c0168a18>] __mark_inode_dirty+0x28/0x150 > > [<c01631c1>] mntput_no_expire+0x11/0x50 > > [<c01596b2>] link_path_walk+0x42/0xb0 > > [<c0159960>] do_path_lookup+0x130/0x150 > > [<c015a190>] __user_walk_fd+0x30/0x50 > > [<c0154766>] vfs_lstat_fd+0x16/0x40 > > [<c01547df>] sys_lstat64+0xf/0x30 > > [<c0103c42>] syscall_call+0x7/0xb > > ======================= > > static int as_can_break_anticipation(struct as_data *ad, struct request *rq) > { > struct io_context *ioc; > struct as_io_context *aic; > > ioc = ad->io_context; <======== ad is bogus > BUG_ON(!ioc); > > > Call chain is: > > as_add_request > as_update_rq: > if (ad->antic_status == ANTIC_WAIT_REQ > || ad->antic_status == ANTIC_WAIT_NEXT) { > if (as_can_break_anticipation(ad, rq)) > as_antic_stop(ad); > } > > > So somehow 'ad' became invalid between the time ad->antic_status was > checked and as_can_break_anticipation() tried to access ad->io_context? That's impossible, ad is persistent unless the io scheduler is attempted removed. Did you fiddle with switching io schedulers while this happened? If not, then something corrupted your memory. And I'm not aware of any io scheduler switching bugs, so the oops would still be highly suspect if so. You write emerge - are you using an experimental compiler? Or did you recently change hardware? Is it warmer than usual? -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] AS io-scheduler. 2007-07-16 17:29 ` Jens Axboe @ 2007-07-16 19:49 ` Ian Kumlien 2007-07-16 19:56 ` Jens Axboe 0 siblings, 1 reply; 11+ messages in thread From: Ian Kumlien @ 2007-07-16 19:49 UTC (permalink / raw) To: Jens Axboe; +Cc: Chuck Ebbert, Linux-kernel, Nick Piggin [-- Attachment #1: Type: text/plain, Size: 5033 bytes --] On mån, 2007-07-16 at 19:29 +0200, Jens Axboe wrote: > On Mon, Jul 16 2007, Chuck Ebbert wrote: > > On 07/15/2007 11:20 AM, Ian Kumlien wrote: > > > I had emerge --sync failing several times... > > > > > > So i checked dmesg and found some info, attached further down. > > > This is a old VIA C3 machine with one disk, it's been running most > > > kernels in the 2.6.x series with no problems until now. > > > > > > PS. Don't forget to CC me > > > DS. > > > > > > BUG: unable to handle kernel paging request at virtual address ea86ac54 > > > printing eip: > > > c022dfec > > > *pde = 00000000 > > > Oops: 0000 [#1] > > > Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge > > > CPU: 0 > > > EIP: 0060:[<c022dfec>] Not tainted VLI > > > EFLAGS: 00010082 (2.6.22.1 #26) > > > EIP is at as_can_break_anticipation+0xc/0x190 > > > eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844 > > > esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70 > > > ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068 > > > Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000) > > > Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844 > > > 00000000 > > > dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000 > > > dfcffb9c > > > 00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30 > > > c04d1ec0 > > > Call Trace: > > > [<c022efc8>] as_add_request+0xa8/0xc0 > > > [<c0227a76>] elv_insert+0xa6/0x150 > > > [<c016e96e>] bio_phys_segments+0xe/0x20 > > > [<c022af64>] __make_request+0x384/0x490 > > > [<c02add1e>] ide_do_request+0x6ee/0x890 > > > [<c02294ab>] generic_make_request+0x18b/0x1c0 > > > [<c022b596>] submit_bio+0xa6/0xb0 > > > [<c013b7b8>] mempool_alloc+0x28/0xa0 > > > [<c016bb66>] __find_get_block+0xf6/0x130 > > > [<c016e0bc>] bio_alloc_bioset+0x8c/0xf0 > > > [<c016b647>] submit_bh+0xb7/0xe0 > > > [<c016c1f8>] ll_rw_block+0x78/0x90 > > > [<c019c85d>] search_by_key+0xdd/0xd20 > > > [<c016c201>] ll_rw_block+0x81/0x90 > > > [<c011f190>] irq_exit+0x40/0x60 > > > [<c01066e4>] do_IRQ+0x94/0xb0 > > > [<c0104bc3>] common_interrupt+0x23/0x30 > > > [<c018beca>] reiserfs_read_locked_inode+0x6a/0x490 > > > [<c018e580>] reiserfs_find_actor+0x0/0x20 > > > [<c018c33b>] reiserfs_iget+0x4b/0x80 > > > [<c018e570>] reiserfs_init_locked_inode+0x0/0x10 > > > [<c0189824>] reiserfs_lookup+0xa4/0xf0 > > > [<c0157b03>] do_lookup+0xa3/0x140 > > > [<c0159265>] __link_path_walk+0x615/0xa20 > > > [<c0168a18>] __mark_inode_dirty+0x28/0x150 > > > [<c01631c1>] mntput_no_expire+0x11/0x50 > > > [<c01596b2>] link_path_walk+0x42/0xb0 > > > [<c0159960>] do_path_lookup+0x130/0x150 > > > [<c015a190>] __user_walk_fd+0x30/0x50 > > > [<c0154766>] vfs_lstat_fd+0x16/0x40 > > > [<c01547df>] sys_lstat64+0xf/0x30 > > > [<c0103c42>] syscall_call+0x7/0xb > > > ======================= > > > > static int as_can_break_anticipation(struct as_data *ad, struct request *rq) > > { > > struct io_context *ioc; > > struct as_io_context *aic; > > > > ioc = ad->io_context; <======== ad is bogus > > BUG_ON(!ioc); > > > > > > Call chain is: > > > > as_add_request > > as_update_rq: > > if (ad->antic_status == ANTIC_WAIT_REQ > > || ad->antic_status == ANTIC_WAIT_NEXT) { > > if (as_can_break_anticipation(ad, rq)) > > as_antic_stop(ad); > > } > > > > > > So somehow 'ad' became invalid between the time ad->antic_status was > > checked and as_can_break_anticipation() tried to access ad->io_context? > > That's impossible, ad is persistent unless the io scheduler is attempted > removed. Did you fiddle with switching io schedulers while this > happened? If not, then something corrupted your memory. And I'm not > aware of any io scheduler switching bugs, so the oops would still be > highly suspect if so. I wasn't fiddling with the scheduler, it's quite happily been running AS for quite some time. Actually it ran AS for the entire 2.6.21 and 2.6.20 life cycles. > You write emerge - are you using an experimental compiler? Or did you > recently change hardware? Is it warmer than usual? No change in hardware, no change in compiler either. gcc (GCC) 4.1.2 (Gentoo 4.1.2) Which is the same compiler that compiled 2.6.21 afair. It's been more humid, but not warmer... Were talking about a cpu that usually idles at 17 deg =) There might however have been more io load, right now it's 220 connections to my webserver and it's 5 days since my friend released the mix they are downloading. (I'm preparing to switch it all over to a core2duo with mirrored disks, which is a wee bit more suited to this kind of load) Sent data: 8062.27 Mb (4620.55 kbit/s) Requests: 56284 (235/min) Webserver uptime 9h -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] AS io-scheduler. 2007-07-16 19:49 ` Ian Kumlien @ 2007-07-16 19:56 ` Jens Axboe 2007-07-16 20:14 ` Ian Kumlien 0 siblings, 1 reply; 11+ messages in thread From: Jens Axboe @ 2007-07-16 19:56 UTC (permalink / raw) To: Ian Kumlien; +Cc: Chuck Ebbert, Linux-kernel, Nick Piggin On Mon, Jul 16 2007, Ian Kumlien wrote: > On mån, 2007-07-16 at 19:29 +0200, Jens Axboe wrote: > > On Mon, Jul 16 2007, Chuck Ebbert wrote: > > > On 07/15/2007 11:20 AM, Ian Kumlien wrote: > > > > I had emerge --sync failing several times... > > > > > > > > So i checked dmesg and found some info, attached further down. > > > > This is a old VIA C3 machine with one disk, it's been running most > > > > kernels in the 2.6.x series with no problems until now. > > > > > > > > PS. Don't forget to CC me > > > > DS. > > > > > > > > BUG: unable to handle kernel paging request at virtual address ea86ac54 > > > > printing eip: > > > > c022dfec > > > > *pde = 00000000 > > > > Oops: 0000 [#1] > > > > Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge > > > > CPU: 0 > > > > EIP: 0060:[<c022dfec>] Not tainted VLI > > > > EFLAGS: 00010082 (2.6.22.1 #26) > > > > EIP is at as_can_break_anticipation+0xc/0x190 > > > > eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844 > > > > esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70 > > > > ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068 > > > > Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000) > > > > Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844 > > > > 00000000 > > > > dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000 > > > > dfcffb9c > > > > 00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30 > > > > c04d1ec0 > > > > Call Trace: > > > > [<c022efc8>] as_add_request+0xa8/0xc0 > > > > [<c0227a76>] elv_insert+0xa6/0x150 > > > > [<c016e96e>] bio_phys_segments+0xe/0x20 > > > > [<c022af64>] __make_request+0x384/0x490 > > > > [<c02add1e>] ide_do_request+0x6ee/0x890 > > > > [<c02294ab>] generic_make_request+0x18b/0x1c0 > > > > [<c022b596>] submit_bio+0xa6/0xb0 > > > > [<c013b7b8>] mempool_alloc+0x28/0xa0 > > > > [<c016bb66>] __find_get_block+0xf6/0x130 > > > > [<c016e0bc>] bio_alloc_bioset+0x8c/0xf0 > > > > [<c016b647>] submit_bh+0xb7/0xe0 > > > > [<c016c1f8>] ll_rw_block+0x78/0x90 > > > > [<c019c85d>] search_by_key+0xdd/0xd20 > > > > [<c016c201>] ll_rw_block+0x81/0x90 > > > > [<c011f190>] irq_exit+0x40/0x60 > > > > [<c01066e4>] do_IRQ+0x94/0xb0 > > > > [<c0104bc3>] common_interrupt+0x23/0x30 > > > > [<c018beca>] reiserfs_read_locked_inode+0x6a/0x490 > > > > [<c018e580>] reiserfs_find_actor+0x0/0x20 > > > > [<c018c33b>] reiserfs_iget+0x4b/0x80 > > > > [<c018e570>] reiserfs_init_locked_inode+0x0/0x10 > > > > [<c0189824>] reiserfs_lookup+0xa4/0xf0 > > > > [<c0157b03>] do_lookup+0xa3/0x140 > > > > [<c0159265>] __link_path_walk+0x615/0xa20 > > > > [<c0168a18>] __mark_inode_dirty+0x28/0x150 > > > > [<c01631c1>] mntput_no_expire+0x11/0x50 > > > > [<c01596b2>] link_path_walk+0x42/0xb0 > > > > [<c0159960>] do_path_lookup+0x130/0x150 > > > > [<c015a190>] __user_walk_fd+0x30/0x50 > > > > [<c0154766>] vfs_lstat_fd+0x16/0x40 > > > > [<c01547df>] sys_lstat64+0xf/0x30 > > > > [<c0103c42>] syscall_call+0x7/0xb > > > > ======================= > > > > > > static int as_can_break_anticipation(struct as_data *ad, struct request *rq) > > > { > > > struct io_context *ioc; > > > struct as_io_context *aic; > > > > > > ioc = ad->io_context; <======== ad is bogus > > > BUG_ON(!ioc); > > > > > > > > > Call chain is: > > > > > > as_add_request > > > as_update_rq: > > > if (ad->antic_status == ANTIC_WAIT_REQ > > > || ad->antic_status == ANTIC_WAIT_NEXT) { > > > if (as_can_break_anticipation(ad, rq)) > > > as_antic_stop(ad); > > > } > > > > > > > > > So somehow 'ad' became invalid between the time ad->antic_status was > > > checked and as_can_break_anticipation() tried to access ad->io_context? > > > > That's impossible, ad is persistent unless the io scheduler is attempted > > removed. Did you fiddle with switching io schedulers while this > > happened? If not, then something corrupted your memory. And I'm not > > aware of any io scheduler switching bugs, so the oops would still be > > highly suspect if so. > > I wasn't fiddling with the scheduler, it's quite happily been running AS > for quite some time. OK, that rules that out then. Then your oops looks very much like hardware trouble. Perhaps a border liner PSU? Just an idea. > Actually it ran AS for the entire 2.6.21 and 2.6.20 life cycles. There's essentially only one change in AS between 2.6.21 and 2.6.22, and that is converting a jiffes vs msec error. So no real code change. > > You write emerge - are you using an experimental compiler? Or did you > > recently change hardware? Is it warmer than usual? > > No change in hardware, no change in compiler either. > > gcc (GCC) 4.1.2 (Gentoo 4.1.2) > > Which is the same compiler that compiled 2.6.21 afair. > > It's been more humid, but not warmer... Were talking about a cpu that > usually idles at 17 deg =) Probably not the heat, then. > There might however have been more io load, right now it's 220 > connections to my webserver and it's 5 days since my friend released the > mix they are downloading. > > (I'm preparing to switch it all over to a core2duo with mirrored disks, > which is a wee bit more suited to this kind of load) > > Sent data: 8062.27 Mb (4620.55 kbit/s) > Requests: 56284 (235/min) > > Webserver uptime 9h My gut says that this is the hardware falling over, for whatever reason. Since it's otherwise stable, it could be something marginal pushing it over. Like your higher load (be it CPU, or disk). You could try and boot with the noop IO scheduler and see if it still oopses. Not sure would else to suggest, your box will likely pass memtest just fine. -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] AS io-scheduler. 2007-07-16 19:56 ` Jens Axboe @ 2007-07-16 20:14 ` Ian Kumlien 2007-07-17 6:23 ` Jens Axboe 0 siblings, 1 reply; 11+ messages in thread From: Ian Kumlien @ 2007-07-16 20:14 UTC (permalink / raw) To: Jens Axboe; +Cc: Chuck Ebbert, Linux-kernel, Nick Piggin [-- Attachment #1: Type: text/plain, Size: 7146 bytes --] On mån, 2007-07-16 at 21:56 +0200, Jens Axboe wrote: > On Mon, Jul 16 2007, Ian Kumlien wrote: > > On mån, 2007-07-16 at 19:29 +0200, Jens Axboe wrote: > > > On Mon, Jul 16 2007, Chuck Ebbert wrote: > > > > On 07/15/2007 11:20 AM, Ian Kumlien wrote: > > > > > I had emerge --sync failing several times... > > > > > > > > > > So i checked dmesg and found some info, attached further down. > > > > > This is a old VIA C3 machine with one disk, it's been running most > > > > > kernels in the 2.6.x series with no problems until now. > > > > > > > > > > PS. Don't forget to CC me > > > > > DS. > > > > > > > > > > BUG: unable to handle kernel paging request at virtual address ea86ac54 > > > > > printing eip: > > > > > c022dfec > > > > > *pde = 00000000 > > > > > Oops: 0000 [#1] > > > > > Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge > > > > > CPU: 0 > > > > > EIP: 0060:[<c022dfec>] Not tainted VLI > > > > > EFLAGS: 00010082 (2.6.22.1 #26) > > > > > EIP is at as_can_break_anticipation+0xc/0x190 > > > > > eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844 > > > > > esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70 > > > > > ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068 > > > > > Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000) > > > > > Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844 > > > > > 00000000 > > > > > dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000 > > > > > dfcffb9c > > > > > 00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30 > > > > > c04d1ec0 > > > > > Call Trace: > > > > > [<c022efc8>] as_add_request+0xa8/0xc0 > > > > > [<c0227a76>] elv_insert+0xa6/0x150 > > > > > [<c016e96e>] bio_phys_segments+0xe/0x20 > > > > > [<c022af64>] __make_request+0x384/0x490 > > > > > [<c02add1e>] ide_do_request+0x6ee/0x890 > > > > > [<c02294ab>] generic_make_request+0x18b/0x1c0 > > > > > [<c022b596>] submit_bio+0xa6/0xb0 > > > > > [<c013b7b8>] mempool_alloc+0x28/0xa0 > > > > > [<c016bb66>] __find_get_block+0xf6/0x130 > > > > > [<c016e0bc>] bio_alloc_bioset+0x8c/0xf0 > > > > > [<c016b647>] submit_bh+0xb7/0xe0 > > > > > [<c016c1f8>] ll_rw_block+0x78/0x90 > > > > > [<c019c85d>] search_by_key+0xdd/0xd20 > > > > > [<c016c201>] ll_rw_block+0x81/0x90 > > > > > [<c011f190>] irq_exit+0x40/0x60 > > > > > [<c01066e4>] do_IRQ+0x94/0xb0 > > > > > [<c0104bc3>] common_interrupt+0x23/0x30 > > > > > [<c018beca>] reiserfs_read_locked_inode+0x6a/0x490 > > > > > [<c018e580>] reiserfs_find_actor+0x0/0x20 > > > > > [<c018c33b>] reiserfs_iget+0x4b/0x80 > > > > > [<c018e570>] reiserfs_init_locked_inode+0x0/0x10 > > > > > [<c0189824>] reiserfs_lookup+0xa4/0xf0 > > > > > [<c0157b03>] do_lookup+0xa3/0x140 > > > > > [<c0159265>] __link_path_walk+0x615/0xa20 > > > > > [<c0168a18>] __mark_inode_dirty+0x28/0x150 > > > > > [<c01631c1>] mntput_no_expire+0x11/0x50 > > > > > [<c01596b2>] link_path_walk+0x42/0xb0 > > > > > [<c0159960>] do_path_lookup+0x130/0x150 > > > > > [<c015a190>] __user_walk_fd+0x30/0x50 > > > > > [<c0154766>] vfs_lstat_fd+0x16/0x40 > > > > > [<c01547df>] sys_lstat64+0xf/0x30 > > > > > [<c0103c42>] syscall_call+0x7/0xb > > > > > ======================= > > > > > > > > static int as_can_break_anticipation(struct as_data *ad, struct request *rq) > > > > { > > > > struct io_context *ioc; > > > > struct as_io_context *aic; > > > > > > > > ioc = ad->io_context; <======== ad is bogus > > > > BUG_ON(!ioc); > > > > > > > > > > > > Call chain is: > > > > > > > > as_add_request > > > > as_update_rq: > > > > if (ad->antic_status == ANTIC_WAIT_REQ > > > > || ad->antic_status == ANTIC_WAIT_NEXT) { > > > > if (as_can_break_anticipation(ad, rq)) > > > > as_antic_stop(ad); > > > > } > > > > > > > > > > > > So somehow 'ad' became invalid between the time ad->antic_status was > > > > checked and as_can_break_anticipation() tried to access ad->io_context? > > > > > > That's impossible, ad is persistent unless the io scheduler is attempted > > > removed. Did you fiddle with switching io schedulers while this > > > happened? If not, then something corrupted your memory. And I'm not > > > aware of any io scheduler switching bugs, so the oops would still be > > > highly suspect if so. > > > > I wasn't fiddling with the scheduler, it's quite happily been running AS > > for quite some time. > > OK, that rules that out then. Then your oops looks very much like > hardware trouble. Perhaps a border liner PSU? Just an idea. It uses a laptop psu, that doesn't need cooling, this is a microitx board =) > > Actually it ran AS for the entire 2.6.21 and 2.6.20 life cycles. > > There's essentially only one change in AS between 2.6.21 and 2.6.22, and > that is converting a jiffes vs msec error. So no real code change. Hummm, odd... > > > You write emerge - are you using an experimental compiler? Or did you > > > recently change hardware? Is it warmer than usual? > > > > No change in hardware, no change in compiler either. > > > > gcc (GCC) 4.1.2 (Gentoo 4.1.2) > > > > Which is the same compiler that compiled 2.6.21 afair. > > > > It's been more humid, but not warmer... Were talking about a cpu that > > usually idles at 17 deg =) > > Probably not the heat, then. Currenlty: CPU Temp: +25.6°C This is all passively cooled btw, it was lower a while ago, bit the webserver is still loaded. > > There might however have been more io load, right now it's 220 > > connections to my webserver and it's 5 days since my friend released the > > mix they are downloading. > > > > (I'm preparing to switch it all over to a core2duo with mirrored disks, > > which is a wee bit more suited to this kind of load) > > > > Sent data: 8062.27 Mb (4620.55 kbit/s) > > Requests: 56284 (235/min) > > > > Webserver uptime 9h > > My gut says that this is the hardware falling over, for whatever reason. > Since it's otherwise stable, it could be something marginal pushing it > over. Like your higher load (be it CPU, or disk). It's only ever happened once. (But it blocked any and all rsyncs... Common io worked) This kinds of loads happens now and then, first it's slashdot finding the icculus.org mirror of postal2 (long time ago, i know) and then it's someone releasing a new mix... I always find out by noticing the amount of webconnections increase. > You could try and boot with the noop IO scheduler and see if it still > oopses. Not sure would else to suggest, your box will likely pass > memtest just fine. It's currently running with cfq since ~2 days without a problem. I really can't take it down and do a memtest on it, it's my mailserver, webserver, firewall etc etc =) Just let me know what kind of information you might want and i'll put it all up... =) -- Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] AS io-scheduler. 2007-07-16 20:14 ` Ian Kumlien @ 2007-07-17 6:23 ` Jens Axboe 0 siblings, 0 replies; 11+ messages in thread From: Jens Axboe @ 2007-07-17 6:23 UTC (permalink / raw) To: Ian Kumlien; +Cc: Chuck Ebbert, Linux-kernel, Nick Piggin On Mon, Jul 16 2007, Ian Kumlien wrote: > On mån, 2007-07-16 at 21:56 +0200, Jens Axboe wrote: > > On Mon, Jul 16 2007, Ian Kumlien wrote: > > > On mån, 2007-07-16 at 19:29 +0200, Jens Axboe wrote: > > > > On Mon, Jul 16 2007, Chuck Ebbert wrote: > > > > > On 07/15/2007 11:20 AM, Ian Kumlien wrote: > > > > > > I had emerge --sync failing several times... > > > > > > > > > > > > So i checked dmesg and found some info, attached further down. > > > > > > This is a old VIA C3 machine with one disk, it's been running most > > > > > > kernels in the 2.6.x series with no problems until now. > > > > > > > > > > > > PS. Don't forget to CC me > > > > > > DS. > > > > > > > > > > > > BUG: unable to handle kernel paging request at virtual address ea86ac54 > > > > > > printing eip: > > > > > > c022dfec > > > > > > *pde = 00000000 > > > > > > Oops: 0000 [#1] > > > > > > Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge > > > > > > CPU: 0 > > > > > > EIP: 0060:[<c022dfec>] Not tainted VLI > > > > > > EFLAGS: 00010082 (2.6.22.1 #26) > > > > > > EIP is at as_can_break_anticipation+0xc/0x190 > > > > > > eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844 > > > > > > esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70 > > > > > > ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068 > > > > > > Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000) > > > > > > Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844 > > > > > > 00000000 > > > > > > dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000 > > > > > > dfcffb9c > > > > > > 00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30 > > > > > > c04d1ec0 > > > > > > Call Trace: > > > > > > [<c022efc8>] as_add_request+0xa8/0xc0 > > > > > > [<c0227a76>] elv_insert+0xa6/0x150 > > > > > > [<c016e96e>] bio_phys_segments+0xe/0x20 > > > > > > [<c022af64>] __make_request+0x384/0x490 > > > > > > [<c02add1e>] ide_do_request+0x6ee/0x890 > > > > > > [<c02294ab>] generic_make_request+0x18b/0x1c0 > > > > > > [<c022b596>] submit_bio+0xa6/0xb0 > > > > > > [<c013b7b8>] mempool_alloc+0x28/0xa0 > > > > > > [<c016bb66>] __find_get_block+0xf6/0x130 > > > > > > [<c016e0bc>] bio_alloc_bioset+0x8c/0xf0 > > > > > > [<c016b647>] submit_bh+0xb7/0xe0 > > > > > > [<c016c1f8>] ll_rw_block+0x78/0x90 > > > > > > [<c019c85d>] search_by_key+0xdd/0xd20 > > > > > > [<c016c201>] ll_rw_block+0x81/0x90 > > > > > > [<c011f190>] irq_exit+0x40/0x60 > > > > > > [<c01066e4>] do_IRQ+0x94/0xb0 > > > > > > [<c0104bc3>] common_interrupt+0x23/0x30 > > > > > > [<c018beca>] reiserfs_read_locked_inode+0x6a/0x490 > > > > > > [<c018e580>] reiserfs_find_actor+0x0/0x20 > > > > > > [<c018c33b>] reiserfs_iget+0x4b/0x80 > > > > > > [<c018e570>] reiserfs_init_locked_inode+0x0/0x10 > > > > > > [<c0189824>] reiserfs_lookup+0xa4/0xf0 > > > > > > [<c0157b03>] do_lookup+0xa3/0x140 > > > > > > [<c0159265>] __link_path_walk+0x615/0xa20 > > > > > > [<c0168a18>] __mark_inode_dirty+0x28/0x150 > > > > > > [<c01631c1>] mntput_no_expire+0x11/0x50 > > > > > > [<c01596b2>] link_path_walk+0x42/0xb0 > > > > > > [<c0159960>] do_path_lookup+0x130/0x150 > > > > > > [<c015a190>] __user_walk_fd+0x30/0x50 > > > > > > [<c0154766>] vfs_lstat_fd+0x16/0x40 > > > > > > [<c01547df>] sys_lstat64+0xf/0x30 > > > > > > [<c0103c42>] syscall_call+0x7/0xb > > > > > > ======================= > > > > > > > > > > static int as_can_break_anticipation(struct as_data *ad, struct request *rq) > > > > > { > > > > > struct io_context *ioc; > > > > > struct as_io_context *aic; > > > > > > > > > > ioc = ad->io_context; <======== ad is bogus > > > > > BUG_ON(!ioc); > > > > > > > > > > > > > > > Call chain is: > > > > > > > > > > as_add_request > > > > > as_update_rq: > > > > > if (ad->antic_status == ANTIC_WAIT_REQ > > > > > || ad->antic_status == ANTIC_WAIT_NEXT) { > > > > > if (as_can_break_anticipation(ad, rq)) > > > > > as_antic_stop(ad); > > > > > } > > > > > > > > > > > > > > > So somehow 'ad' became invalid between the time ad->antic_status was > > > > > checked and as_can_break_anticipation() tried to access ad->io_context? > > > > > > > > That's impossible, ad is persistent unless the io scheduler is attempted > > > > removed. Did you fiddle with switching io schedulers while this > > > > happened? If not, then something corrupted your memory. And I'm not > > > > aware of any io scheduler switching bugs, so the oops would still be > > > > highly suspect if so. > > > > > > I wasn't fiddling with the scheduler, it's quite happily been running AS > > > for quite some time. > > > > OK, that rules that out then. Then your oops looks very much like > > hardware trouble. Perhaps a border liner PSU? Just an idea. > > It uses a laptop psu, that doesn't need cooling, this is a microitx > board =) Yeah I know, I've had the same setup for a "server" at some point in the past. It wasn't very stable for me under load, but that doesn't mean it's a general problem of course :-) > > You could try and boot with the noop IO scheduler and see if it still > > oopses. Not sure would else to suggest, your box will likely pass > > memtest just fine. > > It's currently running with cfq since ~2 days without a problem. > > I really can't take it down and do a memtest on it, it's my mailserver, > webserver, firewall etc etc =) And you shouldn't, as I wrote I don't think that memtest would uncover anything. > Just let me know what kind of information you might want and i'll put it > all up... =) Lets see if it remains stable with CFQ, I have no further ideas right now. The oops is impossible. -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] AS io-scheduler. 2007-07-16 16:31 ` Chuck Ebbert 2007-07-16 17:29 ` Jens Axboe @ 2007-07-17 1:44 ` Rene Herman 2007-07-17 6:24 ` Jens Axboe 1 sibling, 1 reply; 11+ messages in thread From: Rene Herman @ 2007-07-17 1:44 UTC (permalink / raw) To: Chuck Ebbert; +Cc: pomac, Linux-kernel, Jens Axboe, Nick Piggin On 07/16/2007 06:31 PM, Chuck Ebbert wrote: >> BUG: unable to handle kernel paging request at virtual address ea86ac54 >> printing eip: >> c022dfec >> *pde = 00000000 >> Oops: 0000 [#1] >> Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge >> CPU: 0 >> EIP: 0060:[<c022dfec>] Not tainted VLI >> EFLAGS: 00010082 (2.6.22.1 #26) >> EIP is at as_can_break_anticipation+0xc/0x190 >> eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844 >> esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70 >> ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068 >> Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000) >> Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844 >> 00000000 >> dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000 >> dfcffb9c >> 00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30 >> c04d1ec0 >> Call Trace: >> [<c022efc8>] as_add_request+0xa8/0xc0 >> [<c0227a76>] elv_insert+0xa6/0x150 >> [<c016e96e>] bio_phys_segments+0xe/0x20 >> [<c022af64>] __make_request+0x384/0x490 >> [<c02add1e>] ide_do_request+0x6ee/0x890 >> [<c02294ab>] generic_make_request+0x18b/0x1c0 >> [<c022b596>] submit_bio+0xa6/0xb0 >> [<c013b7b8>] mempool_alloc+0x28/0xa0 >> [<c016bb66>] __find_get_block+0xf6/0x130 >> [<c016e0bc>] bio_alloc_bioset+0x8c/0xf0 >> [<c016b647>] submit_bh+0xb7/0xe0 >> [<c016c1f8>] ll_rw_block+0x78/0x90 >> [<c019c85d>] search_by_key+0xdd/0xd20 >> [<c016c201>] ll_rw_block+0x81/0x90 >> [<c011f190>] irq_exit+0x40/0x60 >> [<c01066e4>] do_IRQ+0x94/0xb0 >> [<c0104bc3>] common_interrupt+0x23/0x30 >> [<c018beca>] reiserfs_read_locked_inode+0x6a/0x490 >> [<c018e580>] reiserfs_find_actor+0x0/0x20 >> [<c018c33b>] reiserfs_iget+0x4b/0x80 >> [<c018e570>] reiserfs_init_locked_inode+0x0/0x10 >> [<c0189824>] reiserfs_lookup+0xa4/0xf0 >> [<c0157b03>] do_lookup+0xa3/0x140 >> [<c0159265>] __link_path_walk+0x615/0xa20 >> [<c0168a18>] __mark_inode_dirty+0x28/0x150 >> [<c01631c1>] mntput_no_expire+0x11/0x50 >> [<c01596b2>] link_path_walk+0x42/0xb0 >> [<c0159960>] do_path_lookup+0x130/0x150 >> [<c015a190>] __user_walk_fd+0x30/0x50 >> [<c0154766>] vfs_lstat_fd+0x16/0x40 >> [<c01547df>] sys_lstat64+0xf/0x30 >> [<c0103c42>] syscall_call+0x7/0xb >> ======================= > > static int as_can_break_anticipation(struct as_data *ad, struct request *rq) > { > struct io_context *ioc; > struct as_io_context *aic; > > ioc = ad->io_context; <======== ad is bogus > BUG_ON(!ioc); > > > Call chain is: > > as_add_request > as_update_rq: > if (ad->antic_status == ANTIC_WAIT_REQ > || ad->antic_status == ANTIC_WAIT_NEXT) { > if (as_can_break_anticipation(ad, rq)) > as_antic_stop(ad); > } > > > So somehow 'ad' became invalid between the time ad->antic_status was > checked and as_can_break_anticipation() tried to access ad->io_context? Is this similar to: http://lkml.org/lkml/2007/6/4/50 ? Rene. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] AS io-scheduler. 2007-07-17 1:44 ` Rene Herman @ 2007-07-17 6:24 ` Jens Axboe 0 siblings, 0 replies; 11+ messages in thread From: Jens Axboe @ 2007-07-17 6:24 UTC (permalink / raw) To: Rene Herman; +Cc: Chuck Ebbert, pomac, Linux-kernel, Nick Piggin On Tue, Jul 17 2007, Rene Herman wrote: > On 07/16/2007 06:31 PM, Chuck Ebbert wrote: > >>> BUG: unable to handle kernel paging request at virtual address ea86ac54 >>> printing eip: >>> c022dfec >>> *pde = 00000000 >>> Oops: 0000 [#1] >>> Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge >>> CPU: 0 >>> EIP: 0060:[<c022dfec>] Not tainted VLI >>> EFLAGS: 00010082 (2.6.22.1 #26) >>> EIP is at as_can_break_anticipation+0xc/0x190 >>> eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844 >>> esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70 >>> ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068 >>> Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000) >>> Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844 >>> 00000000 dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 >>> 00000000 >>> dfcffb9c 00000000 c022af64 00000000 dfcffb9c 00000008 00000000 >>> 08ff6b30 >>> c04d1ec0 Call Trace: >>> [<c022efc8>] as_add_request+0xa8/0xc0 >>> [<c0227a76>] elv_insert+0xa6/0x150 >>> [<c016e96e>] bio_phys_segments+0xe/0x20 >>> [<c022af64>] __make_request+0x384/0x490 >>> [<c02add1e>] ide_do_request+0x6ee/0x890 >>> [<c02294ab>] generic_make_request+0x18b/0x1c0 >>> [<c022b596>] submit_bio+0xa6/0xb0 >>> [<c013b7b8>] mempool_alloc+0x28/0xa0 >>> [<c016bb66>] __find_get_block+0xf6/0x130 >>> [<c016e0bc>] bio_alloc_bioset+0x8c/0xf0 >>> [<c016b647>] submit_bh+0xb7/0xe0 >>> [<c016c1f8>] ll_rw_block+0x78/0x90 >>> [<c019c85d>] search_by_key+0xdd/0xd20 >>> [<c016c201>] ll_rw_block+0x81/0x90 >>> [<c011f190>] irq_exit+0x40/0x60 >>> [<c01066e4>] do_IRQ+0x94/0xb0 >>> [<c0104bc3>] common_interrupt+0x23/0x30 >>> [<c018beca>] reiserfs_read_locked_inode+0x6a/0x490 >>> [<c018e580>] reiserfs_find_actor+0x0/0x20 >>> [<c018c33b>] reiserfs_iget+0x4b/0x80 >>> [<c018e570>] reiserfs_init_locked_inode+0x0/0x10 >>> [<c0189824>] reiserfs_lookup+0xa4/0xf0 >>> [<c0157b03>] do_lookup+0xa3/0x140 >>> [<c0159265>] __link_path_walk+0x615/0xa20 >>> [<c0168a18>] __mark_inode_dirty+0x28/0x150 >>> [<c01631c1>] mntput_no_expire+0x11/0x50 >>> [<c01596b2>] link_path_walk+0x42/0xb0 >>> [<c0159960>] do_path_lookup+0x130/0x150 >>> [<c015a190>] __user_walk_fd+0x30/0x50 >>> [<c0154766>] vfs_lstat_fd+0x16/0x40 >>> [<c01547df>] sys_lstat64+0xf/0x30 >>> [<c0103c42>] syscall_call+0x7/0xb >>> ======================= >> static int as_can_break_anticipation(struct as_data *ad, struct request >> *rq) >> { >> struct io_context *ioc; >> struct as_io_context *aic; >> ioc = ad->io_context; <======== ad is bogus >> BUG_ON(!ioc); >> Call chain is: >> as_add_request >> as_update_rq: >> if (ad->antic_status == ANTIC_WAIT_REQ >> || ad->antic_status == ANTIC_WAIT_NEXT) { >> if (as_can_break_anticipation(ad, rq)) >> as_antic_stop(ad); >> } >> So somehow 'ad' became invalid between the time ad->antic_status was >> checked and as_can_break_anticipation() tried to access ad->io_context? > > Is this similar to: > > http://lkml.org/lkml/2007/6/4/50 Nope -- Jens Axboe ^ permalink raw reply [flat|nested] 11+ messages in thread
* [BUG] AS io-scheduler.
@ 2007-07-14 15:39 Ian Kumlien
2007-07-17 6:54 ` Neil Brown
0 siblings, 1 reply; 11+ messages in thread
From: Ian Kumlien @ 2007-07-14 15:39 UTC (permalink / raw)
To: Linux-kernel
[-- Attachment #1: Type: text/plain, Size: 4439 bytes --]
Hi,
I had emerge --sync failing several times...
So i checked dmesg and found some info, attached further down.
This is a old VIA C3 machine with one disk, it's been running most
kernels in the 2.6.x series with no problems until now.
PS. Don't forget to CC me
DS.
BUG: unable to handle kernel paging request at virtual address ea86ac54
printing eip:
c022dfec
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge
CPU: 0
EIP: 0060:[<c022dfec>] Not tainted VLI
EFLAGS: 00010082 (2.6.22.1 #26)
EIP is at as_can_break_anticipation+0xc/0x190
eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844
esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70
ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068
Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000)
Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844
00000000
dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000
dfcffb9c
00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30
c04d1ec0
Call Trace:
[<c022efc8>] as_add_request+0xa8/0xc0
[<c0227a76>] elv_insert+0xa6/0x150
[<c016e96e>] bio_phys_segments+0xe/0x20
[<c022af64>] __make_request+0x384/0x490
[<c02add1e>] ide_do_request+0x6ee/0x890
[<c02294ab>] generic_make_request+0x18b/0x1c0
[<c022b596>] submit_bio+0xa6/0xb0
[<c013b7b8>] mempool_alloc+0x28/0xa0
[<c016bb66>] __find_get_block+0xf6/0x130
[<c016e0bc>] bio_alloc_bioset+0x8c/0xf0
[<c016b647>] submit_bh+0xb7/0xe0
[<c016c1f8>] ll_rw_block+0x78/0x90
[<c019c85d>] search_by_key+0xdd/0xd20
[<c016c201>] ll_rw_block+0x81/0x90
[<c011f190>] irq_exit+0x40/0x60
[<c01066e4>] do_IRQ+0x94/0xb0
[<c0104bc3>] common_interrupt+0x23/0x30
[<c018beca>] reiserfs_read_locked_inode+0x6a/0x490
[<c018e580>] reiserfs_find_actor+0x0/0x20
[<c018c33b>] reiserfs_iget+0x4b/0x80
[<c018e570>] reiserfs_init_locked_inode+0x0/0x10
[<c0189824>] reiserfs_lookup+0xa4/0xf0
[<c0157b03>] do_lookup+0xa3/0x140
[<c0159265>] __link_path_walk+0x615/0xa20
[<c0168a18>] __mark_inode_dirty+0x28/0x150
[<c01631c1>] mntput_no_expire+0x11/0x50
[<c01596b2>] link_path_walk+0x42/0xb0
[<c0159960>] do_path_lookup+0x130/0x150
[<c015a190>] __user_walk_fd+0x30/0x50
[<c0154766>] vfs_lstat_fd+0x16/0x40
[<c01547df>] sys_lstat64+0xf/0x30
[<c0103c42>] syscall_call+0x7/0xb
=======================
Code: c0 8b 44 cb 0c 8b 40 08 29 d0 f7 d0 c1 e8 1f 83 f0 01 eb 02 31 c0
5b c3 8d b4 26 00 00 00 00 55 57 56 53 83 ec 04 89 c3 89 14 24 <8b> 90
b4 00 00 00 85 d2 75 04 0f 0b eb fe 83 3c 24 00 74 0c 8b
EIP: [<c022dfec>] as_can_break_anticipation+0xc/0x190 SS:ESP
0068:ceff6a70
WARNING: at block/as-iosched.c:862 as_remove_queued_request()
[<c022e560>] as_remove_queued_request+0x40/0xc0
[<c022e684>] as_move_to_dispatch+0xa4/0x110
[<c0236626>] __delay+0x6/0x10
[<c022ee03>] as_dispatch_request+0x2d3/0x310
[<c02b39b2>] ide_dma_start+0x22/0x30
[<c02b6be0>] ide_do_rw_disk+0x310/0x3f0
[<c02279b5>] elv_next_request+0x105/0x120
[<c02ad876>] ide_do_request+0x246/0x890
[<c03d5c9d>] schedule+0x45d/0x4c0
[<c022e7f0>] as_work_handler+0x0/0x20
[<c022a331>] blk_start_queueing+0x11/0x20
[<c022e7ff>] as_work_handler+0xf/0x20
[<c0126bab>] run_workqueue+0x6b/0xe0
[<c0127200>] worker_thread+0x0/0xc0
[<c01272b2>] worker_thread+0xb2/0xc0
[<c01297e0>] autoremove_wake_function+0x0/0x40
[<c0129658>] kthread+0x38/0x60
[<c0129620>] kthread+0x0/0x60
[<c0104d87>] kernel_thread_helper+0x7/0x10
=======================
WARNING: at block/as-iosched.c:966 as_move_to_dispatch()
[<c022e6b3>] as_move_to_dispatch+0xd3/0x110
[<c022ee03>] as_dispatch_request+0x2d3/0x310
[<c02b39b2>] ide_dma_start+0x22/0x30
[<c02b6be0>] ide_do_rw_disk+0x310/0x3f0
[<c02279b5>] elv_next_request+0x105/0x120
[<c02ad876>] ide_do_request+0x246/0x890
[<c03d5c9d>] schedule+0x45d/0x4c0
[<c022e7f0>] as_work_handler+0x0/0x20
[<c022a331>] blk_start_queueing+0x11/0x20
[<c022e7ff>] as_work_handler+0xf/0x20
[<c0126bab>] run_workqueue+0x6b/0xe0
[<c0127200>] worker_thread+0x0/0xc0
[<c01272b2>] worker_thread+0xb2/0xc0
[<c01297e0>] autoremove_wake_function+0x0/0x40
[<c0129658>] kthread+0x38/0x60
[<c0129620>] kthread+0x0/0x60
[<c0104d87>] kernel_thread_helper+0x7/0x10
=======================
--
Ian Kumlien <pomac () vapor ! com> -- http://pomac.netswarm.net
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [BUG] AS io-scheduler. 2007-07-14 15:39 Ian Kumlien @ 2007-07-17 6:54 ` Neil Brown 0 siblings, 0 replies; 11+ messages in thread From: Neil Brown @ 2007-07-17 6:54 UTC (permalink / raw) To: pomac; +Cc: Linux-kernel On Saturday July 14, pomac@vapor.com wrote: > Hi, > > I had emerge --sync failing several times... > > So i checked dmesg and found some info, attached further down. > This is a old VIA C3 machine with one disk, it's been running most > kernels in the 2.6.x series with no problems until now. > > PS. Don't forget to CC me > DS. > > BUG: unable to handle kernel paging request at virtual address ea86ac54 ^^^^^^^^ > printing eip: > c022dfec > *pde = 00000000 > Oops: 0000 [#1] > Modules linked in: eeprom i2c_viapro vt8231 i2c_isa skge > CPU: 0 > EIP: 0060:[<c022dfec>] Not tainted VLI > EFLAGS: 00010082 (2.6.22.1 #26) > EIP is at as_can_break_anticipation+0xc/0x190 > eax: dfcdaba0 ebx: dfcdaba0 ecx: 0035ff95 edx: cb296844 ^^^^^^^^ > esi: cb296844 edi: dfcdaba0 ebp: 00000000 esp: ceff6a70 > ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068 > Process rsync (pid: 1667, ti=ceff6000 task=d59cf5b0 task.ti=ceff6000) > Stack: cb296844 00000001 cb296844 dfcdaba0 00000000 c022efc8 cb296844 > 00000000 > dfcffb9c c0227a76 dfcffb9c 00000000 c016e96e cb296844 00000000 > dfcffb9c > 00000000 c022af64 00000000 dfcffb9c 00000008 00000000 08ff6b30 > c04d1ec0 ... > Code: c0 8b 44 cb 0c 8b 40 08 29 d0 f7 d0 c1 e8 1f 83 f0 01 eb 02 31 c0 > 5b c3 8d b4 26 00 00 00 00 55 57 56 53 83 ec 04 89 c3 89 14 24 <8b> 90 > b4 00 00 00 85 d2 75 04 0f 0b eb fe 83 3c 24 00 74 0c 8b Plug that 'Code:' line into ksymoops and you get Code; fffffffb <__kernel_rt_sigreturn+1bbb/????> 26: 89 c3 mov %eax,%ebx Code; fffffffd <__kernel_rt_sigreturn+1bbd/????> 28: 89 14 24 mov %edx,(%esp) Code; 00000000 Before first symbol 2b: 8b 90 b4 00 00 00 mov 0xb4(%eax),%edx Code; 00000006 Before first symbol 31: 85 d2 test %edx,%edx Code; 00000008 Before first symbol 33: 75 04 jne 39 <_EIP+0x39> (etc.). The "Code; 00000000..." is the current EIP. So the operation it was performing was mov 0xb4(%eax),%edx Now I am no Pentium, but when I add 0xb4 to dfcdaba0 I don't get ea86ac54. My result is more like dfcdac54. So it looks like your Pentium (or whatever) got it half-right. i.e. the bottom 16 bits are right, but the top are wrong by 0ab9. I'd be taking a very serious look at your hardware. This is not a software problem. NeilBrown ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2007-07-17 6:54 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-07-15 15:20 [BUG] AS io-scheduler Ian Kumlien 2007-07-16 16:31 ` Chuck Ebbert 2007-07-16 17:29 ` Jens Axboe 2007-07-16 19:49 ` Ian Kumlien 2007-07-16 19:56 ` Jens Axboe 2007-07-16 20:14 ` Ian Kumlien 2007-07-17 6:23 ` Jens Axboe 2007-07-17 1:44 ` Rene Herman 2007-07-17 6:24 ` Jens Axboe -- strict thread matches above, loose matches on Subject: below -- 2007-07-14 15:39 Ian Kumlien 2007-07-17 6:54 ` Neil Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox