* 2.5.2-pre1 dbench 32 hangs in vmstat "b" state
@ 2001-12-21 14:11 rwhron
2001-12-21 14:46 ` Jens Axboe
0 siblings, 1 reply; 21+ messages in thread
From: rwhron @ 2001-12-21 14:11 UTC (permalink / raw)
To: linux-kernel; +Cc: axboe
While running "dbench 32" on 2.5.2-pre1:
I noticed the test was taking much longer than usual,
and I could not do a new "login".
vmstat 8 looked like this:
r b w swpd free buff cache si so bi bo in cs us sy id
0 34 1 0 222504 12248 736088 0 0 0 0 103 59 0 0 100
1 34 1 0 222504 12248 736088 0 0 0 0 100 56 0 0 100
0 34 1 0 222504 12248 736088 0 0 0 0 103 59 0 0 100
<sysrq Sync Umount> did not print their "done" messages.
The "b" and "w" columns when up though:
r b w swpd free buff cache si so bi bo in cs us sy id
0 37 3 0 222456 12280 736092 0 0 0 0 222 269 0 0 100
There was no Oops.
2.5.1-dj3 completed dbench normally.
Configs between the 2 kernels:
diff 2.5.2-pre1 2.5.1-dj3
> CONFIG_IP_NF_QUEUE=m
2.5.1-pre1[01] and 2.5.1-final did not exhibit this behavior.
Hardware:
1333 Athlon
1GB RAM
CONFIG_HIGHMEM4G=y
CONFIG_HIGHMEM=y
--
Randy Hron
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-21 14:11 2.5.2-pre1 dbench 32 hangs in vmstat "b" state rwhron @ 2001-12-21 14:46 ` Jens Axboe 2001-12-21 16:43 ` rwhron 2001-12-21 23:55 ` rwhron 0 siblings, 2 replies; 21+ messages in thread From: Jens Axboe @ 2001-12-21 14:46 UTC (permalink / raw) To: rwhron; +Cc: linux-kernel On Fri, Dec 21 2001, rwhron@earthlink.net wrote: > While running "dbench 32" on 2.5.2-pre1: > > I noticed the test was taking much longer than usual, > and I could not do a new "login". > > vmstat 8 looked like this: You neglected to mention what disk I/O system you are using? IDE or SCSI, and if the latter what host adapter? -- Jens Axboe ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-21 14:46 ` Jens Axboe @ 2001-12-21 16:43 ` rwhron 2001-12-21 17:01 ` Jens Axboe 2001-12-21 23:55 ` rwhron 1 sibling, 1 reply; 21+ messages in thread From: rwhron @ 2001-12-21 16:43 UTC (permalink / raw) To: Jens Axboe; +Cc: rwhron, linux-kernel On Fri, Dec 21, 2001 at 03:46:54PM +0100, Jens Axboe wrote: > You neglected to mention what disk I/O system you are using? IDE or > SCSI, and if the latter what host adapter? > > -- > Jens Axboe Sorry about that. It's an IDE drive. 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40) 00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 10) 00:0f.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10) 01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 04) CONFIG_IDE=y CONFIG_BLK_DEV_IDE=y CONFIG_BLK_DEV_IDEDISK=y CONFIG_IDEDISK_MULTI_MODE=y CONFIG_BLK_DEV_IDECD=m CONFIG_BLK_DEV_IDEPCI=y CONFIG_BLK_DEV_IDEDMA_PCI=y CONFIG_IDEDMA_PCI_AUTO=y CONFIG_BLK_DEV_IDEDMA=y CONFIG_IDEDMA_AUTO=y CONFIG_BLK_DEV_IDE_MODES=y -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-21 16:43 ` rwhron @ 2001-12-21 17:01 ` Jens Axboe 2001-12-21 18:47 ` rwhron 0 siblings, 1 reply; 21+ messages in thread From: Jens Axboe @ 2001-12-21 17:01 UTC (permalink / raw) To: rwhron; +Cc: Jens Axboe, linux-kernel On Fri, Dec 21 2001, rwhron@earthlink.net wrote: > On Fri, Dec 21, 2001 at 03:46:54PM +0100, Jens Axboe wrote: > > You neglected to mention what disk I/O system you are using? IDE or > > SCSI, and if the latter what host adapter? > > > > -- > > Jens Axboe > > Sorry about that. It's an IDE drive. > > 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03) > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] > 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) > 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) > 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40) > 00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 10) > 00:0f.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10) > 01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 04) > > CONFIG_IDE=y > CONFIG_BLK_DEV_IDE=y > CONFIG_BLK_DEV_IDEDISK=y > CONFIG_IDEDISK_MULTI_MODE=y > CONFIG_BLK_DEV_IDECD=m > CONFIG_BLK_DEV_IDEPCI=y > CONFIG_BLK_DEV_IDEDMA_PCI=y > CONFIG_IDEDMA_PCI_AUTO=y > CONFIG_BLK_DEV_IDEDMA=y > CONFIG_IDEDMA_AUTO=y > CONFIG_BLK_DEV_IDE_MODES=y Thanks -- could you also try and do sysrq-t back traces when it seems stuck? Does a non-highmem kernel run ok? -- Jens Axboe ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-21 17:01 ` Jens Axboe @ 2001-12-21 18:47 ` rwhron 2001-12-21 22:19 ` Jens Axboe 0 siblings, 1 reply; 21+ messages in thread From: rwhron @ 2001-12-21 18:47 UTC (permalink / raw) To: Jens Axboe; +Cc: rwhron, linux-kernel On Fri, Dec 21, 2001 at 06:01:56PM +0100, Jens Axboe wrote: > Thanks -- could you also try and do sysrq-t back traces when it seems > stuck? > > Does a non-highmem kernel run ok? > > -- > Jens Axboe I recompiled with highmem turned off. # CONFIG_HIGHMEM4G is not set # CONFIG_HIGHMEM64G is not set I run a scripty that executes dbench 32, then dbench 128. dbench 32 completed this time. dbench 128 hung similar to dbench 32 in the previous message. I don't have the vmstat output captured, but "b" was 128, bi and bo were 0, and idle was 100. I couldn't save a stack trace because /bin/ed would not open a file. I.E: ed output - no prompt about file does not exist. "w" would not save, etc. The vmstat "b" column went up by 2 after I started ed and tried another console login. -- Before running dbench, I normally create a small loopback reiserfs filesystem. This worked okay the first time I did it (with highmem). After recompiling without highmem, I ran my "build_rootfs" script to create a small uml root fs, and got an Oops. The same script was fine on 2.5.1-pre[5-9] and 2.5.1-pre1[01]. (you fixed something like this in the patches between 2.5.1-pre3 and pre4.) I rebooted after each Oops, so the dbench's above were run after a fresh boot. invalid operand: 0000 CPU: 0 EIP: 0010:[<c012fbf0>] Not tainted EFLAGS: 00010287 eax: 00000070 ebx: 00000700 ecx: c02a45dc edx: 00038001 esi: 00000000 edi: 00000000 ebp: f4a5a000 esp: f4a8fe38 ds: 0018 es: 0018 ss: 0018 Process mkreiserfs (pid: 135, stackpage=f4a8f000) Stack: 00000700 00000000 00000000 f4a5a000 c023896c 00000246 f7ef1740 00000000 00000000 fac4a887 00038001 00000070 f4a8fe98 00000700 00000000 c02a45dc f7ef1740 00000000 00000001 00000030 00000000 00000000 c018a4a0 c02a45dc Call Trace: [<fac4a887>] [<c018a4a0>] [<c018a54c>] [<c018a5f6>] [<c01340f0>] [<c012c923>] [<c0136aff>] [<c0136a60>] [<c0126ab5>] [<c0126ee5>] [<c0126e00>] [<c0131ae6>] [<c01086eb>] Code: 0f 0b 8b 35 04 59 29 c0 c7 44 24 18 70 00 00 00 89 74 24 14 >>EIP; c012fbf0 <create_bounce+40/250> <===== Trace; fac4a886 <END_OF_CODE+207b8/????> Trace; c018a4a0 <generic_make_request+170/190> Trace; c018a54c <submit_bio+4c/60> Trace; c018a5f6 <submit_bh+96/a0> Trace; c01340f0 <block_read_full_page+1a0/1c0> Trace; c012c922 <__alloc_pages+32/170> Trace; c0136afe <blkdev_readpage+e/20> Trace; c0136a60 <blkdev_get_block+0/40> Trace; c0126ab4 <do_generic_file_read+274/3f0> Trace; c0126ee4 <generic_file_read+84/140> Trace; c0126e00 <file_read_actor+0/60> Trace; c0131ae6 <sys_read+96/d0> Trace; c01086ea <system_call+32/38> Code; c012fbf0 <create_bounce+40/250> 00000000 <_EIP>: Code; c012fbf0 <create_bounce+40/250> <===== 0: 0f 0b ud2a <===== Code; c012fbf2 <create_bounce+42/250> 2: 8b 35 04 59 29 c0 mov 0xc0295904,%esi Code; c012fbf8 <create_bounce+48/250> 8: c7 44 24 18 70 00 00 movl $0x70,0x18(%esp,1) Code; c012fbfe <create_bounce+4e/250> f: 00 Code; c012fc00 <create_bounce+50/250> 10: 89 74 24 14 mov %esi,0x14(%esp,1) I rebooted, and tried to create the loopback reiserfs again and got: invalid operand: 0000 CPU: 0 EIP: 0010:[<c012fbf0>] Not tainted EFLAGS: 00010287 eax: 00000070 ebx: 00000700 ecx: c02a45dc edx: 00038001 esi: 00000000 edi: 00000000 ebp: f4d0e000 esp: f4c31e38 ds: 0018 es: 0018 ss: 0018 Process mkreiserfs (pid: 118, stackpage=f4c31000) Stack: 00000700 00000000 00000000 f4d0e000 f4c4c2c0 00000246 f7ef1900 00000000 00000000 fac28887 00038001 00000070 f4c31e98 00000700 00000000 c02a45dc f7ef1900 00000000 00000001 00000030 00000000 00000000 c018a4a0 c02a45dc Call Trace: [<fac28887>] [<c018a4a0>] [<c018a54c>] [<c018a5f6>] [<c01340f0>] [<c012c923>] [<c0136aff>] [<c0136a60>] [<c0126ab5>] [<c0126ee5>] [<c0126e00>] [<c0131ae6>] [<c01086eb>] Code: 0f 0b 8b 35 04 59 29 c0 c7 44 24 18 70 00 00 00 89 74 24 14 >>EIP; c012fbf0 <create_bounce+40/250> <===== Trace; fac28886 <[loop]loop_make_request+96/200> Trace; c018a4a0 <generic_make_request+170/190> Trace; c018a54c <submit_bio+4c/60> Trace; c018a5f6 <submit_bh+96/a0> Trace; c01340f0 <block_read_full_page+1a0/1c0> Trace; c012c922 <__alloc_pages+32/170> Trace; c0136afe <blkdev_readpage+e/20> Trace; c0136a60 <blkdev_get_block+0/40> Trace; c0126ab4 <do_generic_file_read+274/3f0> Trace; c0126ee4 <generic_file_read+84/140> Trace; c0126e00 <file_read_actor+0/60> Trace; c0131ae6 <sys_read+96/d0> Trace; c01086ea <system_call+32/38> Code; c012fbf0 <create_bounce+40/250> 00000000 <_EIP>: Code; c012fbf0 <create_bounce+40/250> <===== 0: 0f 0b ud2a <===== Code; c012fbf2 <create_bounce+42/250> 2: 8b 35 04 59 29 c0 mov 0xc0295904,%esi Code; c012fbf8 <create_bounce+48/250> 8: c7 44 24 18 70 00 00 movl $0x70,0x18(%esp,1) Code; c012fbfe <create_bounce+4e/250> f: 00 Code; c012fc00 <create_bounce+50/250> 10: 89 74 24 14 mov %esi,0x14(%esp,1) -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-21 18:47 ` rwhron @ 2001-12-21 22:19 ` Jens Axboe 0 siblings, 0 replies; 21+ messages in thread From: Jens Axboe @ 2001-12-21 22:19 UTC (permalink / raw) To: rwhron; +Cc: linux-kernel On Fri, Dec 21 2001, rwhron@earthlink.net wrote: > On Fri, Dec 21, 2001 at 06:01:56PM +0100, Jens Axboe wrote: > > Thanks -- could you also try and do sysrq-t back traces when it seems > > stuck? > > > > Does a non-highmem kernel run ok? > > > > -- > > Jens Axboe > > I recompiled with highmem turned off. > # CONFIG_HIGHMEM4G is not set > # CONFIG_HIGHMEM64G is not set > > I run a scripty that executes dbench 32, then dbench 128. Ok, please try something for me. In drivers/block/elevator.c, comment out this block: if (q->last_merge) { __rq = list_entry_rq(q->last_merge); BUG_ON(__rq->flags & REQ_STARTED); if ((ret = elv_try_merge(__rq, bio))) { *req = __rq; return ret; } } (just #if 0 the entire thing) -- the one inside elevator_linus_merge() Loop back highmem issue is different, I'll take a look at that later. I'll be pretty unresponsive over christmas, though. -- Jens Axboe ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-21 14:46 ` Jens Axboe 2001-12-21 16:43 ` rwhron @ 2001-12-21 23:55 ` rwhron 2001-12-24 14:03 ` Jens Axboe 1 sibling, 1 reply; 21+ messages in thread From: rwhron @ 2001-12-21 23:55 UTC (permalink / raw) To: Jens Axboe; +Cc: rwhron, linux-kernel > Ok, please try something for me. In drivers/block/elevator.c, comment > out this block: After commenting the block of code, make clean, etc, I rebooted and ran the dbench 32, 128 scripty. It completed dbench 32 again, but dbench 128 hung again. I could quit some tools. df, ps, wouldn't return and didn't listen to <ctrl c>. > Loop back highmem issue is different, I'll take a look at that later. > I'll be pretty unresponsive over christmas, though. > > Jens Axboe Enjoy the holidays! -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-21 23:55 ` rwhron @ 2001-12-24 14:03 ` Jens Axboe 2001-12-24 16:59 ` rwhron 0 siblings, 1 reply; 21+ messages in thread From: Jens Axboe @ 2001-12-24 14:03 UTC (permalink / raw) To: rwhron; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 574 bytes --] On Fri, Dec 21 2001, rwhron@earthlink.net wrote: > > Ok, please try something for me. In drivers/block/elevator.c, comment > > out this block: > > After commenting the block of code, make clean, etc, I rebooted and ran > the dbench 32, 128 scripty. It completed dbench 32 again, but dbench > 128 hung again. I could quit some tools. df, ps, wouldn't return > and didn't listen to <ctrl c>. What IDE controller are you using? The two other reports so far have been with VIA, maybe that's a clue. Anyways, could you please reproduce with this applied? -- Jens Axboe [-- Attachment #2: bio-252p1-2 --] [-- Type: text/plain, Size: 6933 bytes --] diff -ur -X exclude /opt/kernel/linux-2.5.2-pre1/drivers/block/elevator.c linux/drivers/block/elevator.c --- /opt/kernel/linux-2.5.2-pre1/drivers/block/elevator.c Sun Dec 23 17:11:54 2001 +++ linux/drivers/block/elevator.c Sun Dec 23 15:53:07 2001 @@ -124,21 +124,21 @@ inline int elv_try_merge(struct request *__rq, struct bio *bio) { unsigned int count = bio_sectors(bio); - - if (!elv_rq_merge_ok(__rq, bio)) - return ELEVATOR_NO_MERGE; + int ret = ELEVATOR_NO_MERGE; /* * we can merge and sequence is ok, check if it's possible */ - if (__rq->sector + __rq->nr_sectors == bio->bi_sector) { - return ELEVATOR_BACK_MERGE; - } else if (__rq->sector - count == bio->bi_sector) { - __rq->elevator_sequence -= count; - return ELEVATOR_FRONT_MERGE; + if (elv_rq_merge_ok(__rq, bio)) { + if (__rq->sector + __rq->nr_sectors == bio->bi_sector) { + ret = ELEVATOR_BACK_MERGE; + } else if (__rq->sector - count == bio->bi_sector) { + __rq->elevator_sequence -= count; + ret = ELEVATOR_FRONT_MERGE; + } } - return ELEVATOR_NO_MERGE; + return ret; } int elevator_linus_merge(request_queue_t *q, struct request **req, @@ -172,15 +172,17 @@ */ if (__rq->elevator_sequence-- <= 0) break; + if (__rq->flags & (REQ_BARRIER | REQ_STARTED)) break; if (!(__rq->flags & REQ_CMD)) continue; - if (__rq->elevator_sequence < 0) - break; if (!*req && bio_rq_in_between(bio, __rq, &q->queue_head)) *req = __rq; + + if (__rq->elevator_sequence < bio_sectors(bio)) + break; if ((ret = elv_try_merge(__rq, bio))) { *req = __rq; diff -ur -X exclude /opt/kernel/linux-2.5.2-pre1/drivers/block/ll_rw_blk.c linux/drivers/block/ll_rw_blk.c --- /opt/kernel/linux-2.5.2-pre1/drivers/block/ll_rw_blk.c Sun Dec 23 17:11:54 2001 +++ linux/drivers/block/ll_rw_blk.c Mon Dec 24 14:50:46 2001 @@ -155,6 +155,11 @@ blk_queue_max_sectors(q, MAX_SECTORS); blk_queue_hardsect_size(q, 512); + /* + * by default assume old behaviour and bounce for any highmem page + */ + blk_queue_bounce_limit(q, BLK_BOUNCE_HIGH); + init_waitqueue_head(&q->queue_wait); } @@ -603,9 +608,6 @@ return 0; /* Merge is OK... */ - if (q->last_merge == &next->queuelist) - q->last_merge = NULL; - req->nr_phys_segments = total_phys_segments; req->nr_hw_segments = total_hw_segments; return 1; @@ -812,12 +814,8 @@ q->plug_tq.data = q; q->queue_flags = (1 << QUEUE_FLAG_CLUSTER); q->queue_lock = lock; + q->last_merge = NULL; - /* - * by default assume old behaviour and bounce for any highmem page - */ - blk_queue_bounce_limit(q, BLK_BOUNCE_HIGH); - blk_queue_segment_boundary(q, 0xffffffff); blk_queue_make_request(q, __make_request); @@ -886,6 +884,12 @@ if (!rq && (gfp_mask & __GFP_WAIT)) rq = get_request_wait(q, rw); + if (rq) { + rq->flags = 0; + rq->buffer = NULL; + rq->bio = rq->biotail = NULL; + rq->waiting = NULL; + } return rq; } @@ -953,10 +977,15 @@ /* * debug stuff... */ - if (insert_here == &q->queue_head) { - struct request *__rq = __elv_next_request(q); + if (insert_here->next != &q->queue_head) { + struct request *__rq = list_entry_rq(insert_here->next); +#if 0 BUG_ON(__rq && (__rq->flags & REQ_STARTED)); +#else + if (__rq->flags & REQ_STARTED) + printk("add_request: irk, next is started\n"); +#endif } /* @@ -972,11 +1001,15 @@ void blkdev_release_request(struct request *req) { struct request_list *rl = req->rl; + request_queue_t *q = req->q; req->rq_status = RQ_INACTIVE; req->q = NULL; req->rl = NULL; + if (q && q->last_merge == &req->queuelist) + q->last_merge = NULL; + /* * Request may not have originated from ll_rw_blk. if not, * it didn't come out of our reserved rq pools @@ -1571,21 +1604,23 @@ inline void blk_recalc_rq_sectors(struct request *rq, int nsect) { - rq->hard_sector += nsect; - rq->hard_nr_sectors -= nsect; - rq->sector = rq->hard_sector; - rq->nr_sectors = rq->hard_nr_sectors; + if (rq->flags & REQ_CMD) { + rq->hard_sector += nsect; + rq->hard_nr_sectors -= nsect; + rq->sector = rq->hard_sector; + rq->nr_sectors = rq->hard_nr_sectors; - rq->current_nr_sectors = bio_iovec(rq->bio)->bv_len >> 9; - rq->hard_cur_sectors = rq->current_nr_sectors; + rq->current_nr_sectors = bio_iovec(rq->bio)->bv_len >> 9; + rq->hard_cur_sectors = rq->current_nr_sectors; - /* - * if total number of sectors is less than the first segment - * size, something has gone terribly wrong - */ - if (rq->nr_sectors < rq->current_nr_sectors) { - printk("blk: request botched\n"); - rq->nr_sectors = rq->current_nr_sectors; + /* + * if total number of sectors is less than the first segment + * size, something has gone terribly wrong + */ + if (rq->nr_sectors < rq->current_nr_sectors) { + printk("blk: request botched\n"); + rq->nr_sectors = rq->current_nr_sectors; + } } } diff -ur -X exclude /opt/kernel/linux-2.5.2-pre1/include/linux/blkdev.h linux/include/linux/blkdev.h --- /opt/kernel/linux-2.5.2-pre1/include/linux/blkdev.h Sun Dec 23 17:11:55 2001 +++ linux/include/linux/blkdev.h Sun Dec 23 17:15:02 2001 @@ -196,8 +196,7 @@ #define RQ_SCSI_DISCONNECTING 0xffe0 #define QUEUE_FLAG_PLUGGED 0 /* queue is plugged */ -#define QUEUE_FLAG_NOSPLIT 1 /* can process bio over several goes */ -#define QUEUE_FLAG_CLUSTER 2 /* cluster several segments into 1 */ +#define QUEUE_FLAG_CLUSTER 1 /* cluster several segments into 1 */ #define blk_queue_plugged(q) test_bit(QUEUE_FLAG_PLUGGED, &(q)->queue_flags) #define blk_mark_plugged(q) set_bit(QUEUE_FLAG_PLUGGED, &(q)->queue_flags) diff -ur -X exclude /opt/kernel/linux-2.5.2-pre1/mm/highmem.c linux/mm/highmem.c --- /opt/kernel/linux-2.5.2-pre1/mm/highmem.c Sun Dec 23 17:11:56 2001 +++ linux/mm/highmem.c Mon Dec 24 13:59:21 2001 @@ -25,7 +25,9 @@ static void *page_pool_alloc(int gfp_mask, void *data) { - return alloc_page(gfp_mask); + int gfp = gfp_mask | (int) data; + + return alloc_page(gfp); } static void page_pool_free(void *page, void *data) @@ -252,7 +254,7 @@ if (isa_page_pool) return 0; - isa_page_pool = mempool_create(ISA_POOL_SIZE, page_pool_alloc, page_pool_free, NULL); + isa_page_pool = mempool_create(ISA_POOL_SIZE, page_pool_alloc, page_pool_free, (void *) __GFP_DMA); if (!isa_page_pool) BUG(); @@ -272,7 +274,7 @@ int i; __bio_for_each_segment(tovec, to, i, 0) { - fromvec = &from->bi_io_vec[i]; + fromvec = from->bi_io_vec + i; /* * not bounced @@ -301,7 +303,7 @@ * free up bounce indirect pages used */ __bio_for_each_segment(bvec, bio, i, 0) { - org_vec = &bio_orig->bi_io_vec[i]; + org_vec = bio_orig->bi_io_vec + i; if (bvec->bv_page == org_vec->bv_page) continue; @@ -394,7 +397,7 @@ if (!bio) bio = bio_alloc(bio_gfp, (*bio_orig)->bi_vcnt); - to = &bio->bi_io_vec[i]; + to = bio->bi_io_vec + i; to->bv_page = mempool_alloc(pool, gfp); to->bv_len = from->bv_len; ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-24 14:03 ` Jens Axboe @ 2001-12-24 16:59 ` rwhron 2001-12-24 17:02 ` Jens Axboe 0 siblings, 1 reply; 21+ messages in thread From: rwhron @ 2001-12-24 16:59 UTC (permalink / raw) To: Jens Axboe; +Cc: rwhron, linux-kernel On Mon, Dec 24, 2001 at 03:03:37PM +0100, Jens Axboe wrote: > On Fri, Dec 21 2001, rwhron@earthlink.net wrote: > What IDE controller are you using? The two other reports so far have > been with VIA, maybe that's a clue. I do have one of the perhaps buggier VIA chipsets. 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03) 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40) 00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 10) 00:0f.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10) 01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 04) 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) (prog-if 8a [Master SecP PriP]) Subsystem: VIA Technologies, Inc. Bus Master IDE Flags: bus master, medium devsel, latency 32 I/O ports at d000 [size=16] Capabilities: <available only to root> It's been reliable for a long time, but it wouldn't compile an Athlon optimized kernel until 2.4.1x. (Kernel would Oops at boot time unless compiled with CONFIG_M586=y) It was reliable when not optimized for Athlon. > Anyways, could you please reproduce with this applied? > > -- > Jens Axboe With the patch, it still hangs on this system. I recompiled with CONFIG_NOHIGHMEM=y and CONFIG_M586=y, but that ended up with all processes in "b" state during dbench 32 too. I tried unpatched 2.5.2-pre1 on a k6-2. dbench 32 hung similarly with 32 in "b", bo and bi = 0, and id = 100. That machine is ill now and can't find "init" when booting, boot single, or boot init=/bin/bash. -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-24 16:59 ` rwhron @ 2001-12-24 17:02 ` Jens Axboe 2001-12-24 22:14 ` rwhron 2001-12-27 19:07 ` rwhron 0 siblings, 2 replies; 21+ messages in thread From: Jens Axboe @ 2001-12-24 17:02 UTC (permalink / raw) To: rwhron; +Cc: linux-kernel On Mon, Dec 24 2001, rwhron@earthlink.net wrote: > On Mon, Dec 24, 2001 at 03:03:37PM +0100, Jens Axboe wrote: > > On Fri, Dec 21 2001, rwhron@earthlink.net wrote: > > What IDE controller are you using? The two other reports so far have > > been with VIA, maybe that's a clue. > > I do have one of the perhaps buggier VIA chipsets. > > 00:00.0 Host bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133] (rev 03) > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8363/8365 [KT133/KM133 AGP] > 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40) > 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) > 00:07.4 Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40) > 00:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 10) > 00:0f.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10) > 01:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G400 AGP (rev 04) > > 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) (prog-if 8a [Master SecP PriP]) > Subsystem: VIA Technologies, Inc. Bus Master IDE > Flags: bus master, medium devsel, latency 32 > I/O ports at d000 [size=16] > Capabilities: <available only to root> > > It's been reliable for a long time, but it wouldn't compile an Athlon > optimized kernel until 2.4.1x. (Kernel would Oops at boot time unless > compiled with CONFIG_M586=y) Ok noted > > Anyways, could you please reproduce with this applied? > > > > -- > > Jens Axboe > > With the patch, it still hangs on this system. I recompiled with > CONFIG_NOHIGHMEM=y and CONFIG_M586=y, but that ended up with all processes > in "b" state during dbench 32 too. I would suspect that, do you get any kernel messages? > I tried unpatched 2.5.2-pre1 on a k6-2. dbench 32 hung similarly with > 32 in "b", bo and bi = 0, and id = 100. That machine is ill now and can't > find "init" when booting, boot single, or boot init=/bin/bash. Please send ps -eo cmd,wchan info for a hung machine. -- Jens Axboe ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-24 17:02 ` Jens Axboe @ 2001-12-24 22:14 ` rwhron 2001-12-27 19:07 ` rwhron 1 sibling, 0 replies; 21+ messages in thread From: rwhron @ 2001-12-24 22:14 UTC (permalink / raw) To: Jens Axboe; +Cc: rwhron, linux-kernel On Mon, Dec 24, 2001 at 06:02:44PM +0100, Jens Axboe wrote: > > I would suspect that, do you get any kernel messages? When the machine gets in this state, it won't save any files, so kern.log doesn't have anything after the initial boot message. > Please send ps -eo cmd,wchan info for a hung machine. > > -- > Jens Axboe Strangely (to me anyway), when dbench 32 hangs the machine, ps will not print anything. vmstat will continue it's 8 second cycle though. -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-24 17:02 ` Jens Axboe 2001-12-24 22:14 ` rwhron @ 2001-12-27 19:07 ` rwhron 2001-12-28 11:40 ` Jens Axboe 1 sibling, 1 reply; 21+ messages in thread From: rwhron @ 2001-12-27 19:07 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-kernel On Mon, Dec 24, 2001 at 06:02:44PM +0100, Jens Axboe wrote: > > I tried unpatched 2.5.2-pre1 on a k6-2. dbench 32 hung similarly with > > 32 in "b", bo and bi = 0, and id = 100. That machine is ill now and can't > > find "init" when booting, boot single, or boot init=/bin/bash. > > Please send ps -eo cmd,wchan info for a hung machine. > > -- > Jens Axboe > I rebuilt the reiserfs that dbench writes to. Here is ps -eo cmd,wchan on the k6-2 running 2.5.2-pre2: CMD WCHAN init do_select [keventd] context_thread [ksoftirqd_CPU0] ksoftirqd [kswapd] kswapd [bdflush] bdflush [kupdated] get_request_wait [kreiserfsd] get_request_wait /usr/sbin/syslog get_request_wait /usr/sbin/klogd do_syslog [eth0] rtl8139_thread /usr/sbin/iplog do_select /usr/sbin/iplog do_poll /usr/sbin/iplog get_request_wait /usr/sbin/iplog do_select /usr/sbin/iplog wait_for_packet /usr/sbin/sshd do_select /sbin/agetty tty read_chan /bin/login -- down /usr/sbin/sshd do_select -bash wait4 -su wait4 /usr/sbin/sshd do_select -bash wait4 /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /dbench 32 get_request_wait /usr/sbin/sshd do_select /usr/sbin/sshd get_request_wait ed /tmp/ls get_request_wait ps -eo cmd,wchan - vmstat 3 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 37 2 0 25464 3224 333252 0 0 13 371 107 33 0 4 96 0 37 2 0 25460 3224 333252 0 0 0 0 102 6 0 0 100 0 37 2 0 25460 3224 333252 0 0 0 0 101 7 0 0 100 I rebooted and ran dbench 32 on a new ext2 filesystem. dbench runs okay for about 30 seconds. Towards the end of the vmstat output below, I try to ssh in, the "b" column goes up, but I don't the a bash prompt. mountain:~$ vmstat 10 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 0 346236 20012 6316 0 0 793 67 174 164 3 8 90 0 32 0 0 182364 21396 162428 0 0 79 3492 136 109 2 26 72 21 11 0 0 163904 21532 180264 0 0 0 11683 209 97 0 11 89 0 32 0 0 32416 23224 306540 0 0 5 6375 226 108 1 27 72 0 32 1 0 22552 23392 315972 0 0 3 9807 206 98 0 8 92 0 32 2 132 4584 7128 349660 0 0 13 2905 192 204 2 29 69 0 32 2 132 4580 7128 349660 0 0 0 0 101 44 0 0 100 0 32 2 132 4580 7128 349660 0 0 0 0 100 45 0 0 100 0 32 2 132 4580 7128 349660 0 0 0 0 100 44 0 0 100 0 32 2 132 4580 7128 349660 0 0 0 0 100 44 0 0 100 0 32 2 132 4580 7128 349660 0 0 0 0 100 44 0 0 100 0 32 2 132 4580 7128 349660 0 0 0 0 100 44 0 0 100 0 32 2 132 4580 7128 349660 0 0 0 0 101 45 0 0 100 0 35 2 132 4156 7128 349672 0 0 1 1 104 52 1 0 99 0 35 2 132 4156 7128 349672 0 0 0 0 100 44 0 0 100 Below is software, hardware, and kernel configs: Linux (none) 2.5.2-pre2 #1 Thu Dec 27 12:32:39 EST 2001 i586 unknown Gnu C 2.95.3 Gnu make 3.79.1 binutils 2.11.2 util-linux 2.11n mount 2.11n modutils 2.4.11 e2fsprogs 1.25 reiserfsprogs 3.x.0k-pre14 PPP 2.4.1 Linux C Library 2.2.4 Dynamic linker (ldd) 2.2.4 Procps 2.0.7 Net-tools 1.60 Kbd 1.06 Sh-utils 2.0 Modules Loaded This machine has a VIA chipset. No proprietary drivers. 384 MB RAM. Root filesystem on /dev/hdc2 # not the usual /dev/hda 00:00.0 Host bridge: VIA Technologies, Inc. VT82C598 [Apollo MVP3] (rev 04) 00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP] 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C586/A/B PCI-to-ISA [Apollo VP] (rev 47) 00:07.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06) 00:07.3 Host bridge: VIA Technologies, Inc. VT82C586B ACPI (rev 10) 00:13.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 10) 01:00.0 VGA compatible controller: nVidia Corporation Vanta [NV6] (rev 15) 2.4.18-pre1 (and other 2.4.17* kernels run dbench 32, 128 okay on this system) This is the config difference: diff 2.5.2-pre2 2.4.18-pre1 > CONFIG_NETLINK_DEV=y < CONFIG_RAMFS=y # 2.5.2-pre2 config CONFIG_X86=y CONFIG_ISA=y CONFIG_UID16=y CONFIG_EXPERIMENTAL=y CONFIG_MODULES=y CONFIG_KMOD=y CONFIG_MK6=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_X86_L1_CACHE_SHIFT=5 CONFIG_X86_ALIGNMENT_16=y CONFIG_X86_TSC=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_NOHIGHMEM=y CONFIG_MTRR=y CONFIG_NET=y CONFIG_PCI=y CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_NAMES=y CONFIG_SYSVIPC=y CONFIG_SYSCTL=y CONFIG_KCORE_ELF=y CONFIG_BINFMT_ELF=y CONFIG_PM=y CONFIG_APM=m CONFIG_APM_DO_ENABLE=y CONFIG_BLK_DEV_FD=y CONFIG_BLK_DEV_LOOP=m CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_SIZE=4096 CONFIG_BLK_DEV_INITRD=y CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_NETFILTER=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_IP_NF_CONNTRACK=y CONFIG_IP_NF_FTP=m CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_MATCH_LIMIT=y CONFIG_IP_NF_MATCH_MULTIPORT=m CONFIG_IP_NF_MATCH_STATE=y CONFIG_IP_NF_FILTER=y CONFIG_IP_NF_NAT=y CONFIG_IP_NF_NAT_NEEDED=y CONFIG_IP_NF_TARGET_MASQUERADE=y CONFIG_IP_NF_NAT_FTP=m CONFIG_IDE=y CONFIG_BLK_DEV_IDE=y CONFIG_BLK_DEV_IDEDISK=y CONFIG_IDEDISK_MULTI_MODE=y CONFIG_BLK_DEV_IDECD=m CONFIG_BLK_DEV_IDEPCI=y CONFIG_BLK_DEV_IDEDMA_PCI=y CONFIG_BLK_DEV_ADMA=y CONFIG_IDEDMA_PCI_AUTO=y CONFIG_BLK_DEV_IDEDMA=y CONFIG_BLK_DEV_VIA82CXXX=y CONFIG_IDEDMA_AUTO=y CONFIG_BLK_DEV_IDE_MODES=y CONFIG_NETDEVICES=y CONFIG_NET_ETHERNET=y CONFIG_NET_PCI=y CONFIG_8139TOO=y CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_SERIAL=y CONFIG_SERIAL_CONSOLE=y CONFIG_UNIX98_PTYS=y CONFIG_UNIX98_PTY_COUNT=64 CONFIG_MOUSE=m CONFIG_PSMOUSE=y CONFIG_REISERFS_FS=y CONFIG_EXT3_FS=y CONFIG_JBD=y CONFIG_FAT_FS=m CONFIG_MSDOS_FS=m CONFIG_VFAT_FS=m CONFIG_RAMFS=y CONFIG_ISO9660_FS=m CONFIG_NTFS_FS=m CONFIG_PROC_FS=y CONFIG_DEVPTS_FS=y CONFIG_EXT2_FS=y CONFIG_CODA_FS=m CONFIG_NFS_FS=m CONFIG_NFS_V3=y CONFIG_NFSD=y CONFIG_NFSD_V3=y CONFIG_SUNRPC=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_MSDOS_PARTITION=y CONFIG_NLS=y CONFIG_NLS_DEFAULT="iso8859-1" CONFIG_NLS_CODEPAGE_437=m CONFIG_VGA_CONSOLE=y CONFIG_VIDEO_SELECT=y CONFIG_DEBUG_KERNEL=y CONFIG_MAGIC_SYSRQ=y -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-27 19:07 ` rwhron @ 2001-12-28 11:40 ` Jens Axboe 2001-12-28 14:14 ` rwhron 0 siblings, 1 reply; 21+ messages in thread From: Jens Axboe @ 2001-12-28 11:40 UTC (permalink / raw) To: rwhron; +Cc: linux-kernel On Thu, Dec 27 2001, rwhron@earthlink.net wrote: > On Mon, Dec 24, 2001 at 06:02:44PM +0100, Jens Axboe wrote: > > > I tried unpatched 2.5.2-pre1 on a k6-2. dbench 32 hung similarly with > > > 32 in "b", bo and bi = 0, and id = 100. That machine is ill now and can't > > > find "init" when booting, boot single, or boot init=/bin/bash. > > > > Please send ps -eo cmd,wchan info for a hung machine. > > > > -- > > Jens Axboe > > > > I rebuilt the reiserfs that dbench writes to. > Here is ps -eo cmd,wchan on the k6-2 running 2.5.2-pre2: Ah this is interesting, all stuck in get_request_wait. I cannot reproduce your problem here whatever I do, no reiser though. -- Jens Axboe ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-28 11:40 ` Jens Axboe @ 2001-12-28 14:14 ` rwhron 2001-12-28 14:30 ` Jens Axboe 0 siblings, 1 reply; 21+ messages in thread From: rwhron @ 2001-12-28 14:14 UTC (permalink / raw) To: Jens Axboe; +Cc: rwhron, linux-kernel On Fri, Dec 28, 2001 at 12:40:37PM +0100, Jens Axboe wrote: > > I rebuilt the reiserfs that dbench writes to. > > Here is ps -eo cmd,wchan on the k6-2 running 2.5.2-pre2: > > Ah this is interesting, all stuck in get_request_wait. I cannot > reproduce your problem here whatever I do, no reiser though. > > -- > Jens Axboe That's good news. It's probably something with my configuration or hardware. I saw the livelock on both ext2 and reiserfs. I removed these options from the config and rebuilt 2.5.2-pre2: CONFIG_PM=y CONFIG_APM=m CONFIG_APM_DO_ENABLE=y CONFIG_NTFS_FS=m CONFIG_CODA_FS=m CONFIG_NFS_FS=m CONFIG_NFS_V3=y CONFIG_NFSD=y CONFIG_NFSD_V3=y CONFIG_SUNRPC=y CONFIG_LOCKD=y CONFIG_LOCKD_V4=y CONFIG_VIDEO_SELECT=y The initial dbench on ext2 completed for 32 processes but 128 didn't: vmstat 8 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 128 1 132 14796 22136 314916 0 0 0 5467 272 122 1 13 86 0 128 1 636 3968 21844 328132 0 9 1 1338 132 125 1 18 81 0 128 1 636 3964 21844 328132 0 0 0 0 101 44 0 0 100 0 128 1 636 3964 21844 328132 0 0 0 0 101 45 0 0 100 0 128 1 636 3964 21844 328132 0 0 0 0 101 45 0 0 100 ps -eo cmd,wchan | uniq CMD WCHAN init pollwait [keventd] context_thread [ksoftirqd_CPU0] ksoftirqd [kswapd] refill_inactive [bdflush] try_to_free_buffers [kupdated] init_private_file [kreiserfsd] reiserfs_get_block /usr/sbin/syslog pollwait /usr/sbin/klogd do_syslog [eth0] timer_do_blank_screen /usr/sbin/iplog pollwait /usr/sbin/iplog select /usr/sbin/iplog rt_sigsuspend /usr/sbin/iplog pollwait /usr/sbin/iplog netdev_ethtool_ioctl /usr/sbin/sshd pollwait /sbin/agetty tty is_internal /bin/login -- write_chan /usr/sbin/sshd pollwait -bash wait4 /usr/sbin/sshd pollwait -bash wait4 /bin/bash ./chk wait4 /dbench 128 wait4 /dbench 128 down /dbench 128 write_chan /dbench 128 init_private_file /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 add_to_page_cache_unique /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down /dbench 128 write_chan /dbench 128 down ps -eo cmd,wchan - uniq do_execve I stripped down the config a little more by removing these: CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_SIZE=4096 CONFIG_BLK_DEV_INITRD=y CONFIG_IP_NF_CONNTRACK=y CONFIG_IP_NF_FTP=m CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_MATCH_LIMIT=y CONFIG_IP_NF_MATCH_MULTIPORT=m CONFIG_IP_NF_MATCH_STATE=y CONFIG_IP_NF_FILTER=y CONFIG_IP_NF_NAT=y CONFIG_IP_NF_NAT_NEEDED=y CONFIG_IP_NF_TARGET_MASQUERADE=y CONFIG_IP_NF_NAT_FTP=m CONFIG_BLK_DEV_IDECD=m CONFIG_FAT_FS=m CONFIG_MSDOS_FS=m CONFIG_VFAT_FS=m CONFIG_ISO9660_FS=m CONFIG_NLS=y CONFIG_NLS_DEFAULT="iso8859-1" CONFIG_NLS_CODEPAGE_437=m With the stripped config, I built 2.5.2-pre3. It panic'd with the stripped config. 2.5.2-pre3 panic'd yesterday on this machine's normal config too. Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 8139too Fast Ethernet driver 0.9.22 PCI: Found IRQ 11 for device 00:13.0 IRQ routing conflict for 00:13.0, have irq 9, want irq 11 eth0: RealTek RTL8139 Fast Ethernet at 0xd8800000, 00:50:bf:25:68:f3, IRQ 9 NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP IP: routing cache hash table of 4096 buckets, 32Kbytes TCP: Hash tables configured (established 32768 bind 32768) NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. Kernel panic: Out of memory and no killable processes... I haven't noticed any reports of this panic on 2.5.2-pre3. Back to 2.5.2-pre2, I removed these: CONFIG_BLK_DEV_LOOP=m CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_NETFILTER=y CONFIG_BLK_DEV_VIA82CXXX=y CONFIG_BLK_DEV_IDE_MODES=y dbench 32 locked up again. I re-ran dbench 32, 128 with 2.4.17rc2aa2 on this machine and it worked fine. I'll try 2.5.1 on this machine (2.5.1 was okay on another machine). -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-28 14:14 ` rwhron @ 2001-12-28 14:30 ` Jens Axboe 2001-12-28 17:49 ` rwhron ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Jens Axboe @ 2001-12-28 14:30 UTC (permalink / raw) To: rwhron; +Cc: linux-kernel On Fri, Dec 28 2001, rwhron@earthlink.net wrote: > On Fri, Dec 28, 2001 at 12:40:37PM +0100, Jens Axboe wrote: > > > I rebuilt the reiserfs that dbench writes to. > > > Here is ps -eo cmd,wchan on the k6-2 running 2.5.2-pre2: > > > > Ah this is interesting, all stuck in get_request_wait. I cannot > > reproduce your problem here whatever I do, no reiser though. > > > > -- > > Jens Axboe > > That's good news. It's probably something with my configuration > or hardware. I saw the livelock on both ext2 and reiserfs. Thanks for an excellent report. I can't quite see what the problem should be yet, especially since the problems seem to start with 2.5.2-pre1 which doesn't really have a lot of interesting changes. I'll keep looking, though. Could you do sysrq-t for a livelocked system? The livelocks in this mail appear different than the previous ones. Could you try running without swap? > With the stripped config, I built 2.5.2-pre3. It panic'd > with the stripped config. 2.5.2-pre3 panic'd yesterday > on this machine's normal config too. > > Floppy drive(s): fd0 is 1.44M > FDC 0 is a post-1991 82077 > 8139too Fast Ethernet driver 0.9.22 > PCI: Found IRQ 11 for device 00:13.0 > IRQ routing conflict for 00:13.0, have irq 9, want irq 11 > eth0: RealTek RTL8139 Fast Ethernet at 0xd8800000, 00:50:bf:25:68:f3, IRQ 9 > NET4: Linux TCP/IP 1.0 for NET4.0 > IP Protocols: ICMP, UDP, TCP > IP: routing cache hash table of 4096 buckets, 32Kbytes > TCP: Hash tables configured (established 32768 bind 32768) > NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. > Kernel panic: Out of memory and no killable processes... > > I haven't noticed any reports of this panic on 2.5.2-pre3. Someone else did report a similar case. Very strange, doesn't look bio related at all. WHat's the entire boot message for a 2.5.2-pre3 boot attempt like the above? > I re-ran dbench 32, 128 with 2.4.17rc2aa2 on this machine and > it worked fine. I'll try 2.5.1 on this machine (2.5.1 was > okay on another machine). 2.5.1 vs 2.5.2-preX is much more interesting. (btw, attached patch should fix your highmem oops) --- /opt/kernel/linux-2.5.2-pre3/include/linux/blkdev.h Fri Dec 28 11:43:04 2001 +++ include/linux/blkdev.h Fri Dec 28 15:25:36 2001 @@ -228,8 +228,8 @@ * BLK_BOUNCE_ANY : don't bounce anything * BLK_BOUNCE_ISA : bounce pages above ISA DMA boundary */ -#define BLK_BOUNCE_HIGH ((blk_max_low_pfn + 1) << PAGE_SHIFT) -#define BLK_BOUNCE_ANY ((blk_max_pfn + 1) << PAGE_SHIFT) +#define BLK_BOUNCE_HIGH (blk_max_low_pfn << PAGE_SHIFT) +#define BLK_BOUNCE_ANY (blk_max_pfn << PAGE_SHIFT) #define BLK_BOUNCE_ISA (ISA_DMA_THRESHOLD) extern int init_emergency_isa_pool(void); -- Jens Axboe ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-28 14:30 ` Jens Axboe @ 2001-12-28 17:49 ` rwhron 2001-12-28 19:29 ` rwhron 2001-12-29 6:42 ` rwhron 2 siblings, 0 replies; 21+ messages in thread From: rwhron @ 2001-12-28 17:49 UTC (permalink / raw) To: Jens Axboe; +Cc: rwhron, linux-kernel On Fri, Dec 28, 2001 at 03:30:22PM +0100, Jens Axboe wrote: > Thanks for an excellent report. I can't quite see what the problem > should be yet, especially since the problems seem to start with > 2.5.2-pre1 which doesn't really have a lot of interesting changes. I'll > keep looking, though. Could you do sysrq-t for a livelocked system? I don't know how to do sysrq-t via serial console. If I put a monitor and keyboard on the box, syslogd is blocked when the livelock occurs, and I haven't figured out a workaround yet. 2.5.1 runs dbench 32, 128, by the way. > The livelocks in this mail appear different than the previous ones. > Could you try running without swap? Here is without swap on 2.5.2-pre2. vmstat 8 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 0 350756 19484 5464 0 0 0 0 100 41 0 0 100 0 0 0 0 350756 19484 5464 0 0 0 0 100 41 0 0 100 3 29 0 0 344668 19588 8464 0 0 29 0 108 70 1 1 98 0 32 1 0 184264 20824 162556 0 0 32 9123 1085 59 3 86 11 21 11 3 0 181748 20864 164916 0 0 1 10500 1503 20 1 83 16 0 32 1 0 148560 21272 196764 0 0 4 4838 893 52 2 47 51 6 26 2 0 106532 21804 237140 0 0 2 5590 836 62 2 35 64 0 32 2 0 4448 5380 353332 0 0 11 44 253 120 2 26 73 0 32 2 0 4448 5380 353332 0 0 0 0 101 41 0 0 100 0 32 2 0 4448 5380 353332 0 0 0 0 101 41 0 0 100 ps -eo cmd,wchan CMD WCHAN init do_select [keventd] context_thread [ksoftirqd_CPU0] ksoftirqd [kswapd] kswapd [bdflush] wait_on_buffer [kupdated] wait_on_buffer [kreiserfsd] reiserfs_journal_commit_thread /usr/sbin/syslog do_select /usr/sbin/klogd do_syslog [eth0] rtl8139_thread /usr/sbin/sshd do_select /sbin/agetty tty read_chan /sbin/agetty -h read_chan /usr/sbin/sshd do_select -bash wait4 /usr/sbin/sshd - -bash wait4 /bin/bash ./chk wait4 /dbench 32 wait4 /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 wait_on_buffer /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down /dbench 32 down ps -eo cmd,wchan - > > Kernel panic: Out of memory and no killable processes... > > Someone else did report a similar case. Very strange, doesn't look bio > related at all. WHat's the entire boot message for a 2.5.2-pre3 boot > attempt like the above? I rebuilt 2.5.2-pre3 with mrproper using the config that worked for 2.5.1 first and noticed some depmod errors during the build: if [ -r System.map ]; then /sbin/depmod -ae -F System.map 2.5.2-pre3; fi depmod: *** Unresolved symbols in /lib/modules/2.5.2-pre3/kernel/fs/nfs/nfs.o depmod: seq_escape depmod: seq_printf make[1]: Entering directory `/usr/src/linux/arch/i386/boot' sh -x ./install.sh 2.5.2-pre3 bzImage /usr/src/linux/System.map "/boot" So I removed initrd, loopback, nfs, coda, ntfs, dosfs, vfat, and rebuilt with mrproper. Here is the boot message and panic: LILO 22.1 boot: Loading lfs............. Linux version 2.5.2-pre3 (root@mountain) (gcc version 2.95.3 20010315 (release)) #1 Fri Dec 28 12:33:00 EST 2001 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 0000000018000000 (usable) BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved) On node 0 totalpages: 98304 zone(0): 4096 pages. zone(1): 94208 pages. zone(2): 0 pages. Kernel command line: BOOT_IMAGE=lfs ro root=1602 console=ttyS1,38400n8 Initializing CPU#0 Detected 501.155 MHz processor. Console: colour VGA+ 80x25 Calibrating delay loop... 999.42 BogoMIPS Memory: 385036k/393216k available (962k kernel code, 7796k reserved, 243k data, 200k init, 0k highmem) Dentry-cache hash table entries: 65536 (order: 7, 524288 bytes) Inode-cache hash table entries: 32768 (order: 6, 262144 bytes) Mount-cache hash table entries: 8192 (order: 4, 65536 bytes) Buffer-cache hash table entries: 32768 (order: 5, 131072 bytes) Page-cache hash table entries: 131072 (order: 7, 524288 bytes) CPU: L1 I Cache: 32K (32 bytes/line), D cache 32K (32 bytes/line) CPU: AMD-K6(tm) 3D processor stepping 0c Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX mtrr: v1.40 (20010327) Richard Gooch (rgooch@atnf.csiro.au) mtrr: detected mtrr type: AMD K6 PCI: PCI BIOS revision 2.10 entry at 0xfb3c0, last bus=1 PCI: Using configuration type 1 PCI: Probing PCI hardware PCI: Using IRQ router VIA [1106/0586] at 00:07.0 Activating ISA DMA hang workarounds. Linux NET4.0 for Linux 2.4 Based upon Swansea University Computer Society NET3.039 Starting kswapd BIO: pool of 256 setup, 14Kb (56 bytes/bio) biovec: init pool 0, 1 entries, 12 bytes biovec: init pool 1, 4 entries, 48 bytes biovec: init pool 2, 16 entries, 192 bytes biovec: init pool 3, 64 entries, 768 bytes biovec: init pool 4, 128 entries, 1536 bytes biovec: init pool 5, 256 entries, 3072 bytes Journalled Block Device driver loaded Detected PS/2 Mouse Port. pty: 256 Unix98 ptys configured keyboard: Timeout - AT keyboard not present?(ed) keyboard: Timeout - AT keyboard not present?(f4) Serial driver version 5.05c (2001-07-08) with MANY_PORTS SHARE_IRQ SERIAL_PCI enabled ttyS00 at 0x03f8 (irq = 4) is a 16550A ttyS01 at 0x02f8 (irq = 3) is a 16550A block: 256 slots per queue, batch=32 Uniform Multi-Platform E-IDE driver Revision: 6.32 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller on PCI slot 00:07.1 VP_IDE: chipset revision 6 VP_IDE: not 100% native mode: will probe irqs later ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: VIA vt82c586b (rev 47) IDE UDMA33 controller on pci00:07.1 ide0: BM-DMA at 0xe000-0xe007, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xe008-0xe00f, BIOS settings: hdc:DMA, hdd:DMA hda: Maxtor 51536U3, ATA DISK drive hdb: ATAPI CDROM, ATAPI CD/DVD-ROM drive hdc: Maxtor 52049U4, ATA DISK drive ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 ide1 at 0x170-0x177,0x376 on irq 15 blk: queue c028dcc4, I/O limit 4095Mb (mask 0xffffffff) hda: 30015216 sectors (15368 MB) w/2048KiB Cache, CHS=1868/255/63, UDMA(33) blk: queue c028e054, I/O limit 4095Mb (mask 0xffffffff) hdc: 40020624 sectors (20491 MB) w/2048KiB Cache, CHS=39703/16/63, UDMA(33) Partition check: hda: hda1 hda2 hda3 < hda5 hda6 hda7 > hdc: hdc1 hdc2 hdc3 < hdc5 > Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 8139too Fast Ethernet driver 0.9.22 PCI: Found IRQ 11 for device 00:13.0 IRQ routing conflict for 00:13.0, have irq 9, want irq 11 eth0: RealTek RTL8139 Fast Ethernet at 0xd8800000, 00:50:bf:25:68:f3, IRQ 9 NET4: Linux TCP/IP 1.0 for NET4.0 IP Protocols: ICMP, UDP, TCP IP: routing cache hash table of 4096 buckets, 32Kbytes TCP: Hash tables configured (established 32768 bind 32768) ip_conntrack (3072 buckets, 24576 max) ip_tables: (c)2000 Netfilter core team NET4: Unix domain sockets 1.0/SMP for Linux NET4.0. Kernel panic: Out of memory and no killable processes... > > I re-ran dbench 32, 128 with 2.4.17rc2aa2 on this machine and > > 2.5.1 vs 2.5.2-preX is much more interesting. 2.5.1 finishes dbench 32, 128 on this machine. :) Throughput 21.6466 MB/sec (NB=27.0582 MB/sec 216.466 MBit/sec) 32 procs Throughput 5.91991 MB/sec (NB=7.39989 MB/sec 59.1991 MBit/sec) 128 procs > (btw, attached patch should fix your highmem oops) > > -- > Jens Axboe I'm going to hold off testing on my highmem box for a while. BTW, the original "cannot find init" after 2.5.1-pre1 was because I had an invalid "root=" entry in lilo.conf for the kernels other than current and "old". -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-28 14:30 ` Jens Axboe 2001-12-28 17:49 ` rwhron @ 2001-12-28 19:29 ` rwhron 2001-12-29 6:42 ` rwhron 2 siblings, 0 replies; 21+ messages in thread From: rwhron @ 2001-12-28 19:29 UTC (permalink / raw) To: Jens Axboe; +Cc: rwhron, linux-kernel On Fri, Dec 28, 2001 at 03:30:22PM +0100, Jens Axboe wrote: > keep looking, though. Could you do sysrq-t for a livelocked system? > -- > Jens Axboe Using a tip from Russell King: This is while running dbench 32 on an ext2 filesystem. SysRq : Show State free sibling task PC stack pid father child younger older init S C177FF24 4608 1 0 43 3 (NOTLB) Call Trace: [<c011159a>] [<c01114dc>] [<c01398d4>] [<c0139c82>] [<c01337d6>] [<c01085b3>] keventd S 00010000 6596 2 1 7 (L-TLB) Call Trace: [<c011e245>] [<c0106efc>] ksoftirqd_CPU S C1770000 6412 3 0 4 1 (L-TLB) Call Trace: [<c01179b2>] [<c0106efc>] kswapd S C176E000 6652 4 0 5 3 (L-TLB) Call Trace: [<c01282c6>] [<c0106efc>] bdflush S 00000286 6568 5 0 6 4 (L-TLB) Call Trace: [<c0111b29>] [<c0130b53>] [<c0106efc>] kupdated D 00000048 5860 6 0 5 (L-TLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012f265>] [<c015bd08>] [<c015bd95>] [<c013dd35>] [<c01309bd>] [<c0130c45>] [<c0106efc>] kreiserfsd S D7D1BFB4 6148 7 1 25 2 (L-TLB) Call Trace: [<c011159a>] [<c01114dc>] [<c0111b7e>] [<c0177257>] [<c0106efc>] syslogd D 00000048 4788 25 1 27 7 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c01665b8>] [<c0124c35>] [<c012db02>] [<c012479c>] [<c012dc0f>] [<c01085b3>] klogd S 7FFFFFFF 2656 27 1 32 25 (NOTLB) Call Trace: [<c011153f>] [<c01dd4ad>] [<c01ddd37>] [<c01aed94>] [<c01aef9f>] [<c012d91a>] [<c01085b3>] eth0 S D7945F98 2656 32 1 41 27 (L-TLB) Call Trace: [<c011159a>] [<c01114dc>] [<c0111b7e>] [<c01a0d7e>] [<c0106efc>] sshd S 7FFFFFFF 4788 41 1 52 42 32 (NOTLB) Call Trace: [<c011153f>] [<c01af15d>] [<c01398d4>] [<c0139c82>] [<c01085b3>] agetty S 7FFFFFFF 4364 42 1 43 41 (NOTLB) Call Trace: [<c011153f>] [<c018350d>] [<c017f786>] [<c012d855>] [<c01085b3>] agetty S 7FFFFFFF 0 43 1 42 (NOTLB) Call Trace: [<c011153f>] [<c018350d>] [<c017f786>] [<c012d855>] [<c01085b3>] sshd S 7FFFFFFF 5484 45 41 46 52 (NOTLB) Call Trace: [<c011153f>] [<c01398d4>] [<c0139c82>] [<c01085b3>] bash S 00000000 4580 46 45 59 (NOTLB) Call Trace: [<c01169ee>] [<c01085b3>] sshd S 7FFFFFFF 1568 52 41 53 45 (NOTLB) Call Trace: [<c0183b6f>] [<c011153f>] [<c01398d4>] [<c0139c82>] [<c01085b3>] bash S 00000000 2656 53 52 58 (NOTLB) Call Trace: [<c01169ee>] [<c01085b3>] vmstat S D72B5F8C 644 58 53 (NOTLB) Call Trace: [<c011159a>] [<c01114dc>] [<c011a959>] [<c01085b3>] chk S 00000000 5284 59 46 60 (NOTLB) Call Trace: [<c01169ee>] [<c01085b3>] dbench S 00000000 5208 60 59 93 (NOTLB) Call Trace: [<c01169ee>] [<c01085b3>] dbench D 00000048 5692 62 60 63 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D D7744244 5532 63 60 64 62 (NOTLB) Call Trace: [<c01073ed>] [<c0107538>] [<c01e3473>] [<c0124c35>] [<c0124c8d>] [<c015a7de>] [<c0159e43>] [<c012e304>] [<c012d48b>] [<c012d4d7>] [<c01085b3>] dbench D 00000048 5684 64 60 65 63 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5624 65 60 66 64 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5700 66 60 67 65 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D D7744244 5660 67 60 68 66 (NOTLB) Call Trace: [<c01073ed>] [<c0107538>] [<c01e34b9>] [<c0159886>] [<c015c03c>] [<c013655d>] [<c01366ca>] [<c012d07a>] [<c012d3b7>] [<c01085b3>] dbench D 00000048 5688 68 60 69 67 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D D7744244 5532 69 60 70 68 (NOTLB) Call Trace: [<c01073ed>] [<c0107538>] [<c01e3473>] [<c0124c35>] [<c0124c8d>] [<c015a7de>] [<c0159e43>] [<c012e304>] [<c012d48b>] [<c012d4d7>] [<c01085b3>] dbench D 00000048 5780 70 60 71 69 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5756 71 60 72 70 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5692 72 60 73 71 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D D7744244 5692 73 60 74 72 (NOTLB) Call Trace: [<c01073ed>] [<c0107538>] [<c01e3473>] [<c0124c35>] [<c0124c8d>] [<c015a7de>] [<c0159e43>] [<c012e304>] [<c012d48b>] [<c012d4d7>] [<c01085b3>] dbench D D7744244 5612 74 60 75 73 (NOTLB) Call Trace: [<c01073ed>] [<c0107538>] [<c01e3473>] [<c0124c35>] [<c0124c8d>] [<c015a7de>] [<c0159e43>] [<c012e304>] [<c012d48b>] [<c012d4d7>] [<c01085b3>] dbench D 00000048 5740 75 60 76 74 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5600 76 60 77 75 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5448 77 60 78 76 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5692 78 60 79 77 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5640 79 60 80 78 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012f265>] [<c0159085>] [<c015a85c>] [<c015aa70>] [<c015ae09>] [<c012f8f2>] [<c012fee1>] [<c015ac38>] [<c015aff6>] [<c015ac38>] [<c0124bed>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5692 80 60 81 79 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5468 81 60 82 80 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5412 82 60 83 81 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5400 83 60 84 82 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5700 84 60 85 83 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5692 85 60 86 84 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5336 86 60 87 85 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D D7744244 5628 87 60 88 86 (NOTLB) Call Trace: [<c01073ed>] [<c0107538>] [<c01e34b9>] [<c0159886>] [<c015c35d>] [<c0136da4>] [<c0136e65>] [<c01085b3>] dbench D 00000048 5484 88 60 89 87 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D D7744244 5740 89 60 90 88 (NOTLB) Call Trace: [<c01073ed>] [<c0107538>] [<c01e3473>] [<c0124c35>] [<c0124c8d>] [<c015a7de>] [<c0159e43>] [<c012e304>] [<c012d48b>] [<c012d4d7>] [<c01085b3>] dbench D 00000048 5420 90 60 91 89 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5652 91 60 92 90 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5660 92 60 93 91 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] dbench D 00000048 5592 93 60 92 (NOTLB) Call Trace: [<c019656d>] [<c0196baf>] [<c0196e70>] [<c0196f04>] [<c0196fa7>] [<c012e575>] [<c012e5fe>] [<c012f1ef>] [<c012faff>] [<c012ff4b>] [<c0124c35>] [<c012d91a>] [<c01085b3>] PID CMD WCHAN 1 init do_select 2 [keventd] context_thread 3 [ksoftirqd_CPU0] ksoftirqd 4 [kswapd] kswapd 5 [bdflush] bdflush 6 [kupdated] get_request_wait 7 [kreiserfsd] reiserfs_journal_commit_thread 25 /usr/sbin/syslog get_request_wait 27 /usr/sbin/klogd unix_wait_for_peer 32 [eth0] rtl8139_thread 41 /usr/sbin/sshd do_select 42 /sbin/agetty tty read_chan 43 /sbin/agetty -h read_chan 45 /usr/sbin/sshd do_select 46 -bash wait4 52 /usr/sbin/sshd do_select 53 -bash wait4 59 /bin/bash ./chk wait4 60 ./dbench 32 wait4 62 ./dbench 32 get_request_wait 63 ./dbench 32 down 64 ./dbench 32 get_request_wait 65 ./dbench 32 get_request_wait 66 ./dbench 32 get_request_wait 67 ./dbench 32 down 68 ./dbench 32 get_request_wait 69 ./dbench 32 down 70 ./dbench 32 get_request_wait 71 ./dbench 32 get_request_wait 72 ./dbench 32 get_request_wait 73 ./dbench 32 down 74 ./dbench 32 down 75 ./dbench 32 get_request_wait 76 ./dbench 32 get_request_wait 77 ./dbench 32 get_request_wait 78 ./dbench 32 get_request_wait 79 ./dbench 32 get_request_wait 80 ./dbench 32 get_request_wait 81 ./dbench 32 get_request_wait 82 ./dbench 32 get_request_wait 83 ./dbench 32 get_request_wait 84 ./dbench 32 get_request_wait 85 ./dbench 32 get_request_wait 86 ./dbench 32 get_request_wait 87 ./dbench 32 down 88 ./dbench 32 get_request_wait 89 ./dbench 32 down 90 ./dbench 32 get_request_wait 91 ./dbench 32 get_request_wait 92 ./dbench 32 get_request_wait 93 ./dbench 32 get_request_wait 97 ps -eo pid,cmd,w - SysRq : Show Regs Pid: 0, comm: swapper EIP: 0010:[<c0106c03>] CPU: 0 EFLAGS: 00000246 Not tainted EAX: 00000000 EBX: c0220000 ECX: d7d1a270 EDX: d7d1a270 ESI: c0106be0 EDI: ffffe000 EBP: 0008e000 DS: 0018 ES: 0018 CR0: 8005003b CR2: 080cc00c CR3: 17d02000 CR4: 00000090 Call Trace: [<c0106c67>] [<c0105000>] [<c0105027>] SysRq : Show Memory Mem-info: Free pages: 83640kB ( 0kB HighMem) Zone:DMA freepages: 14632kB min: 128kB low: 256kB high: 384kB Zone:Normal freepages: 69008kB min: 1020kB low: 2040kB high: 3060kB Zone:HighMem freepages: 0kB min: 0kB low: 0kB high: 0kB ( Active: 1576, inactive: 69884, free: 20910 ) 4*4kB 3*8kB 2*16kB 3*32kB 4*64kB 3*128kB 2*256kB 0*512kB 1*1024kB 6*2048kB = 14632kB) 10*4kB 3*8kB 1*16kB 2*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 33*2048kB = 69008kB) = 0kB) Swap cache: add 0, delete 0, find 0/0, race 0+0 Free swap: 136512kB 98304 pages of RAM 0 pages of HIGHMEM 1980 reserved pages 75748 pages shared 0 pages swap cached 0 pages in page table cache Buffer memory: 4252kB -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-28 14:30 ` Jens Axboe 2001-12-28 17:49 ` rwhron 2001-12-28 19:29 ` rwhron @ 2001-12-29 6:42 ` rwhron 2001-12-29 17:33 ` Jens Axboe 2 siblings, 1 reply; 21+ messages in thread From: rwhron @ 2001-12-29 6:42 UTC (permalink / raw) To: Jens Axboe; +Cc: viro, linux-kernel > > Kernel panic: Out of memory and no killable processes... > > Someone else did report a similar case. Very strange, doesn't look bio Al Viro posted a fix: http://marc.theaimsgroup.com/?l=linux-kernel&m=100959128922157&w=2 I used Al's patch and 2.5.2-pre3 boots with reiserfs root_fs and no panic. Below is the trace on 2.5.2-pre3 after dbench 32 livelocked. free sibling task PC stack pid father child younger older init S C177FF24 4592 1 0 45 3 (NOTLB) Call Trace: [<c01115d9>] [<c0111500>] [<c0139d54>] [<c013a102>] [<c0133c46>] [<c01085b3>] keventd S 00010000 6580 2 1 7 (L-TLB) Call Trace: [<c011e3f5>] [<c0106ef0>] ksoftirqd_CPU S C1770000 6396 3 0 4 1 (L-TLB) Call Trace: [<c0117b12>] [<c0106ef0>] kswapd S C176E000 6636 4 0 5 3 (L-TLB) Call Trace: [<c0128716>] [<c0106ef0>] bdflush S 00000286 6552 5 0 6 4 (L-TLB) Call Trace: [<c0111c69>] [<c0130fb3>] [<c0106ef0>] kupdated D C176BFAC 5864 6 0 5 (L-TLB) Call Trace: [<c012e96a>] [<c012eb2b>] [<c0131023>] [<c0106ef0>] kreiserfsd S D68E9FB4 6156 7 1 25 2 (L-TLB) Call Trace: [<c01115d9>] [<c0111500>] [<c0111cbe>] [<c0177717>] [<c0106ef0>] syslogd D 00000048 4772 25 1 27 7 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0166aa8>] [<c0125085>] [<c012df62>] [<c0124bec>] [<c012e06f>] [<c01085b3>] klogd S 7FFFFFFF 4772 27 1 32 25 (NOTLB) Call Trace: [<c011157b>] [<c01e6c4d>] [<c01e74e7>] [<c01b0d77>] [<c01b0f87>] [<c012dd7a>] [<c01085b3>] eth0 S D646FF98 0 32 1 37 27 (L-TLB) Call Trace: [<c01115d9>] [<c0111500>] [<c0111cbe>] [<c01a125e>] [<c0106ef0>] iplog S 7FFFFFFF 5304 37 1 38 43 32 (NOTLB) Call Trace: [<c011157b>] [<c0139bd1>] [<c0139d54>] [<c013a102>] [<c01085b3>] iplog S D616DF28 188 38 37 41 (NOTLB) Call Trace: [<c01115d9>] [<c0111500>] [<c013a37c>] [<c013a57d>] [<c011191c>] [<c01085b3>] iplog S D6169FB0 5684 39 38 40 (NOTLB) Call Trace: [<c0107767>] [<c01085b3>] iplog S D6165F24 6280 40 38 41 39 (NOTLB) Call Trace: [<c01115d9>] [<c0111500>] [<c0139d54>] [<c013a102>] [<c01085b3>] iplog S 7FFFFFFF 5656 41 38 40 (NOTLB) Call Trace: [<c011157b>] [<c01bed35>] [<c01b51e2>] [<c01b52fe>] [<c01e960f>] [<c01b0dd5>] [<c01b1b47>] [<c011b314>] [<c011b550>] [<c011bc78>] [<c01b6c4b>] [<c01b2267>] [<c01085b3>] sshd S 7FFFFFFF 4772 43 1 55 44 37 (NOTLB) Call Trace: [<c011157b>] [<c01b113d>] [<c0139d54>] [<c013a102>] [<c01085b3>] agetty S 7FFFFFFF 4468 44 1 45 43 (NOTLB) Call Trace: [<c011157b>] [<c0183a0d>] [<c017fc76>] [<c012dcb5>] [<c01085b3>] agetty S 7FFFFFFF 0 45 1 44 (NOTLB) Call Trace: [<c011157b>] [<c0183a0d>] [<c017fc76>] [<c012dcb5>] [<c01085b3>] sshd S 7FFFFFFF 548 47 43 48 55 (NOTLB) Call Trace: [<c011157b>] [<c0139d54>] [<c013a102>] [<c01085b3>] bash S 00000000 4564 48 47 62 (NOTLB) Call Trace: [<c0116b4e>] [<c01085b3>] sshd S 7FFFFFFF 0 55 43 56 47 (NOTLB) Call Trace: [<c011157b>] [<c0139d54>] [<c013a102>] [<c01085b3>] bash S 7FFFFFFF 2640 56 55 (NOTLB) Call Trace: [<c011157b>] [<c0183a0d>] [<c017fc76>] [<c012dcb5>] [<c01085b3>] chk S 00000000 0 62 48 63 (NOTLB) Call Trace: [<c0116b4e>] [<c01085b3>] dbench S 00000000 5192 63 62 96 (NOTLB) Call Trace: [<c0116b4e>] [<c01085b3>] dbench D 00000048 5372 65 63 66 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5620 66 63 67 65 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c015974b>] [<c015a1fd>] [<c015c847>] [<c0137224>] [<c01372e5>] [<c01085b3>] dbench D 00000000 5620 67 63 68 66 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5728 68 63 69 67 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5608 69 63 70 68 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000000 5948 70 63 71 69 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5572 71 63 72 70 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5264 72 63 73 71 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5464 73 63 74 72 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5728 74 63 75 73 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012f6e5>] [<c015b32a>] [<c012f974>] [<c012fb2d>] [<c012fd51>] [<c0130341>] [<c015b0d8>] [<c015b496>] [<c015b0d8>] [<c012503d>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5528 75 63 76 74 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5676 76 63 77 75 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5332 77 63 78 76 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5584 78 63 79 77 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000000 5644 79 63 80 78 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea8e>] [<c012f974>] [<c012fb73>] [<c012fd6e>] [<c012f745>] [<c012f629>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5620 80 63 81 79 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5600 81 63 82 80 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000000 5620 82 63 83 81 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5604 83 63 84 82 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5620 84 63 85 83 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5632 85 63 86 84 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5676 86 63 87 85 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5676 87 63 88 86 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5620 88 63 89 87 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5620 89 63 90 88 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012f6e5>] [<c015b32a>] [<c012f974>] [<c012fb2d>] [<c012fd51>] [<c0130341>] [<c015b0d8>] [<c015b496>] [<c015b0d8>] [<c012503d>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5728 90 63 91 89 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5628 91 63 92 90 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000000 5676 92 63 93 91 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] dbench D 00000000 5948 93 63 94 92 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000000 5692 94 63 95 93 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000000 5488 95 63 96 94 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] dbench D 00000048 5372 96 63 95 (NOTLB) Call Trace: [<c0196a06>] [<c019708f>] [<c0197340>] [<c01973dc>] [<c019747f>] [<c012e9d5>] [<c012ea5e>] [<c012f66f>] [<c012ff5f>] [<c01303ab>] [<c0125085>] [<c012dd7a>] [<c01085b3>] SysRq : Show Regs Pid: 0, comm: swapper EIP: 0010:[<c0106c03>] CPU: 0 EFLAGS: 00000246 Not tainted EAX: 00000000 EBX: c022e000 ECX: d68e8280 EDX: d68e8280 ESI: c0106be0 EDI: ffffe000 EBP: 0008e000 DS: 0018 ES: 0018 CR0: 8005003b CR2: 40014000 CR3: 16177000 CR4: 00000090 Call Trace: [<c0106c59>] [<c0105000>] [<c0105027>] SysRq : Show Memory Mem-info: Free pages: 95596kB ( 0kB HighMem) Zone:DMA freepages: 14572kB min: 128kB low: 256kB high: 384kB Zone:Normal freepages: 81024kB min: 1020kB low: 2040kB high: 3060kB Zone:HighMem freepages: 0kB min: 0kB low: 0kB high: 0kB ( Active: 1427, inactive: 67074, free: 23899 ) 3*4kB 2*8kB 1*16kB 2*32kB 2*64kB 4*128kB 2*256kB 0*512kB 1*1024kB 6*2048kB = 14572kB) 10*4kB 3*8kB 4*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 39*2048kB = 81024kB) = 0kB) Swap cache: add 0, delete 0, find 0/0, race 0+0 Free swap: 136512kB 98304 pages of RAM 0 pages of HIGHMEM 1995 reserved pages 72913 pages shared 0 pages swap cached 0 pages in page table cache Buffer memory: 23796kB mountain:~/dbench$ ps -eo pid,cmd,wchan PID CMD WCHAN 1 init do_select 2 [keventd] context_thread 3 [ksoftirqd_CPU0] ksoftirqd 4 [kswapd] kswapd 5 [bdflush] bdflush 6 [kupdated] wait_on_buffer 7 [kreiserfsd] reiserfs_journal_commit_thread 25 /usr/sbin/syslog get_request_wait 27 /usr/sbin/klogd unix_wait_for_peer 32 [eth0] rtl8139_thread 37 /usr/sbin/iplog do_select 38 /usr/sbin/iplog do_poll 39 /usr/sbin/iplog rt_sigsuspend 40 /usr/sbin/iplog do_select 41 /usr/sbin/iplog wait_for_packet 43 /usr/sbin/sshd do_select 44 /sbin/agetty tty read_chan 45 /sbin/agetty -h read_chan 47 /usr/sbin/sshd - 48 -bash wait4 55 /usr/sbin/sshd do_select 56 -bash read_chan 65 ./dbench 32 get_request_wait 66 ./dbench 32 get_request_wait 67 ./dbench 32 get_request_wait 68 ./dbench 32 get_request_wait 69 ./dbench 32 get_request_wait 70 ./dbench 32 get_request_wait 71 ./dbench 32 get_request_wait 72 ./dbench 32 get_request_wait 73 ./dbench 32 get_request_wait 74 ./dbench 32 get_request_wait 75 ./dbench 32 get_request_wait 76 ./dbench 32 get_request_wait 77 ./dbench 32 get_request_wait 78 ./dbench 32 get_request_wait 79 ./dbench 32 get_request_wait 80 ./dbench 32 get_request_wait 81 ./dbench 32 get_request_wait 82 ./dbench 32 get_request_wait 83 ./dbench 32 get_request_wait 84 ./dbench 32 get_request_wait 85 ./dbench 32 get_request_wait 86 ./dbench 32 get_request_wait 87 ./dbench 32 get_request_wait 88 ./dbench 32 get_request_wait 89 ./dbench 32 get_request_wait 90 ./dbench 32 get_request_wait 91 ./dbench 32 get_request_wait 92 ./dbench 32 get_request_wait 93 ./dbench 32 get_request_wait 94 ./dbench 32 get_request_wait 95 ./dbench 32 get_request_wait 96 ./dbench 32 get_request_wait 97 ps -eo pid,cmd,w - dbench was run on the ext2 filesystem. mountain:~/dbench$ df -kT . Filesystem Type 1k-blocks Used Available Use% Mounted on /dev/hda6 ext2 4032092 249208 3578060 7% /home -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-29 6:42 ` rwhron @ 2001-12-29 17:33 ` Jens Axboe 2001-12-29 17:48 ` Jens Axboe 0 siblings, 1 reply; 21+ messages in thread From: Jens Axboe @ 2001-12-29 17:33 UTC (permalink / raw) To: rwhron; +Cc: viro, linux-kernel On Sat, Dec 29 2001, rwhron@earthlink.net wrote: > > > Kernel panic: Out of memory and no killable processes... > > > > Someone else did report a similar case. Very strange, doesn't look bio > > Al Viro posted a fix: > http://marc.theaimsgroup.com/?l=linux-kernel&m=100959128922157&w=2 > > I used Al's patch and 2.5.2-pre3 boots with reiserfs root_fs > and no panic. > > Below is the trace on 2.5.2-pre3 after dbench 32 livelocked. Thanks, could you try with this patch? It's not a fix (haven't found the bug yet), but I think we are looking at list corruption so please check if this patch at least alters when it hangs etc. --- /opt/kernel/linux-2.5.2-pre3/drivers/block/elevator.c Sat Dec 29 12:17:53 2001 +++ drivers/block/elevator.c Sat Dec 29 12:30:20 2001 @@ -142,7 +142,7 @@ int elevator_linus_merge(request_queue_t *q, struct request **req, struct bio *bio) { - struct list_head *entry; + struct list_head *entry, *head = &q->queue_head; struct request *__rq; int ret; @@ -160,17 +160,22 @@ } } + if ((__rq = __elv_next_request(q))) + if (__rq->flags & REQ_STARTED) + head = head->next; + entry = &q->queue_head; ret = ELEVATOR_NO_MERGE; - while ((entry = entry->prev) != &q->queue_head) { + while ((entry = entry->prev) != head) { __rq = list_entry_rq(entry); + if (__rq->flags & (REQ_BARRIER | REQ_STARTED)) + break; + /* * simply "aging" of requests in queue */ if (__rq->elevator_sequence-- <= 0) - break; - if (__rq->flags & (REQ_BARRIER | REQ_STARTED)) break; if (!(__rq->flags & REQ_CMD)) continue; -- Jens Axboe ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-29 17:33 ` Jens Axboe @ 2001-12-29 17:48 ` Jens Axboe 2001-12-29 19:43 ` rwhron 0 siblings, 1 reply; 21+ messages in thread From: Jens Axboe @ 2001-12-29 17:48 UTC (permalink / raw) To: rwhron; +Cc: viro, linux-kernel On Sat, Dec 29 2001, Jens Axboe wrote: > On Sat, Dec 29 2001, rwhron@earthlink.net wrote: > > > > Kernel panic: Out of memory and no killable processes... > > > > > > Someone else did report a similar case. Very strange, doesn't look bio > > > > Al Viro posted a fix: > > http://marc.theaimsgroup.com/?l=linux-kernel&m=100959128922157&w=2 > > > > I used Al's patch and 2.5.2-pre3 boots with reiserfs root_fs > > and no panic. > > > > Below is the trace on 2.5.2-pre3 after dbench 32 livelocked. > > Thanks, could you try with this patch? It's not a fix (haven't found the > bug yet), but I think we are looking at list corruption so please check > if this patch at least alters when it hangs etc. Ah I think I got it -- appears to be down to no rechecking for empty queue after a potential queue_lock droppage (busy I/O, no request left get_request returns NULL, drop lock and run get_request_wait). This explains the get_request_wait deadlock, compiling right now... --- /opt/kernel/linux-2.5.2-pre3/drivers/block/ll_rw_blk.c Sat Dec 29 12:17:53 2001 +++ drivers/block/ll_rw_blk.c Sat Dec 29 12:45:04 2001 @@ -881,7 +881,9 @@ BUG_ON(rw != READ && rw != WRITE); + spin_lock_irq(q->queue_lock); rq = get_request(q, rw); + spin_unlock_irq(q->queue_lock); if (!rq && (gfp_mask & __GFP_WAIT)) rq = get_request_wait(q, rw); @@ -1081,7 +1083,7 @@ { struct request *req, *freereq = NULL; int el_ret, latency = 0, rw, nr_sectors, cur_nr_sectors, barrier; - struct list_head *insert_here = &q->queue_head; + struct list_head *insert_here; elevator_t *elevator = &q->elevator; sector_t sector; @@ -1103,15 +1105,14 @@ barrier = test_bit(BIO_RW_BARRIER, &bio->bi_rw); spin_lock_irq(q->queue_lock); +again: + req = NULL; + insert_here = q->queue_head.prev; if (blk_queue_empty(q) || barrier) { blk_plug_device(q); goto get_rq; } - -again: - req = NULL; - insert_here = q->queue_head.prev; el_ret = elevator->elevator_merge_fn(q, &req, bio); switch (el_ret) { -- Jens Axboe ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: 2.5.2-pre1 dbench 32 hangs in vmstat "b" state 2001-12-29 17:48 ` Jens Axboe @ 2001-12-29 19:43 ` rwhron 0 siblings, 0 replies; 21+ messages in thread From: rwhron @ 2001-12-29 19:43 UTC (permalink / raw) To: Jens Axboe; +Cc: rwhron, viro, linux-kernel On Sat, Dec 29, 2001 at 06:48:37PM +0100, Jens Axboe wrote: > Ah I think I got it -- appears to be down to no rechecking for empty > queue after a potential queue_lock droppage (busy I/O, no request left > get_request returns NULL, drop lock and run get_request_wait). This > explains the get_request_wait deadlock, compiling right now... > > -- > Jens Axboe Two thumbs up!! With your ll_rw_blk.c and elevator.c patches, 2.5.2-pre3 completes dbench 32, 128. I'm running a more complete battery of tests and will let you know if there are any unusual results. Thanks! -- Randy Hron ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2001-12-29 19:40 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-12-21 14:11 2.5.2-pre1 dbench 32 hangs in vmstat "b" state rwhron 2001-12-21 14:46 ` Jens Axboe 2001-12-21 16:43 ` rwhron 2001-12-21 17:01 ` Jens Axboe 2001-12-21 18:47 ` rwhron 2001-12-21 22:19 ` Jens Axboe 2001-12-21 23:55 ` rwhron 2001-12-24 14:03 ` Jens Axboe 2001-12-24 16:59 ` rwhron 2001-12-24 17:02 ` Jens Axboe 2001-12-24 22:14 ` rwhron 2001-12-27 19:07 ` rwhron 2001-12-28 11:40 ` Jens Axboe 2001-12-28 14:14 ` rwhron 2001-12-28 14:30 ` Jens Axboe 2001-12-28 17:49 ` rwhron 2001-12-28 19:29 ` rwhron 2001-12-29 6:42 ` rwhron 2001-12-29 17:33 ` Jens Axboe 2001-12-29 17:48 ` Jens Axboe 2001-12-29 19:43 ` rwhron
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox