* 3.5-rc6 printk formatting problem during oom-kill. @ 2012-07-09 18:03 Dave Jones 2012-07-09 18:27 ` Kay Sievers 0 siblings, 1 reply; 19+ messages in thread From: Dave Jones @ 2012-07-09 18:03 UTC (permalink / raw) To: Linux Kernel; +Cc: Kay Sievers, Linus Torvalds I noticed that the format of the oom-killer output seems to have changed, and now it spews stuff like.. [49461.758070] lowmem_reserve[]: [49461.758071] 0 [49461.758071] 2643 [49461.758071] 3878 [49461.758072] 3878 [49461.758072] [49461.758072] Node 0 [49461.758075] DMA32 free:50432kB min:45888kB low:57360kB high:68832kB active_anon:1980216kB inactive_anon:494260kB active_file:276kB inactive_file:1064kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2707204kB mlocked:0kB dirty:0kB writeback:82756kB mapped:0kB shmem:0kB slab_reclaimable:15392kB slab_unreclaimable:118172kB kernel_stack:12672kB pagetables:10744kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:9756 all_unreclaimable? yes [49461.758076] lowmem_reserve[]: [49461.758076] 0 [49461.758076] 0 [49461.758077] 1234 [49461.758077] 1234 [49461.758077] [49461.758078] Node 0 [49461.758080] Normal free:21208kB min:21424kB low:26780kB high:32136kB active_anon:448068kB inactive_anon:446924kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):972kB isolated(file):0kB present:1264032kB mlocked:0kB dirty:0kB writeback:19580kB mapped:516kB shmem:5532kB slab_reclaimable:34960kB slab_unreclaimable:110764kB kernel_stack:54392kB pagetables:11596kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:16064 all_unreclaimable? yes [49461.758080] lowmem_reserve[]: [49461.758080] 0 [49461.758081] 0 [49461.758081] 0 [49461.758081] 0 [49461.758082] [49461.758082] Node 0 [49461.758082] DMA: [49461.758085] 0*4kB [49461.758086] 0*8kB [49461.758086] 3*16kB [49461.758086] 1*32kB [49461.758087] 3*64kB [49461.758087] 3*128kB [49461.758087] 3*256kB [49461.758088] 2*512kB [49461.758088] 3*1024kB [49461.758089] 3*2048kB [49461.758089] 1*4096kB [49461.758089] = 15760kB [49461.758090] Node 0 [49461.758090] DMA32: [49461.758091] 166*4kB [49461.758091] 74*8kB [49461.758092] 162*16kB [49461.758092] 314*32kB [49461.758093] 337*64kB [49461.758093] 82*128kB [49461.758094] 8*256kB [49461.758094] 1*512kB [49461.758094] 1*1024kB [49461.758095] 0*2048kB [49461.758095] 0*4096kB [49461.758096] = 49544kB [49461.758096] Node 0 [49461.758096] Normal: [49461.758097] 847*4kB [49461.758097] 1834*8kB [49461.758098] 0*16kB [49461.758098] 5*32kB [49461.758098] 3*64kB [49461.758099] 2*128kB [49461.758099] 1*256kB [49461.758100] 0*512kB [49461.758100] 0*1024kB [49461.758100] 1*2048kB [49461.758101] 0*4096kB [49461.758101] = 20972kB Does the oom-killer code need modifying, or the printk code ? I know there's been some regressions in this area recently, but this is still happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0) Dave ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 18:03 3.5-rc6 printk formatting problem during oom-kill Dave Jones @ 2012-07-09 18:27 ` Kay Sievers 2012-07-09 18:48 ` Kay Sievers 0 siblings, 1 reply; 19+ messages in thread From: Kay Sievers @ 2012-07-09 18:27 UTC (permalink / raw) To: Dave Jones, Linux Kernel, Kay Sievers, Linus Torvalds, Greg Kroah-Hartmann On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote: > I noticed that the format of the oom-killer output seems to have changed, and > now it spews stuff like.. > > [49461.758070] lowmem_reserve[]: > [49461.758071] 0 > [49461.758071] 2643 > [49461.758071] 3878 > [49461.758072] 3878 > [49461.758072] > [49461.758072] Node 0 > Does the oom-killer code need modifying, or the printk code ? > I know there's been some regressions in this area recently, but this is still > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0) This likely fixes it: http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD Let me check if it does, and if I can reproduce it. Thanks, Kay ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 18:27 ` Kay Sievers @ 2012-07-09 18:48 ` Kay Sievers 2012-07-09 20:31 ` Joe Perches 2012-07-12 0:54 ` Dave Jones 0 siblings, 2 replies; 19+ messages in thread From: Kay Sievers @ 2012-07-09 18:48 UTC (permalink / raw) To: Dave Jones; +Cc: Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote: > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote: > > I noticed that the format of the oom-killer output seems to have changed, and > > now it spews stuff like.. > > > > [49461.758070] lowmem_reserve[]: > > [49461.758071] 0 > > [49461.758071] 2643 > > [49461.758071] 3878 > > [49461.758072] 3878 > > [49461.758072] > > [49461.758072] Node 0 > > > Does the oom-killer code need modifying, or the printk code ? > > I know there's been some regressions in this area recently, but this is still > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0) > > This likely fixes it: > http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD > > Let me check if it does, and if I can reproduce it. It looks fine here with the above mentioned patch: [ 0.000000] lowmem_reserve[]: [ 0.000000] 0 [ 0.000000] 0 [ 0.000000] 0 [ 0.000000] 0 [ 0.000000] [ 0.000000] DMA: [ 0.000000] 1*4kB [ 0.000000] 0*8kB [ 0.000000] 0*16kB [ 0.000000] 1*32kB [ 0.000000] 2*64kB [ 0.000000] 1*128kB [ 0.000000] 1*256kB [ 0.000000] 0*512kB [ 0.000000] 1*1024kB [ 0.000000] 1*2048kB [ 0.000000] 3*4096kB [ 0.000000] = 15908kB becomes: [ 0.000000] lowmem_reserve[]: 0 0 0 0 [ 0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB Thanks, Kay ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 18:48 ` Kay Sievers @ 2012-07-09 20:31 ` Joe Perches 2012-07-09 20:40 ` Greg Kroah-Hartmann 2012-07-12 0:54 ` Dave Jones 1 sibling, 1 reply; 19+ messages in thread From: Joe Perches @ 2012-07-09 20:31 UTC (permalink / raw) To: Kay Sievers; +Cc: Dave Jones, Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann On Mon, 2012-07-09 at 20:48 +0200, Kay Sievers wrote: > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote: > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote: > > > I noticed that the format of the oom-killer output seems to have changed, and > > > now it spews stuff like.. > > > > > > [49461.758070] lowmem_reserve[]: > > > [49461.758071] 0 > > > [49461.758071] 2643 > > > [49461.758071] 3878 > > > [49461.758072] 3878 > > > [49461.758072] > > > [49461.758072] Node 0 > > > > > Does the oom-killer code need modifying, or the printk code ? > > > I know there's been some regressions in this area recently, but this is still > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0) > > > > This likely fixes it: > > http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD > > > > Let me check if it does, and if I can reproduce it. > > It looks fine here with the above mentioned patch: > [ 0.000000] lowmem_reserve[]: > [ 0.000000] 0 > [ 0.000000] 0 > [ 0.000000] 0 > [ 0.000000] 0 > [ 0.000000] > [ 0.000000] DMA: > [ 0.000000] 1*4kB > [ 0.000000] 0*8kB > [ 0.000000] 0*16kB > [ 0.000000] 1*32kB > [ 0.000000] 2*64kB > [ 0.000000] 1*128kB > [ 0.000000] 1*256kB > [ 0.000000] 0*512kB > [ 0.000000] 1*1024kB > [ 0.000000] 1*2048kB > [ 0.000000] 3*4096kB > [ 0.000000] = 15908kB > > becomes: > [ 0.000000] lowmem_reserve[]: 0 0 0 0 > [ 0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB Hi Kay. That single patch doesn't apply cleanly to Linus' 8c84bf4166a4698296342841a549bbee03860ac0 What else is necessary? Your tree seems to have a collection of random patches. It might be useful to clone Linus' tree and produce a branch with all the necessary printk patches in it so someone else could pull it. cheers, Joe ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 20:31 ` Joe Perches @ 2012-07-09 20:40 ` Greg Kroah-Hartmann 2012-07-09 20:44 ` Joe Perches 0 siblings, 1 reply; 19+ messages in thread From: Greg Kroah-Hartmann @ 2012-07-09 20:40 UTC (permalink / raw) To: Joe Perches; +Cc: Kay Sievers, Dave Jones, Linux Kernel, Linus Torvalds On Mon, Jul 09, 2012 at 01:31:56PM -0700, Joe Perches wrote: > On Mon, 2012-07-09 at 20:48 +0200, Kay Sievers wrote: > > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote: > > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote: > > > > I noticed that the format of the oom-killer output seems to have changed, and > > > > now it spews stuff like.. > > > > > > > > [49461.758070] lowmem_reserve[]: > > > > [49461.758071] 0 > > > > [49461.758071] 2643 > > > > [49461.758071] 3878 > > > > [49461.758072] 3878 > > > > [49461.758072] > > > > [49461.758072] Node 0 > > > > > > > Does the oom-killer code need modifying, or the printk code ? > > > > I know there's been some regressions in this area recently, but this is still > > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0) > > > > > > This likely fixes it: > > > http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD > > > > > > Let me check if it does, and if I can reproduce it. > > > > It looks fine here with the above mentioned patch: > > [ 0.000000] lowmem_reserve[]: > > [ 0.000000] 0 > > [ 0.000000] 0 > > [ 0.000000] 0 > > [ 0.000000] 0 > > [ 0.000000] > > [ 0.000000] DMA: > > [ 0.000000] 1*4kB > > [ 0.000000] 0*8kB > > [ 0.000000] 0*16kB > > [ 0.000000] 1*32kB > > [ 0.000000] 2*64kB > > [ 0.000000] 1*128kB > > [ 0.000000] 1*256kB > > [ 0.000000] 0*512kB > > [ 0.000000] 1*1024kB > > [ 0.000000] 1*2048kB > > [ 0.000000] 3*4096kB > > [ 0.000000] = 15908kB > > > > becomes: > > [ 0.000000] lowmem_reserve[]: 0 0 0 0 > > [ 0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB > > Hi Kay. > > That single patch doesn't apply cleanly to Linus' > 8c84bf4166a4698296342841a549bbee03860ac0 > > What else is necessary? > > Your tree seems to have a collection of random patches. > > It might be useful to clone Linus' tree and produce a > branch with all the necessary printk patches in it so > someone else could pull it. They should all now be in my driver-core-next branch that will show up in the next linux-next release, so having a separate tree isn't necessary. thanks, greg k-h ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 20:40 ` Greg Kroah-Hartmann @ 2012-07-09 20:44 ` Joe Perches 2012-07-09 20:48 ` Kay Sievers 2012-07-09 20:56 ` Greg Kroah-Hartmann 0 siblings, 2 replies; 19+ messages in thread From: Joe Perches @ 2012-07-09 20:44 UTC (permalink / raw) To: Greg Kroah-Hartmann; +Cc: Kay Sievers, Dave Jones, Linux Kernel, Linus Torvalds On Mon, 2012-07-09 at 13:40 -0700, Greg Kroah-Hartmann wrote: > On Mon, Jul 09, 2012 at 01:31:56PM -0700, Joe Perches wrote: > > On Mon, 2012-07-09 at 20:48 +0200, Kay Sievers wrote: > > > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote: > > > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote: > > > > > I noticed that the format of the oom-killer output seems to have changed, and > > > > > now it spews stuff like.. > > > > > > > > > > [49461.758070] lowmem_reserve[]: > > > > > [49461.758071] 0 > > > > > [49461.758071] 2643 > > > > > [49461.758071] 3878 > > > > > [49461.758072] 3878 > > > > > [49461.758072] > > > > > [49461.758072] Node 0 > > > > > > > > > Does the oom-killer code need modifying, or the printk code ? > > > > > I know there's been some regressions in this area recently, but this is still > > > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0) > > > > > > > > This likely fixes it: > > > > http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD > > > > > > > > Let me check if it does, and if I can reproduce it. > > > > > > It looks fine here with the above mentioned patch: > > > [ 0.000000] lowmem_reserve[]: > > > [ 0.000000] 0 > > > [ 0.000000] 0 > > > [ 0.000000] 0 > > > [ 0.000000] 0 > > > [ 0.000000] > > > [ 0.000000] DMA: > > > [ 0.000000] 1*4kB > > > [ 0.000000] 0*8kB > > > [ 0.000000] 0*16kB > > > [ 0.000000] 1*32kB > > > [ 0.000000] 2*64kB > > > [ 0.000000] 1*128kB > > > [ 0.000000] 1*256kB > > > [ 0.000000] 0*512kB > > > [ 0.000000] 1*1024kB > > > [ 0.000000] 1*2048kB > > > [ 0.000000] 3*4096kB > > > [ 0.000000] = 15908kB > > > > > > becomes: > > > [ 0.000000] lowmem_reserve[]: 0 0 0 0 > > > [ 0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB > > > > Hi Kay. > > > > That single patch doesn't apply cleanly to Linus' > > 8c84bf4166a4698296342841a549bbee03860ac0 > > > > What else is necessary? > > > > Your tree seems to have a collection of random patches. > > > > It might be useful to clone Linus' tree and produce a > > branch with all the necessary printk patches in it so > > someone else could pull it. > > They should all now be in my driver-core-next branch that will show up > in the next linux-next release, so having a separate tree isn't > necessary. I don't think so. There are real defects in the existing code. These are patches that are necessary _now_. not for a -next 3.6 future. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 20:44 ` Joe Perches @ 2012-07-09 20:48 ` Kay Sievers 2012-07-09 21:08 ` Joe Perches 2012-07-09 20:56 ` Greg Kroah-Hartmann 1 sibling, 1 reply; 19+ messages in thread From: Kay Sievers @ 2012-07-09 20:48 UTC (permalink / raw) To: Joe Perches; +Cc: Greg Kroah-Hartmann, Dave Jones, Linux Kernel, Linus Torvalds On Mon, Jul 9, 2012 at 10:44 PM, Joe Perches <joe@perches.com> wrote: >> > That single patch doesn't apply cleanly to Linus' >> > 8c84bf4166a4698296342841a549bbee03860ac0 >> > >> > What else is necessary? >> > >> > Your tree seems to have a collection of random patches. >> > >> > It might be useful to clone Linus' tree and produce a >> > branch with all the necessary printk patches in it so >> > someone else could pull it. I just worked on top of Greg's tree with the pending stuff for 3.5. >> They should all now be in my driver-core-next branch that will show up >> in the next linux-next release, so having a separate tree isn't >> necessary. > > I don't think so. > > There are real defects in the existing code. > > These are patches that are necessary _now_. > not for a -next 3.6 future. Wrong conclusion. They are not in the 3.6 branch, but still go into -next: http://git.kernel.org/?p=linux/kernel/git/gregkh/driver-core.git;a=shortlog;h=refs/heads/driver-core-linus Kay ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 20:48 ` Kay Sievers @ 2012-07-09 21:08 ` Joe Perches 0 siblings, 0 replies; 19+ messages in thread From: Joe Perches @ 2012-07-09 21:08 UTC (permalink / raw) To: Kay Sievers; +Cc: Greg Kroah-Hartmann, Dave Jones, Linux Kernel, Linus Torvalds On Mon, 2012-07-09 at 22:48 +0200, Kay Sievers wrote: > On Mon, Jul 9, 2012 at 10:44 PM, Joe Perches <joe@perches.com> wrote: > > >> > That single patch doesn't apply cleanly to Linus' > >> > 8c84bf4166a4698296342841a549bbee03860ac0 > >> > > >> > What else is necessary? > >> > > >> > Your tree seems to have a collection of random patches. > >> > > >> > It might be useful to clone Linus' tree and produce a > >> > branch with all the necessary printk patches in it so > >> > someone else could pull it. > > I just worked on top of Greg's tree with the pending stuff for 3.5. No worries Kay, as long as there's a tree with just printk patches, it'll all work out. > >> They should all now be in my driver-core-next branch that will show up > >> in the next linux-next release, so having a separate tree isn't > >> necessary. > > > > I don't think so. > > > > There are real defects in the existing code. > > > > These are patches that are necessary _now_. > > not for a -next 3.6 future. > > Wrong conclusion. They are not in the 3.6 branch, but still go into -next: > http://git.kernel.org/?p=linux/kernel/git/gregkh/driver-core.git;a=shortlog;h=refs/heads/driver-core-linus That's the drivers-core-linus branch not the -next one. Greg said -next. Anyway, it's not a problem, Greg just mistyped. I do that too. Sometimes too often. cheers, Joe ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 20:44 ` Joe Perches 2012-07-09 20:48 ` Kay Sievers @ 2012-07-09 20:56 ` Greg Kroah-Hartmann 2012-07-11 0:11 ` Dave Jones 1 sibling, 1 reply; 19+ messages in thread From: Greg Kroah-Hartmann @ 2012-07-09 20:56 UTC (permalink / raw) To: Joe Perches; +Cc: Kay Sievers, Dave Jones, Linux Kernel, Linus Torvalds On Mon, Jul 09, 2012 at 01:44:24PM -0700, Joe Perches wrote: > On Mon, 2012-07-09 at 13:40 -0700, Greg Kroah-Hartmann wrote: > > On Mon, Jul 09, 2012 at 01:31:56PM -0700, Joe Perches wrote: > > > On Mon, 2012-07-09 at 20:48 +0200, Kay Sievers wrote: > > > > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote: > > > > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote: > > > > > > I noticed that the format of the oom-killer output seems to have changed, and > > > > > > now it spews stuff like.. > > > > > > > > > > > > [49461.758070] lowmem_reserve[]: > > > > > > [49461.758071] 0 > > > > > > [49461.758071] 2643 > > > > > > [49461.758071] 3878 > > > > > > [49461.758072] 3878 > > > > > > [49461.758072] > > > > > > [49461.758072] Node 0 > > > > > > > > > > > Does the oom-killer code need modifying, or the printk code ? > > > > > > I know there's been some regressions in this area recently, but this is still > > > > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0) > > > > > > > > > > This likely fixes it: > > > > > http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD > > > > > > > > > > Let me check if it does, and if I can reproduce it. > > > > > > > > It looks fine here with the above mentioned patch: > > > > [ 0.000000] lowmem_reserve[]: > > > > [ 0.000000] 0 > > > > [ 0.000000] 0 > > > > [ 0.000000] 0 > > > > [ 0.000000] 0 > > > > [ 0.000000] > > > > [ 0.000000] DMA: > > > > [ 0.000000] 1*4kB > > > > [ 0.000000] 0*8kB > > > > [ 0.000000] 0*16kB > > > > [ 0.000000] 1*32kB > > > > [ 0.000000] 2*64kB > > > > [ 0.000000] 1*128kB > > > > [ 0.000000] 1*256kB > > > > [ 0.000000] 0*512kB > > > > [ 0.000000] 1*1024kB > > > > [ 0.000000] 1*2048kB > > > > [ 0.000000] 3*4096kB > > > > [ 0.000000] = 15908kB > > > > > > > > becomes: > > > > [ 0.000000] lowmem_reserve[]: 0 0 0 0 > > > > [ 0.000000] DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB > > > > > > Hi Kay. > > > > > > That single patch doesn't apply cleanly to Linus' > > > 8c84bf4166a4698296342841a549bbee03860ac0 > > > > > > What else is necessary? > > > > > > Your tree seems to have a collection of random patches. > > > > > > It might be useful to clone Linus' tree and produce a > > > branch with all the necessary printk patches in it so > > > someone else could pull it. > > > > They should all now be in my driver-core-next branch that will show up > > in the next linux-next release, so having a separate tree isn't > > necessary. > > I don't think so. > > There are real defects in the existing code. > > These are patches that are necessary _now_. > not for a -next 3.6 future. Oops, sorry, I ment to type, "driver-core-linus" above. greg k-h ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 20:56 ` Greg Kroah-Hartmann @ 2012-07-11 0:11 ` Dave Jones 2012-07-11 0:15 ` Greg Kroah-Hartmann 0 siblings, 1 reply; 19+ messages in thread From: Dave Jones @ 2012-07-11 0:11 UTC (permalink / raw) To: Greg Kroah-Hartmann Cc: Joe Perches, Kay Sievers, Linux Kernel, Linus Torvalds On Mon, Jul 09, 2012 at 01:56:31PM -0700, Greg Kroah-Hartman wrote: > > > > That single patch doesn't apply cleanly to Linus' > > > > 8c84bf4166a4698296342841a549bbee03860ac0 > > > > > > > > What else is necessary? > > > > > > > > Your tree seems to have a collection of random patches. > > > > > > > > It might be useful to clone Linus' tree and produce a > > > > branch with all the necessary printk patches in it so > > > > someone else could pull it. > > > > > > They should all now be in my driver-core-next branch that will show up > > > in the next linux-next release, so having a separate tree isn't > > > necessary. > > > > I don't think so. > > > > There are real defects in the existing code. > > > > These are patches that are necessary _now_. > > not for a -next 3.6 future. > > Oops, sorry, I ment to type, "driver-core-linus" above. Can you get this to Linus asap please ? I just had an oops that scrolled off the screen because the Code: line suffers the same problem. Thankfully serial console caught most of it, but I had to hand-edit a lot of crap just to make it readable again. Dave ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-11 0:11 ` Dave Jones @ 2012-07-11 0:15 ` Greg Kroah-Hartmann 0 siblings, 0 replies; 19+ messages in thread From: Greg Kroah-Hartmann @ 2012-07-11 0:15 UTC (permalink / raw) To: Dave Jones, Joe Perches, Kay Sievers, Linux Kernel, Linus Torvalds On Tue, Jul 10, 2012 at 08:11:45PM -0400, Dave Jones wrote: > On Mon, Jul 09, 2012 at 01:56:31PM -0700, Greg Kroah-Hartman wrote: > > > > > That single patch doesn't apply cleanly to Linus' > > > > > 8c84bf4166a4698296342841a549bbee03860ac0 > > > > > > > > > > What else is necessary? > > > > > > > > > > Your tree seems to have a collection of random patches. > > > > > > > > > > It might be useful to clone Linus' tree and produce a > > > > > branch with all the necessary printk patches in it so > > > > > someone else could pull it. > > > > > > > > They should all now be in my driver-core-next branch that will show up > > > > in the next linux-next release, so having a separate tree isn't > > > > necessary. > > > > > > I don't think so. > > > > > > There are real defects in the existing code. > > > > > > These are patches that are necessary _now_. > > > not for a -next 3.6 future. > > > > Oops, sorry, I ment to type, "driver-core-linus" above. > > Can you get this to Linus asap please ? I just had an oops that scrolled > off the screen because the Code: line suffers the same problem. > Thankfully serial console caught most of it, but I had to hand-edit > a lot of crap just to make it readable again. It will be sent tomorrow, I wanted to get it tested in linux-next at least for one day. thanks, greg k-h ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-09 18:48 ` Kay Sievers 2012-07-09 20:31 ` Joe Perches @ 2012-07-12 0:54 ` Dave Jones 2012-07-12 13:52 ` Kay Sievers 1 sibling, 1 reply; 19+ messages in thread From: Dave Jones @ 2012-07-12 0:54 UTC (permalink / raw) To: Kay Sievers; +Cc: Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann On Mon, Jul 09, 2012 at 08:48:51PM +0200, Kay Sievers wrote: > On Mon, 2012-07-09 at 20:27 +0200, Kay Sievers wrote: > > On Mon, Jul 9, 2012 at 8:03 PM, Dave Jones <davej@redhat.com> wrote: > > > I noticed that the format of the oom-killer output seems to have changed, and > > > now it spews stuff like.. > > > > > > [49461.758070] lowmem_reserve[]: > > > [49461.758071] 0 > > > [49461.758071] 2643 > > > [49461.758071] 3878 > > > [49461.758072] 3878 > > > [49461.758072] > > > [49461.758072] Node 0 > > > > > Does the oom-killer code need modifying, or the printk code ? > > > I know there's been some regressions in this area recently, but this is still > > > happening on the current tree (8c84bf4166a4698296342841a549bbee03860ac0) > > > > This likely fixes it: > > http://git.kernel.org/?p=linux/kernel/git/kay/patches.git;a=blob;f=kmsg-merge-cont.patch;hb=HEAD > > > > Let me check if it does, and if I can reproduce it. > > It looks fine here with the above mentioned patch: Now that that patch is in Linus tree, I've hit what's probably a different case. Look at the modules list in this oops.. [10016.460020] BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child1:24295] [10016.470008] rose<4>[10016.470008] ip_set_bitmap_ipmac<4>[10016.470008] nf_conntrack_h323<4>[10016.470008] girbil_sir<4>[10016.470008] ath9k_common<4>[10016.470008] hdlcdrv<4>[10016.470008] tun<4>[10016.470008] spcp8x5<4>[10016.470008] rc_streamzap<4>[10016.470008] rc_medion_x10<4>[10016.470008] gspca_mr97310a<4>[10016.470008] hid_multitouch<4>[10016.470008] fam15h_power<4>[10016.470008] sym53c8xx<4>[10016.470008] gunze<4>[10016.470008] pata_ns87410<4>[10016.470008] snd_ymfpci<4>[10016.470008] michael_mic<4>[10016.470008] blocklayoutdriver nfs_layout_nfsv41_files nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log btrfs zlib_deflate libcrc32c raid0 iTCO_wdt iTCO_vendor_support ppdev dcdbas coretemp kvm_intel kvm microcode snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device serio_raw snd_pcm lpc_ich mfd_core tg3 i2c_i801 pcspkr snd_timer i5000_edac edac_core snd i5k_amb soundcore snd_page_alloc parport_pc parport shpchp sunrpc firewire_ohci firewire_core crc_itu_t floppy nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi video wmi [last unloaded: scsi_wait_scan] [10016.470008] irq event stamp: 82066 [10016.470008] hardirqs last enabled at (82065): [<ffffffff816b8cf0>] restore_args+0x0/0x30 [10016.470008] hardirqs last disabled at (82066): [<ffffffff816c236a>] apic_timer_interrupt+0x6a/0x80 [10016.470008] softirqs last enabled at (82064): [<ffffffff8106ef8c>] __do_softirq+0x13c/0x3e0 [10016.470008] softirqs last disabled at (82055): [<ffffffff816c2d6c>] call_softirq+0x1c/0x30 [10016.470008] CPU 1 [10016.470008] Modules linked in:<4>[10016.470008] unix_diag<4>[10016.470008] ip_set_bitmap_ipmac<4>[10016.470008] nf_conntrack_h323<4>[10016.470008] girbil_sir<4>[10016.470008] ath9k_common<4>[10016.470008] hdlcdrv<4>[10016.470008] tun<4>[10016.470008] spcp8x5<4>[10016.470008] rc_streamzap<4>[10016.470008] rc_medion_x10<4>[10016.470008] gspca_mr97310a<4>[10016.470008] hid_multitouch<4>[10016.470008] fam15h_power<4>[10016.470008] sym53c8xx<4>[10016.470008] gunze<4>[10016.470008] pata_ns87410<4>[10016.470008] snd_ymfpci<4>[10016.470008] michael_mic<4>[10016.470008] blocklayoutdriver nfs_layout_nfsv41_files nfs fscache auth_rpcgss nfs_acl lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash dm_log btrfs zlib_deflate libcrc32c raid0 iTCO_wdt iTCO_vendor_support ppdev dcdbas coretemp kvm_intel kvm microcode snd_hda_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device serio_raw snd_pcm lpc_ich mfd_core tg3 i2c_i801 pcspkr snd_timer i5000_edac edac_core snd i5k_amb soundcore snd_page_alloc parport_pc parport shpchp sunrpc firewire_ohci firewire_core crc_itu_t floppy nouveau ttm drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi video wmi [last unloaded: scsi_wait_scan] [10016.470008] [10016.470008] Pid: 24295, comm: trinity-child1 Tainted: G C 3.5.0-rc6+ #80 Dell Inc. Precision WorkStation 490 /0DT031 [10016.470008] RIP: 0010:[<ffffffff810ce800>] [<ffffffff810ce800>] match_held_lock+0x190/0x190 [10016.470008] RSP: 0018:ffff88016bdf1c10 EFLAGS: 00000202 [10016.470008] RAX: 0000000000000001 RBX: ffffffff81021d43 RCX: 000000000000000f [10016.470008] RDX: ffff88016bdf1fd8 RSI: ffffffff81c2fb60 RDI: ffffffff81c2fb60 [10016.470008] RBP: ffff88016bdf1c38 R08: 000000000000000a R09: 0000000000000000 [10016.470008] R10: 0000000000000001 R11: 0000000000000001 R12: ffffffff81021db9 [10016.470008] R13: ffff88016bdf1b88 R14: ffffffff81021d43 R15: ffff88016bdf1b78 [10016.470008] FS: 00007fbad33ad700(0000) GS:ffff880226800000(0000) knlGS:0000000000000000 [10016.470008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [10016.470008] CR2: 0000000000000001 CR3: 00000001413a1000 CR4: 00000000000007e0 [10016.470008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [10016.470008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [10016.470008] Process trinity-child1 (pid: 24295, threadinfo ffff88016bdf0000, task ffff88016a244900) [10016.470008] Stack: [10016.470008] ffffffff8132ef67 ffff88016bdf1c48 000000000000229c 00000000000034b2 [10016.470008] ffffffff81c6c7c0 ffff88016bdf1c98 ffffffff812b319c ffff88016bdf1c98 [10016.470008] 0000000000000246 ffff88015a81d428 ffff88016bdf1cf0 13722bca39882609 [10016.470008] Call Trace: [10016.470008] [<ffffffff8132ef67>] ? idr_find+0xd7/0x150 [10016.470008] [<ffffffff812b319c>] sysvipc_find_ipc+0x5c/0x170 [10016.470008] [<ffffffff812b3374>] sysvipc_proc_next+0x54/0xf0 [10016.470008] [<ffffffff811e8cd9>] traverse+0xc9/0x240 [10016.470008] [<ffffffff811e91e0>] ? seq_lseek+0x120/0x120 [10016.470008] [<ffffffff811e954d>] seq_read+0x36d/0x420 [10016.470008] [<ffffffff811e91e0>] ? seq_lseek+0x120/0x120 [10016.470008] [<ffffffff811e91e0>] ? seq_lseek+0x120/0x120 [10016.470008] [<ffffffff8122f916>] proc_reg_read+0x86/0xc0 [10016.470008] [<ffffffff8122f890>] ? proc_reg_write+0xc0/0xc0 [10016.470008] [<ffffffff811c4649>] do_loop_readv_writev+0x59/0x90 [10016.470008] [<ffffffff811c4966>] do_readv_writev+0x1b6/0x1d0 [10016.470008] [<ffffffff810d462d>] ? trace_hardirqs_on+0xd/0x10 [10016.470008] [<ffffffff816b8350>] ? _raw_spin_unlock_irq+0x30/0x50 [10016.470008] [<ffffffff816c1855>] ? sysret_check+0x22/0x5d [10016.470008] [<ffffffff811c4a1c>] vfs_readv+0x3c/0x50 [10016.470008] [<ffffffff811c4c6a>] sys_preadv+0xba/0xc0 [10016.470008] [<ffffffff816c1829>] system_call_fastpath+0x16/0x1b [10016.470008] Code: 80 3d dc 14 c1 00 01 74 a0 be cb 02 00 00 48 c7 c7 dc 94 9d 81 e8 81 63 f9 ff c6 05 c2 14 c1 00 01 eb 86 0f 1f 84 00 00 00 00 00 <55> 48 89 e5 41 56 41 55 49 89 fd 41 54 65 4c 8b 24 25 c0 c8 00 Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the module listing began (2nd line). Sidenote: Who maintains sysvipc ? Dave ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-12 0:54 ` Dave Jones @ 2012-07-12 13:52 ` Kay Sievers 2012-07-12 14:05 ` Dave Jones 0 siblings, 1 reply; 19+ messages in thread From: Kay Sievers @ 2012-07-12 13:52 UTC (permalink / raw) To: Dave Jones, Kay Sievers, Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann On Thu, Jul 12, 2012 at 2:54 AM, Dave Jones <davej@redhat.com> wrote: > On Mon, Jul 09, 2012 at 08:48:51PM +0200, Kay Sievers wrote: > > It looks fine here with the above mentioned patch: > > Now that that patch is in Linus tree, I've hit what's probably a different case. > Look at the modules list in this oops.. > > [10016.460020] BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child1:24295] > [10016.470008] rose<4>[10016.470008] ip_set_bitmap_ipmac<4>[10016.470008] > Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the > module listing began (2nd line). It's the output of 'dmesg' you pasted, right? I tried to force all sorts of racy print_modules() calls, and kept your trinity tool from git for hours, it looks all fine here: [ 54.664301] NET: Registered protocol family 24 [ 54.684721] Netfilter messages via NETLINK v0.30. [ 63.103847] ------------[ cut here ]------------ [ 63.103857] WARNING: at kernel/futex.c:2452 sys_get_robust_list+0x28e/0x2a0() [ 63.103859] Hardware name: 4174NEG [ 63.103861] deprecated: get_robust_list will be deleted in 2013. [ 63.103862] Modules linked in: binfmt_misc nfnetlink pppoe pppox ppp_generic slhc atm fuse arc4 i915 hid_generic i2c_algo_bit drm_kms_helper drm snd_usb_audio snd_usbmidi_lib snd_rawmidi snd_seq_device bnep bluetooth snd_hda_codec_hdmi snd_hda_codec_conexant uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media coretemp kvm_intel kvm crc32c_intel microcode i2c_i801 i2c_core iwlwifi snd_hda_intel snd_hda_codec mac80211 snd_hwdep lpc_ich snd_pcm mfd_core sdhci_pci sdhci mmc_core snd_page_alloc cfg80211 snd_timer thinkpad_acpi snd e1000e soundcore rfkill video cdc_ncm wmi usbnet mii cdc_wdm cdc_acm [ 63.103921] Pid: 837, comm: trinity-child1 Not tainted 3.5.0-0.rc6.git2.1.fc18.x86_64 #1 [ 63.103923] Call Trace: [ 63.103930] [<ffffffff8106310f>] warn_slowpath_common+0x7f/0xc0 [ 63.103934] [<ffffffff81063206>] warn_slowpath_fmt+0x46/0x50 [ 63.103938] [<ffffffff810d253d>] ? trace_hardirqs_on_caller+0x10d/0x1a0 [ 63.103941] [<ffffffff810d6ffe>] sys_get_robust_list+0x28e/0x2a0 [ 63.103946] [<ffffffff816d7d29>] system_call_fastpath+0x16/0x1b [ 63.103948] ---[ end trace 796993e76a8397be ]--- [ 63.126701] NET: Registered protocol family 23 [ 63.397208] trinity-child3 (839): Using mlock ulimits for SHM_HUGETLB is deprecated Can you easily reproduce the issue you pasted? If, could you give me the /dev/kmsg output? Thanks, Kay ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-12 13:52 ` Kay Sievers @ 2012-07-12 14:05 ` Dave Jones 2012-07-12 16:47 ` Kay Sievers 2012-07-12 17:11 ` Linus Torvalds 0 siblings, 2 replies; 19+ messages in thread From: Dave Jones @ 2012-07-12 14:05 UTC (permalink / raw) To: Kay Sievers; +Cc: Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann On Thu, Jul 12, 2012 at 03:52:17PM +0200, Kay Sievers wrote: > On Thu, Jul 12, 2012 at 2:54 AM, Dave Jones <davej@redhat.com> wrote: > > On Mon, Jul 09, 2012 at 08:48:51PM +0200, Kay Sievers wrote: > > > > It looks fine here with the above mentioned patch: > > > > Now that that patch is in Linus tree, I've hit what's probably a different case. > > Look at the modules list in this oops.. > > > > [10016.460020] BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child1:24295] > > [10016.470008] rose<4>[10016.470008] ip_set_bitmap_ipmac<4>[10016.470008] > > > Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the > > module listing began (2nd line). > > It's the output of 'dmesg' you pasted, right? Yes. > I tried to force all sorts of racy print_modules() calls, and kept > your trinity tool from git for hours, it looks all fine here: > > Can you easily reproduce the issue you pasted? If, could you give me > the /dev/kmsg output? I've seen it a few times, always with the soft lockup trace. You might be able to trigger it using scripts/load-all-modules.sh from trinity.git. (Assuming you have a lot of modules built, I'm still trying to track down which one seems to be responsible). Dave ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-12 14:05 ` Dave Jones @ 2012-07-12 16:47 ` Kay Sievers 2012-07-12 17:02 ` Dave Jones 2012-07-12 17:11 ` Linus Torvalds 1 sibling, 1 reply; 19+ messages in thread From: Kay Sievers @ 2012-07-12 16:47 UTC (permalink / raw) To: Dave Jones, Kay Sievers, Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann On Thu, Jul 12, 2012 at 4:05 PM, Dave Jones <davej@redhat.com> wrote: > On Thu, Jul 12, 2012 at 03:52:17PM +0200, Kay Sievers wrote: > > On Thu, Jul 12, 2012 at 2:54 AM, Dave Jones <davej@redhat.com> wrote: > > > On Mon, Jul 09, 2012 at 08:48:51PM +0200, Kay Sievers wrote: > > > > > > It looks fine here with the above mentioned patch: > > > > > > Now that that patch is in Linus tree, I've hit what's probably a different case. > > > Look at the modules list in this oops.. > > > > > > [10016.460020] BUG: soft lockup - CPU#1 stuck for 22s! [trinity-child1:24295] > > > [10016.470008] rose<4>[10016.470008] ip_set_bitmap_ipmac<4>[10016.470008] > > > > > Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the > > > module listing began (2nd line). They do not belong together. The second line is just another call to the same print_modules() done from: arch/x86/kernel/dumpstack_64.c :: show_regs() While we already called print_modules() a few cycles earlier from: kernel/watchdog :: watchdog_timer_fn() > > I tried to force all sorts of racy print_modules() calls, and kept > > your trinity tool from git for hours, it looks all fine here: > > > > Can you easily reproduce the issue you pasted? If, could you give me > > the /dev/kmsg output? > > I've seen it a few times, always with the soft lockup trace. > > You might be able to trigger it using scripts/load-all-modules.sh > from trinity.git. (Assuming you have a lot of modules built, I'm > still trying to track down which one seems to be responsible). Hmm, it does not trigger your pattern. I tried adding an rmmod in that loop, but that crashes after a few seconds. Some modules are just not meant to be removed. :) I forced the watchdog to trigger by setting the timeout to 1s, but all looks still fine: [ 20.854010] BUG: soft lockup - CPU#0 stuck for 1000s! [trinity:247] [ 20.854010] Modules linked in: nfnetlink xfrm_user xfrm_algo pppoe pppox ppp_generic slhc atm bluetooth rfkill microcode cirrus ttm sr_mod cdrom pcspkr drm_kms_helper drm syscopyarea sysfillrect sysimgblt floppy evdev ipv6 [ 20.854010] irq event stamp: 980768 [ 20.854010] hardirqs last enabled at (980767): [<ffffffff814a425d>] __slab_alloc.constprop.65+0x3c9/0x408 [ 20.854010] hardirqs last disabled at (980768): [<ffffffff814adfaa>] apic_timer_interrupt+0x6a/0x80 [ 20.854010] softirqs last enabled at (978462): [<ffffffff8103db1f>] __do_softirq+0x11f/0x170 [ 20.854010] softirqs last disabled at (978457): [<ffffffff814ae82c>] call_softirq+0x1c/0x30 [ 20.854010] CPU 0 [ 20.854010] Modules linked in: nfnetlink xfrm_user xfrm_algo pppoe pppox ppp_generic slhc atm bluetooth rfkill microcode cirrus ttm sr_mod cdrom pcspkr drm_kms_helper drm syscopyarea sysfillrect sysimgblt floppy evdev ipv6 Could it possibly be that we get some sort of corruption somewhere else while running trinity and load modules? Kay ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-12 16:47 ` Kay Sievers @ 2012-07-12 17:02 ` Dave Jones 0 siblings, 0 replies; 19+ messages in thread From: Dave Jones @ 2012-07-12 17:02 UTC (permalink / raw) To: Kay Sievers; +Cc: Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann On Thu, Jul 12, 2012 at 06:47:57PM +0200, Kay Sievers wrote: > > > > Also, I have no idea how the hell the 'Modules linked in:' line (9th line) ended up being printed /after/ the > > > > module listing began (2nd line). > > They do not belong together. The second line is just another call to > the same print_modules() done from: > arch/x86/kernel/dumpstack_64.c :: show_regs() > > While we already called print_modules() a few cycles earlier from: > kernel/watchdog :: watchdog_timer_fn() ah, derp. Good spotting. > Hmm, it does not trigger your pattern. I tried adding an rmmod in that > loop, but that crashes after a few seconds. Some modules are just not > meant to be removed. :) yeah, that's bothersome. TTBOMK, we've never survived a 'load all modules/unload all modules' ever. > I forced the watchdog to trigger by setting the timeout to 1s, but all > looks still fine: > > Could it possibly be that we get some sort of corruption somewhere > else while running trinity and load modules? I suppose anything is possible, but it's surprisingly consistent. Dave ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-12 14:05 ` Dave Jones 2012-07-12 16:47 ` Kay Sievers @ 2012-07-12 17:11 ` Linus Torvalds 2012-07-12 18:25 ` Kay Sievers 1 sibling, 1 reply; 19+ messages in thread From: Linus Torvalds @ 2012-07-12 17:11 UTC (permalink / raw) To: Dave Jones, Kay Sievers, Linux Kernel, Linus Torvalds, Greg Kroah-Hartmann On Thu, Jul 12, 2012 at 7:05 AM, Dave Jones <davej@redhat.com> wrote: > > I've seen it a few times, always with the soft lockup trace. I bet it's because you have tons of modules, and the line ends up being *really* long. And overflows LOG_LINE_MAX. I suspect something odd happens. There are tons of odd special cases for LOG_LINE_MAX, and I bet Kay doesn't see it for the simple reason that he's not totally insane, and hasn't loaded hundreds of modules. Kay, I suspect the "continuation line" logic could easily have a rule like "If the old line is already > 80 characters, do a line break here and add TAB to the beginning of the new line" In fact, that could be really nice for things like stack dumps etc - we wouldn't have to worry about line breaks and crap, if the printk logic just makes "KERN_CONT" do a line break automatically if it doesn't fit on the screen. People who use KERN_CONT don't do it because they *need* things to be on one line (it's not guaranteed anyway), they do it because they want the output to be dense and readable. Doing auto-line-break would actually *help*. And would mean that you never hit the odd LOG_LINE_MAX cases just because somebody is printing lots of modules. Linus ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-12 17:11 ` Linus Torvalds @ 2012-07-12 18:25 ` Kay Sievers 2012-07-12 22:28 ` Kay Sievers 0 siblings, 1 reply; 19+ messages in thread From: Kay Sievers @ 2012-07-12 18:25 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dave Jones, Linux Kernel, Greg Kroah-Hartmann On Thu, Jul 12, 2012 at 7:11 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Jul 12, 2012 at 7:05 AM, Dave Jones <davej@redhat.com> wrote: >> >> I've seen it a few times, always with the soft lockup trace. > > I bet it's because you have tons of modules, and the line ends up > being *really* long. And overflows LOG_LINE_MAX. I suspect something > odd happens. Straight to the point. That's the issue, combined with the later safety range checks, we produce somehow "ugly" output. We already flush the line out of the buffer when we can not add stuff anymore. The line length is then close to LOG_LINE_MAX, and we want to add a prefix with the timestamp during output and we reach the limit of LOG_LINE_MAX. > There are tons of odd special cases for LOG_LINE_MAX, and I bet Kay > doesn't see it for the simple reason that he's not totally insane, and > hasn't loaded hundreds of modules. Yeah, I just added a loop now that prints over-long continuation lines, and I can see parts of the effect Dave sees. > Kay, I suspect the "continuation line" logic could easily have a rule like > > "If the old line is already > 80 characters, do a line break here > and add TAB to the beginning of the new line" > > In fact, that could be really nice for things like stack dumps etc - > we wouldn't have to worry about line breaks and crap, if the printk > logic just makes "KERN_CONT" do a line break automatically if it > doesn't fit on the screen. > > People who use KERN_CONT don't do it because they *need* things to be > on one line (it's not guaranteed anyway), they do it because they want > the output to be dense and readable. Doing auto-line-break would > actually *help*. And would mean that you never hit the odd > LOG_LINE_MAX cases just because somebody is printing lots of modules. That sounds good. I'll look into that. Kay ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 3.5-rc6 printk formatting problem during oom-kill. 2012-07-12 18:25 ` Kay Sievers @ 2012-07-12 22:28 ` Kay Sievers 0 siblings, 0 replies; 19+ messages in thread From: Kay Sievers @ 2012-07-12 22:28 UTC (permalink / raw) To: Linus Torvalds; +Cc: Dave Jones, Linux Kernel, Greg Kroah-Hartmann On Thu, 2012-07-12 at 20:25 +0200, Kay Sievers wrote: > On Thu, Jul 12, 2012 at 7:11 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > On Thu, Jul 12, 2012 at 7:05 AM, Dave Jones <davej@redhat.com> wrote: > >> > >> I've seen it a few times, always with the soft lockup trace. > > > > I bet it's because you have tons of modules, and the line ends up > > being *really* long. And overflows LOG_LINE_MAX. I suspect something > > odd happens. > > Straight to the point. That's the issue, combined with the later > safety range checks, we produce somehow "ugly" output. > > We already flush the line out of the buffer when we can not add stuff > anymore. The line length is then close to LOG_LINE_MAX, and we want to > add a prefix with the timestamp during output and we reach the limit > of LOG_LINE_MAX. > > > There are tons of odd special cases for LOG_LINE_MAX, and I bet Kay > > doesn't see it for the simple reason that he's not totally insane, and > > hasn't loaded hundreds of modules. > > Yeah, I just added a loop now that prints over-long continuation > lines, and I can see parts of the effect Dave sees. > > > Kay, I suspect the "continuation line" logic could easily have a rule like > > > > "If the old line is already > 80 characters, do a line break here > > and add TAB to the beginning of the new line" This seems to fix the issue for me, which Dave has posted. I was able to partly reproduce it by printing continuation lines larger than 1000 bytes. The overlong lines did not leave enough room for the later syslog and timestamp prefixing. I'll look into the automatic split of lines later, it is not as trivial as this fix. While we are at the over-long log line topic, Harald just pointed me to this unrelated mail, sent an hour ago. :) "The "Modules linked in:" lines say only <omitted>, because the lines become too long and cause netconsole to eat most of the message." https://lkml.org/lkml/2012/7/12/484 Thanks, Kay From: Kay Sievers <kay@vrfy.org> Subject: kmsg - properly print over-long continuation lines Reserve PREFIX_MAX bytes in the LOG_LINE_MAX line when buffering a continuation line, to be able to properly prefix the LOG_LINE_MAX line with the syslog prefix and timestamp when printing it. Reported-By: Dave Jones <davej@redhat.com> Signed-off-by: Kay Sievers <kay@vrfy.org> --- kernel/printk.c | 33 +++++++++++++++++++-------------- 1 file changed, 19 insertions(+), 14 deletions(-) --- a/kernel/printk.c +++ b/kernel/printk.c @@ -235,7 +235,8 @@ static u32 log_next_idx; static u64 clear_seq; static u32 clear_idx; -#define LOG_LINE_MAX 1024 +#define PREFIX_MAX 32 +#define LOG_LINE_MAX 1024 - PREFIX_MAX /* record buffer */ #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) @@ -876,7 +877,7 @@ static size_t msg_print_text(const struc if (buf) { if (print_prefix(msg, syslog, NULL) + - text_len + 1>= size - len) + text_len + 1 >= size - len) break; if (prefix) @@ -907,7 +908,7 @@ static int syslog_print(char __user *buf struct log *msg; int len = 0; - text = kmalloc(LOG_LINE_MAX, GFP_KERNEL); + text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL); if (!text) return -ENOMEM; @@ -930,7 +931,8 @@ static int syslog_print(char __user *buf skip = syslog_partial; msg = log_from_idx(syslog_idx); - n = msg_print_text(msg, syslog_prev, true, text, LOG_LINE_MAX); + n = msg_print_text(msg, syslog_prev, true, text, + LOG_LINE_MAX + PREFIX_MAX); if (n - syslog_partial <= size) { /* message fits into buffer, move forward */ syslog_idx = log_next(syslog_idx); @@ -969,7 +971,7 @@ static int syslog_print_all(char __user char *text; int len = 0; - text = kmalloc(LOG_LINE_MAX, GFP_KERNEL); + text = kmalloc(LOG_LINE_MAX + PREFIX_MAX, GFP_KERNEL); if (!text) return -ENOMEM; @@ -1022,7 +1024,8 @@ static int syslog_print_all(char __user struct log *msg = log_from_idx(idx); int textlen; - textlen = msg_print_text(msg, prev, true, text, LOG_LINE_MAX); + textlen = msg_print_text(msg, prev, true, text, + LOG_LINE_MAX + PREFIX_MAX); if (textlen < 0) { len = textlen; break; @@ -1367,15 +1370,15 @@ static struct cont { bool flushed:1; /* buffer sealed and committed */ } cont; -static void cont_flush(void) +static void cont_flush(enum log_flags flags) { if (cont.flushed) return; if (cont.len == 0) return; - log_store(cont.facility, cont.level, LOG_NOCONS, cont.ts_nsec, - NULL, 0, cont.buf, cont.len); + log_store(cont.facility, cont.level, LOG_NOCONS | flags, + cont.ts_nsec, NULL, 0, cont.buf, cont.len); cont.flushed = true; } @@ -1386,7 +1389,8 @@ static bool cont_add(int facility, int l return false; if (cont.len + len > sizeof(cont.buf)) { - cont_flush(); + /* the line gets too long, split it up in separate records */ + cont_flush(LOG_CONT); return false; } @@ -1522,7 +1526,7 @@ asmlinkage int vprintk_emit(int facility * or another task also prints continuation lines. */ if (cont.len && (lflags & LOG_PREFIX || cont.owner != current)) - cont_flush(); + cont_flush(0); /* buffer line if possible, otherwise store it right away */ if (!cont_add(facility, level, text, text_len)) @@ -1540,7 +1544,7 @@ asmlinkage int vprintk_emit(int facility if (cont.len && cont.owner == current) { if (!(lflags & LOG_PREFIX)) stored = cont_add(facility, level, text, text_len); - cont_flush(); + cont_flush(0); } if (!stored) @@ -1633,7 +1637,8 @@ EXPORT_SYMBOL(printk); #else -#define LOG_LINE_MAX 0 +#define LOG_LINE_MAX 0 +#define PREFIX_MAX 0 static struct cont { size_t len; size_t cons; @@ -1938,7 +1943,7 @@ static enum log_flags console_prev; */ void console_unlock(void) { - static char text[LOG_LINE_MAX]; + static char text[LOG_LINE_MAX + PREFIX_MAX]; static u64 seen_seq; unsigned long flags; bool wake_klogd = false; ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2012-07-12 22:28 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-07-09 18:03 3.5-rc6 printk formatting problem during oom-kill Dave Jones 2012-07-09 18:27 ` Kay Sievers 2012-07-09 18:48 ` Kay Sievers 2012-07-09 20:31 ` Joe Perches 2012-07-09 20:40 ` Greg Kroah-Hartmann 2012-07-09 20:44 ` Joe Perches 2012-07-09 20:48 ` Kay Sievers 2012-07-09 21:08 ` Joe Perches 2012-07-09 20:56 ` Greg Kroah-Hartmann 2012-07-11 0:11 ` Dave Jones 2012-07-11 0:15 ` Greg Kroah-Hartmann 2012-07-12 0:54 ` Dave Jones 2012-07-12 13:52 ` Kay Sievers 2012-07-12 14:05 ` Dave Jones 2012-07-12 16:47 ` Kay Sievers 2012-07-12 17:02 ` Dave Jones 2012-07-12 17:11 ` Linus Torvalds 2012-07-12 18:25 ` Kay Sievers 2012-07-12 22:28 ` Kay Sievers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox