* Re: I found a synchronization problem in mm/vmalloc.c
[not found] ` <002201ca936a$1fc06780$5f413680$@koh@samsung.com>
@ 2010-01-13 4:55 ` Nick Piggin
[not found] ` <1263388030.2818.6.camel@barrios-desktop>
0 siblings, 1 reply; 2+ messages in thread
From: Nick Piggin @ 2010-01-13 4:55 UTC (permalink / raw)
To: Yongseok Koh
Cc: 'Andrew Morton', gregkh, vegard.nossum, mingo, penberg,
paulmck, torvalds,
'������',
'������',
'������',
'����ȣ',
'������',
'��ȿâ',
'������',
'������',
'������',
'������',
linux-kernel
On Tue, Jan 12, 2010 at 06:32:09PM +0900, Yongseok Koh wrote:
> Sorry, Mr. Morton.
>
> Even though it is somewhat late, I am doing cc the mailing list.
>
> Thanks.
>
> -----Original Message-----
>
> On Thu, 7 Jan 2010 20:22:30 +0900
> "Yongseok Koh" <yongseok.koh@samsung.com> wrote:
>
> > Dear all,
> >
> > I___m Yongseok Koh in Korea.
> >
>
> Thanks for the report.
>
> Please do cc a mailing list when reporting bugs so that everyone else knows
> what's going on.
>
> >
> > I just got a new message in linux-2.6.28.10 (plz refer to the below)
> >
> > And, one of my colleagues found that there is a synchronization
> > problem in mm/vmalloc.c
> >
> >
> >
> > In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE
> > first, and then vmap_lazy_nr is increased atomically.
> >
> > But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list,
> > nr is counted by checking VM_LAZY_FREE is set to va->flags.
> >
> > After counting the variable nr, kernel reads vmap_lazy_nr atomically
> > and checks a BUG_ON condition whether nr is greater than vmap_lazy_nr.
> >
> >
> >
> > The problem is that, if interrupted right after marking VM_LAZY_FREE,
> > increment of vmap_lazy_nr can be delayed.
> >
> > Consequently, BUG_ON condition can be met because nr is counted more
> > than vmap_lazy_nr.
> >
> >
> >
> > What I mentioned is highly probable when vmalloc/vfree are called
> > frequently.
> >
> > And my colleagues have verified this scenario by adding delay between
> > marking VM_LAZY_FREE and increasing vmap_lazy_nr in
> > free_unmap_area_noflush().
> >
> >
> >
> > Am I right ?
> >
>
> Looks plausible to me and as far as I can tell, current code has the same
> issue.
Yes, I think it's a good catch.
> Wakey wakey, Nick! What makes that BUG_ON() safe? Not purge_lock afacit?
No I think it is a bug. I would say that we can just get rid of the BUG_ON
now. atomic_t is signed, so it should be OK if it momentarily goes negative
(and anyway it's only used in a heuristic).
So, thanks for the report. Would you care to send a patch, or propose
another way to fix the problem?
Thanks,
Nick
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] vmalloc: remove BUG_ON due to racy counting of VM_LAZY_FREE
[not found] ` <001c01ca98e2$231d8b10$6958a130$@koh@samsung.com>
@ 2010-01-19 12:01 ` Minchan Kim
0 siblings, 0 replies; 2+ messages in thread
From: Minchan Kim @ 2010-01-19 12:01 UTC (permalink / raw)
To: yongseok.koh
Cc: 'Nick Piggin', 'Linus Torvalds',
'Andrew Morton', gregkh, vegard.nossum,
'Ingo Molnar', penberg, paulmck, linux-kernel
On Tue, 2010-01-19 at 17:33 +0900, Yongseok Koh wrote:
> From: Yongseok Koh <yongseok.koh@samsung.com>
You don't need above line.
We use "From" when we send patch instead of someone.
>
> In free_unmap_area_noflush(), va->flags is marked as VM_LAZY_FREE first, and
> then vmap_lazy_nr is increased atomically.
> But, in __purge_vmap_area_lazy(), while traversing of vmap_are_list, nr is
> counted by checking VM_LAZY_FREE is set to va->flags.
> After counting the variable nr, kernel reads vmap_lazy_nr atomically and
> checks a BUG_ON condition whether nr is greater than vmap_lazy_nr to prevent
> vmap_lazy_nr from being negative.
>
> The problem is that, if interrupted right after marking VM_LAZY_FREE,
> increment of vmap_lazy_nr can be delayed.
> Consequently, BUG_ON condition can be met because nr is counted more than
> vmap_lazy_nr.
>
> It is highly probable when vmalloc/vfree are called frequently.
> This scenario have been verified by adding delay between marking
> VM_LAZY_FREE and increasing vmap_lazy_nr in free_unmap_area_noflush().
>
> Even the vmap_lazy_nr is for checking high watermark, it never be the strict
> watermark.
> Although the BUG_ON condition is to prevent vmap_lazy_nr from being
> negative, vmap_lazy_nr is signed variable.
> So, it could go down to negative value temporarily.
>
> Consequently, removing the BUG_ON condition is proper.
>
> A possible BUG_ON message is like the below.
>
> kernel BUG at mm/vmalloc.c:517!
> invalid opcode: 0000 [#1] SMP
> EIP: 0060:[<c04824a4>] EFLAGS: 00010297 CPU: 3
> EIP is at __purge_vmap_area_lazy+0x144/0x150
> EAX: ee8a8818 EBX: c08e77d4 ECX: e7c7ae40 EDX: c08e77ec
> ESI: 000081fe EDI: e7c7ae60 EBP: e7c7ae64 ESP: e7c7ae3c
> DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Call Trace:
> [<c0482ad9>] free_unmap_vmap_area_noflush+0x69/0x70
> [<c0482b02>] remove_vm_area+0x22/0x70
> [<c0482c15>] __vunmap+0x45/0xe0
> [<c04831ec>] vmalloc+0x2c/0x30
> Code: 8d 59 e0 eb 04 66 90 89 cb 89 d0 e8 87 fe ff ff 8b 43 20 89 da 8d 48
> e0 8d 43 20 3b 04 24 75 e7 fe 05 a8 a5 a3 c0 e9 78 ff ff ff <0f> 0b eb fe 90
> 8d b4 26 00 00 00 00 56 89 c6 b8 ac a5 a3 c0 31
> EIP: [<c04824a4>] __purge_vmap_area_lazy+0x144/0x150 SS:ESP 0068:e7c7ae3c
>
>
> Signed-off-by: Yongseok Koh <yongseok.koh@samsung.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
We discussed about this following as.
http://marc.info/?l=linux-kernel&m=126335856228090&w=2
Thanks for contribution for linux kernel, Yongseok. :)
--
Kind regards,
Minchan Kim
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-01-19 12:01 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <006d01ca8f8b$b62c8ec0$2285ac40$%koh@samsung.com>
[not found] ` <20100111163205.4d013e86.akpm@linux-foundation.org>
[not found] ` <002201ca936a$1fc06780$5f413680$@koh@samsung.com>
2010-01-13 4:55 ` I found a synchronization problem in mm/vmalloc.c Nick Piggin
[not found] ` <1263388030.2818.6.camel@barrios-desktop>
[not found] ` <alpine.LFD.2.00.1001130829490.13231@localhost.localdomain>
[not found] ` <20100114123328.GA7518@laptop>
[not found] ` <28c262361001150902g569683a1nbd3e0212655a87a0@mail.gmail.com>
[not found] ` <20100118073759.GB10052@laptop>
[not found] ` <001c01ca98e2$231d8b10$6958a130$@koh@samsung.com>
2010-01-19 12:01 ` [PATCH] vmalloc: remove BUG_ON due to racy counting of VM_LAZY_FREE Minchan Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox