linux-lvm.redhat.com archive mirror
 help / color / mirror / Atom feed
* [linux-lvm] Snapshot causing segault
@ 2012-12-31 18:50 Tyler Gates
  2013-01-03 10:18 ` Zdenek Kabelac
  2013-01-04 11:23 ` Milan Broz
  0 siblings, 2 replies; 6+ messages in thread
From: Tyler Gates @ 2012-12-31 18:50 UTC (permalink / raw)
  To: linux-lvm

[-- Attachment #1: Type: text/plain, Size: 4310 bytes --]

Hello everyone,
     I've been having an intermittent problem on random servers segfaulting
while trying to create a snapshot under version  lvm2-2.02.17-7.38.3 on
kernel 2.6.16.60-0.93.1-bigsmp (SLES 10 SP4). The messages I get are:
###########################################
Dec 27 07:45:39 chelco-app-01 kernel: Unable to handle kernel NULL pointer
dereference at virtual address 0000001c
Dec 27 07:45:39 chelco-app-01 kernel:  printing eip:
Dec 27 07:45:39 chelco-app-01 kernel: f90ab3a7
Dec 27 07:45:39 chelco-app-01 kernel: *pde = 3780a001
Dec 27 07:45:39 chelco-app-01 kernel: Oops: 0000 [#1]
Dec 27 07:45:39 chelco-app-01 kernel: SMP
Dec 27 07:45:39 chelco-app-01 kernel: last sysfs file:
/devices/pci0000:00/0000:00:02.0/0000:04:00.1/irq
Dec 27 07:45:39 chelco-app-01 kernel: Modules linked in: raw dock button
battery ac loop dm_snapshot usbhid dm_mod uhci_hcd bnx2x hw_random ehci_hcd
qla2xxx hpilo usbcore firmware_class scsi_transport_fc parport_pc lp
parport ext3 jbd edd
fan thermal processor cciss sd_mod scsi_mod
Dec 27 07:45:39 chelco-app-01 kernel: CPU:    4
Dec 27 07:45:39 chelco-app-01 kernel: EIP:    0060:[<f90ab3a7>]    Tainted:
G     X VLI
Dec 27 07:45:39 chelco-app-01 kernel: EFLAGS: 00210202
(2.6.16.60-0.93.1-bigsmp #1)
Dec 27 07:45:39 chelco-app-01 kernel: EIP is at __map_bio+0x50/0x11f
[dm_mod]
Dec 27 07:45:39 chelco-app-01 kernel: eax: f90960c4   ebx: 00000000   ecx:
f7ff2a60   edx: f7794440
Dec 27 07:45:39 chelco-app-01 kernel: esi: f7ff2a58   edi: f90960c4   ebp:
f46306c0   esp: f4c15d28
Dec 27 07:45:39 chelco-app-01 kernel: ds: 007b   es: 007b   ss: 0068
Dec 27 07:45:39 chelco-app-01 kernel: Process lvcreate (pid: 6678,
threadinfo=f4c14000 task=f7838680)
Dec 27 07:45:39 chelco-app-01 kernel: Stack: <0>f7794340 f7794440 f7794440
03201ff0 00000000 03201ff0 00000000 00000008
Dec 27 07:45:39 chelco-app-01 kernel:        00000000 00000000 f90960c4
f7ff2a68 f46306c0 f90abd1b 00000000 00000001
Dec 27 07:45:39 chelco-app-01 kernel:        00000008 f428e2e0 fcdfe010
ffffffff c0113d62 00000000 0000001f f7ff2a58
Dec 27 07:45:39 chelco-app-01 kernel: Call Trace:
Dec 27 07:45:39 chelco-app-01 kernel:  [<f90abd1b>] __split_bio+0x182/0x440
[dm_mod]
Dec 27 07:45:39 chelco-app-01 kernel:  [<c0113d62>]
do_flush_tlb_all+0x0/0x5d
Dec 27 07:45:39 chelco-app-01 kernel:  [<f90abff0>]
__flush_deferred_io+0x17/0x20 [dm_mod]
Dec 27 07:45:39 chelco-app-01 kernel:  [<f90ac14c>] dm_resume+0x8e/0xf9
[dm_mod]
Dec 27 07:45:39 chelco-app-01 kernel:  [<f90aedd8>] dev_suspend+0x138/0x157
[dm_mod]
Dec 27 07:45:39 chelco-app-01 kernel:  [<f90af607>] ctl_ioctl+0x220/0x26e
[dm_mod]
Dec 27 07:45:39 chelco-app-01 kernel:  [<f90aeca0>] dev_suspend+0x0/0x157
[dm_mod]
Dec 27 07:45:39 chelco-app-01 kernel:  [<c0179ce8>] do_ioctl+0x48/0x5e
Dec 27 07:45:39 chelco-app-01 kernel:  [<c0179f60>] vfs_ioctl+0x262/0x275
Dec 27 07:45:39 chelco-app-01 kernel:  [<c0179fc7>] sys_ioctl+0x54/0x6d
Dec 27 07:45:39 chelco-app-01 kernel:  [<c0103dcb>]
sysenter_past_esp+0x54/0x79
Dec 27 07:45:39 chelco-app-01 kernel: Code: b4 0a f9 89 70 40 8b 06 83 c0
0c f0 ff 00 8b 54 24 08 8d 4e 08 8b 02 8b 52 04 89 44 24 0c 89 f8 89 54 24
10 8b 5f 04 8b 54 24 08 <ff> 53 1c 83 f8 00 89 c2 0f 8e 93 00 00 00 8b 54
24 08 8b 42 0c
#############################################################

The result is the target volume gets suspended and the only way to fix it
is to reboot and remove the faulty snapshot when it comes back up.

Now the script I wrote that creates these snapshots will use all available
extents from the Volume Group pool which in this case was actually larger
than the size of the volume I was trying to snapshot. Thinking this was the
problem, I tried creating the snapshot several times using a snapshot size
less than or equal to the target volume and it worked every time. So, I
tried a value larger than the target to generate a crash and it did BUT not
every time. In fact now I can't get it to segfault at all.

So my question is: is creating the snapshot volume with a size larger than
the target volume inducing segfaults randomly or could there be another
problem lurking? If these weren't production machines I would normally just
go with a size smaller than the target but I really need to be sure what
exactly is causing the segfaults.

Any help would be appreciated.

  -Tyler

[-- Attachment #2: Type: text/html, Size: 5735 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] Snapshot causing segault
  2012-12-31 18:50 [linux-lvm] Snapshot causing segault Tyler Gates
@ 2013-01-03 10:18 ` Zdenek Kabelac
  2013-01-03 13:25   ` Tyler Gates
  2013-01-03 20:04   ` Stuart D Gathman
  2013-01-04 11:23 ` Milan Broz
  1 sibling, 2 replies; 6+ messages in thread
From: Zdenek Kabelac @ 2013-01-03 10:18 UTC (permalink / raw)
  To: linux-lvm

Dne 31.12.2012 19:50, Tyler Gates napsal(a):
> Hello everyone,
>       I've been having an intermittent problem on random servers segfaulting
> while trying to create a snapshot under version  lvm2-2.02.17-7.38.3 on
> kernel 2.6.16.60-0.93.1-bigsmp (SLES 10 SP4). The messages I get are:
> ###########################################
> Dec 27 07:45:39 chelco-app-01 kernel: Unable to handle kernel NULL pointer
> dereference at virtual address 0000001c
> Dec 27 07:45:39 chelco-app-01 kernel:  printing eip:
> Dec 27 07:45:39 chelco-app-01 kernel: f90ab3a7
> Dec 27 07:45:39 chelco-app-01 kernel: *pde = 3780a001
> Dec 27 07:45:39 chelco-app-01 kernel: Oops: 0000 [#1]
> Dec 27 07:45:39 chelco-app-01 kernel: SMP
> Dec 27 07:45:39 chelco-app-01 kernel: last sysfs file:
> /devices/pci0000:00/0000:00:02.0/0000:04:00.1/irq
> Dec 27 07:45:39 chelco-app-01 kernel: Modules linked in: raw dock button
> battery ac loop dm_snapshot usbhid dm_mod uhci_hcd bnx2x hw_random ehci_hcd
> qla2xxx hpilo usbcore firmware_class scsi_transport_fc parport_pc lp parport
> ext3 jbd edd
> fan thermal processor cciss sd_mod scsi_mod
> Dec 27 07:45:39 chelco-app-01 kernel: CPU:    4
> Dec 27 07:45:39 chelco-app-01 kernel: EIP:    0060:[<f90ab3a7>]    Tainted: G
>      X VLI
> Dec 27 07:45:39 chelco-app-01 kernel: EFLAGS: 00210202
> (2.6.16.60-0.93.1-bigsmp #1)
> Dec 27 07:45:39 chelco-app-01 kernel: EIP is at __map_bio+0x50/0x11f [dm_mod]
> Dec 27 07:45:39 chelco-app-01 kernel: eax: f90960c4   ebx: 00000000   ecx:
> f7ff2a60   edx: f7794440
> Dec 27 07:45:39 chelco-app-01 kernel: esi: f7ff2a58   edi: f90960c4   ebp:
> f46306c0   esp: f4c15d28
> Dec 27 07:45:39 chelco-app-01 kernel: ds: 007b   es: 007b   ss: 0068
> Dec 27 07:45:39 chelco-app-01 kernel: Process lvcreate (pid: 6678,
> threadinfo=f4c14000 task=f7838680)
> Dec 27 07:45:39 chelco-app-01 kernel: Stack: <0>f7794340 f7794440 f7794440
> 03201ff0 00000000 03201ff0 00000000 00000008
> Dec 27 07:45:39 chelco-app-01 kernel:        00000000 00000000 f90960c4
> f7ff2a68 f46306c0 f90abd1b 00000000 00000001
> Dec 27 07:45:39 chelco-app-01 kernel:        00000008 f428e2e0 fcdfe010
> ffffffff c0113d62 00000000 0000001f f7ff2a58
> Dec 27 07:45:39 chelco-app-01 kernel: Call Trace:
> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90abd1b>] __split_bio+0x182/0x440
> [dm_mod]
> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0113d62>] do_flush_tlb_all+0x0/0x5d
> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90abff0>]
> __flush_deferred_io+0x17/0x20 [dm_mod]
> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90ac14c>] dm_resume+0x8e/0xf9 [dm_mod]
> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90aedd8>] dev_suspend+0x138/0x157
> [dm_mod]
> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90af607>] ctl_ioctl+0x220/0x26e [dm_mod]
> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90aeca0>] dev_suspend+0x0/0x157 [dm_mod]
> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0179ce8>] do_ioctl+0x48/0x5e
> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0179f60>] vfs_ioctl+0x262/0x275
> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0179fc7>] sys_ioctl+0x54/0x6d
> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0103dcb>] sysenter_past_esp+0x54/0x79
> Dec 27 07:45:39 chelco-app-01 kernel: Code: b4 0a f9 89 70 40 8b 06 83 c0 0c
> f0 ff 00 8b 54 24 08 8d 4e 08 8b 02 8b 52 04 89 44 24 0c 89 f8 89 54 24 10 8b
> 5f 04 8b 54 24 08 <ff> 53 1c 83 f8 00 89 c2 0f 8e 93 00 00 00 8b 54 24 08 8b 42 0c
> #############################################################
>
> The result is the target volume gets suspended and the only way to fix it is
> to reboot and remove the faulty snapshot when it comes back up.
>
> Now the script I wrote that creates these snapshots will use all available
> extents from the Volume Group pool which in this case was actually larger than
> the size of the volume I was trying to snapshot. Thinking this was the
> problem, I tried creating the snapshot several times using a snapshot size
> less than or equal to the target volume and it worked every time. So, I tried
> a value larger than the target to generate a crash and it did BUT not every
> time. In fact now I can't get it to segfault at all.
>
> So my question is: is creating the snapshot volume with a size larger than the
> target volume inducing segfaults randomly or could there be another problem
> lurking? If these weren't production machines I would normally just go with a
> size smaller than the target but I really need to be sure what exactly is
> causing the segfaults.
>
> Any help would be appreciated.


Any special reason to use lvm2 from the year 2006 in the year 2013 ?
There is no big point in fixing some particular bugs any many years obsoleted 
source code.

Can you try to use/rebuild more recent version?

Zdenek

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] Snapshot causing segault
  2013-01-03 10:18 ` Zdenek Kabelac
@ 2013-01-03 13:25   ` Tyler Gates
  2013-01-03 20:04   ` Stuart D Gathman
  1 sibling, 0 replies; 6+ messages in thread
From: Tyler Gates @ 2013-01-03 13:25 UTC (permalink / raw)
  To: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 6547 bytes --]

On Thu, Jan 3, 2013 at 5:18 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote:

> Dne 31.12.2012 19:50, Tyler Gates napsal(a):
>
>  Hello everyone,
>>       I've been having an intermittent problem on random servers
>> segfaulting
>> while trying to create a snapshot under version  lvm2-2.02.17-7.38.3 on
>> kernel 2.6.16.60-0.93.1-bigsmp (SLES 10 SP4). The messages I get are:
>> ##############################**#############
>> Dec 27 07:45:39 chelco-app-01 kernel: Unable to handle kernel NULL pointer
>> dereference at virtual address 0000001c
>> Dec 27 07:45:39 chelco-app-01 kernel:  printing eip:
>> Dec 27 07:45:39 chelco-app-01 kernel: f90ab3a7
>> Dec 27 07:45:39 chelco-app-01 kernel: *pde = 3780a001
>> Dec 27 07:45:39 chelco-app-01 kernel: Oops: 0000 [#1]
>> Dec 27 07:45:39 chelco-app-01 kernel: SMP
>> Dec 27 07:45:39 chelco-app-01 kernel: last sysfs file:
>> /devices/pci0000:00/0000:00:**02.0/0000:04:00.1/irq
>> Dec 27 07:45:39 chelco-app-01 kernel: Modules linked in: raw dock button
>> battery ac loop dm_snapshot usbhid dm_mod uhci_hcd bnx2x hw_random
>> ehci_hcd
>> qla2xxx hpilo usbcore firmware_class scsi_transport_fc parport_pc lp
>> parport
>> ext3 jbd edd
>> fan thermal processor cciss sd_mod scsi_mod
>> Dec 27 07:45:39 chelco-app-01 kernel: CPU:    4
>> Dec 27 07:45:39 chelco-app-01 kernel: EIP:    0060:[<f90ab3a7>]
>>  Tainted: G
>>      X VLI
>> Dec 27 07:45:39 chelco-app-01 kernel: EFLAGS: 00210202
>> (2.6.16.60-0.93.1-bigsmp #1)
>> Dec 27 07:45:39 chelco-app-01 kernel: EIP is at __map_bio+0x50/0x11f
>> [dm_mod]
>> Dec 27 07:45:39 chelco-app-01 kernel: eax: f90960c4   ebx: 00000000   ecx:
>> f7ff2a60   edx: f7794440
>> Dec 27 07:45:39 chelco-app-01 kernel: esi: f7ff2a58   edi: f90960c4   ebp:
>> f46306c0   esp: f4c15d28
>> Dec 27 07:45:39 chelco-app-01 kernel: ds: 007b   es: 007b   ss: 0068
>> Dec 27 07:45:39 chelco-app-01 kernel: Process lvcreate (pid: 6678,
>> threadinfo=f4c14000 task=f7838680)
>> Dec 27 07:45:39 chelco-app-01 kernel: Stack: <0>f7794340 f7794440 f7794440
>> 03201ff0 00000000 03201ff0 00000000 00000008
>> Dec 27 07:45:39 chelco-app-01 kernel:        00000000 00000000 f90960c4
>> f7ff2a68 f46306c0 f90abd1b 00000000 00000001
>> Dec 27 07:45:39 chelco-app-01 kernel:        00000008 f428e2e0 fcdfe010
>> ffffffff c0113d62 00000000 0000001f f7ff2a58
>> Dec 27 07:45:39 chelco-app-01 kernel: Call Trace:
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90abd1b>]
>> __split_bio+0x182/0x440
>> [dm_mod]
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0113d62>]
>> do_flush_tlb_all+0x0/0x5d
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90abff0>]
>> __flush_deferred_io+0x17/0x20 [dm_mod]
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90ac14c>] dm_resume+0x8e/0xf9
>> [dm_mod]
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90aedd8>]
>> dev_suspend+0x138/0x157
>> [dm_mod]
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90af607>] ctl_ioctl+0x220/0x26e
>> [dm_mod]
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<f90aeca0>] dev_suspend+0x0/0x157
>> [dm_mod]
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0179ce8>] do_ioctl+0x48/0x5e
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0179f60>] vfs_ioctl+0x262/0x275
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0179fc7>] sys_ioctl+0x54/0x6d
>> Dec 27 07:45:39 chelco-app-01 kernel:  [<c0103dcb>]
>> sysenter_past_esp+0x54/0x79
>> Dec 27 07:45:39 chelco-app-01 kernel: Code: b4 0a f9 89 70 40 8b 06 83 c0
>> 0c
>> f0 ff 00 8b 54 24 08 8d 4e 08 8b 02 8b 52 04 89 44 24 0c 89 f8 89 54 24
>> 10 8b
>> 5f 04 8b 54 24 08 <ff> 53 1c 83 f8 00 89 c2 0f 8e 93 00 00 00 8b 54 24 08
>> 8b 42 0c
>> ##############################**##############################**#
>>
>> The result is the target volume gets suspended and the only way to fix it
>> is
>> to reboot and remove the faulty snapshot when it comes back up.
>>
>> Now the script I wrote that creates these snapshots will use all available
>> extents from the Volume Group pool which in this case was actually larger
>> than
>> the size of the volume I was trying to snapshot. Thinking this was the
>> problem, I tried creating the snapshot several times using a snapshot size
>> less than or equal to the target volume and it worked every time. So, I
>> tried
>> a value larger than the target to generate a crash and it did BUT not
>> every
>> time. In fact now I can't get it to segfault at all.
>>
>> So my question is: is creating the snapshot volume with a size larger
>> than the
>> target volume inducing segfaults randomly or could there be another
>> problem
>> lurking? If these weren't production machines I would normally just go
>> with a
>> size smaller than the target but I really need to be sure what exactly is
>> causing the segfaults.
>>
>> Any help would be appreciated.
>>
>
>
> Any special reason to use lvm2 from the year 2006 in the year 2013 ?
>

Yes. It is from a specific version of an OS we tested as being stable back
in the day, which unfortunately uses older software such as this LVM
version. It wasn't until recently that I wanted to start using LVM.


> There is no big point in fixing some particular bugs any many years
> obsoleted source code.
>
> Can you try to use/rebuild more recent version?
>

I realize trying a more recent version would be the best thing to do
assuming it would be easy (in this situation it would be a big hassle) but
I was hoping someone could tell me either "yes over allocating to the
snapshot could cause this" or  "it sounds like a bug in that version"
before I go through all that trouble.


>
> Zdenek
>
>
> ______________________________**_________________
> linux-lvm mailing list
> linux-lvm@redhat.com
> https://www.redhat.com/**mailman/listinfo/linux-lvm<https://www.redhat.com/mailman/listinfo/linux-lvm>
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-**HOWTO/<http://tldp.org/HOWTO/LVM-HOWTO/>
>



-- 

Tyler Gates*
*

*ATS* | Sr. Systems Administrator

Tyler.Gates@ats.coop

*The Power of **One** Software Solution - **OpenOne*


 910.210.4100 main  |  910.210.4150 fax |  910.210.4118 direct |
910.358.3063 mobile |


This email may contain information that is confidential or attorney-client
privileged and may constitute inside
information. The contents of this email are intended only for the
recipient(s) listed above.  If you are not the
intended recipient, you are directed not to read, disclose, distribute or
otherwise use this transmission.  If you
received this email in error, please notify the sender immediately and
delete the transmission.  Delivery of the
message is not intended to waive any applicable privileges.

[-- Attachment #2: Type: text/html, Size: 8112 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] Snapshot causing segault
  2013-01-03 10:18 ` Zdenek Kabelac
  2013-01-03 13:25   ` Tyler Gates
@ 2013-01-03 20:04   ` Stuart D Gathman
  1 sibling, 0 replies; 6+ messages in thread
From: Stuart D Gathman @ 2013-01-03 20:04 UTC (permalink / raw)
  To: linux-lvm

On 01/03/2013 05:18 AM, Zdenek Kabelac expounded in part:
>
>> So my question is: is creating the snapshot volume with a size larger 
>> than the
>> target volume inducing segfaults randomly or could there be another 
>> problem
>> lurking? If these weren't production machines I would normally just 
>> go with a
>> size smaller than the target but I really need to be sure what 
>> exactly is
>> causing the segfaults.
>>
>> Any help would be appreciated.
>
>
> Any special reason to use lvm2 from the year 2006 in the year 2013 ?
> There is no big point in fixing some particular bugs any many years 
> obsoleted source code.
>
> Can you try to use/rebuild more recent version?
Upgrading production systems can be expensive - hence the value of paid 
long term support contracts.  I don't think he is asking to fix it - 
just to confirm that it is an LVM bug, which could make upgrading more 
attractive, or make the workaround (of ensuring snapshot size <= source 
size) acceptable.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] Snapshot causing segault
  2012-12-31 18:50 [linux-lvm] Snapshot causing segault Tyler Gates
  2013-01-03 10:18 ` Zdenek Kabelac
@ 2013-01-04 11:23 ` Milan Broz
  2013-01-04 14:50   ` Tyler Gates
  1 sibling, 1 reply; 6+ messages in thread
From: Milan Broz @ 2013-01-04 11:23 UTC (permalink / raw)
  To: LVM general discussion and development; +Cc: Tyler Gates

On 12/31/2012 07:50 PM, Tyler Gates wrote:
> Hello everyone,
> I've been having an intermittent problem on random servers
> segfaulting while trying to create a snapshot under version
> lvm2-2.02.17-7.38.3 on kernel 2.6.16.60-0.93.1-bigsmp (SLES 10 SP4).

This is almost surely kernel bug, not userspace one.

It is very old kernel, so hard to say exactly which patches are needed
but my guess is it is very similar to https://bugzilla.redhat.com/show_bug.cgi?id=360151
(note: fixed 2007!) with upstream patch
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=512875bd9661368da6f993205a61213b79ba1df0

(But there can more patches, this is really old and someone need
to do proper analysis what is going here.)

Milan

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [linux-lvm] Snapshot causing segault
  2013-01-04 11:23 ` Milan Broz
@ 2013-01-04 14:50   ` Tyler Gates
  0 siblings, 0 replies; 6+ messages in thread
From: Tyler Gates @ 2013-01-04 14:50 UTC (permalink / raw)
  To: Milan Broz; +Cc: LVM general discussion and development

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

Thanks Milan I believe you are right about it being a bug. I was able to
trigger the panic with the script provided and the results are very similar
(although I'm now having a hard time recreating my problem on demand).

I should have fun finding this kernel for SLES10 SP4 even though the minor
is only 2 off...

Thanks again for everyone's help.

    Tyler


On Fri, Jan 4, 2013 at 6:23 AM, Milan Broz <mbroz@redhat.com> wrote:

> On 12/31/2012 07:50 PM, Tyler Gates wrote:
> > Hello everyone,
> > I've been having an intermittent problem on random servers
> > segfaulting while trying to create a snapshot under version
> > lvm2-2.02.17-7.38.3 on kernel 2.6.16.60-0.93.1-bigsmp (SLES 10 SP4).
>
> This is almost surely kernel bug, not userspace one.
>
> It is very old kernel, so hard to say exactly which patches are needed
> but my guess is it is very similar to
> https://bugzilla.redhat.com/show_bug.cgi?id=360151
> (note: fixed 2007!) with upstream patch
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=512875bd9661368da6f993205a61213b79ba1df0
>
> (But there can more patches, this is really old and someone need
> to do proper analysis what is going here.)
>
> Milan
>
>


-- 

Tyler Gates*
*

*ATS* | Sr. Systems Administrator

Tyler.Gates@ats.coop

*The Power of **One** Software Solution - **OpenOne*


 910.210.4100 main  |  910.210.4150 fax |  910.210.4118 direct |
910.358.3063 mobile |


This email may contain information that is confidential or attorney-client
privileged and may constitute inside
information. The contents of this email are intended only for the
recipient(s) listed above.  If you are not the
intended recipient, you are directed not to read, disclose, distribute or
otherwise use this transmission.  If you
received this email in error, please notify the sender immediately and
delete the transmission.  Delivery of the
message is not intended to waive any applicable privileges.

[-- Attachment #2: Type: text/html, Size: 3117 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-01-04 14:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-31 18:50 [linux-lvm] Snapshot causing segault Tyler Gates
2013-01-03 10:18 ` Zdenek Kabelac
2013-01-03 13:25   ` Tyler Gates
2013-01-03 20:04   ` Stuart D Gathman
2013-01-04 11:23 ` Milan Broz
2013-01-04 14:50   ` Tyler Gates

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).