* 2 TB wraparound on 32 bit host
@ 2010-06-11 20:57 Phillip Susi
2010-06-11 21:16 ` James Bottomley
0 siblings, 1 reply; 13+ messages in thread
From: Phillip Susi @ 2010-06-11 20:57 UTC (permalink / raw)
To: device-mapper development
I am seeing access to > 2tb on a dm target silently wrap around to 0.
Simple recreation steps:
lvcreate --type zero -L 3TB -n empty vg0
lvcreate -s vg0/empty -L 10G -n thin
mke2fs -t ext4 -E lazy_itable_init /dev/vg0/thin
e2fsck -f /dev/vg0/thin
The fsck will find block bitmap differences on a cleanly formatted fs
that seem to be caused by wraparound. Accessing block 536870912 with dd
seems to return the superblock instead of the block allocation bitmap
that should be located there.
This is using kernel 2.6.31-21-generic-pae i686 build from Ubuntu 9.10.
Is this a known issue and/or can anyone reproduce it?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on 32 bit host
2010-06-11 20:57 2 TB wraparound on 32 bit host Phillip Susi
@ 2010-06-11 21:16 ` James Bottomley
2010-06-12 15:45 ` Phillip Susi
0 siblings, 1 reply; 13+ messages in thread
From: James Bottomley @ 2010-06-11 21:16 UTC (permalink / raw)
To: device-mapper development
On Fri, 2010-06-11 at 16:57 -0400, Phillip Susi wrote:
> I am seeing access to > 2tb on a dm target silently wrap around to 0.
> Simple recreation steps:
>
> lvcreate --type zero -L 3TB -n empty vg0
> lvcreate -s vg0/empty -L 10G -n thin
> mke2fs -t ext4 -E lazy_itable_init /dev/vg0/thin
> e2fsck -f /dev/vg0/thin
>
> The fsck will find block bitmap differences on a cleanly formatted fs
> that seem to be caused by wraparound. Accessing block 536870912 with dd
> seems to return the superblock instead of the block allocation bitmap
> that should be located there.
>
> This is using kernel 2.6.31-21-generic-pae i686 build from Ubuntu 9.10.
>
> Is this a known issue and/or can anyone reproduce it?
So best guess is that CONFIG_LBDAF isn't set. This would make all
sector_t counts wrap at 2TB (32 bits worth of 512 bytes). It would be
rather a daft thing for a distribution not to have set, though ...
James
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on 32 bit host
2010-06-11 21:16 ` James Bottomley
@ 2010-06-12 15:45 ` Phillip Susi
2010-06-12 16:03 ` James Bottomley
0 siblings, 1 reply; 13+ messages in thread
From: Phillip Susi @ 2010-06-12 15:45 UTC (permalink / raw)
To: device-mapper development
On 06/11/2010 05:16 PM, James Bottomley wrote:
> So best guess is that CONFIG_LBDAF isn't set. This would make all
> sector_t counts wrap at 2TB (32 bits worth of 512 bytes). It would be
> rather a daft thing for a distribution not to have set, though ...
Bingo, thanks. It doesn't seem to be set on this machine running the
amd64 2.6.32 lucid build which also appears to suffer the same problem.
If this config option isn't set though, shouldn't the kernel fail
calls like llseek() that try to exceed the limit, rather than silently
wrap around to the wrong address?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on 32 bit host
2010-06-12 15:45 ` Phillip Susi
@ 2010-06-12 16:03 ` James Bottomley
2010-06-12 16:09 ` James Bottomley
0 siblings, 1 reply; 13+ messages in thread
From: James Bottomley @ 2010-06-12 16:03 UTC (permalink / raw)
To: Phillip Susi; +Cc: device-mapper development
On Sat, 2010-06-12 at 11:45 -0400, Phillip Susi wrote:
> On 06/11/2010 05:16 PM, James Bottomley wrote:
> > So best guess is that CONFIG_LBDAF isn't set. This would make all
> > sector_t counts wrap at 2TB (32 bits worth of 512 bytes). It would be
> > rather a daft thing for a distribution not to have set, though ...
>
> Bingo, thanks. It doesn't seem to be set on this machine running the
> amd64 2.6.32 lucid build which also appears to suffer the same problem.
So is this a default kernel or did you build your own ... because if
it's a vanilla ubuntu kernel, not setting this config option would be
pretty embarrassing (not to mention annoy a lot of users)?
> If this config option isn't set though, shouldn't the kernel fail
> calls like llseek() that try to exceed the limit, rather than silently
> wrap around to the wrong address?
Not really ... it's defaulted to y; only people who know what they're
doing should set it to N. These people are mostly embedded and will
never connect > 2TB devices to their system, so checking is a waste of
time and space for them. Plus making the checks exhaustive and
foolproof is just about impossible given that we alter the underlying
size of sector_t and wrap around without warning is the C default.
James
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on 32 bit host
2010-06-12 16:03 ` James Bottomley
@ 2010-06-12 16:09 ` James Bottomley
2010-06-12 17:58 ` Phillip Susi
0 siblings, 1 reply; 13+ messages in thread
From: James Bottomley @ 2010-06-12 16:09 UTC (permalink / raw)
To: Phillip Susi; +Cc: device-mapper development
On Sat, 2010-06-12 at 11:03 -0500, James Bottomley wrote:
> On Sat, 2010-06-12 at 11:45 -0400, Phillip Susi wrote:
> > On 06/11/2010 05:16 PM, James Bottomley wrote:
> > > So best guess is that CONFIG_LBDAF isn't set. This would make all
> > > sector_t counts wrap at 2TB (32 bits worth of 512 bytes). It would be
> > > rather a daft thing for a distribution not to have set, though ...
> >
> > Bingo, thanks. It doesn't seem to be set on this machine running the
> > amd64 2.6.32 lucid build which also appears to suffer the same problem.
>
> So is this a default kernel or did you build your own ... because if
> it's a vanilla ubuntu kernel, not setting this config option would be
> pretty embarrassing (not to mention annoy a lot of users)?
>
> > If this config option isn't set though, shouldn't the kernel fail
> > calls like llseek() that try to exceed the limit, rather than silently
> > wrap around to the wrong address?
>
> Not really ... it's defaulted to y; only people who know what they're
> doing should set it to N. These people are mostly embedded and will
> never connect > 2TB devices to their system, so checking is a waste of
> time and space for them. Plus making the checks exhaustive and
> foolproof is just about impossible given that we alter the underlying
> size of sector_t and wrap around without warning is the C default.
Actually, looking at the above, this is conflicting information. On 64
bit systems (like amd64) there's no need to set CONFIG_LBDAF because
sector_t is an unsigned long, which is 64 bits. It's only on 32 bit
configurations it gets set. Initially you said you were using a PAE
i686 configuration, which would need this setting. An amd64 one
wouldn't.
Which kernel are you actually seeing the problem on, and what is
CONFIG_LBDAF set to on that kernel?
James
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on 32 bit host
2010-06-12 16:09 ` James Bottomley
@ 2010-06-12 17:58 ` Phillip Susi
2010-06-12 18:47 ` 2 TB wraparound on snapshots Phillip Susi
0 siblings, 1 reply; 13+ messages in thread
From: Phillip Susi @ 2010-06-12 17:58 UTC (permalink / raw)
To: James Bottomley; +Cc: device-mapper development
On 06/12/2010 12:09 PM, James Bottomley wrote:
> Actually, looking at the above, this is conflicting information. On 64
> bit systems (like amd64) there's no need to set CONFIG_LBDAF because
> sector_t is an unsigned long, which is 64 bits. It's only on 32 bit
> configurations it gets set. Initially you said you were using a PAE
> i686 configuration, which would need this setting. An amd64 one
> wouldn't.
>
> Which kernel are you actually seeing the problem on, and what is
> CONFIG_LBDAF set to on that kernel?
I am seeing the wraparound on two systems, one running the vanilla i386
Ubuntu Karmic kernel, and one running amd64 Lucid. I checked the config
file on the amd64 system I am on now and the symbol CONFIG_LBDAF is not
there. I just checked the i386 config and it is set there. It seems
then, that this is not the cause of the problem after all.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on snapshots
2010-06-12 17:58 ` Phillip Susi
@ 2010-06-12 18:47 ` Phillip Susi
2010-06-12 20:03 ` James Bottomley
0 siblings, 1 reply; 13+ messages in thread
From: Phillip Susi @ 2010-06-12 18:47 UTC (permalink / raw)
To: device-mapper development
Update: it seems this is a problem specific to snapshots. I used
dmsetup to create a linear mapping > 2tb which worked fine. When
writing beyond the 2TB mark to the snapshot, it aliases exceptions
starting at sector 0.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on snapshots
2010-06-12 18:47 ` 2 TB wraparound on snapshots Phillip Susi
@ 2010-06-12 20:03 ` James Bottomley
2010-06-12 22:03 ` Phillip Susi
0 siblings, 1 reply; 13+ messages in thread
From: James Bottomley @ 2010-06-12 20:03 UTC (permalink / raw)
To: Phillip Susi; +Cc: device-mapper development
On Sat, 2010-06-12 at 14:47 -0400, Phillip Susi wrote:
> Update: it seems this is a problem specific to snapshots. I used
> dmsetup to create a linear mapping > 2tb which worked fine. When
> writing beyond the 2TB mark to the snapshot, it aliases exceptions
> starting at sector 0.
So do you have a 64 bit system (like an amd64) that you can try this
with? If it only occurs on the i686 system, then it's likely a long to
sector conversion bug somewhere in the snapshot code. If it occurs on
both, there's some 2TB limit coded directly into snapshots.
James
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on snapshots
2010-06-12 20:03 ` James Bottomley
@ 2010-06-12 22:03 ` Phillip Susi
2010-06-15 14:57 ` 2 TB wraparound on snapshots on kernels < 2.6.33 Phillip Susi
0 siblings, 1 reply; 13+ messages in thread
From: Phillip Susi @ 2010-06-12 22:03 UTC (permalink / raw)
To: device-mapper development
On 06/12/2010 04:03 PM, James Bottomley wrote:
> So do you have a 64 bit system (like an amd64) that you can try this
> with? If it only occurs on the i686 system, then it's likely a long to
> sector conversion bug somewhere in the snapshot code. If it occurs on
> both, there's some 2TB limit coded directly into snapshots.
As I said before, it occurs on both.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on snapshots on kernels < 2.6.33
2010-06-12 22:03 ` Phillip Susi
@ 2010-06-15 14:57 ` Phillip Susi
2010-06-16 13:13 ` Mikulas Patocka
0 siblings, 1 reply; 13+ messages in thread
From: Phillip Susi @ 2010-06-15 14:57 UTC (permalink / raw)
To: device-mapper development
After further testing of mainline kernels, it seems that the bug was
fixed between 2.6.32 and 2.6.33. Looking over the logs, I see no
changes that were intended to fix this issue, but there were quite a
number of changes to the snapshot code. I can only conclude that these
inadvertently fixed the problem.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on snapshots on kernels < 2.6.33
2010-06-15 14:57 ` 2 TB wraparound on snapshots on kernels < 2.6.33 Phillip Susi
@ 2010-06-16 13:13 ` Mikulas Patocka
2010-06-16 13:45 ` Mikulas Patocka
0 siblings, 1 reply; 13+ messages in thread
From: Mikulas Patocka @ 2010-06-16 13:13 UTC (permalink / raw)
To: Phillip Susi; +Cc: device-mapper development
On Tue, 15 Jun 2010, Phillip Susi wrote:
> After further testing of mainline kernels, it seems that the bug was
> fixed between 2.6.32 and 2.6.33. Looking over the logs, I see no
> changes that were intended to fix this issue, but there were quite a
> number of changes to the snapshot code. I can only conclude that these
> inadvertently fixed the problem.
Hi
I wasn't able to reproduce this bug on any upstream kernel and I suspect
it is caused by incorrect patching on Ubuntu side. Ubuntu kernels
2.6.31-16 and before don't have the bug, 2.6.31-17 and above have it.
Mikulas
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on snapshots on kernels < 2.6.33
2010-06-16 13:13 ` Mikulas Patocka
@ 2010-06-16 13:45 ` Mikulas Patocka
2010-06-16 13:52 ` Phillip Susi
0 siblings, 1 reply; 13+ messages in thread
From: Mikulas Patocka @ 2010-06-16 13:45 UTC (permalink / raw)
To: device-mapper development; +Cc: Phillip Susi
On Wed, 16 Jun 2010, Mikulas Patocka wrote:
>
>
> On Tue, 15 Jun 2010, Phillip Susi wrote:
>
> > After further testing of mainline kernels, it seems that the bug was
> > fixed between 2.6.32 and 2.6.33. Looking over the logs, I see no
> > changes that were intended to fix this issue, but there were quite a
> > number of changes to the snapshot code. I can only conclude that these
> > inadvertently fixed the problem.
>
> Hi
>
> I wasn't able to reproduce this bug on any upstream kernel and I suspect
> it is caused by incorrect patching on Ubuntu side. Ubuntu kernels
> 2.6.31-16 and before don't have the bug, 2.6.31-17 and above have it.
>
> Mikulas
The bug existed even in upstream, but only in 2.6.32 kernel. The reason
was this function:
static inline chunk_t sector_to_chunk(struct dm_exception_store *store,
sector_t sector)
{
return (sector & ~store->chunk_mask) >> store->chunk_shift;
}
"store->chunk_mask" was changed to be unsigned in 2.6.32, so it was
masking the sector with 32-bit value. In 2.6.33 that masking was removed.
Ubuntu picked that 2.6.32 patch but didn't pick further patches.
Mikulas
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: 2 TB wraparound on snapshots on kernels < 2.6.33
2010-06-16 13:45 ` Mikulas Patocka
@ 2010-06-16 13:52 ` Phillip Susi
0 siblings, 0 replies; 13+ messages in thread
From: Phillip Susi @ 2010-06-16 13:52 UTC (permalink / raw)
To: Mikulas Patocka; +Cc: device-mapper development
Aha! I looked at the code carefully as it aroused my suspicions, but I
couldn't quite work out how it actually caused the problem. Good catch.
On 6/16/2010 9:45 AM, Mikulas Patocka wrote:
> The bug existed even in upstream, but only in 2.6.32 kernel. The reason
> was this function:
> static inline chunk_t sector_to_chunk(struct dm_exception_store *store,
> sector_t sector)
> {
> return (sector & ~store->chunk_mask) >> store->chunk_shift;
> }
>
> "store->chunk_mask" was changed to be unsigned in 2.6.32, so it was
> masking the sector with 32-bit value. In 2.6.33 that masking was removed.
> Ubuntu picked that 2.6.32 patch but didn't pick further patches.
>
> Mikulas
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2010-06-16 13:52 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-11 20:57 2 TB wraparound on 32 bit host Phillip Susi
2010-06-11 21:16 ` James Bottomley
2010-06-12 15:45 ` Phillip Susi
2010-06-12 16:03 ` James Bottomley
2010-06-12 16:09 ` James Bottomley
2010-06-12 17:58 ` Phillip Susi
2010-06-12 18:47 ` 2 TB wraparound on snapshots Phillip Susi
2010-06-12 20:03 ` James Bottomley
2010-06-12 22:03 ` Phillip Susi
2010-06-15 14:57 ` 2 TB wraparound on snapshots on kernels < 2.6.33 Phillip Susi
2010-06-16 13:13 ` Mikulas Patocka
2010-06-16 13:45 ` Mikulas Patocka
2010-06-16 13:52 ` Phillip Susi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.