* [PATCH 0/2] repair: couple of bug fixes @ 2011-10-06 1:01 Dave Chinner 2011-10-06 1:01 ` [PATCH 1/2] repair: handle repair of image files on large sector size filesystems Dave Chinner 2011-10-06 1:01 ` [PATCH 2/2] repair: fix some valgrind reported errors on i686 Dave Chinner 0 siblings, 2 replies; 8+ messages in thread From: Dave Chinner @ 2011-10-06 1:01 UTC (permalink / raw) To: xfs Hi Alex, These are a couple of bug fixes to repair that I posted a while ago that we should probably include in the 3.1.6 release. I don't recall if they were fully reviewed back when I posted them, so I'm just sending them again. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] repair: handle repair of image files on large sector size filesystems 2011-10-06 1:01 [PATCH 0/2] repair: couple of bug fixes Dave Chinner @ 2011-10-06 1:01 ` Dave Chinner 2011-10-06 12:17 ` Alex Elder 2011-10-06 1:01 ` [PATCH 2/2] repair: fix some valgrind reported errors on i686 Dave Chinner 1 sibling, 1 reply; 8+ messages in thread From: Dave Chinner @ 2011-10-06 1:01 UTC (permalink / raw) To: xfs From: Dave Chinner <dchinner@redhat.com> Because repair uses direct IO, it cannot do IO smaller than a sector on the underlying device. When repairing a filesystem image, the filesystem hosting the image may have a sector size larger than the sector size of the image, and so single image sector reads and writes will fail. To avoid this, when checking a file and there is a sector size mismatch like this, turn off direct IO. While there, fix a compile bug in the IO_DEBUG option for libxfs which was found during triage. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- libxfs/rdwr.c | 2 +- repair/sb.c | 9 ++++++++- repair/xfs_repair.c | 30 ++++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+), 2 deletions(-) diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c index ec2675e..a656851 100644 --- a/libxfs/rdwr.c +++ b/libxfs/rdwr.c @@ -364,7 +364,7 @@ libxfs_getbufr(dev_t device, xfs_daddr_t blkno, int bblen) libxfs_initbuf(bp, device, blkno, blen); #ifdef IO_DEBUG printf("%lx: %s: allocated %u bytes buffer, key=%llu(%llu), %p\n", - pthread_self(), __FUNCTION__, BBTOB(len), + pthread_self(), __FUNCTION__, blen, (long long)LIBXFS_BBTOOFF64(blkno), (long long)blkno, bp); #endif diff --git a/repair/sb.c b/repair/sb.c index 0ee2345..c8df076 100644 --- a/repair/sb.c +++ b/repair/sb.c @@ -689,7 +689,14 @@ verify_set_primary_sb(xfs_sb_t *rsb, */ num_sbs = MIN(NUM_SBS, rsb->sb_agcount); skip = howmany(num_sbs, rsb->sb_agcount); - size = NUM_AGH_SECTS * rsb->sb_sectsize; + + /* + * We haven't been able to validate the sector size yet properly + * (e.g. in the case of repairing an image in a file), so we need to + * take into account sector mismatches and so use the maximum possible + * sector size rather than that in @rsb. + */ + size = NUM_AGH_SECTS * (1 << (XFS_MAX_SECTORSIZE_LOG)); retval = 0; list = NULL; num_ok = 0; diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index 4707b83..dda9daa 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -569,6 +569,36 @@ main(int argc, char **argv) memset(&xfs_m, 0, sizeof(xfs_mount_t)); libxfs_sb_from_disk(&xfs_m.m_sb, XFS_BUF_TO_SBP(sbp)); + /* + * if the sector size of the filesystem we are trying to repair is + * smaller than that of the underlying filesystem (i.e. we are repairing + * an image), the we have to turn off direct IO because we cannot do IO + * smaller than the host filesystem's sector size. + */ + if (isa_file) { + int fd = libxfs_device_to_fd(x.ddev); + struct xfs_fsop_geom_v1 geom = { + .sectsize = BBSIZE, + }; + + if (ioctl(fd, XFS_IOC_FSGEOMETRY_V1, &geom) < 0) { + do_warn(_("Cannot get host fiesystem geometry.\n" + "Repair may fail if there is a sector size mismatch between " + "the image and the host filesystem.\n")); + } + + if (xfs_m.m_sb.sb_sectsize < geom.sectsize) { + long old_flags; + + old_flags = fcntl(fd, F_GETFL, 0); + if (fcntl(fd, F_SETFL, old_flags & ~O_DIRECT) < 0) { + do_warn(_( + "Sector size on host filesystem larger than image sector size.\n" + "Cannot turn off direct IO, so exiting.\n")); + exit(1); + } + } + } mp = libxfs_mount(&xfs_m, &xfs_m.m_sb, x.ddev, x.logdev, x.rtdev, 0); if (!mp) { -- 1.7.5.4 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] repair: handle repair of image files on large sector size filesystems 2011-10-06 1:01 ` [PATCH 1/2] repair: handle repair of image files on large sector size filesystems Dave Chinner @ 2011-10-06 12:17 ` Alex Elder 0 siblings, 0 replies; 8+ messages in thread From: Alex Elder @ 2011-10-06 12:17 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs On Thu, 2011-10-06 at 12:01 +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@redhat.com> > > Because repair uses direct IO, it cannot do IO smaller than a sector > on the underlying device. When repairing a filesystem image, the > filesystem hosting the image may have a sector size larger than the > sector size of the image, and so single image sector reads and > writes will fail. > > To avoid this, when checking a file and there is a sector size > mismatch like this, turn off direct IO. While there, fix a compile > bug in the IO_DEBUG option for libxfs which was found during triage. > > Signed-off-by: Dave Chinner <dchinner@redhat.com> I had a few suggestions for you to consider when I reviewed this before. The second patch in this series needs an update so you have a chance to address those suggestions if you so choose. http://patchwork.xfs.org/patch/2336/ Either way, I'll mark this as I did before. Reviewed-by: Alex Elder <aelder@sgi.com> _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/2] repair: fix some valgrind reported errors on i686 2011-10-06 1:01 [PATCH 0/2] repair: couple of bug fixes Dave Chinner 2011-10-06 1:01 ` [PATCH 1/2] repair: handle repair of image files on large sector size filesystems Dave Chinner @ 2011-10-06 1:01 ` Dave Chinner 2011-10-06 12:17 ` Alex Elder 1 sibling, 1 reply; 8+ messages in thread From: Dave Chinner @ 2011-10-06 1:01 UTC (permalink / raw) To: xfs From: Dave Chinner <dchinner@redhat.com> Fix a potential prefetch read problem due to the first loop execution of pf_batch_read potentially not initialising the fsbno varaible: ==10177== Thread 6: ==10177== Conditional jump or move depends on uninitialised value(s) ==10177== at 0x8079CAB: pf_batch_read (prefetch.c:408) ==10177== by 0x6A2996D: clone (clone.S:130) ==10177== Fix a bunch of invalid read/write errors due to excessive blkmap allocations when inode forks are corrupted. These show up some time after making a blkmap allocation for 536870913 extents on i686, which is followed some time later by a crash caused bymemory corruption. This blkmap allocation size overflows 32 bits in such a way that it results in a 32 byte allocation and so access to the second extent results in access beyond the allocated memory and corrupts random memory. ==5419== Invalid write of size 4 ==5419== at 0x80507DA: blkmap_set_ext (bmap.c:260) ==5419== by 0x8055CF4: process_bmbt_reclist_int (dinode.c:712) ==5419== by 0x8056206: process_bmbt_reclist (dinode.c:813) ==5419== by 0x80579DA: process_exinode (dinode.c:1324) ==5419== by 0x8059B77: process_dinode_int (dinode.c:2036) ==5419== by 0x805ABE6: process_dinode (dinode.c:2823) ==5419== by 0x8052493: process_inode_chunk.isra.4 (dino_chunks.c:777) ==5419== by 0x8054012: process_aginodes (dino_chunks.c:1024) ==5419== by 0xFFF: ??? ==5419== Address 0x944cfb8 is 0 bytes after a block of size 32 alloc'd ==5419== at 0x48E1102: realloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==5419== by 0x80501F3: blkmap_alloc (bmap.c:56) ==5419== by 0x80599F5: process_dinode_int (dinode.c:2027) ==5419== by 0x805ABE6: process_dinode (dinode.c:2823) ==5419== by 0x8052493: process_inode_chunk.isra.4 (dino_chunks.c:777) ==5419== by 0x8054012: process_aginodes (dino_chunks.c:1024) ==5419== by 0xFFF: ??? Add overflow detection code into the blkmap allocation code to avoid this problem, and also free large allocations once they are finished with to avoid pinning large amounts of memory due to the occasional large extent list in a filesystem. Signed-off-by: Dave Chinner <dchinner@redhat.com> --- repair/bmap.c | 37 ++++++++++++++++++++++++++++++++++++- repair/prefetch.c | 2 +- 2 files changed, 37 insertions(+), 2 deletions(-) diff --git a/repair/bmap.c b/repair/bmap.c index 79b9f79..1127a87 100644 --- a/repair/bmap.c +++ b/repair/bmap.c @@ -47,6 +47,17 @@ blkmap_alloc( if (nex < 1) nex = 1; +#if (BITS_PER_LONG != 64) + if (nex > (INT_MAX / sizeof(bmap_ext_t) - 1)) { + do_warn( + _("Number of extents requested in blkmap_alloc (%u) overflows 32 bits.\n" + "If this is not a corruption, then will need a 64 bit system\n" + "to repair this filesystem.\n"), + nex); + return NULL; + } +#endif + key = whichfork ? ablkmap_key : dblkmap_key; blkmap = pthread_getspecific(key); if (!blkmap || blkmap->naexts < nex) { @@ -66,12 +77,27 @@ blkmap_alloc( /* * Free a block map. + * + * If the map is a large, uncommon size (say for hundreds of thousands of + * extents) then free it to release the memory. This prevents us from pinning + * large tracts of memory due to corrupted fork values or one-off fragmented + * files. Otherwise we have nothing to do but keep the memory around for the + * next inode */ void blkmap_free( blkmap_t *blkmap) { - /* nothing to do! - keep the memory around for the next inode */ + /* consider more than 100k extents rare */ + if (blkmap->naexts < 100 * 1024) + return; + + if (blkmap == pthread_getspecific(dblkmap_key)) + pthread_setspecific(dblkmap_key, NULL); + else + pthread_setspecific(ablkmap_key, NULL); + + free(blkmap); } /* @@ -218,6 +244,15 @@ blkmap_grow( } blkmap->naexts += 4; +#if (BITS_PER_LONG != 64) + if (blkmap->naexts > (INT_MAX / sizeof(bmap_ext_t) - 1)) { + do_error( + _("Number of extents requested in blkmap_grow (%u) overflows 32 bits.\n" + "You need a 64 bit system to repair this filesystem.\n"), + blkmap->naexts); + return NULL; + } +#endif blkmap = realloc(blkmap, BLKMAP_SIZE(blkmap->naexts)); if (blkmap == NULL) do_error(_("realloc failed in blkmap_grow\n")); diff --git a/repair/prefetch.c b/repair/prefetch.c index d2fdf90..da074a8 100644 --- a/repair/prefetch.c +++ b/repair/prefetch.c @@ -397,7 +397,7 @@ pf_batch_read( int len, size; int i; int inode_bufs; - unsigned long fsbno; + unsigned long fsbno = 0; unsigned long max_fsbno; char *pbuf; -- 1.7.5.4 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] repair: fix some valgrind reported errors on i686 2011-10-06 1:01 ` [PATCH 2/2] repair: fix some valgrind reported errors on i686 Dave Chinner @ 2011-10-06 12:17 ` Alex Elder 2011-10-06 22:15 ` Dave Chinner 0 siblings, 1 reply; 8+ messages in thread From: Alex Elder @ 2011-10-06 12:17 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs On Thu, 2011-10-06 at 12:01 +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@redhat.com> > > Fix a potential prefetch read problem due to the first loop > execution of pf_batch_read potentially not initialising the fsbno > varaible: > > ==10177== Thread 6: > ==10177== Conditional jump or move depends on uninitialised value(s) > ==10177== at 0x8079CAB: pf_batch_read (prefetch.c:408) > ==10177== by 0x6A2996D: clone (clone.S:130) > ==10177== > > Fix a bunch of invalid read/write errors due to excessive blkmap > allocations when inode forks are corrupted. These show up some time > after making a blkmap allocation for 536870913 extents on i686, > which is followed some time later by a crash caused bymemory > corruption. > > This blkmap allocation size overflows 32 bits in such a > way that it results in a 32 byte allocation and so access to the > second extent results in access beyond the allocated memory and > corrupts random memory. > > ==5419== Invalid write of size 4 > ==5419== at 0x80507DA: blkmap_set_ext (bmap.c:260) > ==5419== by 0x8055CF4: process_bmbt_reclist_int (dinode.c:712) > ==5419== by 0x8056206: process_bmbt_reclist (dinode.c:813) > ==5419== by 0x80579DA: process_exinode (dinode.c:1324) > ==5419== by 0x8059B77: process_dinode_int (dinode.c:2036) > ==5419== by 0x805ABE6: process_dinode (dinode.c:2823) > ==5419== by 0x8052493: process_inode_chunk.isra.4 (dino_chunks.c:777) > ==5419== by 0x8054012: process_aginodes (dino_chunks.c:1024) > ==5419== by 0xFFF: ??? > ==5419== Address 0x944cfb8 is 0 bytes after a block of size 32 alloc'd > ==5419== at 0x48E1102: realloc (in > /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) > ==5419== by 0x80501F3: blkmap_alloc (bmap.c:56) > ==5419== by 0x80599F5: process_dinode_int (dinode.c:2027) > ==5419== by 0x805ABE6: process_dinode (dinode.c:2823) > ==5419== by 0x8052493: process_inode_chunk.isra.4 (dino_chunks.c:777) > ==5419== by 0x8054012: process_aginodes (dino_chunks.c:1024) > ==5419== by 0xFFF: ??? > > Add overflow detection code into the blkmap allocation code to avoid > this problem, and also free large allocations once they are finished > with to avoid pinning large amounts of memory due to the occasional > large extent list in a filesystem. > > Signed-off-by: Dave Chinner <dchinner@redhat.com> This is good but I have a few comments below, a couple of which really indicate you need to update this. -Alex > --- > repair/bmap.c | 37 ++++++++++++++++++++++++++++++++++++- > repair/prefetch.c | 2 +- > 2 files changed, 37 insertions(+), 2 deletions(-) > > diff --git a/repair/bmap.c b/repair/bmap.c > index 79b9f79..1127a87 100644 > --- a/repair/bmap.c > +++ b/repair/bmap.c > @@ -47,6 +47,17 @@ blkmap_alloc( > if (nex < 1) > nex = 1; > > +#if (BITS_PER_LONG != 64) This should be == 32, not != 64. (And if it were possible, sizeof (int) == 32.) > + if (nex > (INT_MAX / sizeof(bmap_ext_t) - 1)) { See the comment below about this calculation. > + do_warn( > + _("Number of extents requested in blkmap_alloc (%u) overflows 32 bits.\n" > + "If this is not a corruption, then will need a 64 bit system\n" ...then you will need... > + "to repair this filesystem.\n"), > + nex); > + return NULL; > + } > +#endif > + > key = whichfork ? ablkmap_key : dblkmap_key; > blkmap = pthread_getspecific(key); > if (!blkmap || blkmap->naexts < nex) { . . . > @@ -218,6 +244,15 @@ blkmap_grow( > } > > blkmap->naexts += 4; The check needs to go *before* you update naexts. > +#if (BITS_PER_LONG != 64) > + if (blkmap->naexts > (INT_MAX / sizeof(bmap_ext_t) - 1)) { I don't really follow this calculation. I would expect it to be based more closely on how BLKMAP_SIZE() is defined. If you move it before the increment I think it would be better to use: if (BLKMAP_SIZE(nex) >= INT_MAX - sizeof (blkent_t *)) And since this would expose the internals of what BLKMAP_SIZE() does, it might be nicer if some sort of BLKMAP_NENTS_MAX constant were defined next to the definition of BLKMAP_SIZE(). > + do_error( > + _("Number of extents requested in blkmap_grow (%u) overflows 32 bits.\n" > + "You need a 64 bit system to repair this filesystem.\n"), > + blkmap->naexts); > + return NULL; > + } > +#endif > blkmap = realloc(blkmap, BLKMAP_SIZE(blkmap->naexts)); > if (blkmap == NULL) > do_error(_("realloc failed in blkmap_grow\n")); _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] repair: fix some valgrind reported errors on i686 2011-10-06 12:17 ` Alex Elder @ 2011-10-06 22:15 ` Dave Chinner 2011-10-06 22:54 ` Alex Elder 0 siblings, 1 reply; 8+ messages in thread From: Dave Chinner @ 2011-10-06 22:15 UTC (permalink / raw) To: Alex Elder; +Cc: xfs On Thu, Oct 06, 2011 at 07:17:52AM -0500, Alex Elder wrote: > On Thu, 2011-10-06 at 12:01 +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > Fix a potential prefetch read problem due to the first loop > > execution of pf_batch_read potentially not initialising the fsbno > > varaible: > > > > ==10177== Thread 6: > > ==10177== Conditional jump or move depends on uninitialised value(s) > > ==10177== at 0x8079CAB: pf_batch_read (prefetch.c:408) > > ==10177== by 0x6A2996D: clone (clone.S:130) > > ==10177== > > > > Fix a bunch of invalid read/write errors due to excessive blkmap > > allocations when inode forks are corrupted. These show up some time > > after making a blkmap allocation for 536870913 extents on i686, > > which is followed some time later by a crash caused bymemory > > corruption. > > > > This blkmap allocation size overflows 32 bits in such a > > way that it results in a 32 byte allocation and so access to the > > second extent results in access beyond the allocated memory and > > corrupts random memory. > > > > ==5419== Invalid write of size 4 > > ==5419== at 0x80507DA: blkmap_set_ext (bmap.c:260) > > ==5419== by 0x8055CF4: process_bmbt_reclist_int (dinode.c:712) > > ==5419== by 0x8056206: process_bmbt_reclist (dinode.c:813) > > ==5419== by 0x80579DA: process_exinode (dinode.c:1324) > > ==5419== by 0x8059B77: process_dinode_int (dinode.c:2036) > > ==5419== by 0x805ABE6: process_dinode (dinode.c:2823) > > ==5419== by 0x8052493: process_inode_chunk.isra.4 (dino_chunks.c:777) > > ==5419== by 0x8054012: process_aginodes (dino_chunks.c:1024) > > ==5419== by 0xFFF: ??? > > ==5419== Address 0x944cfb8 is 0 bytes after a block of size 32 alloc'd > > ==5419== at 0x48E1102: realloc (in > > /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) > > ==5419== by 0x80501F3: blkmap_alloc (bmap.c:56) > > ==5419== by 0x80599F5: process_dinode_int (dinode.c:2027) > > ==5419== by 0x805ABE6: process_dinode (dinode.c:2823) > > ==5419== by 0x8052493: process_inode_chunk.isra.4 (dino_chunks.c:777) > > ==5419== by 0x8054012: process_aginodes (dino_chunks.c:1024) > > ==5419== by 0xFFF: ??? > > > > Add overflow detection code into the blkmap allocation code to avoid > > this problem, and also free large allocations once they are finished > > with to avoid pinning large amounts of memory due to the occasional > > large extent list in a filesystem. > > > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > > This is good but I have a few comments below, a couple of > which really indicate you need to update this. > > -Alex > > > --- > > repair/bmap.c | 37 ++++++++++++++++++++++++++++++++++++- > > repair/prefetch.c | 2 +- > > 2 files changed, 37 insertions(+), 2 deletions(-) > > > > diff --git a/repair/bmap.c b/repair/bmap.c > > index 79b9f79..1127a87 100644 > > --- a/repair/bmap.c > > +++ b/repair/bmap.c > > @@ -47,6 +47,17 @@ blkmap_alloc( > > if (nex < 1) > > nex = 1; > > > > +#if (BITS_PER_LONG != 64) > > This should be == 32, not != 64. OK. > (And if it > were possible, sizeof (int) == 32.) The BITS_PER_LONG are derived from the output of the configure process, and that's what the other code uses as well. So I'm just being consistent ;). > > + if (nex > (INT_MAX / sizeof(bmap_ext_t) - 1)) { > > See the comment below about this calculation. > > > + do_warn( > > + _("Number of extents requested in blkmap_alloc (%u) overflows 32 bits.\n" > > + "If this is not a corruption, then will need a 64 bit system\n" > ...then you will need... > > > + "to repair this filesystem.\n"), > > + nex); > > + return NULL; > > + } > > +#endif > > + > > key = whichfork ? ablkmap_key : dblkmap_key; > > blkmap = pthread_getspecific(key); > > if (!blkmap || blkmap->naexts < nex) { > > . . . > > > @@ -218,6 +244,15 @@ blkmap_grow( > > } > > > > blkmap->naexts += 4; > > The check needs to go *before* you update naexts. It's in the right place - adding 4 to the unchecked extent count can push it over the limit, so we have to check it after adding 4 to it. > > +#if (BITS_PER_LONG != 64) > > + if (blkmap->naexts > (INT_MAX / sizeof(bmap_ext_t) - 1)) { > > I don't really follow this calculation. I would expect > it to be based more closely on how BLKMAP_SIZE() is > defined. > > If you move it before the increment I think it would > be better to use: > if (BLKMAP_SIZE(nex) >= INT_MAX - sizeof (blkent_t *)) We can't use BLKMAP_SIZE(), because it will overflow 32 bits. Indeed, use of BLKMAP_SIZE() on unchecked extent counts is *precisely* the cause of the memory corruption this fix addresses. In more detail: typedef struct blkmap { int naexts; int nexts; bmap_ext_t exts[1]; } blkmap_t; #define BLKMAP_SIZE(n) \ (offsetof(blkmap_t, exts) + (sizeof(bmap_ext_t) * (n))) sizeof()/offsetof() on 32 bit platforms return a 32 bit number, naexts is a 32 bit number, so BLKMAP_SIZE() will overflow if blkmap->naexts >= INT_MAX / sizeof(bmap_ext_t). IOWs, we can't use BLKMAP_SIZE to detect an overflow, because it is the code that overflows... > And since this would expose the internals of what > BLKMAP_SIZE() does, it might be nicer if some sort of > BLKMAP_NENTS_MAX constant were defined next to the > definition of BLKMAP_SIZE(). I can add a BLKMAP_NENTS_MAX constant to repair/bmap.h. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] repair: fix some valgrind reported errors on i686 2011-10-06 22:15 ` Dave Chinner @ 2011-10-06 22:54 ` Alex Elder 2011-10-07 0:06 ` Dave Chinner 0 siblings, 1 reply; 8+ messages in thread From: Alex Elder @ 2011-10-06 22:54 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs On Fri, 2011-10-07 at 09:15 +1100, Dave Chinner wrote: > On Thu, Oct 06, 2011 at 07:17:52AM -0500, Alex Elder wrote: > > On Thu, 2011-10-06 at 12:01 +1100, Dave Chinner wrote: . . . > > > @@ -47,6 +47,17 @@ blkmap_alloc( > > > if (nex < 1) > > > nex = 1; > > > > > > +#if (BITS_PER_LONG != 64) > > > > This should be == 32, not != 64. > > OK. > > > (And if it > > were possible, sizeof (int) == 32.) > > The BITS_PER_LONG are derived from the output of the configure > process, and that's what the other code uses as well. So I'm just > being consistent ;). Well, it isn't possible anyway because sizeof is a compile-time operator, so it can't be used during preprocessing. You did the right thing. > > > > + if (nex > (INT_MAX / sizeof(bmap_ext_t) - 1)) { > > > > See the comment below about this calculation. > > > > > + do_warn( > > > + _("Number of extents requested in blkmap_alloc (%u) overflows 32 bits.\n" > > > + "If this is not a corruption, then will need a 64 bit system\n" > > ...then you will need... > > > > > + "to repair this filesystem.\n"), > > > + nex); > > > + return NULL; > > > + } > > > +#endif > > > + > > > key = whichfork ? ablkmap_key : dblkmap_key; > > > blkmap = pthread_getspecific(key); > > > if (!blkmap || blkmap->naexts < nex) { > > > > . . . > > > > > @@ -218,6 +244,15 @@ blkmap_grow( > > > } > > > > > > blkmap->naexts += 4; > > > > The check needs to go *before* you update naexts. > > It's in the right place - adding 4 to the unchecked extent count can > push it over the limit, so we have to check it after adding 4 to it. My point was that I'd rather see the check be whether it *will* overflow, rather than allowing it to overflow. The other reason though, even if you do the calculation at that spot, is that you shouldn't update what blkmap records as the number of allocated extents unless you have actually updated it the amount of allocated memory. As it stands, you will have updated blkmap->naexts, then if it overflows you return before actually attempting to reallocate. In other words, blkmap->naexts should reflect the actual amount of memory allotted in the blkmap->exts array, which is not going to be the case if it returns early (and this can be avoided). > > > +#if (BITS_PER_LONG != 64) > > > + if (blkmap->naexts > (INT_MAX / sizeof(bmap_ext_t) - 1)) { > > > > I don't really follow this calculation. I would expect > > it to be based more closely on how BLKMAP_SIZE() is > > defined. > > > > > If you move it before the increment I think it would > > be better to use: > > if (BLKMAP_SIZE(nex) >= INT_MAX - sizeof (blkent_t *)) > > We can't use BLKMAP_SIZE(), because it will overflow 32 bits. Which is also a reason I was suggesting to check whether we'd be exceeding the max *before* calling the macro. > Indeed, use of BLKMAP_SIZE() on unchecked extent counts is > *precisely* the cause of the memory corruption this fix addresses. > In more detail: > > typedef struct blkmap { > int naexts; > int nexts; > bmap_ext_t exts[1]; > } blkmap_t; > > #define BLKMAP_SIZE(n) \ > (offsetof(blkmap_t, exts) + (sizeof(bmap_ext_t) * (n))) > > sizeof()/offsetof() on 32 bit platforms return a 32 bit number, > naexts is a 32 bit number, so BLKMAP_SIZE() will overflow if > blkmap->naexts >= INT_MAX / sizeof(bmap_ext_t). > > IOWs, we can't use BLKMAP_SIZE to detect an overflow, because it > is the code that overflows... I understand that argument. I'm in a hurry at the moment so I haven't thought through this very well right now. But if you do have BLKMAP_NENTS_MAX, you could check nex against that before attempting to use BLKMAP_SIZE to compute how many bytes need to be allocated. > > And since this would expose the internals of what > > BLKMAP_SIZE() does, it might be nicer if some sort of > > BLKMAP_NENTS_MAX constant were defined next to the > > definition of BLKMAP_SIZE(). > > I can add a BLKMAP_NENTS_MAX constant to repair/bmap.h. > OK. I'll check for an update later. Thanks for the explanation. -Alex _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] repair: fix some valgrind reported errors on i686 2011-10-06 22:54 ` Alex Elder @ 2011-10-07 0:06 ` Dave Chinner 0 siblings, 0 replies; 8+ messages in thread From: Dave Chinner @ 2011-10-07 0:06 UTC (permalink / raw) To: Alex Elder; +Cc: xfs On Thu, Oct 06, 2011 at 05:54:18PM -0500, Alex Elder wrote: > On Fri, 2011-10-07 at 09:15 +1100, Dave Chinner wrote: > > On Thu, Oct 06, 2011 at 07:17:52AM -0500, Alex Elder wrote: > > > On Thu, 2011-10-06 at 12:01 +1100, Dave Chinner wrote: > > > > @@ -218,6 +244,15 @@ blkmap_grow( > > > > } > > > > > > > > blkmap->naexts += 4; > > > > > > The check needs to go *before* you update naexts. > > > > It's in the right place - adding 4 to the unchecked extent count can > > push it over the limit, so we have to check it after adding 4 to it. > > My point was that I'd rather see the check be whether it *will* > overflow, rather than allowing it to overflow. > > The other reason though, even if you do the calculation > at that spot, is that you shouldn't update what blkmap > records as the number of allocated extents unless you > have actually updated it the amount of allocated memory. > As it stands, you will have updated blkmap->naexts, then > if it overflows you return before actually attempting > to reallocate. > > In other words, blkmap->naexts should reflect the actual > amount of memory allotted in the blkmap->exts array, which > is not going to be the case if it returns early (and this > can be avoided). Fair point. And just checking the single caller of blkmap_grow indicates that it doesn't handle allocation failure *at all*, so I need to fix that as well. > > IOWs, we can't use BLKMAP_SIZE to detect an overflow, because it > > is the code that overflows... > > I understand that argument. I'm in a hurry at the moment so > I haven't thought through this very well right now. > > But if you do have BLKMAP_NENTS_MAX, you could check nex > against that before attempting to use BLKMAP_SIZE to > compute how many bytes need to be allocated. Yup, that's what the patch does, just without the BLKMAP_NENTS_MAX macro. Which I've already added. ;) FWIW, because the extent count fields are all signed ints, overflow checks also need to check for values <= 0. So I've added that too. And I think now I need to split this into separate patches for each issue, as there are now about 5 different problems this one patch fixes. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-10-07 0:06 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-06 1:01 [PATCH 0/2] repair: couple of bug fixes Dave Chinner 2011-10-06 1:01 ` [PATCH 1/2] repair: handle repair of image files on large sector size filesystems Dave Chinner 2011-10-06 12:17 ` Alex Elder 2011-10-06 1:01 ` [PATCH 2/2] repair: fix some valgrind reported errors on i686 Dave Chinner 2011-10-06 12:17 ` Alex Elder 2011-10-06 22:15 ` Dave Chinner 2011-10-06 22:54 ` Alex Elder 2011-10-07 0:06 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox