* [PATCH] xfs_repair: avoid segfault if reporting progress early in repair
@ 2013-10-17 17:50 Eric Sandeen
2013-10-17 20:05 ` Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Eric Sandeen @ 2013-10-17 17:50 UTC (permalink / raw)
To: xfs-oss
For a very large filesystem, zeroing the log may take some time.
If we ask for progress reports frequently enough that one fires
before we finish with log zeroing, we try to use a progress format
which has not yet been set up, and segfault:
# mkfs.xfs -d size=60t,file,name=fsfile
# xfs_repair -m 9000 -o ag_stride=32 -t 1 fsfile
Phase 1 - find and verify superblock...
- reporting progress in intervals of 1 seconds
Phase 2 - using internal log
- zero log...
Segmentation fault
(gdb) bt
#0 0x0000000000426962 in progress_rpt_thread (p=0x67ad20) at progress.c:234
#1 0x0000003b98a07851 in start_thread (arg=0x7f19d8e47700) at pthread_create.c:301
#2 0x0000003b982e767d in ?? ()
#3 0x0000000000000000 in ?? ()
(gdb) p msgp
$1 = (msg_block_t *) 0x67ad20
(gdb) p msgp->format
$2 = (progress_rpt_t *) 0x0
(gdb)
I suppose we could rig up progress reports for log zeroing, but
that won't usually take terribly long; for now, be defensive
and init the message->format to NULL, and just return early
from the progress thread if we've not yet set up any message.
(Sure, global_msgs is global, and ->format is already NULL,
but to me it's worth being explicit since we will test it).
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---
diff --git a/repair/progress.c b/repair/progress.c
index ab320dc..45a412e 100644
--- a/repair/progress.c
+++ b/repair/progress.c
@@ -124,6 +124,7 @@ init_progress_rpt (void)
*/
pthread_mutex_init(&global_msgs.mutex, NULL);
+ global_msgs.format = NULL;
global_msgs.count = glob_agcount;
global_msgs.interval = report_interval;
global_msgs.done = prog_rpt_done;
@@ -169,6 +170,10 @@ progress_rpt_thread (void *p)
msg_block_t *msgp = (msg_block_t *)p;
__uint64_t percent;
+ /* It's possible to get here very early w/ no progress msg set */
+ if (!msgp->format)
+ return NULL;
+
if ((msgbuf = (char *)malloc(DURATION_BUF_SIZE)) == NULL)
do_error (_("progress_rpt: cannot malloc progress msg buffer\n"));
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] xfs_repair: avoid segfault if reporting progress early in repair
2013-10-17 17:50 [PATCH] xfs_repair: avoid segfault if reporting progress early in repair Eric Sandeen
@ 2013-10-17 20:05 ` Christoph Hellwig
2013-10-17 23:12 ` Eric Sandeen
2013-10-18 17:42 ` Rich Johnston
2 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2013-10-17 20:05 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs-oss
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs_repair: avoid segfault if reporting progress early in repair
2013-10-17 17:50 [PATCH] xfs_repair: avoid segfault if reporting progress early in repair Eric Sandeen
2013-10-17 20:05 ` Christoph Hellwig
@ 2013-10-17 23:12 ` Eric Sandeen
2013-10-18 17:42 ` Rich Johnston
2 siblings, 0 replies; 4+ messages in thread
From: Eric Sandeen @ 2013-10-17 23:12 UTC (permalink / raw)
To: Eric Sandeen; +Cc: xfs-oss
On 10/17/13 12:50 PM, Eric Sandeen wrote:
> For a very large filesystem, zeroing the log may take some time.
>
> If we ask for progress reports frequently enough that one fires
> before we finish with log zeroing, we try to use a progress format
> which has not yet been set up, and segfault:
>
> # mkfs.xfs -d size=60t,file,name=fsfile
> # xfs_repair -m 9000 -o ag_stride=32 -t 1 fsfile
> Phase 1 - find and verify superblock...
> - reporting progress in intervals of 1 seconds
> Phase 2 - using internal log
> - zero log...
> Segmentation fault
>
> (gdb) bt
> #0 0x0000000000426962 in progress_rpt_thread (p=0x67ad20) at progress.c:234
> #1 0x0000003b98a07851 in start_thread (arg=0x7f19d8e47700) at pthread_create.c:301
> #2 0x0000003b982e767d in ?? ()
> #3 0x0000000000000000 in ?? ()
> (gdb) p msgp
> $1 = (msg_block_t *) 0x67ad20
> (gdb) p msgp->format
> $2 = (progress_rpt_t *) 0x0
> (gdb)
>
> I suppose we could rig up progress reports for log zeroing, but
> that won't usually take terribly long; for now, be defensive
> and init the message->format to NULL, and just return early
> from the progress thread if we've not yet set up any message.
>
> (Sure, global_msgs is global, and ->format is already NULL,
> but to me it's worth being explicit since we will test it).
>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
>
> diff --git a/repair/progress.c b/repair/progress.c
> index ab320dc..45a412e 100644
> --- a/repair/progress.c
> +++ b/repair/progress.c
> @@ -124,6 +124,7 @@ init_progress_rpt (void)
> */
>
> pthread_mutex_init(&global_msgs.mutex, NULL);
> + global_msgs.format = NULL;
> global_msgs.count = glob_agcount;
> global_msgs.interval = report_interval;
> global_msgs.done = prog_rpt_done;
> @@ -169,6 +170,10 @@ progress_rpt_thread (void *p)
> msg_block_t *msgp = (msg_block_t *)p;
> __uint64_t percent;
>
> + /* It's possible to get here very early w/ no progress msg set */
> + if (!msgp->format)
> + return NULL;
> +
> if ((msgbuf = (char *)malloc(DURATION_BUF_SIZE)) == NULL)
> do_error (_("progress_rpt: cannot malloc progress msg buffer\n"));
Dammit:
CID 1107596: Data race condition (MISSING_LOCK)
/repair/progress.c: 127 ( missing_lock)
124 */
125
126 pthread_mutex_init(&global_msgs.mutex, NULL);
>>> CID 1107596: Data race condition (MISSING_LOCK)
>>> Accessing "global_msgs.format" without holding lock "msg_block_s.mutex". Elsewhere, "global_msgs.format" is accessed with "msg_block_s.mutex" held 2 out of 2 times.
127 global_msgs.format = NULL;
128 global_msgs.count = glob_agcount;
129 global_msgs.interval = report_interval;
130 global_msgs.done = prog_rpt_done;
131 global_msgs.total = &prog_rpt_total;
Probably best to just drop the new NULL assignment, since it's a global init'd to 0 anyway, to shut up coverity?
-Eric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] xfs_repair: avoid segfault if reporting progress early in repair
2013-10-17 17:50 [PATCH] xfs_repair: avoid segfault if reporting progress early in repair Eric Sandeen
2013-10-17 20:05 ` Christoph Hellwig
2013-10-17 23:12 ` Eric Sandeen
@ 2013-10-18 17:42 ` Rich Johnston
2 siblings, 0 replies; 4+ messages in thread
From: Rich Johnston @ 2013-10-18 17:42 UTC (permalink / raw)
To: Eric Sandeen, xfs-oss
This has been committed.
Thanks
--Rich
commit 7f2d6b811755b6b91f18aa5bd9d5980848a81267
Author: Eric Sandeen <sandeen@redhat.com>
Date: Thu Oct 17 17:50:16 2013 +0000
xfs_repair: avoid segfault if reporting progress early in repair
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-10-18 17:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-17 17:50 [PATCH] xfs_repair: avoid segfault if reporting progress early in repair Eric Sandeen
2013-10-17 20:05 ` Christoph Hellwig
2013-10-17 23:12 ` Eric Sandeen
2013-10-18 17:42 ` Rich Johnston
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox