All of lore.kernel.org
 help / color / mirror / Atom feed
* When it rains it pours....
@ 2010-06-30 12:25 Linda A. Walsh
  2010-06-30 16:21 ` xfs_restore -R -- recovering a previous restore...does it work? Linda A. Walsh
  2010-07-01  0:24 ` When it rains it pours Dave Chinner
  0 siblings, 2 replies; 9+ messages in thread
From: Linda A. Walsh @ 2010-06-30 12:25 UTC (permalink / raw)
  To: xfs-oss

Due to another bug in lvm, my restore of this partition crashed after running a few
hours (takes alot longer to restore than to backup).

So I decided to use the "-R" option to Resume my previously left off dump:

# xfsrestore -R -p 180 -f /backups/Ishtar/torrents/torrents-100629-0-1611.dump .
  xfsrestore: using file dump (drive_simple) strategy
  xfsrestore: version 3.0.4 (dump format 3.0) - Running single-threaded
  xfsrestore: resuming restore previously begun Wed Jun 30 04:41:57 2010
  xfsrestore: examining media file 0
  xfsrestore: seeking past portion of media file already restored

Looks good so far!..Yup, and..

  xfsrestore: drive_simple.c:770: do_seek_mark: Assertion `nreadneeded64 <= ( ( intgen_t ) ( ( ( 1ull << ( ( unsigned long long )sizeof( intgen_t ) * ( unsigned long long )8 - ( 1ull + 1ull ))) - 1ull ) * 2ull + 1ull ))' failed.
  Aborted (core dumped)


Say what?  Um...is that supposed to be an error message?
 
Why can't it just tell me why "'nreadneeded64' > 0xbfffffffffffffd"
is 'bad', or what it means?

I have a feeling that the 'core dumped' message means that 
if I want my filesystem restored in the near future, I should
just restart...

*sigh*
-l

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* xfs_restore -R -- recovering a previous restore...does it work?
  2010-06-30 12:25 When it rains it pours Linda A. Walsh
@ 2010-06-30 16:21 ` Linda A. Walsh
       [not found]   ` <20100630172750.GA20764@puku.stupidest.org>
  2010-07-01  0:24 ` When it rains it pours Dave Chinner
  1 sibling, 1 reply; 9+ messages in thread
From: Linda A. Walsh @ 2010-06-30 16:21 UTC (permalink / raw)
  To: xfs-oss

So why is this crashing?

I just had it happen again -- something killed off the window it was
executing in -- system didn't go down, but it got a hangup -- a
cleaner shutdown -- but it still doesn't recover.

Under what circumstances should this work?  Or does it work?



Linda A. Walsh wrote:
> Due to another bug in lvm, my restore of this partition crashed after running a few
> hours (takes alot longer to restore than to backup).
> 
> So I decided to use the "-R" option to Resume my previously left off dump:
> 
> # xfsrestore -R -p 180 -f /backups/Ishtar/torrents/torrents-100629-0-1611.dump .
>   xfsrestore: using file dump (drive_simple) strategy
>   xfsrestore: version 3.0.4 (dump format 3.0) - Running single-threaded
>   xfsrestore: resuming restore previously begun Wed Jun 30 04:41:57 2010
>   xfsrestore: examining media file 0
>   xfsrestore: seeking past portion of media file already restored
> 
> Looks good so far!..Yup, and..
> 
>   xfsrestore: drive_simple.c:770: do_seek_mark: Assertion `nreadneeded64 <= ( ( intgen_t ) ( ( ( 1ull << ( ( unsigned long long )sizeof( intgen_t ) * ( unsigned long long )8 - ( 1ull + 1ull ))) - 1ull ) * 2ull + 1ull ))' failed.
>   Aborted (core dumped)
> 
> 
> Say what?  Um...is that supposed to be an error message?
>  
> Why can't it just tell me why "'nreadneeded64' > 0xbfffffffffffffd"
> is 'bad', or what it means?
> 
> I have a feeling that the 'core dumped' message means that 
> if I want my filesystem restored in the near future, I should
> just restart...
> 
> *sigh*
> -l
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs_restore -R -- recovering a previous restore...does it work?
       [not found]   ` <20100630172750.GA20764@puku.stupidest.org>
@ 2010-06-30 17:52     ` Linda A. Walsh
       [not found]       ` <20100630182755.GA23188@puku.stupidest.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Linda A. Walsh @ 2010-06-30 17:52 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: xfs-oss

Dmesg shows nothing revealing.

I am not using tape -- I'm restoring from a file.

Does it only work with tapes?  I didn't recall it saying so.

-l

Chris Wedgwood wrote:
> On Wed, Jun 30, 2010 at 09:21:02AM -0700, Linda A. Walsh wrote:
> 
>> I just had it happen again -- something killed off the window it was
>> executing in -- system didn't go down, but it got a hangup -- a
>> cleaner shutdown -- but it still doesn't recover.
> 
> dmesg ...
> 
> see anything?
> 
>> Under what circumstances should this work?  Or does it work?
> 
> it used to work for me, i've not tried it recently though, i basically
> gave up tapes (and xfsdump/restore) some time back due to size/cost
> contraints

----
Do you not keep backups at all anymore?  Or if you do, do you not 
care about the metastuff?  

Only other backup util, that I know of, that speaks meta, is 'star'.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs_restore -R -- recovering a previous restore...does it work?
       [not found]       ` <20100630182755.GA23188@puku.stupidest.org>
@ 2010-06-30 19:10         ` Linda A. Walsh
       [not found]           ` <20100630213653.GA26145@puku.stupidest.org>
  0 siblings, 1 reply; 9+ messages in thread
From: Linda A. Walsh @ 2010-06-30 19:10 UTC (permalink / raw)
  To: Chris Wedgwood, xfs-oss



Chris Wedgwood wrote:
> On Wed, Jun 30, 2010 at 10:52:00AM -0700, Linda A. Walsh wrote:
> 
>> Dmesg shows nothing revealing.
> 
> it seems odd the subprocess killed you terminal then
----
	No...it was attached to a terminal window that got closed
by accident.


> 
>> I am not using tape -- I'm restoring from a file.
>>
>> Does it only work with tapes?  I didn't recall it saying so.
> 
> files should work, i've just not used them personally aside from some
> testing
> 
> 
> how large is the image to be restored?

A few bytes...
2362017678616
( 2.15TB)

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: xfs_restore -R -- recovering a previous restore...does it work?
       [not found]           ` <20100630213653.GA26145@puku.stupidest.org>
@ 2010-07-01  0:06             ` Linda A. Walsh
  0 siblings, 0 replies; 9+ messages in thread
From: Linda A. Walsh @ 2010-07-01  0:06 UTC (permalink / raw)
  To: Chris Wedgwood, xfs-oss

Chris Wedgwood wrote:
>>> how large is the image to be restored?
>> ( 2.15TB)

> out of interest, so smaller images work?


I don't know...this was my first time I ever tried it.

Usually smaller images don't have so many opportunities to get interrupted. :-)

This one took 3h 18m, .. a crappy 189MB/s!

I was just unlucky enough to have something interrupt it 3 times in a row
(since it was the partition that was corrupt -- I'm a bit scattered as I have
no idea what might have caused that partition to go corrupt and no idea
if it is just going turn around and happen again.  Unfortunately, it's not
like it is the easiest partition to backup & restore.

Guess I can experiment! ;-)




_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: When it rains it pours....
  2010-06-30 12:25 When it rains it pours Linda A. Walsh
  2010-06-30 16:21 ` xfs_restore -R -- recovering a previous restore...does it work? Linda A. Walsh
@ 2010-07-01  0:24 ` Dave Chinner
  1 sibling, 0 replies; 9+ messages in thread
From: Dave Chinner @ 2010-07-01  0:24 UTC (permalink / raw)
  To: Linda A. Walsh; +Cc: xfs-oss

On Wed, Jun 30, 2010 at 05:25:58AM -0700, Linda A. Walsh wrote:
> Due to another bug in lvm, my restore of this partition crashed after running a few
> hours (takes alot longer to restore than to backup).
> 
> So I decided to use the "-R" option to Resume my previously left off dump:
> 
> # xfsrestore -R -p 180 -f /backups/Ishtar/torrents/torrents-100629-0-1611.dump .
>  xfsrestore: using file dump (drive_simple) strategy
>  xfsrestore: version 3.0.4 (dump format 3.0) - Running single-threaded
>  xfsrestore: resuming restore previously begun Wed Jun 30 04:41:57 2010
>  xfsrestore: examining media file 0
>  xfsrestore: seeking past portion of media file already restored
> 
> Looks good so far!..Yup, and..
> 
>  xfsrestore: drive_simple.c:770: do_seek_mark: Assertion `nreadneeded64 <= ( ( intgen_t ) ( ( ( 1ull << ( ( unsigned long long )sizeof( intgen_t ) * ( unsigned long long )8 - ( 1ull + 1ull ))) - 1ull ) * 2ull + 1ull ))' failed.
>  Aborted (core dumped)
> 
> 
> Say what?  Um...is that supposed to be an error message?

No, it's an assert failure. i.e. something a developer considered
fatal and requiring debugging if it ever occurred. It's not an error
message an end user is expected to understand. ;)

> Why can't it just tell me why "'nreadneeded64' > 0xbfffffffffffffd"
> is 'bad', or what it means?

Asserts generally indicate that design constraints or assumptions
have been violated which canbe hard to explain in one line to an end
user....

>From a brief look at the code, it appears that the distance between
the stream offset and the next tape mark is greater than MAXINTGENT.
MAXINTGEN evaluates as 0x7fffffff (more commonly known as INT_MAX).

Normally the file is read mark by mark, but when resuming a restore
we skip from the initial header to the checkpointed file mark in one
step. It seems like marks are normally less that 2GB apart, but
in the case of resuming, the attempt to seek from the header to the
checkpointed mark is way more than 2GB and hence it triggers the
assert.

I'm not sure yet how to fix this - I have't dug into the drive code
in xfs_restore before so it'll take a while to understand well
enough to work out a solution.

In the mean time, I think restarting your restore from scratch is
your best bet.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* When it rains it pours
@ 2023-06-30 19:30 Limonciello, Mario
  2023-06-30 19:39 ` Bjorn Helgaas
  0 siblings, 1 reply; 9+ messages in thread
From: Limonciello, Mario @ 2023-06-30 19:30 UTC (permalink / raw)
  To: Bjorn Helgaas, open list:PCI SUBSYSTEM

Hi Bjorn,

For the _REG change that went into Linus' tree I was recently made aware 
of another system that it helps.

This system was appearing to hang during bootup which evaluating the 
USB4 _OSC.
This hang happened on both the 6.1 LTS kernel and 6.4 final kernel.

In looking at the BIOS debug log shared by the reporter I noticed that
the kernel isn't hung it's just that the BIOS was waiting to be given 
the ability to access the config space.

Backporting just that _REG patch onto 6.1 LTS kernel fixes the issue.

I'm encouraging the BIOS team to try to come up with a cleaner failure 
path for the lack of _REG being called.  However there is always the 
possibility they can't or choose not to and people try to boot older 
kernels and fail.

Given how severe this boot issue is compared to the original suspend 
issue that prompted the patch I wanted to gauge how you feel about the 
risk of taking this change back to stable.

Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: When it rains it pours
  2023-06-30 19:30 Limonciello, Mario
@ 2023-06-30 19:39 ` Bjorn Helgaas
  2023-06-30 19:43   ` Limonciello, Mario
  0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2023-06-30 19:39 UTC (permalink / raw)
  To: Limonciello, Mario; +Cc: Bjorn Helgaas, open list:PCI SUBSYSTEM

On Fri, Jun 30, 2023 at 02:30:56PM -0500, Limonciello, Mario wrote:
> Hi Bjorn,
> 
> For the _REG change that went into Linus' tree I was recently made aware of
> another system that it helps.
> 
> This system was appearing to hang during bootup which evaluating the USB4
> _OSC.
> This hang happened on both the 6.1 LTS kernel and 6.4 final kernel.
> 
> In looking at the BIOS debug log shared by the reporter I noticed that
> the kernel isn't hung it's just that the BIOS was waiting to be given the
> ability to access the config space.
> 
> Backporting just that _REG patch onto 6.1 LTS kernel fixes the issue.
> 
> I'm encouraging the BIOS team to try to come up with a cleaner failure path
> for the lack of _REG being called.  However there is always the possibility
> they can't or choose not to and people try to boot older kernels and fail.
> 
> Given how severe this boot issue is compared to the original suspend issue
> that prompted the patch I wanted to gauge how you feel about the risk of
> taking this change back to stable.

I think we can do that.  But the patch is already in the pull request
for v6.5, so we'll have to wait until Linus pulls it and then ask the
stable folks to pick it up.  I don't think it should be a big deal; we
just need a mainline SHA1 for it.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-kernel-rules.rst?id=v6.4#n64

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: When it rains it pours
  2023-06-30 19:39 ` Bjorn Helgaas
@ 2023-06-30 19:43   ` Limonciello, Mario
  0 siblings, 0 replies; 9+ messages in thread
From: Limonciello, Mario @ 2023-06-30 19:43 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Bjorn Helgaas, open list:PCI SUBSYSTEM

On 6/30/2023 14:39, Bjorn Helgaas wrote:
> On Fri, Jun 30, 2023 at 02:30:56PM -0500, Limonciello, Mario wrote:
>> Hi Bjorn,
>>
>> For the _REG change that went into Linus' tree I was recently made aware of
>> another system that it helps.
>>
>> This system was appearing to hang during bootup which evaluating the USB4
>> _OSC.
>> This hang happened on both the 6.1 LTS kernel and 6.4 final kernel.
>>
>> In looking at the BIOS debug log shared by the reporter I noticed that
>> the kernel isn't hung it's just that the BIOS was waiting to be given the
>> ability to access the config space.
>>
>> Backporting just that _REG patch onto 6.1 LTS kernel fixes the issue.
>>
>> I'm encouraging the BIOS team to try to come up with a cleaner failure path
>> for the lack of _REG being called.  However there is always the possibility
>> they can't or choose not to and people try to boot older kernels and fail.
>>
>> Given how severe this boot issue is compared to the original suspend issue
>> that prompted the patch I wanted to gauge how you feel about the risk of
>> taking this change back to stable.
> 
> I think we can do that.  But the patch is already in the pull request
> for v6.5, so we'll have to wait until Linus pulls it and then ask the
> stable folks to pick it up.  I don't think it should be a big deal; we
> just need a mainline SHA1 for it.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-kernel-rules.rst?id=v6.4#n64

OK thanks, I saw your PR was sent out but I didn't realize it wasn't 
picked yet.

I'll keep an eye out for when the SHA1 is in Linus' tree and I'll 
request it for stable when it is.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-06-30 19:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-30 12:25 When it rains it pours Linda A. Walsh
2010-06-30 16:21 ` xfs_restore -R -- recovering a previous restore...does it work? Linda A. Walsh
     [not found]   ` <20100630172750.GA20764@puku.stupidest.org>
2010-06-30 17:52     ` Linda A. Walsh
     [not found]       ` <20100630182755.GA23188@puku.stupidest.org>
2010-06-30 19:10         ` Linda A. Walsh
     [not found]           ` <20100630213653.GA26145@puku.stupidest.org>
2010-07-01  0:06             ` Linda A. Walsh
2010-07-01  0:24 ` When it rains it pours Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2023-06-30 19:30 Limonciello, Mario
2023-06-30 19:39 ` Bjorn Helgaas
2023-06-30 19:43   ` Limonciello, Mario

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.