* When it rains it pours....
@ 2010-06-30 12:25 Linda A. Walsh
2010-06-30 16:21 ` xfs_restore -R -- recovering a previous restore...does it work? Linda A. Walsh
2010-07-01 0:24 ` When it rains it pours Dave Chinner
0 siblings, 2 replies; 9+ messages in thread
From: Linda A. Walsh @ 2010-06-30 12:25 UTC (permalink / raw)
To: xfs-oss
Due to another bug in lvm, my restore of this partition crashed after running a few
hours (takes alot longer to restore than to backup).
So I decided to use the "-R" option to Resume my previously left off dump:
# xfsrestore -R -p 180 -f /backups/Ishtar/torrents/torrents-100629-0-1611.dump .
xfsrestore: using file dump (drive_simple) strategy
xfsrestore: version 3.0.4 (dump format 3.0) - Running single-threaded
xfsrestore: resuming restore previously begun Wed Jun 30 04:41:57 2010
xfsrestore: examining media file 0
xfsrestore: seeking past portion of media file already restored
Looks good so far!..Yup, and..
xfsrestore: drive_simple.c:770: do_seek_mark: Assertion `nreadneeded64 <= ( ( intgen_t ) ( ( ( 1ull << ( ( unsigned long long )sizeof( intgen_t ) * ( unsigned long long )8 - ( 1ull + 1ull ))) - 1ull ) * 2ull + 1ull ))' failed.
Aborted (core dumped)
Say what? Um...is that supposed to be an error message?
Why can't it just tell me why "'nreadneeded64' > 0xbfffffffffffffd"
is 'bad', or what it means?
I have a feeling that the 'core dumped' message means that
if I want my filesystem restored in the near future, I should
just restart...
*sigh*
-l
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* xfs_restore -R -- recovering a previous restore...does it work?
2010-06-30 12:25 When it rains it pours Linda A. Walsh
@ 2010-06-30 16:21 ` Linda A. Walsh
[not found] ` <20100630172750.GA20764@puku.stupidest.org>
2010-07-01 0:24 ` When it rains it pours Dave Chinner
1 sibling, 1 reply; 9+ messages in thread
From: Linda A. Walsh @ 2010-06-30 16:21 UTC (permalink / raw)
To: xfs-oss
So why is this crashing?
I just had it happen again -- something killed off the window it was
executing in -- system didn't go down, but it got a hangup -- a
cleaner shutdown -- but it still doesn't recover.
Under what circumstances should this work? Or does it work?
Linda A. Walsh wrote:
> Due to another bug in lvm, my restore of this partition crashed after running a few
> hours (takes alot longer to restore than to backup).
>
> So I decided to use the "-R" option to Resume my previously left off dump:
>
> # xfsrestore -R -p 180 -f /backups/Ishtar/torrents/torrents-100629-0-1611.dump .
> xfsrestore: using file dump (drive_simple) strategy
> xfsrestore: version 3.0.4 (dump format 3.0) - Running single-threaded
> xfsrestore: resuming restore previously begun Wed Jun 30 04:41:57 2010
> xfsrestore: examining media file 0
> xfsrestore: seeking past portion of media file already restored
>
> Looks good so far!..Yup, and..
>
> xfsrestore: drive_simple.c:770: do_seek_mark: Assertion `nreadneeded64 <= ( ( intgen_t ) ( ( ( 1ull << ( ( unsigned long long )sizeof( intgen_t ) * ( unsigned long long )8 - ( 1ull + 1ull ))) - 1ull ) * 2ull + 1ull ))' failed.
> Aborted (core dumped)
>
>
> Say what? Um...is that supposed to be an error message?
>
> Why can't it just tell me why "'nreadneeded64' > 0xbfffffffffffffd"
> is 'bad', or what it means?
>
> I have a feeling that the 'core dumped' message means that
> if I want my filesystem restored in the near future, I should
> just restart...
>
> *sigh*
> -l
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: When it rains it pours....
2010-06-30 12:25 When it rains it pours Linda A. Walsh
2010-06-30 16:21 ` xfs_restore -R -- recovering a previous restore...does it work? Linda A. Walsh
@ 2010-07-01 0:24 ` Dave Chinner
1 sibling, 0 replies; 9+ messages in thread
From: Dave Chinner @ 2010-07-01 0:24 UTC (permalink / raw)
To: Linda A. Walsh; +Cc: xfs-oss
On Wed, Jun 30, 2010 at 05:25:58AM -0700, Linda A. Walsh wrote:
> Due to another bug in lvm, my restore of this partition crashed after running a few
> hours (takes alot longer to restore than to backup).
>
> So I decided to use the "-R" option to Resume my previously left off dump:
>
> # xfsrestore -R -p 180 -f /backups/Ishtar/torrents/torrents-100629-0-1611.dump .
> xfsrestore: using file dump (drive_simple) strategy
> xfsrestore: version 3.0.4 (dump format 3.0) - Running single-threaded
> xfsrestore: resuming restore previously begun Wed Jun 30 04:41:57 2010
> xfsrestore: examining media file 0
> xfsrestore: seeking past portion of media file already restored
>
> Looks good so far!..Yup, and..
>
> xfsrestore: drive_simple.c:770: do_seek_mark: Assertion `nreadneeded64 <= ( ( intgen_t ) ( ( ( 1ull << ( ( unsigned long long )sizeof( intgen_t ) * ( unsigned long long )8 - ( 1ull + 1ull ))) - 1ull ) * 2ull + 1ull ))' failed.
> Aborted (core dumped)
>
>
> Say what? Um...is that supposed to be an error message?
No, it's an assert failure. i.e. something a developer considered
fatal and requiring debugging if it ever occurred. It's not an error
message an end user is expected to understand. ;)
> Why can't it just tell me why "'nreadneeded64' > 0xbfffffffffffffd"
> is 'bad', or what it means?
Asserts generally indicate that design constraints or assumptions
have been violated which canbe hard to explain in one line to an end
user....
>From a brief look at the code, it appears that the distance between
the stream offset and the next tape mark is greater than MAXINTGENT.
MAXINTGEN evaluates as 0x7fffffff (more commonly known as INT_MAX).
Normally the file is read mark by mark, but when resuming a restore
we skip from the initial header to the checkpointed file mark in one
step. It seems like marks are normally less that 2GB apart, but
in the case of resuming, the attempt to seek from the header to the
checkpointed mark is way more than 2GB and hence it triggers the
assert.
I'm not sure yet how to fix this - I have't dug into the drive code
in xfs_restore before so it'll take a while to understand well
enough to work out a solution.
In the mean time, I think restarting your restore from scratch is
your best bet.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread
* When it rains it pours
@ 2023-06-30 19:30 Limonciello, Mario
2023-06-30 19:39 ` Bjorn Helgaas
0 siblings, 1 reply; 9+ messages in thread
From: Limonciello, Mario @ 2023-06-30 19:30 UTC (permalink / raw)
To: Bjorn Helgaas, open list:PCI SUBSYSTEM
Hi Bjorn,
For the _REG change that went into Linus' tree I was recently made aware
of another system that it helps.
This system was appearing to hang during bootup which evaluating the
USB4 _OSC.
This hang happened on both the 6.1 LTS kernel and 6.4 final kernel.
In looking at the BIOS debug log shared by the reporter I noticed that
the kernel isn't hung it's just that the BIOS was waiting to be given
the ability to access the config space.
Backporting just that _REG patch onto 6.1 LTS kernel fixes the issue.
I'm encouraging the BIOS team to try to come up with a cleaner failure
path for the lack of _REG being called. However there is always the
possibility they can't or choose not to and people try to boot older
kernels and fail.
Given how severe this boot issue is compared to the original suspend
issue that prompted the patch I wanted to gauge how you feel about the
risk of taking this change back to stable.
Thanks!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: When it rains it pours
2023-06-30 19:30 Limonciello, Mario
@ 2023-06-30 19:39 ` Bjorn Helgaas
2023-06-30 19:43 ` Limonciello, Mario
0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2023-06-30 19:39 UTC (permalink / raw)
To: Limonciello, Mario; +Cc: Bjorn Helgaas, open list:PCI SUBSYSTEM
On Fri, Jun 30, 2023 at 02:30:56PM -0500, Limonciello, Mario wrote:
> Hi Bjorn,
>
> For the _REG change that went into Linus' tree I was recently made aware of
> another system that it helps.
>
> This system was appearing to hang during bootup which evaluating the USB4
> _OSC.
> This hang happened on both the 6.1 LTS kernel and 6.4 final kernel.
>
> In looking at the BIOS debug log shared by the reporter I noticed that
> the kernel isn't hung it's just that the BIOS was waiting to be given the
> ability to access the config space.
>
> Backporting just that _REG patch onto 6.1 LTS kernel fixes the issue.
>
> I'm encouraging the BIOS team to try to come up with a cleaner failure path
> for the lack of _REG being called. However there is always the possibility
> they can't or choose not to and people try to boot older kernels and fail.
>
> Given how severe this boot issue is compared to the original suspend issue
> that prompted the patch I wanted to gauge how you feel about the risk of
> taking this change back to stable.
I think we can do that. But the patch is already in the pull request
for v6.5, so we'll have to wait until Linus pulls it and then ask the
stable folks to pick it up. I don't think it should be a big deal; we
just need a mainline SHA1 for it.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-kernel-rules.rst?id=v6.4#n64
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: When it rains it pours
2023-06-30 19:39 ` Bjorn Helgaas
@ 2023-06-30 19:43 ` Limonciello, Mario
0 siblings, 0 replies; 9+ messages in thread
From: Limonciello, Mario @ 2023-06-30 19:43 UTC (permalink / raw)
To: Bjorn Helgaas; +Cc: Bjorn Helgaas, open list:PCI SUBSYSTEM
On 6/30/2023 14:39, Bjorn Helgaas wrote:
> On Fri, Jun 30, 2023 at 02:30:56PM -0500, Limonciello, Mario wrote:
>> Hi Bjorn,
>>
>> For the _REG change that went into Linus' tree I was recently made aware of
>> another system that it helps.
>>
>> This system was appearing to hang during bootup which evaluating the USB4
>> _OSC.
>> This hang happened on both the 6.1 LTS kernel and 6.4 final kernel.
>>
>> In looking at the BIOS debug log shared by the reporter I noticed that
>> the kernel isn't hung it's just that the BIOS was waiting to be given the
>> ability to access the config space.
>>
>> Backporting just that _REG patch onto 6.1 LTS kernel fixes the issue.
>>
>> I'm encouraging the BIOS team to try to come up with a cleaner failure path
>> for the lack of _REG being called. However there is always the possibility
>> they can't or choose not to and people try to boot older kernels and fail.
>>
>> Given how severe this boot issue is compared to the original suspend issue
>> that prompted the patch I wanted to gauge how you feel about the risk of
>> taking this change back to stable.
>
> I think we can do that. But the patch is already in the pull request
> for v6.5, so we'll have to wait until Linus pulls it and then ask the
> stable folks to pick it up. I don't think it should be a big deal; we
> just need a mainline SHA1 for it.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-kernel-rules.rst?id=v6.4#n64
OK thanks, I saw your PR was sent out but I didn't realize it wasn't
picked yet.
I'll keep an eye out for when the SHA1 is in Linus' tree and I'll
request it for stable when it is.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-06-30 19:43 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-30 12:25 When it rains it pours Linda A. Walsh
2010-06-30 16:21 ` xfs_restore -R -- recovering a previous restore...does it work? Linda A. Walsh
[not found] ` <20100630172750.GA20764@puku.stupidest.org>
2010-06-30 17:52 ` Linda A. Walsh
[not found] ` <20100630182755.GA23188@puku.stupidest.org>
2010-06-30 19:10 ` Linda A. Walsh
[not found] ` <20100630213653.GA26145@puku.stupidest.org>
2010-07-01 0:06 ` Linda A. Walsh
2010-07-01 0:24 ` When it rains it pours Dave Chinner
-- strict thread matches above, loose matches on Subject: below --
2023-06-30 19:30 Limonciello, Mario
2023-06-30 19:39 ` Bjorn Helgaas
2023-06-30 19:43 ` Limonciello, Mario
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.