* Moving existing internal journal log to an external device (success?)
@ 2023-08-20 19:37 fk1xdcio
2023-08-20 22:14 ` Dave Chinner
0 siblings, 1 reply; 5+ messages in thread
From: fk1xdcio @ 2023-08-20 19:37 UTC (permalink / raw)
To: linux-xfs
Does this look like a sane method for moving an existing internal log to
an external device?
3 drives:
/dev/nvme0n1p1 2GB Journal mirror 0
/dev/nvme1n1p1 2GB Journal mirror 1
/dev/sda1 16TB XFS
# mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/nvme0n1p1
/dev/nvme1n1p2
# mkfs.xfs /dev/sda1
# xfs_logprint -C journal.bin /dev/sda1
# cat journal.bin > /dev/md0
# xfs_db -x /dev/sda1
xfs_db> sb
xfs_db> write -d logstart 0
xfs_db> quit
# mount -o logdev=/dev/md0 /dev/sda1 /mnt
-------------------------
It seems to "work" and I tested with a whole bunch of data. I was also
able to move the log back to internal without issue (set logstart back
to what it was originally). I don't know enough about how the filesystem
layout works to know if this will eventually break.
*IF* this works, why can't xfs_growfs do it?
Thanks!
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Moving existing internal journal log to an external device (success?)
2023-08-20 19:37 Moving existing internal journal log to an external device (success?) fk1xdcio
@ 2023-08-20 22:14 ` Dave Chinner
[not found] ` <B4C72D86-4CD6-415D-802E-7A225C868E57.1@smtp-inbound1.duck.com>
0 siblings, 1 reply; 5+ messages in thread
From: Dave Chinner @ 2023-08-20 22:14 UTC (permalink / raw)
To: fk1xdcio; +Cc: linux-xfs
On Sun, Aug 20, 2023 at 03:37:38PM -0400, fk1xdcio@duck.com wrote:
> Does this look like a sane method for moving an existing internal log to an
> external device?
>
> 3 drives:
> /dev/nvme0n1p1 2GB Journal mirror 0
> /dev/nvme1n1p1 2GB Journal mirror 1
> /dev/sda1 16TB XFS
>
> # mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/nvme0n1p1
> /dev/nvme1n1p2
> # mkfs.xfs /dev/sda1
> # xfs_logprint -C journal.bin /dev/sda1
> # cat journal.bin > /dev/md0
> # xfs_db -x /dev/sda1
>
> xfs_db> sb
> xfs_db> write -d logstart 0
> xfs_db> quit
>
> # mount -o logdev=/dev/md0 /dev/sda1 /mnt
So you are physically moving the contents of the log whilst the
filesystem is unmounted and unchanging.
> -------------------------
>
> It seems to "work" and I tested with a whole bunch of data.
You'll get ENOSPC earlier than you think, because you just leaked
the old log space (needs to be marked free space). There might be
other issues, but you get to keep all the broken bits to yourself if
you find them.
You can probably fix that by running xfs_repair, but then....
> I was also able
> to move the log back to internal without issue (set logstart back to what it
> was originally). I don't know enough about how the filesystem layout works
> to know if this will eventually break.
.... this won't work.
i.e. you can move the log back to the original position because you
didn't mark the space the old journal used as free, so the filesytem
still thinks it is in use by something....
> *IF* this works, why can't xfs_growfs do it?
"Doctor, I can perform an amputation with a tornique and a chainsaw,
why can't you do that?"
Mostly you are ignoring the fact that growfs in an online operation
- actually moving the log safely and testing it rigorously is a
whole lot harder to than changing a few fields with xfs_db....
Let's ignore the simple fact we can't tell the kernel to use a
different block device for the log via growfs right now (i.e. needs
a new ioctl interface) and focus on what is involved in moving the
log whilst the filesytem is mounted and actively in use.
First we need an atomic, crash safe mechanism to swap from one log
to another. We need to do that while the filesystem is running, so
it has to be done within a freeze context. Then we have run a
transaction that initialises the new log and tells the old log where
the new log is so that if we crash before the superblock is written
log recovery will replay the log switch. Then we do a sync write of
the superblock so that the next mount will see the new log location.
Then, while the filesystem is still frozen, we have to reconfigure
the in memory log structures to use the new log (e.g. open new
buftarg, update mount pointers to the log device, change the log
state to external, reset log sequence numbers, grant heads, etc).
Finally, if we got everything correct, we then need to free the old
journal in a new transaction running in the new log to clean up the
old journal now that it is no longer in use. Then we can unfreeze
the filesystem...
Yes, you can do amputations with a chainsaw, but it's a method of
last resort that does not guarantee success and you take
responsibility for the results yourself. Turning this into a
reliable procedure that always works or fails safe for all
conditions (professional engineering!) is a whole lot more
complex...
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Moving existing internal journal log to an external device (success?)
[not found] ` <B4C72D86-4CD6-415D-802E-7A225C868E57.1@smtp-inbound1.duck.com>
@ 2023-08-21 13:07 ` fk1xdcio
2023-08-24 20:22 ` Eric Sandeen
0 siblings, 1 reply; 5+ messages in thread
From: fk1xdcio @ 2023-08-21 13:07 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-xfs@vger.kernel.org
On 2023-08-20 18:14, Dave Chinner wrote:
> On Sun, Aug 20, 2023 at 03:37:38PM -0400, fk1xdcio@duck.com wrote:
>> Does this look like a sane method for moving an existing internal log
>> to an
>> external device?
>>
>> 3 drives:
>> /dev/nvme0n1p1 2GB Journal mirror 0
>> /dev/nvme1n1p1 2GB Journal mirror 1
>> /dev/sda1 16TB XFS
>>
>> # mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/nvme0n1p1
>> /dev/nvme1n1p2
>> # mkfs.xfs /dev/sda1
>> # xfs_logprint -C journal.bin /dev/sda1
>> # cat journal.bin > /dev/md0
>> # xfs_db -x /dev/sda1
>>
>> xfs_db> sb
>> xfs_db> write -d logstart 0
>> xfs_db> quit
>>
>> # mount -o logdev=/dev/md0 /dev/sda1 /mnt
>
> So you are physically moving the contents of the log whilst the
> filesystem is unmounted and unchanging.
>
>> -------------------------
>>
>> It seems to "work" and I tested with a whole bunch of data.
>
> You'll get ENOSPC earlier than you think, because you just leaked
> the old log space (needs to be marked free space). There might be
> other issues, but you get to keep all the broken bits to yourself if
> you find them.
It's 2GB out of terabytes so I don't really care about the space but the
"other issues" is a problem.
> You can probably fix that by running xfs_repair, but then....
>
>> I was also able
>> to move the log back to internal without issue (set logstart back to
>> what it
>> was originally). I don't know enough about how the filesystem layout
>> works
>> to know if this will eventually break.
>
> .... this won't work.
>
> i.e. you can move the log back to the original position because you
> didn't mark the space the old journal used as free, so the filesytem
> still thinks it is in use by something....
The space being leaked is fine but xfs_repair is an issue. I did some
testing and yes, if I run xfs_repair on one of these filesystems with a
moved log it causes all sorts of problems. In fact it doesn't seem to
work at all. Big problem.
>> *IF* this works, why can't xfs_growfs do it?
>
> "Doctor, I can perform an amputation with a tornique and a chainsaw,
> why can't you do that?"
> ,,,
> -Dave.
Yes, I understand. I was thinking more of an offline utility for doing
this but I see why that can't be done in growfs.
So I guess it doesn't really work. This is why I ask the experts. I'll
keep experimenting because due to the requirements of needing to
physically move disks around, being able to move the log back and forth
from internal to external would be extremely helpful.
Thanks!
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Moving existing internal journal log to an external device (success?)
2023-08-21 13:07 ` fk1xdcio
@ 2023-08-24 20:22 ` Eric Sandeen
[not found] ` <21DD2F1E-AAB5-48BC-8C9F-7A9A07F3F81C.1@smtp-inbound1.duck.com>
0 siblings, 1 reply; 5+ messages in thread
From: Eric Sandeen @ 2023-08-24 20:22 UTC (permalink / raw)
To: fk1xdcio, Dave Chinner; +Cc: linux-xfs@vger.kernel.org
On 8/21/23 8:07 AM, fk1xdcio@duck.com wrote:
>>> *IF* this works, why can't xfs_growfs do it?
>>
>> "Doctor, I can perform an amputation with a tornique and a chainsaw,
>> why can't you do that?"
>> ,,,
>> -Dave.
>
>
> Yes, I understand. I was thinking more of an offline utility for doing
> this but I see why that can't be done in growfs.
>
> So I guess it doesn't really work. This is why I ask the experts. I'll
> keep experimenting because due to the requirements of needing to
> physically move disks around, being able to move the log back and forth
> from internal to external would be extremely helpful.
>
> Thanks!
Just out of curiosity, what is your use case? Why do you need/want to
move logs around?
-Eric
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Moving existing internal journal log to an external device (success?)
[not found] ` <21DD2F1E-AAB5-48BC-8C9F-7A9A07F3F81C.1@smtp-inbound1.duck.com>
@ 2023-08-26 14:43 ` fk1xdcio
0 siblings, 0 replies; 5+ messages in thread
From: fk1xdcio @ 2023-08-26 14:43 UTC (permalink / raw)
To: Eric Sandeen; +Cc: linux-xfs@vger.kernel.org
On 2023-08-24 16:22, Eric Sandeen wrote:
> On 8/21/23 8:07 AM, fk1xdcio@duck.com wrote:
>> Yes, I understand. I was thinking more of an offline utility for doing
>> this but I see why that can't be done in growfs.
>>
>> So I guess it doesn't really work. This is why I ask the experts. I'll
>> keep experimenting because due to the requirements of needing to
>> physically move disks around, being able to move the log back and
>> forth
>> from internal to external would be extremely helpful.
>>
>> Thanks!
>
> Just out of curiosity, what is your use case? Why do you need/want to
> move logs around?
Every so often I rotate certain drives from production servers to
semi-offline servers for testing and verification. The production
servers have SSD for cache and the fast external journal but the testing
servers do not. The testing servers need to be able to test and verify
the filesystem and do periodic synchronization/mirroring but obliviously
can't use the original filesystem without the journal. Also some of
these drives are moved offsite so being able to put the journal back to
its internal position would simplify things.
Of course it would be possible to have extra drives in the testing
environment that the logs could be moved to but the testing servers are
very physically limited as to what can be hooked up to them so there
really isn't enough room or ports for those extra drives. Plus the whole
offsite thing.
It's more of a "want to help make life easier" than a hard requirement.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-08-26 14:44 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-20 19:37 Moving existing internal journal log to an external device (success?) fk1xdcio
2023-08-20 22:14 ` Dave Chinner
[not found] ` <B4C72D86-4CD6-415D-802E-7A225C868E57.1@smtp-inbound1.duck.com>
2023-08-21 13:07 ` fk1xdcio
2023-08-24 20:22 ` Eric Sandeen
[not found] ` <21DD2F1E-AAB5-48BC-8C9F-7A9A07F3F81C.1@smtp-inbound1.duck.com>
2023-08-26 14:43 ` fk1xdcio
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox