* [PATCH 0/2] Improve dmesg output for swapfile+hibernation
@ 2024-05-22 7:46 Sukrit Bhatnagar
2024-05-22 7:46 ` [PATCH 1/2] iomap: swap: print warning for unaligned swapfile Sukrit Bhatnagar
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Sukrit Bhatnagar @ 2024-05-22 7:46 UTC (permalink / raw)
To: Rafael J. Wysocki, Pavel Machek, Christian Brauner,
Darrick J. Wong, Andrew Morton
Cc: linux-xfs, linux-pm, linux-fsdevel, linux-kernel, linux-mm,
Sukrit.Bhatnagar
While trying to use a swapfile for hibernation, I noticed that the suspend
process was failing when it tried to search for the swap to use for snapshot.
I had created the swapfile on ext4 and got the starting physical block offset
using the filefrag command.
Upon looking at the swap activation code, I realized that the iomap part is
doing some rounding up for the physical offset of the swapfile.
Then I checked the block size of the filesystem, which was actually set to
1KB by default in my environment.
(This was in buildroot, using the genimage utility to create the VM disk
partitions, filesystems etc.)
The block offset is rounded-up and stored in the swap extents metadata by
iomap code, but as the exact value is lost for 1KB-block filesystem, hibernate
cannot read back the swap header after it finishes writing data pages to swap.
Note that this is not a bug in my understanding. Both swapfile and hibernate
subsystems have the correct handling of this edge case, individually.
Another observation was that we need to rely on external commands, such as
filefrag for getting the swapfile offset value. This value can be conveniently
printed in dmesg output when doing swapon.
Sukrit Bhatnagar (2):
iomap: swap: print warning for unaligned swapfile
mm: swap: print starting physical block offset in swapon
fs/iomap/swapfile.c | 10 ++++++++++
mm/swapfile.c | 3 ++-
2 files changed, 12 insertions(+), 1 deletion(-)
--
2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH 1/2] iomap: swap: print warning for unaligned swapfile 2024-05-22 7:46 [PATCH 0/2] Improve dmesg output for swapfile+hibernation Sukrit Bhatnagar @ 2024-05-22 7:46 ` Sukrit Bhatnagar 2024-05-22 18:02 ` Chris Li 2024-05-22 7:46 ` [PATCH 2/2] mm: swap: print starting physical block offset in swapon Sukrit Bhatnagar 2024-05-23 19:45 ` [PATCH 0/2] Improve dmesg output for swapfile+hibernation Pavel Machek 2 siblings, 1 reply; 14+ messages in thread From: Sukrit Bhatnagar @ 2024-05-22 7:46 UTC (permalink / raw) To: Rafael J. Wysocki, Pavel Machek, Christian Brauner, Darrick J. Wong, Andrew Morton Cc: linux-xfs, linux-pm, linux-fsdevel, linux-kernel, linux-mm, Sukrit.Bhatnagar When creating a swapfile on a filesystem with block size less than the PAGE_SIZE, there is a possibility that the starting physical block is not page-aligned, which results in rounding up that value before setting it in the first swap extent. But now that the value is rounded up, we have lost the actual offset location of the first physical block. The starting physical block value is needed in hibernation when using a swapfile, i.e., the resume_offset. After we have written the snapshot pages, some values will be set in the swap header which is accessed using that offset location. However, it will not find the swap header if the offset value was rounded up and results in an error. The swapfile offset being unaligned should not fail the swapon activation as the swap extents will always have the alignment. Therefore, just print a warning if an unaligned swapfile is activated when hibernation is enabled. Signed-off-by: Sukrit Bhatnagar <Sukrit.Bhatnagar@sony.com> --- fs/iomap/swapfile.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c index 5fc0ac36dee3..1f7b189089dd 100644 --- a/fs/iomap/swapfile.c +++ b/fs/iomap/swapfile.c @@ -49,6 +49,16 @@ static int iomap_swapfile_add_extent(struct iomap_swapfile_info *isi) next_ppage = ALIGN_DOWN(iomap->addr + iomap->length, PAGE_SIZE) >> PAGE_SHIFT; +#ifdef CONFIG_HIBERNATION + /* + * Print a warning if the starting physical block is not aligned + * to PAGE_SIZE (for filesystems using smaller block sizes). + * This will fail the hibernation suspend as we need to read + * the swap header later using the starting block offset. + */ + if (!iomap->offset && iomap->addr & PAGE_MASK) + pr_warn("swapon: starting physical offset not page-aligned\n"); +#endif /* Skip too-short physical extents. */ if (first_ppage >= next_ppage) return 0; -- 2.34.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] iomap: swap: print warning for unaligned swapfile 2024-05-22 7:46 ` [PATCH 1/2] iomap: swap: print warning for unaligned swapfile Sukrit Bhatnagar @ 2024-05-22 18:02 ` Chris Li 2024-05-27 10:54 ` Sukrit.Bhatnagar 0 siblings, 1 reply; 14+ messages in thread From: Chris Li @ 2024-05-22 18:02 UTC (permalink / raw) To: Sukrit Bhatnagar Cc: Rafael J. Wysocki, Pavel Machek, Christian Brauner, Darrick J. Wong, Andrew Morton, linux-xfs, linux-pm, linux-fsdevel, linux-kernel, linux-mm Hi Sukrit, It seems that you need the swap file block start address to read the swap file headers. This warning still requires the user to read the dmesg. The kernel still does not have the swapfile header at resume. In other words, it does not fix the issue. I don't know the suspend/resume code enough, will adding recording the physical start address of the swapfile in swap_info_struct help you address this problem? The suspend code can write that value to "somewhere* for resume to pick it up. Let's find a proper way to fix this issue rather than just warning on it. Chris On Wed, May 22, 2024 at 12:42 AM Sukrit Bhatnagar <Sukrit.Bhatnagar@sony.com> wrote: > > When creating a swapfile on a filesystem with block size less than the > PAGE_SIZE, there is a possibility that the starting physical block is not > page-aligned, which results in rounding up that value before setting it > in the first swap extent. But now that the value is rounded up, we have > lost the actual offset location of the first physical block. > > The starting physical block value is needed in hibernation when using a > swapfile, i.e., the resume_offset. After we have written the snapshot > pages, some values will be set in the swap header which is accessed using > that offset location. However, it will not find the swap header if the > offset value was rounded up and results in an error. > > The swapfile offset being unaligned should not fail the swapon activation > as the swap extents will always have the alignment. > > Therefore, just print a warning if an unaligned swapfile is activated > when hibernation is enabled. > > Signed-off-by: Sukrit Bhatnagar <Sukrit.Bhatnagar@sony.com> > --- > fs/iomap/swapfile.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c > index 5fc0ac36dee3..1f7b189089dd 100644 > --- a/fs/iomap/swapfile.c > +++ b/fs/iomap/swapfile.c > @@ -49,6 +49,16 @@ static int iomap_swapfile_add_extent(struct iomap_swapfile_info *isi) > next_ppage = ALIGN_DOWN(iomap->addr + iomap->length, PAGE_SIZE) >> > PAGE_SHIFT; > > +#ifdef CONFIG_HIBERNATION > + /* > + * Print a warning if the starting physical block is not aligned > + * to PAGE_SIZE (for filesystems using smaller block sizes). > + * This will fail the hibernation suspend as we need to read > + * the swap header later using the starting block offset. > + */ > + if (!iomap->offset && iomap->addr & PAGE_MASK) > + pr_warn("swapon: starting physical offset not page-aligned\n"); > +#endif > /* Skip too-short physical extents. */ > if (first_ppage >= next_ppage) > return 0; > -- > 2.34.1 > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 1/2] iomap: swap: print warning for unaligned swapfile 2024-05-22 18:02 ` Chris Li @ 2024-05-27 10:54 ` Sukrit.Bhatnagar 0 siblings, 0 replies; 14+ messages in thread From: Sukrit.Bhatnagar @ 2024-05-27 10:54 UTC (permalink / raw) To: Chris Li Cc: Rafael J. Wysocki, Pavel Machek, Christian Brauner, Darrick J. Wong, Andrew Morton, linux-xfs@vger.kernel.org, linux-pm@vger.kernel.org, linux-fsdevel, linux-kernel, linux-mm Hi Chris, On 2024-05-23 03:02, Chris Li wrote: > Hi Sukrit, > > It seems that you need the swap file block start address to read the > swap file headers. > This warning still requires the user to read the dmesg. The kernel > still does not have the swapfile header at resume. In other words, it > does not fix the issue. This was not intended to be a fix. I had created this patch thinking that adding an actual fix for this might be non-trivial and may not be desirable to everyone, especially when swapfile+hibernation is not commonly used. > I don't know the suspend/resume code enough, will adding recording the > physical start address of the swapfile in swap_info_struct help you > address this problem? The suspend code can write that value to > "somewhere* for resume to pick it up. > > Let's find a proper way to fix this issue rather than just warning on it. Adding a new member in swap_info_struct for the physical block offset will help in the fix, I think. It can even be enclosed in #ifdef HIBERNATION if nothing else needs that value. This value can be checked by hibernate[1] as: sis->start_offset == first_se(sis)->start_block Another possible solution might be to add a new swap flag like SWP_SUSP or SWP_HIBERNATE and set it only when we don't have this rounding up issue. Since we will boot the kernel normally first and later switch to our pre-hibernate kernel after image load from swap, the value cannot be set inside the kernel memory. We pass the start block address in kernel commandline parameter (resume_offset) for next boot. [1]: The hibernate swapfile checking code is the function swap_type_of() in mm/swapfile.c -- Sukrit ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/2] mm: swap: print starting physical block offset in swapon 2024-05-22 7:46 [PATCH 0/2] Improve dmesg output for swapfile+hibernation Sukrit Bhatnagar 2024-05-22 7:46 ` [PATCH 1/2] iomap: swap: print warning for unaligned swapfile Sukrit Bhatnagar @ 2024-05-22 7:46 ` Sukrit Bhatnagar 2024-05-22 14:56 ` Darrick J. Wong 2024-05-23 19:45 ` [PATCH 0/2] Improve dmesg output for swapfile+hibernation Pavel Machek 2 siblings, 1 reply; 14+ messages in thread From: Sukrit Bhatnagar @ 2024-05-22 7:46 UTC (permalink / raw) To: Rafael J. Wysocki, Pavel Machek, Christian Brauner, Darrick J. Wong, Andrew Morton Cc: linux-xfs, linux-pm, linux-fsdevel, linux-kernel, linux-mm, Sukrit.Bhatnagar When a swapfile is created for hibernation purposes, we always need the starting physical block offset, which is usually determined using userspace commands such as filefrag. It would be good to have that value printed when we do swapon and get that value directly from dmesg. Signed-off-by: Sukrit Bhatnagar <Sukrit.Bhatnagar@sony.com> --- mm/swapfile.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index f6ca215fb92f..53c9187d5fbe 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3264,8 +3264,9 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) (swap_flags & SWAP_FLAG_PRIO_MASK) >> SWAP_FLAG_PRIO_SHIFT; enable_swap_info(p, prio, swap_map, cluster_info); - pr_info("Adding %uk swap on %s. Priority:%d extents:%d across:%lluk %s%s%s%s\n", + pr_info("Adding %uk swap on %s. Priority:%d extents:%d start:%llu across:%lluk %s%s%s%s\n", K(p->pages), name->name, p->prio, nr_extents, + (unsigned long long)first_se(p)->start_block, K((unsigned long long)span), (p->flags & SWP_SOLIDSTATE) ? "SS" : "", (p->flags & SWP_DISCARDABLE) ? "D" : "", -- 2.34.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] mm: swap: print starting physical block offset in swapon 2024-05-22 7:46 ` [PATCH 2/2] mm: swap: print starting physical block offset in swapon Sukrit Bhatnagar @ 2024-05-22 14:56 ` Darrick J. Wong 2024-05-22 14:59 ` Christoph Hellwig 2024-05-27 11:02 ` Sukrit.Bhatnagar 0 siblings, 2 replies; 14+ messages in thread From: Darrick J. Wong @ 2024-05-22 14:56 UTC (permalink / raw) To: Sukrit Bhatnagar Cc: Rafael J. Wysocki, Pavel Machek, Christian Brauner, Andrew Morton, linux-xfs, linux-pm, linux-fsdevel, linux-kernel, linux-mm On Wed, May 22, 2024 at 04:46:58PM +0900, Sukrit Bhatnagar wrote: > When a swapfile is created for hibernation purposes, we always need > the starting physical block offset, which is usually determined using > userspace commands such as filefrag. If you always need this value, then shouldn't it be exported via sysfs or somewhere so that you can always get to it? The kernel ringbuffer can overwrite log messages, swapfiles can get disabled, etc. > It would be good to have that value printed when we do swapon and get > that value directly from dmesg. > > Signed-off-by: Sukrit Bhatnagar <Sukrit.Bhatnagar@sony.com> > --- > mm/swapfile.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index f6ca215fb92f..53c9187d5fbe 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -3264,8 +3264,9 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) > (swap_flags & SWAP_FLAG_PRIO_MASK) >> SWAP_FLAG_PRIO_SHIFT; > enable_swap_info(p, prio, swap_map, cluster_info); > > - pr_info("Adding %uk swap on %s. Priority:%d extents:%d across:%lluk %s%s%s%s\n", > + pr_info("Adding %uk swap on %s. Priority:%d extents:%d start:%llu across:%lluk %s%s%s%s\n", > K(p->pages), name->name, p->prio, nr_extents, > + (unsigned long long)first_se(p)->start_block, Last time I looked, start_block was in units of PAGE_SIZE, despite add_swap_extent confusingly (ab)using the sector_t type. Wherever you end up reporting this value, it ought to be converted to something more common (like byte offset or 512b-block offset). Also ... if this is a swap *file* then reporting the path and the physical storage device address is not that helpful. Exposing the block device major/minor and block device address would be much more useful, wouldn't it? (Not that I have any idea what the "suspend process" in the cover letter refers to -- suspend and hibernate have been broken on xfs forever...) --D > K((unsigned long long)span), > (p->flags & SWP_SOLIDSTATE) ? "SS" : "", > (p->flags & SWP_DISCARDABLE) ? "D" : "", > -- > 2.34.1 > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] mm: swap: print starting physical block offset in swapon 2024-05-22 14:56 ` Darrick J. Wong @ 2024-05-22 14:59 ` Christoph Hellwig 2024-05-27 11:02 ` Sukrit.Bhatnagar 1 sibling, 0 replies; 14+ messages in thread From: Christoph Hellwig @ 2024-05-22 14:59 UTC (permalink / raw) To: Darrick J. Wong Cc: Sukrit Bhatnagar, Rafael J. Wysocki, Pavel Machek, Christian Brauner, Andrew Morton, linux-xfs, linux-pm, linux-fsdevel, linux-kernel, linux-mm On Wed, May 22, 2024 at 07:56:37AM -0700, Darrick J. Wong wrote: > On Wed, May 22, 2024 at 04:46:58PM +0900, Sukrit Bhatnagar wrote: > > When a swapfile is created for hibernation purposes, we always need > > the starting physical block offset, which is usually determined using > > userspace commands such as filefrag. > > If you always need this value, then shouldn't it be exported via sysfs > or somewhere so that you can always get to it? The kernel ringbuffer > can overwrite log messages, swapfiles can get disabled, etc. Scraping a block address from anything is just broken. Wher is the code using this? It needs a proper kernel interface. Same about the warning crap in patch 1. ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 2/2] mm: swap: print starting physical block offset in swapon 2024-05-22 14:56 ` Darrick J. Wong 2024-05-22 14:59 ` Christoph Hellwig @ 2024-05-27 11:02 ` Sukrit.Bhatnagar 1 sibling, 0 replies; 14+ messages in thread From: Sukrit.Bhatnagar @ 2024-05-27 11:02 UTC (permalink / raw) To: Darrick J. Wong Cc: Rafael J. Wysocki, Pavel Machek, Christian Brauner, Andrew Morton, linux-xfs@vger.kernel.org, linux-pm@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Hi Darrick, On 2024-05-22 23:56, Darrick Wong wrote: > On Wed, May 22, 2024 at 04:46:58PM +0900, Sukrit Bhatnagar wrote: >> When a swapfile is created for hibernation purposes, we always need >> the starting physical block offset, which is usually determined using >> userspace commands such as filefrag. > > If you always need this value, then shouldn't it be exported via sysfs > or somewhere so that you can always get to it? The kernel ringbuffer > can overwrite log messages, swapfiles can get disabled, etc. I agree on using appropriate kernel interfaces instead of kernel log. >> It would be good to have that value printed when we do swapon and get >> that value directly from dmesg. >> >> Signed-off-by: Sukrit Bhatnagar <Sukrit.Bhatnagar@sony.com> >> --- >> mm/swapfile.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> diff --git a/mm/swapfile.c b/mm/swapfile.c >> index f6ca215fb92f..53c9187d5fbe 100644 >> --- a/mm/swapfile.c >> +++ b/mm/swapfile.c >> @@ -3264,8 +3264,9 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) >> (swap_flags & SWAP_FLAG_PRIO_MASK) >> SWAP_FLAG_PRIO_SHIFT; >> enable_swap_info(p, prio, swap_map, cluster_info); >> - pr_info("Adding %uk swap on %s. Priority:%d extents:%d across:%lluk %s%s%s%s\n", >> + pr_info("Adding %uk swap on %s. Priority:%d extents:%d start:%llu across:%lluk %s%s%s%s\n", >> K(p->pages), name->name, p->prio, nr_extents, >> + (unsigned long long)first_se(p)->start_block, > > Last time I looked, start_block was in units of PAGE_SIZE, despite > add_swap_extent confusingly (ab)using the sector_t type. Wherever you > end up reporting this value, it ought to be converted to something more > common (like byte offset or 512b-block offset). I could not find any swap-related entries in the sysfs, but there is /proc/swaps which shows the enabled swaps in a table. A column for this start offset could be added there, which as you have mentioned, should be in a unit such as bytes instead of PAGE_SIZE blocks. > Also ... if this is a swap *file* then reporting the path and the > physical storage device address is not that helpful. Exposing the block > device major/minor and block device address would be much more useful, > wouldn't it? For exposing information about swap file path, I think it wouldn't make much difference (at least for the hibernate case) as we can always do the file-path -> bdev-path -> major:minor conversion in userspace. > (Not that I have any idea what the "suspend process" in the cover letter > refers to -- suspend and hibernate have been broken on xfs forever...) By suspend process, I meant the series of steps taken when we trigger hibernate's suspend-to-disk. Not the task that started it. (Wrong choice of words, my bad). -- Sukrit ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/2] Improve dmesg output for swapfile+hibernation 2024-05-22 7:46 [PATCH 0/2] Improve dmesg output for swapfile+hibernation Sukrit Bhatnagar 2024-05-22 7:46 ` [PATCH 1/2] iomap: swap: print warning for unaligned swapfile Sukrit Bhatnagar 2024-05-22 7:46 ` [PATCH 2/2] mm: swap: print starting physical block offset in swapon Sukrit Bhatnagar @ 2024-05-23 19:45 ` Pavel Machek 2024-05-27 11:06 ` Sukrit.Bhatnagar 2 siblings, 1 reply; 14+ messages in thread From: Pavel Machek @ 2024-05-23 19:45 UTC (permalink / raw) To: Sukrit Bhatnagar Cc: Rafael J. Wysocki, Christian Brauner, Darrick J. Wong, Andrew Morton, linux-xfs, linux-pm, linux-fsdevel, linux-kernel, linux-mm [-- Attachment #1: Type: text/plain, Size: 561 bytes --] Hi! > While trying to use a swapfile for hibernation, I noticed that the suspend > process was failing when it tried to search for the swap to use for snapshot. > I had created the swapfile on ext4 and got the starting physical block offset > using the filefrag command. How is swapfile for hibernation supposed to work? I'm afraid that can't work, and we should just not allow hibernation if there's anything else than just one swap partition. Best regards, Pavel -- People of Russia, stop Putin before his war on Ukraine escalates. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 0/2] Improve dmesg output for swapfile+hibernation 2024-05-23 19:45 ` [PATCH 0/2] Improve dmesg output for swapfile+hibernation Pavel Machek @ 2024-05-27 11:06 ` Sukrit.Bhatnagar 2024-05-27 11:20 ` Christoph Hellwig 0 siblings, 1 reply; 14+ messages in thread From: Sukrit.Bhatnagar @ 2024-05-27 11:06 UTC (permalink / raw) To: Pavel Machek Cc: Rafael J. Wysocki, Christian Brauner, Darrick J. Wong, Andrew Morton, linux-xfs@vger.kernel.org, linux-pm@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Hi Pavel! On 2024-05-24 04:45, Pavel Machek wrote: > Hi! > >> While trying to use a swapfile for hibernation, I noticed that the suspend >> process was failing when it tried to search for the swap to use for snapshot. >> I had created the swapfile on ext4 and got the starting physical block offset >> using the filefrag command. > > How is swapfile for hibernation supposed to work? I'm afraid that > can't work, and we should just not allow hibernation if there's > anything else than just one swap partition. I am not sure what you mean. We can pass the starting physical block offset of a swapfile into /sys/power/resume_offset, and hibernate can directly read/write into it using the swap extents information created by iomap during swapon. On resume, the kernel would read this offset value from the commandline parameters, and then access the swapfile. I find having a swapfile option for hibernate useful, in the scenarios where it is hard to modify the partitioning scheme, or to have a dedicated swap partition. Are there any plans to remove swapfile support from hibernation? -- Sukrit ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/2] Improve dmesg output for swapfile+hibernation 2024-05-27 11:06 ` Sukrit.Bhatnagar @ 2024-05-27 11:20 ` Christoph Hellwig 2024-05-27 12:51 ` Sukrit.Bhatnagar 0 siblings, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2024-05-27 11:20 UTC (permalink / raw) To: Sukrit.Bhatnagar@sony.com Cc: Pavel Machek, Rafael J. Wysocki, Christian Brauner, Darrick J. Wong, Andrew Morton, linux-xfs@vger.kernel.org, linux-pm@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org On Mon, May 27, 2024 at 11:06:11AM +0000, Sukrit.Bhatnagar@sony.com wrote: > We can pass the starting physical block offset of a swapfile into > /sys/power/resume_offset, and hibernate can directly read/write > into it using the swap extents information created by iomap during > swapon. On resume, the kernel would read this offset value from > the commandline parameters, and then access the swapfile. Reading a physical address from userspace is not a proper interface. What is this code even trying to do with it? ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 0/2] Improve dmesg output for swapfile+hibernation 2024-05-27 11:20 ` Christoph Hellwig @ 2024-05-27 12:51 ` Sukrit.Bhatnagar 2024-05-27 12:58 ` Christoph Hellwig 0 siblings, 1 reply; 14+ messages in thread From: Sukrit.Bhatnagar @ 2024-05-27 12:51 UTC (permalink / raw) To: Christoph Hellwig Cc: Pavel Machek, Rafael J. Wysocki, Christian Brauner, Darrick J. Wong, Andrew Morton, linux-xfs@vger.kernel.org, linux-pm@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Hi Christoph, On 2024-05-27 20:20, Christoph Hellwig wrote: > On Mon, May 27, 2024 at 11:06:11AM +0000, Sukrit.Bhatnagar@sony.com wrote: >> We can pass the starting physical block offset of a swapfile into >> /sys/power/resume_offset, and hibernate can directly read/write >> into it using the swap extents information created by iomap during >> swapon. On resume, the kernel would read this offset value from >> the commandline parameters, and then access the swapfile. > > Reading a physical address from userspace is not a proper interface. > What is this code even trying to do with it? I understand your point. Ideally, the low-level stuff such as finding the physical block offset should not be handled in the userspace. In my understanding, the resume offset in hibernate is used as follows. Suspend - Hibernate looks up the swap/swapfile using the details we pass in the sysfs entries, in the function swsusp_swap_check(): * /sys/power/resume - path/uuid/major:minor of the swap partition (or non-swap partition for swapfile) * /sys/power/resume_offset - physical offset of the swapfile in that partition * If no resume device is specified, it just uses the first available swap! - It then proceeds to write the image to the specified swap. (The allocation of swap pages is done by the swapfile code internally.) - When writing is finished, the swap header needs to be updated with some metadata, in the function mark_swapfiles(). * Hibernate creates bio requests to read/write the header (which is the first page of swap) using that physical block offset. Resume - Hibernate gets the partition and offset values from kernel command-line parameters "resume" and "resume_offset" (which must be set from userspace, not ideal). - It checks for valid hibernate swap signature by reading the swap header. * Hibernate creates bio requests again, using the physical block offset, but the one from kernel command-line this time. - Then it restores image and resumes into the previously saved kernel. -- Sukrit ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/2] Improve dmesg output for swapfile+hibernation 2024-05-27 12:51 ` Sukrit.Bhatnagar @ 2024-05-27 12:58 ` Christoph Hellwig 2024-05-31 14:15 ` Sukrit.Bhatnagar 0 siblings, 1 reply; 14+ messages in thread From: Christoph Hellwig @ 2024-05-27 12:58 UTC (permalink / raw) To: Sukrit.Bhatnagar@sony.com Cc: Christoph Hellwig, Pavel Machek, Rafael J. Wysocki, Christian Brauner, Darrick J. Wong, Andrew Morton, linux-xfs@vger.kernel.org, linux-pm@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org On Mon, May 27, 2024 at 12:51:07PM +0000, Sukrit.Bhatnagar@sony.com wrote: > In my understanding, the resume offset in hibernate is used as follows. > > Suspend > - Hibernate looks up the swap/swapfile using the details we pass in the > sysfs entries, in the function swsusp_swap_check(): > * /sys/power/resume - path/uuid/major:minor of the swap partition (or > non-swap partition for swapfile) > * /sys/power/resume_offset - physical offset of the swapfile in that > partition > * If no resume device is specified, it just uses the first available swap! > - It then proceeds to write the image to the specified swap. > (The allocation of swap pages is done by the swapfile code internally.) Where "it" is userspace code? If so, that already seems unsafe for a swap device, but definitely is a no-go for a swapfile. > - Hibernate gets the partition and offset values from kernel command-line > parameters "resume" and "resume_offset" (which must be set from > userspace, not ideal). Or is it just for these parameters? In which case we "only" need to specify the swap file, which would then need code in the file system driver to resolve the logical to physical mapping as swap files don't need to be contiguous. ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH 0/2] Improve dmesg output for swapfile+hibernation 2024-05-27 12:58 ` Christoph Hellwig @ 2024-05-31 14:15 ` Sukrit.Bhatnagar 0 siblings, 0 replies; 14+ messages in thread From: Sukrit.Bhatnagar @ 2024-05-31 14:15 UTC (permalink / raw) To: Christoph Hellwig Cc: Pavel Machek, Rafael J. Wysocki, Christian Brauner, Darrick J. Wong, Andrew Morton, linux-xfs@vger.kernel.org, linux-pm@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org On 2024-05-27 21:58, Christoph Hellwig wrote: > On Mon, May 27, 2024 at 12:51:07PM +0000, Sukrit.Bhatnagar@sony.com wrote: >> In my understanding, the resume offset in hibernate is used as follows. >> >> Suspend >> - Hibernate looks up the swap/swapfile using the details we pass in the >> sysfs entries, in the function swsusp_swap_check(): >> * /sys/power/resume - path/uuid/major:minor of the swap partition (or >> non-swap partition for swapfile) >> * /sys/power/resume_offset - physical offset of the swapfile in that >> partition >> * If no resume device is specified, it just uses the first available >> swap! - It then proceeds to write the image to the specified swap. >> (The allocation of swap pages is done by the swapfile code >> internally.) > > Where "it" is userspace code? If so, that already seems unsafe for > a swap device, but definitely is a no-go for a swapfile. By "it", I meant the hibernate code running in kernel space. Once userspace triggers hibernation by `echo disk > /sys/power/state` or a systemd wrapper program etc., and userspace tasks are frozen, everything happens within kernel context. >> - Hibernate gets the partition and offset values from kernel command-line >> parameters "resume" and "resume_offset" (which must be set from >> userspace, not ideal). > > Or is it just for these parameters? In which case we "only" need to > specify the swap file, which would then need code in the file system > driver to resolve the logical to physical mapping as swap files don't > need to be contiguous. Yes, it is just for setting these parameters in sysfs entries and in kernel commandline. I think specifying the swapfile path *may* not work because when we resume from hibernation, the filesystems are not yet mounted (except for the case when someone is resuming from initramfs stage). Using the block device + physical offset, this procedure becomes Independent of the filesystem and the mounted status. And since the system swap information is lost on reboot/shutdown, the kernel which loads the hibernation image will not know what swaps were enabled when the image was created. -- Sukrit ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-05-31 14:16 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-22 7:46 [PATCH 0/2] Improve dmesg output for swapfile+hibernation Sukrit Bhatnagar 2024-05-22 7:46 ` [PATCH 1/2] iomap: swap: print warning for unaligned swapfile Sukrit Bhatnagar 2024-05-22 18:02 ` Chris Li 2024-05-27 10:54 ` Sukrit.Bhatnagar 2024-05-22 7:46 ` [PATCH 2/2] mm: swap: print starting physical block offset in swapon Sukrit Bhatnagar 2024-05-22 14:56 ` Darrick J. Wong 2024-05-22 14:59 ` Christoph Hellwig 2024-05-27 11:02 ` Sukrit.Bhatnagar 2024-05-23 19:45 ` [PATCH 0/2] Improve dmesg output for swapfile+hibernation Pavel Machek 2024-05-27 11:06 ` Sukrit.Bhatnagar 2024-05-27 11:20 ` Christoph Hellwig 2024-05-27 12:51 ` Sukrit.Bhatnagar 2024-05-27 12:58 ` Christoph Hellwig 2024-05-31 14:15 ` Sukrit.Bhatnagar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).