All of lore.kernel.org
 help / color / mirror / Atom feed
* Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
@ 2012-12-04 19:44 Thorn Roby
  2012-12-05  2:01 ` Liu Bo
  0 siblings, 1 reply; 6+ messages in thread
From: Thorn Roby @ 2012-12-04 19:44 UTC (permalink / raw)
  To: linux-btrfs

I am trying to use a btrfs filesystem (Oracle Linux 6.3 X86_64, UEK kernel)  
with data RAID0 and metadata RAID1, mounted as follows:

/dev/sda3 on /data type btrfs (rw,noatime,compress-force=zlib,noacl)

dmesg shows 

btrfs: force zlib compression
btrfs: disk space caching is enabled

My application is MongoDB, which creates a series of 2GB datafiles, possibly 
using sparse allocation (at least I know it does on XFS), which reports the 
following for each datafile:

Tue Dec  4 12:05:26.538 [FileAllocator] allocating new datafile 
/data/db/lfs/lfs.19, filling with zeroes...
Tue Dec  4 12:05:26.544 [FileAllocator] done allocating datafile 
/data/db/lfs/lfs.19, size: 2047MB,  took 0.006 secs

If I create a new filesystem and mount it with compress-force=zlib, then rsync a 
number of these 2GB files containing MongoDB data from another system, I get 
roughly 4:1 compression. I also notices that CPU usage remains at 100% and the 
rsync speed is roughly 70% of network speed. If I do the same thing using LZO, I 
see low CPU usage, transmission speed is 100% of network speed and compression 
is about 2:1. These results are what I expect.

However, if I run the mongodb process directly on the machine, using compress-
force=zlib (haven't tried LZO yet) with the same mount options, and load data 
into the database from another system via database inserts, there is no evidence 
of compression. This is consistently verified by the output of df, btrfs fi df 
and an internal report from the database which shows row count and average row 
size (560 bytes). I also see low CPU usage (however this is not conclusive since 
the file write rate from the database process is roughly 10 times slower than a 
direct write using rsync). 

[root@eng-mongodb-t1 lfs]# btrfs fi df /data
Data, RAID0: total=37.95GB, used=33.84GB
Data: total=8.00MB, used=0.00
System, RAID1: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=1.00GB, used=45.35MB
Metadata: total=8.00MB, used=0.00

[root@eng-mongodb-t1 lfs]# df -h
/dev/sda3              40G   34G  4.2G  90% /data

I tried chattr -c on the database directory but it doesn't appear to be 
supported in Oracle Linux (nor was it necessary in the case where I rsyncd the 
files and saw the expected compression).
Filesystem type is: 9123683e
File size of lfs.18 is 2146435072 (524032 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0  9182208               2 
   1       2  9706240  9182209      1 
   2       3  9182211  9706240  14348 
   3   14351  9706241  9196558      1 
   4   14352  9196560  9706241  28126 
   5   42478  9224686          481554 eof
lfs.18: 5 extents found


Is there something about the (possible) initial sparse allocation followed by 
zero-filling that might be disabling compression as the datafile is later 
overwritten by the database process, despite the compress-force flag?

filefrag -v shows this, but I'm still not sure how to interpret it:

Filesystem type is: 9123683e
File size of lfs.18 is 2146435072 (524032 blocks, blocksize 4096)
 ext logical physical expected length flags
   0       0  9182208               2 
   1       2  9706240  9182209      1 
   2       3  9182211  9706240  14348 
   3   14351  9706241  9196558      1 
   4   14352  9196560  9706241  28126 
   5   42478  9224686          481554 eof
lfs.18: 5 extents found










^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
  2012-12-04 19:44 Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application Thorn Roby
@ 2012-12-05  2:01 ` Liu Bo
  2012-12-05 17:21   ` Thorn Roby
  2012-12-05 17:42   ` Thorn Roby
  0 siblings, 2 replies; 6+ messages in thread
From: Liu Bo @ 2012-12-05  2:01 UTC (permalink / raw)
  To: Thorn Roby; +Cc: linux-btrfs

On Tue, Dec 04, 2012 at 07:44:16PM +0000, Thorn Roby wrote:
> I am trying to use a btrfs filesystem (Oracle Linux 6.3 X86_64, UEK kernel)  
> with data RAID0 and metadata RAID1, mounted as follows:
> 
> /dev/sda3 on /data type btrfs (rw,noatime,compress-force=zlib,noacl)
> 
> dmesg shows 
> 
> btrfs: force zlib compression
> btrfs: disk space caching is enabled
> 
> My application is MongoDB, which creates a series of 2GB datafiles, possibly 
> using sparse allocation (at least I know it does on XFS), which reports the 
> following for each datafile:
> 
> Tue Dec  4 12:05:26.538 [FileAllocator] allocating new datafile 
> /data/db/lfs/lfs.19, filling with zeroes...
> Tue Dec  4 12:05:26.544 [FileAllocator] done allocating datafile 
> /data/db/lfs/lfs.19, size: 2047MB,  took 0.006 secs
> 
> If I create a new filesystem and mount it with compress-force=zlib, then rsync a 
> number of these 2GB files containing MongoDB data from another system, I get 
> roughly 4:1 compression. I also notices that CPU usage remains at 100% and the 
> rsync speed is roughly 70% of network speed. If I do the same thing using LZO, I 
> see low CPU usage, transmission speed is 100% of network speed and compression 
> is about 2:1. These results are what I expect.
> 
> However, if I run the mongodb process directly on the machine, using compress-
> force=zlib (haven't tried LZO yet) with the same mount options, and load data 
> into the database from another system via database inserts, there is no evidence 
> of compression. This is consistently verified by the output of df, btrfs fi df 
> and an internal report from the database which shows row count and average row 
> size (560 bytes). I also see low CPU usage (however this is not conclusive since 
> the file write rate from the database process is roughly 10 times slower than a 
> direct write using rsync). 
> 
> [root@eng-mongodb-t1 lfs]# btrfs fi df /data
> Data, RAID0: total=37.95GB, used=33.84GB
> Data: total=8.00MB, used=0.00
> System, RAID1: total=8.00MB, used=4.00KB
> System: total=4.00MB, used=0.00
> Metadata, RAID1: total=1.00GB, used=45.35MB
> Metadata: total=8.00MB, used=0.00
> 
> [root@eng-mongodb-t1 lfs]# df -h
> /dev/sda3              40G   34G  4.2G  90% /data
> 
> I tried chattr -c on the database directory but it doesn't appear to be 
> supported in Oracle Linux (nor was it necessary in the case where I rsyncd the 
> files and saw the expected compression).
> Filesystem type is: 9123683e
> File size of lfs.18 is 2146435072 (524032 blocks, blocksize 4096)
>  ext logical physical expected length flags
>    0       0  9182208               2 
>    1       2  9706240  9182209      1 
>    2       3  9182211  9706240  14348 
>    3   14351  9706241  9196558      1 
>    4   14352  9196560  9706241  28126 
>    5   42478  9224686          481554 eof
> lfs.18: 5 extents found
> 
> 
> Is there something about the (possible) initial sparse allocation followed by 
> zero-filling that might be disabling compression as the datafile is later 
> overwritten by the database process, despite the compress-force flag?

Well, it shouldn't be that since we have compress-force, as the name shows,
with compress-force it'll always try to compress the data via zlib/lzo.

So the previous rsync one is just 'write numbers of 2G files', while the
current application one is 'write numbers of 2G files' plus 'overwritten
by others later', is it right?

And is the above 'btrfs fi df' output captured after 'overwritten', or not?

thanks,

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
  2012-12-05  2:01 ` Liu Bo
@ 2012-12-05 17:21   ` Thorn Roby
  2012-12-05 17:42   ` Thorn Roby
  1 sibling, 0 replies; 6+ messages in thread
From: Thorn Roby @ 2012-12-05 17:21 UTC (permalink / raw)
  To: bo.li.liu@oracle.com; +Cc: linux-btrfs@vger.kernel.org

Thanks for your response. Yes, the rsync copies a number of the 2GB files written by the same database software on another system on XFS, and compression is successful. These files consist mostly of plain text log output and are highly compressible (4:1 via zlib). When the database is running locally on btrfs, the I/O pattern is that the database allocates a new 2GB datafile (perhaps using a sparse allocation call, I know it does on XFS) and then says it fills the file with zeroes (but I'm not certain this actually happens). At that point database inserts are appended initially to a transaction log (a separate non-sparse 2GB file), and shortly thereafter there is a batch copy operation (I'm not sure of the size) of a set of new rows which are copied from the transaction log and appended to the main datafiles (which are sparse). Up until now I have included both the transaction log and the datafiles within the btrfs filesystem - next I'm going to try putting the transaction log on a separate filesystem . The btrfs fi df output was taken after the database load process was stopped, so it reflects a static set of the 2GB datafiles (plus the transaction log and a few other database system files).

-----Original Message-----
From: Liu Bo [mailto:bo.li.liu@oracle.com] 
Sent: Tuesday, December 04, 2012 7:02 PM
To: Thorn Roby
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application


Well, it shouldn't be that since we have compress-force, as the name shows, with compress-force it'll always try to compress the data via zlib/lzo.

So the previous rsync one is just 'write numbers of 2G files', while the current application one is 'write numbers of 2G files' plus 'overwritten by others later', is it right?

And is the above 'btrfs fi df' output captured after 'overwritten', or not?

thanks,



^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
  2012-12-05  2:01 ` Liu Bo
  2012-12-05 17:21   ` Thorn Roby
@ 2012-12-05 17:42   ` Thorn Roby
  2012-12-06  6:59     ` Liu Bo
  1 sibling, 1 reply; 6+ messages in thread
From: Thorn Roby @ 2012-12-05 17:42 UTC (permalink / raw)
  To: bo.li.liu@oracle.com; +Cc: linux-btrfs@vger.kernel.org

My previous reply was incorrect in one point - the data is never copied from the transaction log into the sparse datafiles, instead the application writes the same data independently to both locations.
Also, I failed to mention that the files are memmapped, and it's possible that the write operations attempt to use DIRECT_IO, which I believe is disabled by btrfs with compression - is it possible that an attempt to use DIRECT_IO or memmapped files would prevent compression?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
  2012-12-05 17:42   ` Thorn Roby
@ 2012-12-06  6:59     ` Liu Bo
  2012-12-06 20:34       ` Thorn Roby
  0 siblings, 1 reply; 6+ messages in thread
From: Liu Bo @ 2012-12-06  6:59 UTC (permalink / raw)
  To: Thorn Roby; +Cc: linux-btrfs@vger.kernel.org

On Wed, Dec 05, 2012 at 05:42:51PM +0000, Thorn Roby wrote:
> My previous reply was incorrect in one point - the data is never copied from the transaction log into the sparse datafiles, instead the application writes the same data independently to both locations.
> Also, I failed to mention that the files are memmapped, and it's possible that the write operations attempt to use DIRECT_IO, which I believe is disabled by btrfs with compression - is it possible that an attempt to use DIRECT_IO or memmapped files would prevent compression?
> 

Actually, writting with DIRECT_IO will fall back to buffer write for
safety, and mmap files just dirty pages, should be same with buffer
write, so it might be other reasons.

thanks,
liubo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application
  2012-12-06  6:59     ` Liu Bo
@ 2012-12-06 20:34       ` Thorn Roby
  0 siblings, 0 replies; 6+ messages in thread
From: Thorn Roby @ 2012-12-06 20:34 UTC (permalink / raw)
  To: bo.li.liu@oracle.com; +Cc: linux-btrfs@vger.kernel.org

It appears to be an issue with the initial sparse file allocation. When I manually create 2GB datafiles using dd from /dev/zero, instead of allowing MongoDB to allocate them as needed, compression seems to be working correctly.

-----Original Message-----
From: Liu Bo [mailto:bo.li.liu@oracle.com] 
Sent: Wednesday, December 05, 2012 11:59 PM
To: Thorn Roby
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application

On Wed, Dec 05, 2012 at 05:42:51PM +0000, Thorn Roby wrote:
> My previous reply was incorrect in one point - the data is never copied from the transaction log into the sparse datafiles, instead the application writes the same data independently to both locations.
> Also, I failed to mention that the files are memmapped, and it's possible that the write operations attempt to use DIRECT_IO, which I believe is disabled by btrfs with compression - is it possible that an attempt to use DIRECT_IO or memmapped files would prevent compression?
> 

Actually, writting with DIRECT_IO will fall back to buffer write for safety, and mmap files just dirty pages, should be same with buffer write, so it might be other reasons.

thanks,
liubo



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-12-06 20:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-04 19:44 Mounted compress-force=zlib, compresses when files copied in, but not when written directly by application Thorn Roby
2012-12-05  2:01 ` Liu Bo
2012-12-05 17:21   ` Thorn Roby
2012-12-05 17:42   ` Thorn Roby
2012-12-06  6:59     ` Liu Bo
2012-12-06 20:34       ` Thorn Roby

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.