public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC 0/6] Backing Store for sysfs
@ 2003-10-06 18:19 Christian Borntraeger
  0 siblings, 0 replies; 34+ messages in thread
From: Christian Borntraeger @ 2003-10-06 18:19 UTC (permalink / raw)
  To: Greg KH; +Cc: Al Viro, Patrick Mochel, LKML, dipankar

Hi Greg,

I just did a test run. There is still more free memory than with a stock 
kernel. I guess some cache entries aged dropped out of existence.
I guess more cache entries will be removed if I put memory pressure on the 
system.
Please correct me, if I am wrong, but sysfs dentry and inode caches are 
currently unswappable, right?
But now to the results:


------standard after boot----------
cat /proc/meminfo 
MemTotal:       795612 kB 
MemFree:        175904 kB 
Buffers:          2620 kB 
Cached:         257948 kB 
SwapCached:          0 kB 
Active:          11280 kB 
Inactive:       251392 kB 
HighTotal:           0 kB 
HighFree:            0 kB 
LowTotal:       795612 kB 
LowFree:        175904 kB 
SwapTotal:     1355416 kB 
SwapFree:      1355416 kB 
Dirty:            1044 kB 
Writeback:           0 kB 
Mapped:           5032 kB 
Slab:           346220 kB 
Committed_AS:     4580 kB 
PageTables:        140 kB 
VmallocTotal: 4294139904 kB 
VmallocUsed:      2108 kB 
VmallocChunk: 4294137796 kB 



------with patch after boot-----------
cat /proc/meminfo 
MemTotal:       795612 kB 
MemFree:        702416 kB 
Buffers:          2604 kB 
Cached:          17328 kB 
SwapCached:          0 kB 
Active:          11040 kB 
Inactive:        11080 kB 
HighTotal:           0 kB 
HighFree:            0 kB 
LowTotal:       795612 kB 
LowFree:        702416 kB 
SwapTotal:     1355416 kB 
SwapFree:      1355416 kB 
Dirty:            1040 kB 
Writeback:           0 kB 
Mapped:           5016 kB 
Slab:            61004 kB 
Committed_AS:     4580 kB 
PageTables:        136 kB 
VmallocTotal: 4294139904 kB 
VmallocUsed:      1308 kB 
VmallocChunk: 4294138596 kB 

------with patch after find /sys-----------
cat /proc/meminfo 
MemTotal:       795612 kB 
MemFree:        312312 kB 
Buffers:          2568 kB 
Cached:         257868 kB 
SwapCached:          0 kB 
Active:          11284 kB 
Inactive:       251304 kB 
HighTotal:           0 kB 
HighFree:            0 kB 
LowTotal:       795612 kB 
LowFree:        312312 kB 
SwapTotal:     1355416 kB 
SwapFree:      1355416 kB 
Dirty:               0 kB 
Writeback:           0 kB 
Mapped:           5016 kB 
Slab:           210608 kB 
Committed_AS:     4580 kB 
PageTables:        136 kB 
VmallocTotal: 4294139904 kB 
VmallocUsed:      1308 kB 
VmallocChunk: 4294138596 kB 
¬root§53v15g05 root|# 

By the way, I noticed, that this patch slows down the find.

cheers

-- 
Mit freundlichen Grüßen / Best Regards

Christian Bornträger
IBM Deutschland Entwicklung GmbH
eServer SW  System Evaluation + Test
email: CBORNTRA@de.ibm.com
Tel +49  7031 16 1975


To:     Christian Borntraeger/Germany/IBM@IBMDE
cc:     Al Viro <viro@parcelfarce.linux.theplanet.co.uk>, Patrick Mochel 
<mochel@osdl.org>, LKML <linux-kernel@vger.kernel.org>, 
dipankar@in.ltcfwd.linux.ibm.com 
Subject:        Re: [RFC 0/6] Backing Store for sysfs


On Mon, Oct 06, 2003 at 07:38:06PM +0200, Christian Borntraeger wrote:
> Greg KH wrote:
> 
> > On Mon, Oct 06, 2003 at 02:29:15PM +0530, Maneesh Soni wrote:
> >> 
> >> 2.6.0-test6          With patches.
> >> -----------------
> >> dentry_cache (active)                2520                    2544
> >> inode_cache (active)         1058                    1050
> >> LowFree                      875032 KB               874748 KB
> > 
> > So with these patches we actually eat up more LowFree if all sysfs
> > entries are searched, and make the dentry_cache bigger?  That's not 
good
> > :(
> [...]
> > information for that kobject.  So I don't see any savings in these
> > patches, do you?
> 
> I do. As stated earlier, with 20000 devices on a S390 guest I have 
around 
> 350MB slab memory after rebooting. 
> With this patch, the slab memory reduces to 60MB. 

That's good.  But what happens after you run a find over the sysfs tree?
Which is essencially what udev will be doing :)

thanks,

greg k-h




^ permalink raw reply	[flat|nested] 34+ messages in thread
* Re: [RFC 0/6] Backing Store for sysfs
@ 2003-10-06 17:38 Christian Borntraeger
  2003-10-06 17:41 ` Greg KH
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Borntraeger @ 2003-10-06 17:38 UTC (permalink / raw)
  To: Greg KH; +Cc: Al Viro, Patrick Mochel, LKML, Dipankar Sarma

Greg KH wrote:

> On Mon, Oct 06, 2003 at 02:29:15PM +0530, Maneesh Soni wrote:
>> 
>> 2.6.0-test6          With patches.
>> -----------------
>> dentry_cache (active)                2520                    2544
>> inode_cache (active)         1058                    1050
>> LowFree                      875032 KB               874748 KB
> 
> So with these patches we actually eat up more LowFree if all sysfs
> entries are searched, and make the dentry_cache bigger?  That's not good
> :(
[...]
> information for that kobject.  So I don't see any savings in these
> patches, do you?

I do. As stated earlier, with 20000 devices on a S390 guest I have around 
350MB slab memory after rebooting. 
With this patch, the slab memory reduces to 60MB. 
This becomes even more nasty as the kernel crashes during bootup if I only 
spend 256M for this guest: (happens with the current sysfs, not with this 
patch)

fixpoint divide exception: 0009 ¬#1| 
CPU:    0    Not tainted 
Process cio/0 (pid: 18, task: 000000000b84a810, ksp: 000000000b81f0a8) 
Krnl PSW : 0700000180000000 0000000000066aa2 
Krnl GPRS: 000000000000245e 0000000000000000 0000000000000000 
0000000000000000 
           00000000003b5110 0000000000000000 0000000000000000 
000000000030c008 
           0000000000000044 0000000000000020 000000000030be00 
00000000009fb8b0 
           00000000009fb880 00000000002b12b0 00000000000668f0 
000000000b81f0a8 
Krnl ACRS: 00000000 00000000 00000000 00000000 
           00000000 00000000 00000000 00000000 
           00000000 00000000 00000000 00000000 
           00000000 00000000 00000000 00000000 
Krnl Code: eb 13 00 3f 00 0c b9 08 00 13 58 40 a4 04 a7 28 00 64 8a 20 
Call Trace: 
 ¬<00000000000671c2>| shrink_zone+0x9e/0xc4 
 ¬<00000000000672c2>| shrink_caches+0xda/0xf4 
 ¬<00000000000673ae>| try_to_free_pages+0xd2/0x1b4 
 ¬<000000000005d812>| __alloc_pages+0x2aa/0x48c 
 ¬<000000000005da42>| __get_free_pages+0x4e/0x8c 
 ¬<0000000000061bfa>| cache_grow+0x116/0x40c 
 ¬<00000000000620ec>| cache_alloc_refill+0x1fc/0x328 
 ¬<000000000006258a>| kmem_cache_alloc+0xa2/0xb0 
 ¬<000000000009e094>| alloc_inode+0x1bc/0x1c0 
 ¬<000000000009ee40>| new_inode+0x20/0xb0 
 ¬<00000000000c50bc>| sysfs_new_inode+0x2c/0xb4 
 ¬<00000000000c519a>| sysfs_create+0x56/0xe0 
 ¬<00000000000c5bba>| sysfs_add_file+0xd2/0xf8 
 ¬<00000000000c6dce>| create_files+0x3e/0x84 
 ¬<00000000000c6e80>| sysfs_create_group+0x6c/0xe4 
 ¬<000000000016a508>| io_subchannel_register+0x54/0xec 
 ¬<000000000004b5ce>| worker_thread+0x21e/0x31c 
 ¬<0000000000019b68>| kernel_thread_starter+0x14/0x1c 

I agree that this patch is still borked, even some of the s390 device dont 
work. Nevertheless,  the idea to make this dentry/inode-cache memory 
freeable is good. I dont know why, but each device currently each device 
eats much more slab memory than a pagesize.
As far as I understood the mail of Dipankar, his patch is more a 
proof-of-concept not a mergable patch. If we find another solution to 
reduce the memory consumption of sysfs, I would be happy to accept 
different ideas.

cheers Christian

-- 
Mit freundlichen Grüßen / Best Regards

Christian Bornträger
IBM Deutschland Entwicklung GmbH
eServer SW  System Evaluation + Test
email: CBORNTRA@de.ibm.com
Tel +49  7031 16 1975


^ permalink raw reply	[flat|nested] 34+ messages in thread
* Re: [RFC 0/6] Backing Store for sysfs
@ 2003-10-06 12:34 Christian Borntraeger
  0 siblings, 0 replies; 34+ messages in thread
From: Christian Borntraeger @ 2003-10-06 12:34 UTC (permalink / raw)
  To: Maneesh Soni; +Cc: Al Viro, Patrick Mochel, Greg KH, linux-kernel

> Hi,
> 
> The following patch set(mailed separately) provides a prototype for a 
backing 
> store mechanism for sysfs. Currently sysfs pins all its dentries and 
inodes in 
> memory there by wasting kernel lowmem even when it is not mounted. 
> 
> With this patch set we create sysfs dentry whenever it is required like 
> other real filesystems and, age and free it as per the dcache rules. We
> now save significant amount of Lowmem by avoiding un-necessary pinning. 

A more mature patch could be a possible solution of some problems we faced 
with sysfs.
I have s390 test system with ~ 20000 devices. Memory consumption _without_ 
this
patch is horribly high.
Slab uses 346028 kB of memory, most of it is dentry and inode cache. 
I tried the patch, its boots, memory usage is much better,  but it is 
somewhat 
broken with our ccw devices as I cannot bring up our ccwgroup network 
devices. 
Therefore I dont have reliable memory results.
Almost nobody would use 20000 devices on a S390, but with some shared 
OSA-card
100 or 200 devices is realistic. Even in this case, memory consumption is 
much higher
than with 2.4.

I still have to look closer on this patch, if there are some deeper 
problems. 
Until I find something, I think this patch could be really helpful for 
computers with lots of devices.

-- 
Mit freundlichen Grüßen / Best Regards

Christian Bornträger
IBM Deutschland Entwicklung GmbH
eServer SW  System Evaluation + Test
email: CBORNTRA@de.ibm.com
Tel +49  7031 16 1975


^ permalink raw reply	[flat|nested] 34+ messages in thread
* [RFC 0/6] Backing Store for sysfs
@ 2003-10-06  8:59 Maneesh Soni
  2003-10-06 16:08 ` Greg KH
  2003-10-06 18:44 ` Patrick Mochel
  0 siblings, 2 replies; 34+ messages in thread
From: Maneesh Soni @ 2003-10-06  8:59 UTC (permalink / raw)
  To: Al Viro, Patrick Mochel, Greg KH; +Cc: LKML, Dipankar Sarma


Hi,

The following patch set(mailed separately) provides a prototype for a backing 
store mechanism for sysfs. Currently sysfs pins all its dentries and inodes in 
memory there by wasting kernel lowmem even when it is not mounted. 

With this patch set we create sysfs dentry whenever it is required like 
other real filesystems and, age and free it as per the dcache rules. We
now save significant amount of Lowmem by avoiding un-necessary pinning. 
The following numbers were on a 2-way system with 6 disks and 2 NICs with 
about 1028 dentries. The numbers are just indicative as they are system
wide collected from /proc and are not exclusively for sysfs.

				2.6.0-test6		With patches.
Right after system is booted
---------------------------
dentry_cache (active)		2343			1315
inode_cache (active)		1058			30
LowFree				875096 KB		875900 KB

After mounting sysfs
-------------------
dentry_cache (active)		2350			1321
inode_cache (active)		1058			31
LowFree				875096 KB		875836 KB

After "find /sys"
-----------------
dentry_cache (active)		2520			2544
inode_cache (active)		1058			1050
LowFree				875032 KB		874748 KB

After un-mounting sysfs
-----------------------
dentry_cache (active)		2363			1319
inode_cache (active)		1058			30
LowFree				875032 KB		875836 KB


The main idea is not create the dentry in sysfs_create_xxx calls but create
the dentry when it is first lookup. We now have lookup() inode_operation, 
open and close file_operations for sysfs directory inodes. 

The backing store is based on the kobjects which are always there in memory.
sysfs lookup is based on hierarchy of kobjects. As the current kobject 
infrastructure donot provide any means to traverse the kobject's children or 
siblings, two-way hierarchy lookup was not possible. For this new fields 
are added to kobject structure. This ended up increasing the size of kobject
from 52 bytes to 108 bytes but saving one dentry and inode per kobject.

The details of the patches are in the following mails. For testing please
apply all the patches as they are splitted just for ease of review.

Please send me comments on the approach, implementation, missed things and 
suggestions to improve them.

Thanks,
Maneesh

-- 
Maneesh Soni
Linux Technology Center, 
IBM Software Lab, Bangalore, India
email: maneesh@in.ibm.com
Phone: 91-80-5044999 Fax: 91-80-5268553
T/L : 9243696

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2003-10-07  9:08 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Dzxw.1wW.3@gated-at.bofh.it>
     [not found] ` <DGfG.4UY.3@gated-at.bofh.it>
     [not found]   ` <DHv1.5Ir.1@gated-at.bofh.it>
     [not found]     ` <DHEU.7ET.19@gated-at.bofh.it>
     [not found]       ` <DHY6.3c0.7@gated-at.bofh.it>
     [not found]         ` <DI7S.58w.13@gated-at.bofh.it>
2003-10-06 19:01           ` [RFC 0/6] Backing Store for sysfs Pascal Schmidt
2003-10-06 19:10             ` Greg KH
2003-10-07  0:15               ` Pascal Schmidt
2003-10-06 18:19 Christian Borntraeger
  -- strict thread matches above, loose matches on Subject: below --
2003-10-06 17:38 Christian Borntraeger
2003-10-06 17:41 ` Greg KH
2003-10-06 18:00   ` Kevin P. Fleming
2003-10-06 18:11     ` Greg KH
2003-10-06 18:23       ` Kevin P. Fleming
2003-10-06 18:30         ` Greg KH
2003-10-06 18:38           ` Kevin P. Fleming
2003-10-07  8:30           ` Maneesh Soni
2003-10-06 12:34 Christian Borntraeger
2003-10-06  8:59 Maneesh Soni
2003-10-06 16:08 ` Greg KH
2003-10-06 17:31   ` Dipankar Sarma
2003-10-06 17:38     ` Greg KH
2003-10-06 18:01       ` Dipankar Sarma
2003-10-06 18:09         ` Greg KH
2003-10-06 18:31           ` Dipankar Sarma
2003-10-06 18:34             ` Greg KH
2003-10-07  9:08               ` Andreas Jellinghaus
2003-10-06 18:44 ` Patrick Mochel
2003-10-06 19:27   ` Dipankar Sarma
2003-10-06 19:30     ` viro
2003-10-06 20:01       ` Dipankar Sarma
2003-10-06 20:34         ` viro
2003-10-07  4:47       ` Maneesh Soni
2003-10-06 19:33     ` Patrick Mochel
2003-10-06 20:26       ` Dipankar Sarma
2003-10-06 20:29         ` Patrick Mochel
2003-10-07  4:31           ` Maneesh Soni
2003-10-07  5:25             ` Nick Piggin
2003-10-07  7:17               ` Maneesh Soni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox