* Re: oom-killer killing even if memory is available?
[not found] ` <20090317112842.3b8e7724@osiris.boeblingen.de.ibm.com>
@ 2009-03-17 10:49 ` Nick Piggin
2009-03-17 11:39 ` Heiko Carstens
2009-03-20 5:08 ` Wu Fengguang
0 siblings, 2 replies; 3+ messages in thread
From: Nick Piggin @ 2009-03-17 10:49 UTC (permalink / raw)
To: Heiko Carstens, linux-fsdevel
Cc: Andrew Morton, linux-mm, Mel Gorman, Nick Piggin,
Martin Schwidefsky, Andreas Krebbel
On Tuesday 17 March 2009 21:28:42 Heiko Carstens wrote:
> On Tue, 17 Mar 2009 11:17:38 +0100
>
> Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> > On Tue, 17 Mar 2009 02:46:05 -0700
> >
> > Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > Mar 16 21:40:40 t6360003 kernel: Active_anon:372 active_file:45
> > > > inactive_anon:154 Mar 16 21:40:40 t6360003 kernel: inactive_file:152
> > > > unevictable:987 dirty:0 writeback:188 unstable:0 Mar 16 21:40:40
> > > > t6360003 kernel: free:146348 slab:875833 mapped:805 pagetables:378
> > > > bounce:0 Mar 16 21:40:40 t6360003 kernel: DMA free:467728kB
> > > > min:4064kB low:5080kB high:6096kB active_anon:0kB inactive_anon:0kB
> > > > active_file:0kB inactive_file:116kB unevictable:0kB present:2068480kB
> > > > pages_scanned:0 all_unreclaimable? no Mar 16 21:40:40 t6360003
> > > > kernel: lowmem_reserve[]: 0 2020 2020 Mar 16 21:40:40 t6360003
> > > > kernel: Normal free:117664kB min:4064kB low:5080kB high:6096kB
> > > > active_anon:1488kB inactive_anon:616kB active_file:188kB
> > > > inactive_file:492kB unevictable:3948kB present:2068480kB
> > > > pages_scanned:128 all_unreclaimable? no Mar 16 21:40:40 t6360003
> > > > kernel: lowmem_reserve[]: 0 0 0
> > >
> > > The scanner has wrung pretty much all it can out of the reclaimable
> > > pages - the LRUs are nearly empty. There's a few hundred MB free and
> > > apparently we don't have four physically contiguous free pages
> > > anywhere. It's believeable.
> > >
> > > The question is: where the heck did all your memory go? You have 2GB
> > > of ZONE_NORMAL memory in that machine, but only a tenth of it is
> > > visible to the page reclaim code.
> > >
> > > Something must have allocated (and possibly leaked) it.
> >
> > Looks like most of the memory went for dentries and inodes.
> > slabtop output:
> >
> > Active / Total Objects (% used) : 8172165 / 8326954 (98.1%)
> > Active / Total Slabs (% used) : 903692 / 903698 (100.0%)
> > Active / Total Caches (% used) : 91 / 144 (63.2%)
> > Active / Total Size (% used) : 3251262.44K / 3281384.22K (99.1%)
> > Minimum / Average / Maximum Object : 0.02K / 0.39K / 1024.00K
> >
> > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> > 3960036 3960017 99% 0.59K 660006 6 2640024K inode_cache
> > 4137155 3997581 96% 0.20K 217745 19 870980K dentry
> > 69776 69744 99% 0.80K 17444 4 69776K ext3_inode_cache
> > 96792 92892 95% 0.10K 2616 37 10464K buffer_head
> > 10024 9895 98% 0.54K 1432 7 5728K radix_tree_node
> > 1093 1087 99% 4.00K 1093 1 4372K size-4096
> > 14805 14711 99% 0.25K 987 15 3948K size-256
> > 2400 2381 99% 0.80K 480 5 1920K shmem_inode_cache
>
> FWIW, after "echo 3 > /proc/sys/vm/drop_caches" it looks like this:
>
> Active / Total Objects (% used) : 7965003 / 8153578 (97.7%)
> Active / Total Slabs (% used) : 882511 / 882511 (100.0%)
> Active / Total Caches (% used) : 90 / 144 (62.5%)
> Active / Total Size (% used) : 3173487.59K / 3211091.64K (98.8%)
> Minimum / Average / Maximum Object : 0.02K / 0.39K / 1024.00K
>
> OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> 3960036 3960007 99% 0.59K 660006 6 2640024K inode_cache
> 4137155 3962636 95% 0.20K 217745 19 870980K dentry
> 1097 1097 100% 4.00K 1097 1 4388K size-4096
> 14805 14667 99% 0.25K 987 15 3948K size-256
> 2400 2381 99% 0.80K 480 5 1920K shmem_inode_cache
> 1404 1404 100% 1.00K 351 4 1404K size-1024
> 152 152 100% 5.59K 152 1 1216K task_struct
> 1302 347 26% 0.54K 186 7 744K radix_tree_node
> 370 359 97% 2.00K 185 2 740K size-2048
> 9381 4316 46% 0.06K 159 59 636K size-64
> 8 8 100% 64.00K 8 1 512K size-65536
>
> So, are we leaking dentries and inodes?
Yes, probably leaking dentries, which pin inodes. I don't know that slab
leak debugging is going to help you because it won't find what is holding
the refcount.
Cc linux-fsdevel. Which kernel this is? Config as well.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: oom-killer killing even if memory is available?
2009-03-17 10:49 ` oom-killer killing even if memory is available? Nick Piggin
@ 2009-03-17 11:39 ` Heiko Carstens
2009-03-20 5:08 ` Wu Fengguang
1 sibling, 0 replies; 3+ messages in thread
From: Heiko Carstens @ 2009-03-17 11:39 UTC (permalink / raw)
To: Nick Piggin
Cc: linux-fsdevel, Andrew Morton, linux-mm, Mel Gorman, Nick Piggin,
Martin Schwidefsky, Andreas Krebbel
On Tue, 17 Mar 2009 21:49:35 +1100
Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> On Tuesday 17 March 2009 21:28:42 Heiko Carstens wrote:
> > > Looks like most of the memory went for dentries and inodes.
> > > slabtop output:
> > >
> > > Active / Total Objects (% used) : 8172165 / 8326954 (98.1%)
> > > Active / Total Slabs (% used) : 903692 / 903698 (100.0%)
> > > Active / Total Caches (% used) : 91 / 144 (63.2%)
> > > Active / Total Size (% used) : 3251262.44K / 3281384.22K (99.1%)
> > > Minimum / Average / Maximum Object : 0.02K / 0.39K / 1024.00K
> > >
> > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> > > 3960036 3960017 99% 0.59K 660006 6 2640024K inode_cache
> > > 4137155 3997581 96% 0.20K 217745 19 870980K dentry
> > > 69776 69744 99% 0.80K 17444 4 69776K ext3_inode_cache
> > > 96792 92892 95% 0.10K 2616 37 10464K buffer_head
> > > 10024 9895 98% 0.54K 1432 7 5728K radix_tree_node
> > > 1093 1087 99% 4.00K 1093 1 4372K size-4096
> > > 14805 14711 99% 0.25K 987 15 3948K size-256
> > > 2400 2381 99% 0.80K 480 5 1920K shmem_inode_cache
> >
> > FWIW, after "echo 3 > /proc/sys/vm/drop_caches" it looks like this:
> >
> > Active / Total Objects (% used) : 7965003 / 8153578 (97.7%)
> > Active / Total Slabs (% used) : 882511 / 882511 (100.0%)
> > Active / Total Caches (% used) : 90 / 144 (62.5%)
> > Active / Total Size (% used) : 3173487.59K / 3211091.64K (98.8%)
> > Minimum / Average / Maximum Object : 0.02K / 0.39K / 1024.00K
> >
> > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> > 3960036 3960007 99% 0.59K 660006 6 2640024K inode_cache
> > 4137155 3962636 95% 0.20K 217745 19 870980K dentry
> > 1097 1097 100% 4.00K 1097 1 4388K size-4096
> > 14805 14667 99% 0.25K 987 15 3948K size-256
> > 2400 2381 99% 0.80K 480 5 1920K shmem_inode_cache
> > 1404 1404 100% 1.00K 351 4 1404K size-1024
> > 152 152 100% 5.59K 152 1 1216K task_struct
> > 1302 347 26% 0.54K 186 7 744K radix_tree_node
> > 370 359 97% 2.00K 185 2 740K size-2048
> > 9381 4316 46% 0.06K 159 59 636K size-64
> > 8 8 100% 64.00K 8 1 512K size-65536
> >
> > So, are we leaking dentries and inodes?
>
> Yes, probably leaking dentries, which pin inodes. I don't know that slab
> leak debugging is going to help you because it won't find what is holding
> the refcount.
>
> Cc linux-fsdevel. Which kernel this is? Config as well.
This is a 2.6.28 kernel, but with some private patches on top. But none
of them touches fs code.
Hmm... if needed we could retry with a plain vanilla 2.6.28.x kernel.
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.28
# Mon Jan 19 12:43:38 2009
#
CONFIG_SCHED_MC=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_BUG=y
CONFIG_NO_IOMEM=y
CONFIG_NO_DMA=y
CONFIG_GENERIC_LOCKBREAK=y
CONFIG_PGSTE=y
CONFIG_VIRT_CPU_ACCOUNTING=y
CONFIG_S390=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_TASKSTATS is not set
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_TREE=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CGROUPS is not set
CONFIG_GROUP_SCHED=y
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_RT_GROUP_SCHED=y
CONFIG_USER_SCHED=y
# CONFIG_CGROUP_SCHED is not set
# CONFIG_SYSFS_DEPRECATED_V2 is not set
CONFIG_RELAY=y
CONFIG_NAMESPACES=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
# CONFIG_USER_NS is not set
# CONFIG_PID_NS is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_COMPAT_BRK=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_AIO=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_PROFILING=y
# CONFIG_MARKERS is not set
CONFIG_OPROFILE=m
CONFIG_HAVE_OPROFILE=y
CONFIG_KPROBES=y
CONFIG_KRETPROBES=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KRETPROBES=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_USE_GENERIC_SMP_HELPERS=y
# CONFIG_HAVE_GENERIC_DMA_COHERENT is not set
CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_BLK_DEV_IO_TRACE=y
# CONFIG_BLK_DEV_BSG is not set
# CONFIG_BLK_DEV_INTEGRITY is not set
CONFIG_BLOCK_COMPAT=y
#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
CONFIG_DEFAULT_DEADLINE=y
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="deadline"
CONFIG_PREEMPT_NOTIFIERS=y
# CONFIG_CLASSIC_RCU is not set
# CONFIG_FREEZER is not set
#
# Base setup
#
#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_64BIT=y
CONFIG_SMP=y
CONFIG_NR_CPUS=64
CONFIG_HOTPLUG_CPU=y
CONFIG_COMPAT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_AUDIT_ARCH=y
CONFIG_S390_SWITCH_AMODE=y
CONFIG_S390_EXEC_PROTECT=y
#
# Code generation options
#
# CONFIG_MARCH_G5 is not set
# CONFIG_MARCH_Z900 is not set
CONFIG_MARCH_Z990=y
# CONFIG_MARCH_Z9_109 is not set
# CONFIG_MARCH_Z10 is not set
CONFIG_PACK_STACK=y
CONFIG_SMALL_STACK=y
CONFIG_CHECK_STACK=y
CONFIG_STACK_GUARD=512
# CONFIG_WARN_STACK is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
#
# Kernel preemption
#
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_RCU=y
CONFIG_RCU_TRACE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_DEFAULT=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
CONFIG_SELECT_MEMORY_MODEL=y
# CONFIG_FLATMEM_MANUAL is not set
# CONFIG_DISCONTIGMEM_MANUAL is not set
CONFIG_SPARSEMEM_MANUAL=y
CONFIG_SPARSEMEM=y
CONFIG_HAVE_MEMORY_PRESENT=y
CONFIG_SPARSEMEM_EXTREME=y
CONFIG_SPARSEMEM_VMEMMAP_ENABLE=y
CONFIG_SPARSEMEM_VMEMMAP=y
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTPLUG_SPARSE=y
CONFIG_MEMORY_HOTREMOVE=y
CONFIG_PAGEFLAGS_EXTENDED=y
CONFIG_SPLIT_PTLOCK_CPUS=4
CONFIG_MIGRATION=y
CONFIG_RESOURCES_64BIT=y
CONFIG_PHYS_ADDR_T_64BIT=y
CONFIG_ZONE_DMA_FLAG=1
CONFIG_BOUNCE=y
CONFIG_VIRT_TO_BUS=y
CONFIG_UNEVICTABLE_LRU=y
#
# I/O subsystem configuration
#
CONFIG_MACHCHK_WARNING=y
CONFIG_QDIO=y
CONFIG_CHSC_SCH=m
#
# Misc
#
CONFIG_IPL=y
# CONFIG_IPL_TAPE is not set
CONFIG_IPL_VM=y
CONFIG_BINFMT_ELF=y
CONFIG_COMPAT_BINFMT_ELF=y
# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
# CONFIG_HAVE_AOUT is not set
# CONFIG_BINFMT_MISC is not set
CONFIG_FORCE_MAX_ZONEORDER=9
# CONFIG_PROCESS_DEBUG is not set
CONFIG_PFAULT=y
CONFIG_SHARED_KERNEL=y
CONFIG_CMM=m
CONFIG_CMM_PROC=y
CONFIG_CMM_IUCV=y
CONFIG_PAGE_STATES=y
CONFIG_APPLDATA_BASE=y
CONFIG_APPLDATA_MEM=m
CONFIG_APPLDATA_OS=m
CONFIG_APPLDATA_NET_SUM=m
CONFIG_HZ_100=y
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=100
CONFIG_SCHED_HRTICK=y
CONFIG_S390_HYPFS_FS=y
CONFIG_KEXEC=y
CONFIG_ZFCPDUMP=y
CONFIG_S390_GUEST=y
CONFIG_KMSG_IDS=y
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_USER=m
CONFIG_XFRM_SUB_POLICY=y
# CONFIG_XFRM_MIGRATE is not set
CONFIG_XFRM_STATISTICS=y
CONFIG_XFRM_IPCOMP=y
CONFIG_NET_KEY=m
# CONFIG_NET_KEY_MIGRATE is not set
CONFIG_IUCV=y
CONFIG_AFIUCV=m
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_ASK_IP_FIB_HASH=y
# CONFIG_IP_FIB_TRIE is not set
CONFIG_IP_FIB_HASH=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=y
CONFIG_NET_IPGRE=y
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=y
CONFIG_INET_XFRM_TUNNEL=y
CONFIG_INET_TUNNEL=y
CONFIG_INET_XFRM_MODE_TRANSPORT=y
CONFIG_INET_XFRM_MODE_TUNNEL=y
CONFIG_INET_XFRM_MODE_BEET=m
CONFIG_INET_LRO=m
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_TCP_MD5SIG=y
CONFIG_IPV6=y
CONFIG_IPV6_PRIVACY=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
# CONFIG_IPV6_OPTIMISTIC_DAD is not set
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_IPCOMP=m
CONFIG_IPV6_MIP6=y
CONFIG_INET6_XFRM_TUNNEL=m
CONFIG_INET6_TUNNEL=m
CONFIG_INET6_XFRM_MODE_TRANSPORT=y
CONFIG_INET6_XFRM_MODE_TUNNEL=y
CONFIG_INET6_XFRM_MODE_BEET=m
CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION=m
CONFIG_IPV6_SIT=m
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_IPV6_TUNNEL=m
CONFIG_IPV6_MULTIPLE_TABLES=y
CONFIG_IPV6_SUBTREES=y
# CONFIG_IPV6_MROUTE is not set
CONFIG_NETLABEL=y
# CONFIG_NETWORK_SECMARK is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_NETFILTER_ADVANCED=y
CONFIG_BRIDGE_NETFILTER=y
#
# Core Netfilter Configuration
#
CONFIG_NETFILTER_NETLINK=m
CONFIG_NETFILTER_NETLINK_QUEUE=m
CONFIG_NETFILTER_NETLINK_LOG=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CT_ACCT=y
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_EVENTS=y
# CONFIG_NF_CT_PROTO_DCCP is not set
CONFIG_NF_CT_PROTO_SCTP=m
# CONFIG_NF_CT_PROTO_UDPLITE is not set
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
# CONFIG_NF_CONNTRACK_PPTP is not set
# CONFIG_NF_CONNTRACK_SANE is not set
CONFIG_NF_CONNTRACK_SIP=m
# CONFIG_NF_CONNTRACK_TFTP is not set
CONFIG_NF_CT_NETLINK=m
# CONFIG_NETFILTER_TPROXY is not set
CONFIG_NETFILTER_XTABLES=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
CONFIG_NETFILTER_XT_TARGET_RATEEST=m
CONFIG_NETFILTER_XT_TARGET_TRACE=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
CONFIG_NETFILTER_XT_TARGET_TCPOPTSTRIP=m
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
CONFIG_NETFILTER_XT_MATCH_IPRANGE=m
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
CONFIG_NETFILTER_XT_MATCH_OWNER=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_PHYSDEV=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_RATEEST=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
# CONFIG_NETFILTER_XT_MATCH_RECENT is not set
# CONFIG_NETFILTER_XT_MATCH_SCTP is not set
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
CONFIG_NETFILTER_XT_MATCH_TIME=m
CONFIG_NETFILTER_XT_MATCH_U32=m
CONFIG_IP_VS=m
# CONFIG_IP_VS_IPV6 is not set
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=12
#
# IPVS transport protocol load balancing support
#
# CONFIG_IP_VS_PROTO_TCP is not set
# CONFIG_IP_VS_PROTO_UDP is not set
# CONFIG_IP_VS_PROTO_ESP is not set
# CONFIG_IP_VS_PROTO_AH is not set
#
# IPVS scheduler
#
# CONFIG_IP_VS_RR is not set
# CONFIG_IP_VS_WRR is not set
# CONFIG_IP_VS_LC is not set
# CONFIG_IP_VS_WLC is not set
# CONFIG_IP_VS_LBLC is not set
# CONFIG_IP_VS_LBLCR is not set
# CONFIG_IP_VS_DH is not set
# CONFIG_IP_VS_SH is not set
# CONFIG_IP_VS_SED is not set
# CONFIG_IP_VS_NQ is not set
#
# IPVS application helper
#
#
# IP: Netfilter Configuration
#
CONFIG_NF_DEFRAG_IPV4=m
CONFIG_NF_CONNTRACK_IPV4=m
CONFIG_NF_CONNTRACK_PROC_COMPAT=y
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_NF_NAT=m
CONFIG_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_REDIRECT=m
# CONFIG_NF_NAT_SNMP_BASIC is not set
CONFIG_NF_NAT_PROTO_SCTP=m
CONFIG_NF_NAT_FTP=m
CONFIG_NF_NAT_IRC=m
# CONFIG_NF_NAT_TFTP is not set
CONFIG_NF_NAT_AMANDA=m
# CONFIG_NF_NAT_PPTP is not set
CONFIG_NF_NAT_H323=m
CONFIG_NF_NAT_SIP=m
CONFIG_IP_NF_MANGLE=m
# CONFIG_IP_NF_TARGET_CLUSTERIP is not set
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_TTL=m
CONFIG_IP_NF_RAW=m
# CONFIG_IP_NF_SECURITY is not set
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
#
# IPv6: Netfilter Configuration
#
CONFIG_NF_CONNTRACK_IPV6=m
CONFIG_IP6_NF_QUEUE=m
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_AH=m
CONFIG_IP6_NF_MATCH_EUI64=m
CONFIG_IP6_NF_MATCH_FRAG=m
CONFIG_IP6_NF_MATCH_OPTS=m
CONFIG_IP6_NF_MATCH_HL=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=m
CONFIG_IP6_NF_MATCH_MH=m
CONFIG_IP6_NF_MATCH_RT=m
CONFIG_IP6_NF_TARGET_LOG=m
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_REJECT=m
CONFIG_IP6_NF_MANGLE=m
CONFIG_IP6_NF_TARGET_HL=m
CONFIG_IP6_NF_RAW=m
# CONFIG_IP6_NF_SECURITY is not set
CONFIG_BRIDGE_NF_EBTABLES=m
CONFIG_BRIDGE_EBT_BROUTE=m
CONFIG_BRIDGE_EBT_T_FILTER=m
CONFIG_BRIDGE_EBT_T_NAT=m
CONFIG_BRIDGE_EBT_802_3=m
CONFIG_BRIDGE_EBT_AMONG=m
CONFIG_BRIDGE_EBT_ARP=m
CONFIG_BRIDGE_EBT_IP=m
# CONFIG_BRIDGE_EBT_IP6 is not set
CONFIG_BRIDGE_EBT_LIMIT=m
CONFIG_BRIDGE_EBT_MARK=m
CONFIG_BRIDGE_EBT_PKTTYPE=m
CONFIG_BRIDGE_EBT_STP=m
CONFIG_BRIDGE_EBT_VLAN=m
CONFIG_BRIDGE_EBT_ARPREPLY=m
CONFIG_BRIDGE_EBT_DNAT=m
CONFIG_BRIDGE_EBT_MARK_T=m
CONFIG_BRIDGE_EBT_REDIRECT=m
CONFIG_BRIDGE_EBT_SNAT=m
CONFIG_BRIDGE_EBT_LOG=m
CONFIG_BRIDGE_EBT_ULOG=m
CONFIG_BRIDGE_EBT_NFLOG=m
# CONFIG_IP_DCCP is not set
CONFIG_IP_SCTP=m
# CONFIG_SCTP_DBG_MSG is not set
# CONFIG_SCTP_DBG_OBJCNT is not set
CONFIG_SCTP_HMAC_NONE=y
# CONFIG_SCTP_HMAC_SHA1 is not set
# CONFIG_SCTP_HMAC_MD5 is not set
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
CONFIG_STP=y
CONFIG_BRIDGE=y
CONFIG_VLAN_8021Q=y
# CONFIG_VLAN_8021Q_GVRP is not set
# CONFIG_DECNET is not set
CONFIG_LLC=y
CONFIG_LLC2=m
CONFIG_IPX=m
# CONFIG_IPX_INTERN is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_SCHED is not set
CONFIG_NET_CLS_ROUTE=y
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_NET_TCPPROBE is not set
CONFIG_CAN=m
CONFIG_CAN_RAW=m
CONFIG_CAN_BCM=m
#
# CAN Device Drivers
#
CONFIG_CAN_VCAN=m
CONFIG_CAN_DEBUG_DEVICES=y
# CONFIG_AF_RXRPC is not set
# CONFIG_PHONET is not set
CONFIG_FIB_RULES=y
# CONFIG_RFKILL is not set
# CONFIG_NET_9P is not set
# CONFIG_PCMCIA is not set
CONFIG_CCW=y
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_FIRMWARE_IN_KERNEL is not set
CONFIG_EXTRA_FIRMWARE=""
# CONFIG_DEBUG_DRIVER is not set
# CONFIG_DEBUG_DEVRES is not set
CONFIG_SYS_HYPERVISOR=y
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
CONFIG_BLK_DEV=y
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=40960
CONFIG_BLK_DEV_XIP=y
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set
#
# S/390 block device drivers
#
CONFIG_BLK_DEV_XPRAM=m
CONFIG_DCSSBLK=y
CONFIG_DASD=y
CONFIG_DASD_PROFILE=y
CONFIG_DASD_ECKD=y
CONFIG_DASD_FBA=y
CONFIG_DASD_DIAG=y
CONFIG_DASD_EER=y
CONFIG_VIRTIO_BLK=y
CONFIG_MISC_DEVICES=y
# CONFIG_EEPROM_93CX6 is not set
CONFIG_ENCLOSURE_SERVICES=m
# CONFIG_C2PORT is not set
#
# SCSI device support
#
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
# CONFIG_SCSI_DMA is not set
CONFIG_SCSI_TGT=m
CONFIG_SCSI_NETLINK=y
# CONFIG_SCSI_PROC_FS is not set
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=y
CONFIG_CHR_DEV_OSST=y
CONFIG_BLK_DEV_SR=y
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=y
CONFIG_CHR_DEV_SCH=m
CONFIG_SCSI_ENCLOSURE=m
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_SCAN_ASYNC=y
CONFIG_SCSI_WAIT_SCAN=m
#
# SCSI Transports
#
# CONFIG_SCSI_SPI_ATTRS is not set
CONFIG_SCSI_FC_ATTRS=y
CONFIG_SCSI_ISCSI_ATTRS=m
CONFIG_SCSI_SAS_ATTRS=m
CONFIG_SCSI_SAS_LIBSAS=m
CONFIG_SCSI_SAS_HOST_SMP=y
CONFIG_SCSI_SAS_LIBSAS_DEBUG=y
CONFIG_SCSI_SRP_ATTRS=m
CONFIG_SCSI_SRP_TGT_ATTRS=y
CONFIG_SCSI_LOWLEVEL=y
CONFIG_ISCSI_TCP=m
# CONFIG_SCSI_DEBUG is not set
CONFIG_ZFCP=y
# CONFIG_SCSI_DH is not set
CONFIG_MD=y
CONFIG_BLK_DEV_MD=m
# CONFIG_MD_LINEAR is not set
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
# CONFIG_MD_RAID456 is not set
CONFIG_MD_MULTIPATH=m
# CONFIG_MD_FAULTY is not set
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_DEBUG is not set
CONFIG_DM_CRYPT=m
CONFIG_DM_SNAPSHOT=m
CONFIG_DM_MIRROR=m
CONFIG_DM_ZERO=m
CONFIG_DM_MULTIPATH=m
# CONFIG_DM_DELAY is not set
CONFIG_DM_UEVENT=y
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_BONDING=m
# CONFIG_MACVLAN is not set
# CONFIG_EQUALIZER is not set
CONFIG_TUN=y
CONFIG_VETH=m
CONFIG_NET_ETHERNET=y
# CONFIG_MII is not set
# CONFIG_IBM_NEW_EMAC_ZMII is not set
# CONFIG_IBM_NEW_EMAC_RGMII is not set
# CONFIG_IBM_NEW_EMAC_TAH is not set
# CONFIG_IBM_NEW_EMAC_EMAC4 is not set
# CONFIG_IBM_NEW_EMAC_NO_FLOW_CTRL is not set
# CONFIG_IBM_NEW_EMAC_MAL_CLR_ICINTSTAT is not set
# CONFIG_IBM_NEW_EMAC_MAL_COMMON_ERR is not set
CONFIG_NETDEV_1000=y
CONFIG_NETDEV_10000=y
CONFIG_TR=y
# CONFIG_WAN is not set
#
# S/390 network device drivers
#
CONFIG_LCS=m
CONFIG_CTCM=m
CONFIG_NETIUCV=m
CONFIG_SMSGIUCV=m
CONFIG_CLAW=m
CONFIG_QETH=m
CONFIG_QETH_L2=m
CONFIG_QETH_L3=m
CONFIG_QETH_IPV6=y
CONFIG_CCWGROUP=m
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NETCONSOLE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NET_POLL_CONTROLLER is not set
CONFIG_VIRTIO_NET=y
#
# Character devices
#
CONFIG_DEVKMEM=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_HVC_DRIVER=y
CONFIG_HVC_IUCV=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_HW_RANDOM=m
# CONFIG_HW_RANDOM_VIRTIO is not set
# CONFIG_R3964 is not set
CONFIG_RAW_DRIVER=m
CONFIG_MAX_RAW_DEVS=256
CONFIG_HANGCHECK_TIMER=m
#
# S/390 character device drivers
#
CONFIG_TN3270=y
CONFIG_TN3270_TTY=y
CONFIG_TN3270_FS=y
CONFIG_TN3270_CONSOLE=y
CONFIG_TN3215=y
CONFIG_TN3215_CONSOLE=y
CONFIG_CCW_CONSOLE=y
CONFIG_SCLP_TTY=y
CONFIG_SCLP_CONSOLE=y
CONFIG_SCLP_VT220_TTY=y
CONFIG_SCLP_VT220_CONSOLE=y
CONFIG_SCLP_CPI=m
CONFIG_SCLP_ASYNC=m
CONFIG_S390_TAPE=m
#
# S/390 tape interface support
#
CONFIG_S390_TAPE_BLOCK=y
#
# S/390 tape hardware support
#
CONFIG_S390_TAPE_34XX=m
CONFIG_S390_TAPE_3590=m
CONFIG_VMLOGRDR=m
CONFIG_VMCP=m
CONFIG_MONREADER=m
CONFIG_MONWRITER=m
CONFIG_S390_VMUR=m
# CONFIG_POWER_SUPPLY is not set
CONFIG_THERMAL=y
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
CONFIG_ZVM_WATCHDOG=m
# CONFIG_REGULATOR is not set
CONFIG_MEMSTICK=m
CONFIG_MEMSTICK_DEBUG=y
#
# MemoryStick drivers
#
CONFIG_MEMSTICK_UNSAFE_RESUME=y
CONFIG_MSPRO_BLOCK=m
#
# MemoryStick Host Controller Drivers
#
# CONFIG_NEW_LEDS is not set
# CONFIG_ACCESSIBILITY is not set
# CONFIG_STAGING is not set
#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
CONFIG_EXT2_FS_XIP=y
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
# CONFIG_EXT4_FS is not set
CONFIG_FS_XIP=y
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_JBD2=m
CONFIG_JBD2_DEBUG=y
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=m
# CONFIG_REISERFS_CHECK is not set
CONFIG_REISERFS_PROC_INFO=y
CONFIG_REISERFS_FS_XATTR=y
CONFIG_REISERFS_FS_POSIX_ACL=y
# CONFIG_REISERFS_FS_SECURITY is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
CONFIG_FILE_LOCKING=y
CONFIG_XFS_FS=m
# CONFIG_XFS_QUOTA is not set
CONFIG_XFS_POSIX_ACL=y
# CONFIG_XFS_RT is not set
# CONFIG_XFS_DEBUG is not set
CONFIG_GFS2_FS=m
CONFIG_GFS2_FS_LOCKING_DLM=m
CONFIG_OCFS2_FS=m
CONFIG_OCFS2_FS_O2CB=m
CONFIG_OCFS2_FS_USERSPACE_CLUSTER=m
# CONFIG_OCFS2_FS_STATS is not set
CONFIG_OCFS2_DEBUG_MASKLOG=y
# CONFIG_OCFS2_DEBUG_FS is not set
# CONFIG_OCFS2_COMPAT_JBD is not set
CONFIG_DNOTIFY=y
CONFIG_INOTIFY=y
CONFIG_INOTIFY_USER=y
CONFIG_QUOTA=y
# CONFIG_QUOTA_NETLINK_INTERFACE is not set
CONFIG_PRINT_QUOTA_WARNING=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=m
CONFIG_QUOTACTL=y
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set
CONFIG_FUSE_FS=m
CONFIG_GENERIC_ACL=y
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
# CONFIG_MSDOS_FS is not set
# CONFIG_VFAT_FS is not set
# CONFIG_NTFS_FS is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_PROC_SYSCTL=y
CONFIG_PROC_PAGE_MONITOR=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_CONFIGFS_FS=m
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_OMFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
CONFIG_NETWORK_FILESYSTEMS=y
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_NFSD=m
CONFIG_NFSD_V2_ACL=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V3_ACL=y
# CONFIG_NFSD_V4 is not set
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=m
CONFIG_NFS_ACL_SUPPORT=m
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=m
CONFIG_SUNRPC_GSS=m
# CONFIG_SUNRPC_REGISTER_V4 is not set
CONFIG_RPCSEC_GSS_KRB5=m
CONFIG_RPCSEC_GSS_SPKM3=m
CONFIG_SMB_FS=m
CONFIG_SMB_NLS_DEFAULT=y
CONFIG_SMB_NLS_REMOTE="cp437"
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
CONFIG_IBM_PARTITION=y
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
# CONFIG_MINIX_SUBPARTITION is not set
# CONFIG_SOLARIS_X86_PARTITION is not set
# CONFIG_UNIXWARE_DISKLABEL is not set
# CONFIG_LDM_PARTITION is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
CONFIG_SUN_PARTITION=y
# CONFIG_KARMA_PARTITION is not set
# CONFIG_EFI_PARTITION is not set
# CONFIG_SYSV68_PARTITION is not set
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
CONFIG_NLS_CODEPAGE_850=m
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
CONFIG_NLS_ISO8859_15=y
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
CONFIG_NLS_UTF8=m
CONFIG_DLM=m
CONFIG_DLM_DEBUG=y
#
# Kernel hacking
#
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
# CONFIG_PRINTK_TIME is not set
CONFIG_ENABLE_WARN_DEPRECATED=y
CONFIG_ENABLE_MUST_CHECK=y
CONFIG_FRAME_WARN=2048
CONFIG_MAGIC_SYSRQ=y
# CONFIG_UNUSED_SYMBOLS is not set
CONFIG_DEBUG_FS=y
CONFIG_HEADERS_CHECK=y
CONFIG_DEBUG_KERNEL=y
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
# CONFIG_TIMER_STATS is not set
# CONFIG_DEBUG_OBJECTS is not set
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_PREEMPT is not set
# CONFIG_DEBUG_RT_MUTEXES is not set
# CONFIG_RT_MUTEX_TESTER is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_MUTEXES is not set
# CONFIG_DEBUG_LOCK_ALLOC is not set
# CONFIG_PROVE_LOCKING is not set
# CONFIG_LOCK_STAT is not set
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
# CONFIG_DEBUG_VM is not set
# CONFIG_DEBUG_WRITECOUNT is not set
CONFIG_DEBUG_MEMORY_INIT=y
CONFIG_DEBUG_LIST=y
# CONFIG_DEBUG_SG is not set
CONFIG_FRAME_POINTER=y
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_KPROBES_SANITY_TEST=y
CONFIG_BACKTRACE_SELF_TEST=m
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
CONFIG_LKDTM=m
# CONFIG_FAULT_INJECTION is not set
CONFIG_LATENCYTOP=y
CONFIG_SYSCTL_SYSCALL_CHECK=y
CONFIG_HAVE_FUNCTION_TRACER=y
#
# Tracers
#
# CONFIG_FUNCTION_TRACER is not set
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_PREEMPT_TRACER is not set
# CONFIG_SCHED_TRACER is not set
# CONFIG_CONTEXT_SWITCH_TRACER is not set
# CONFIG_BOOT_TRACER is not set
# CONFIG_STACK_TRACER is not set
# CONFIG_BUILD_DOCSRC is not set
# CONFIG_DYNAMIC_PRINTK_DEBUG is not set
# CONFIG_SAMPLES is not set
# CONFIG_DEBUG_PAGEALLOC is not set
#
# Security options
#
# CONFIG_KEYS is not set
CONFIG_SECURITY=y
# CONFIG_SECURITYFS is not set
# CONFIG_SECURITY_NETWORK is not set
# CONFIG_SECURITY_FILE_CAPABILITIES is not set
CONFIG_SECURITY_DEFAULT_MMAP_MIN_ADDR=0
CONFIG_CRYPTO=y
#
# Crypto core or helper
#
# CONFIG_CRYPTO_FIPS is not set
CONFIG_CRYPTO_ALGAPI=y
CONFIG_CRYPTO_ALGAPI2=y
CONFIG_CRYPTO_AEAD=m
CONFIG_CRYPTO_AEAD2=y
CONFIG_CRYPTO_BLKCIPHER=y
CONFIG_CRYPTO_BLKCIPHER2=y
CONFIG_CRYPTO_HASH=y
CONFIG_CRYPTO_HASH2=y
CONFIG_CRYPTO_RNG=m
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_MANAGER=y
CONFIG_CRYPTO_MANAGER2=y
CONFIG_CRYPTO_GF128MUL=m
CONFIG_CRYPTO_NULL=m
# CONFIG_CRYPTO_CRYPTD is not set
CONFIG_CRYPTO_AUTHENC=m
CONFIG_CRYPTO_TEST=m
#
# Authenticated Encryption with Associated Data
#
CONFIG_CRYPTO_CCM=m
CONFIG_CRYPTO_GCM=m
CONFIG_CRYPTO_SEQIV=m
#
# Block modes
#
CONFIG_CRYPTO_CBC=y
CONFIG_CRYPTO_CTR=m
# CONFIG_CRYPTO_CTS is not set
CONFIG_CRYPTO_ECB=y
CONFIG_CRYPTO_LRW=m
CONFIG_CRYPTO_PCBC=m
CONFIG_CRYPTO_XTS=m
#
# Hash modes
#
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_XCBC=m
#
# Digest
#
CONFIG_CRYPTO_CRC32C=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_MICHAEL_MIC=m
CONFIG_CRYPTO_RMD128=m
CONFIG_CRYPTO_RMD160=m
CONFIG_CRYPTO_RMD256=m
CONFIG_CRYPTO_RMD320=m
CONFIG_CRYPTO_SHA1=m
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
CONFIG_CRYPTO_TGR192=m
CONFIG_CRYPTO_WP512=m
#
# Ciphers
#
CONFIG_CRYPTO_AES=m
# CONFIG_CRYPTO_ANUBIS is not set
CONFIG_CRYPTO_ARC4=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_CAMELLIA=m
CONFIG_CRYPTO_CAST5=m
CONFIG_CRYPTO_CAST6=m
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_FCRYPT=m
CONFIG_CRYPTO_KHAZAD=m
CONFIG_CRYPTO_SALSA20=m
CONFIG_CRYPTO_SEED=m
CONFIG_CRYPTO_SERPENT=m
CONFIG_CRYPTO_TEA=m
CONFIG_CRYPTO_TWOFISH=m
CONFIG_CRYPTO_TWOFISH_COMMON=m
#
# Compression
#
CONFIG_CRYPTO_DEFLATE=y
CONFIG_CRYPTO_LZO=m
#
# Random Number Generation
#
# CONFIG_CRYPTO_ANSI_CPRNG is not set
CONFIG_CRYPTO_HW=y
CONFIG_ZCRYPT=m
# CONFIG_ZCRYPT_MONOLITHIC is not set
CONFIG_CRYPTO_SHA1_S390=m
CONFIG_CRYPTO_SHA256_S390=m
CONFIG_CRYPTO_SHA512_S390=m
CONFIG_CRYPTO_DES_S390=m
CONFIG_CRYPTO_AES_S390=m
CONFIG_S390_PRNG=m
#
# Library routines
#
CONFIG_BITREVERSE=y
CONFIG_CRC_CCITT=m
CONFIG_CRC16=m
CONFIG_CRC_T10DIF=m
CONFIG_CRC_ITU_T=y
CONFIG_CRC32=y
CONFIG_CRC7=m
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_LZO_COMPRESS=m
CONFIG_LZO_DECOMPRESS=m
CONFIG_TEXTSEARCH=y
CONFIG_TEXTSEARCH_KMP=m
CONFIG_TEXTSEARCH_BM=m
CONFIG_TEXTSEARCH_FSM=m
CONFIG_PLIST=y
CONFIG_HAVE_KVM=y
CONFIG_VIRTUALIZATION=y
CONFIG_KVM=y
CONFIG_VIRTIO=y
CONFIG_VIRTIO_RING=y
CONFIG_VIRTIO_BALLOON=y
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: oom-killer killing even if memory is available?
2009-03-17 10:49 ` oom-killer killing even if memory is available? Nick Piggin
2009-03-17 11:39 ` Heiko Carstens
@ 2009-03-20 5:08 ` Wu Fengguang
1 sibling, 0 replies; 3+ messages in thread
From: Wu Fengguang @ 2009-03-20 5:08 UTC (permalink / raw)
To: Nick Piggin
Cc: Heiko Carstens, linux-fsdevel, Andrew Morton, linux-mm,
Mel Gorman, Nick Piggin, Martin Schwidefsky, Andreas Krebbel
[-- Attachment #1: Type: text/plain, Size: 4864 bytes --]
On Tue, Mar 17, 2009 at 09:49:35PM +1100, Nick Piggin wrote:
> On Tuesday 17 March 2009 21:28:42 Heiko Carstens wrote:
> > On Tue, 17 Mar 2009 11:17:38 +0100
> >
> > Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
> > > On Tue, 17 Mar 2009 02:46:05 -0700
> > >
> > > Andrew Morton <akpm@linux-foundation.org> wrote:
> > > > > Mar 16 21:40:40 t6360003 kernel: Active_anon:372 active_file:45
> > > > > inactive_anon:154 Mar 16 21:40:40 t6360003 kernel: inactive_file:152
> > > > > unevictable:987 dirty:0 writeback:188 unstable:0 Mar 16 21:40:40
> > > > > t6360003 kernel: free:146348 slab:875833 mapped:805 pagetables:378
> > > > > bounce:0 Mar 16 21:40:40 t6360003 kernel: DMA free:467728kB
> > > > > min:4064kB low:5080kB high:6096kB active_anon:0kB inactive_anon:0kB
> > > > > active_file:0kB inactive_file:116kB unevictable:0kB present:2068480kB
> > > > > pages_scanned:0 all_unreclaimable? no Mar 16 21:40:40 t6360003
> > > > > kernel: lowmem_reserve[]: 0 2020 2020 Mar 16 21:40:40 t6360003
> > > > > kernel: Normal free:117664kB min:4064kB low:5080kB high:6096kB
> > > > > active_anon:1488kB inactive_anon:616kB active_file:188kB
> > > > > inactive_file:492kB unevictable:3948kB present:2068480kB
> > > > > pages_scanned:128 all_unreclaimable? no Mar 16 21:40:40 t6360003
> > > > > kernel: lowmem_reserve[]: 0 0 0
> > > >
> > > > The scanner has wrung pretty much all it can out of the reclaimable
> > > > pages - the LRUs are nearly empty. There's a few hundred MB free and
> > > > apparently we don't have four physically contiguous free pages
> > > > anywhere. It's believeable.
> > > >
> > > > The question is: where the heck did all your memory go? You have 2GB
> > > > of ZONE_NORMAL memory in that machine, but only a tenth of it is
> > > > visible to the page reclaim code.
> > > >
> > > > Something must have allocated (and possibly leaked) it.
> > >
> > > Looks like most of the memory went for dentries and inodes.
> > > slabtop output:
> > >
> > > Active / Total Objects (% used) : 8172165 / 8326954 (98.1%)
> > > Active / Total Slabs (% used) : 903692 / 903698 (100.0%)
> > > Active / Total Caches (% used) : 91 / 144 (63.2%)
> > > Active / Total Size (% used) : 3251262.44K / 3281384.22K (99.1%)
> > > Minimum / Average / Maximum Object : 0.02K / 0.39K / 1024.00K
> > >
> > > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> > > 3960036 3960017 99% 0.59K 660006 6 2640024K inode_cache
> > > 4137155 3997581 96% 0.20K 217745 19 870980K dentry
> > > 69776 69744 99% 0.80K 17444 4 69776K ext3_inode_cache
> > > 96792 92892 95% 0.10K 2616 37 10464K buffer_head
> > > 10024 9895 98% 0.54K 1432 7 5728K radix_tree_node
> > > 1093 1087 99% 4.00K 1093 1 4372K size-4096
> > > 14805 14711 99% 0.25K 987 15 3948K size-256
> > > 2400 2381 99% 0.80K 480 5 1920K shmem_inode_cache
> >
> > FWIW, after "echo 3 > /proc/sys/vm/drop_caches" it looks like this:
> >
> > Active / Total Objects (% used) : 7965003 / 8153578 (97.7%)
> > Active / Total Slabs (% used) : 882511 / 882511 (100.0%)
> > Active / Total Caches (% used) : 90 / 144 (62.5%)
> > Active / Total Size (% used) : 3173487.59K / 3211091.64K (98.8%)
> > Minimum / Average / Maximum Object : 0.02K / 0.39K / 1024.00K
> >
> > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> > 3960036 3960007 99% 0.59K 660006 6 2640024K inode_cache
> > 4137155 3962636 95% 0.20K 217745 19 870980K dentry
> > 1097 1097 100% 4.00K 1097 1 4388K size-4096
> > 14805 14667 99% 0.25K 987 15 3948K size-256
> > 2400 2381 99% 0.80K 480 5 1920K shmem_inode_cache
> > 1404 1404 100% 1.00K 351 4 1404K size-1024
> > 152 152 100% 5.59K 152 1 1216K task_struct
> > 1302 347 26% 0.54K 186 7 744K radix_tree_node
> > 370 359 97% 2.00K 185 2 740K size-2048
> > 9381 4316 46% 0.06K 159 59 636K size-64
> > 8 8 100% 64.00K 8 1 512K size-65536
> >
> > So, are we leaking dentries and inodes?
>
> Yes, probably leaking dentries, which pin inodes. I don't know that slab
> leak debugging is going to help you because it won't find what is holding
> the refcount.
Heiko, what's the output of `lsof`?
The attached filecache patch may also help debugging.
Usage:
# run patched kernel, with CONFIG_PROC_FILECACHE and CONFIG_PROC_FILECACHE_EXTRAS
modprobe filecache
echo ls all > /proc/filecache
cp /proc/filecache filecache-`date +'%F'`
This will dump all the cached inodes with their file name, refcount and creator.
Thanks,
Fengguang
[-- Attachment #2: filecache-2.6.28.patch --]
[-- Type: text/x-diff, Size: 33812 bytes --]
--- linux-2.6.orig/include/linux/mm.h
+++ linux-2.6/include/linux/mm.h
@@ -27,6 +27,7 @@ extern unsigned long max_mapnr;
extern unsigned long num_physpages;
extern void * high_memory;
extern int page_cluster;
+extern char * const zone_names[];
#ifdef CONFIG_SYSCTL
extern int sysctl_legacy_va_layout;
--- linux-2.6.orig/mm/page_alloc.c
+++ linux-2.6/mm/page_alloc.c
@@ -104,7 +104,7 @@ int sysctl_lowmem_reserve_ratio[MAX_NR_Z
EXPORT_SYMBOL(totalram_pages);
-static char * const zone_names[MAX_NR_ZONES] = {
+char * const zone_names[MAX_NR_ZONES] = {
#ifdef CONFIG_ZONE_DMA
"DMA",
#endif
--- linux-2.6.orig/fs/dcache.c
+++ linux-2.6/fs/dcache.c
@@ -1943,7 +1943,10 @@ char *__d_path(const struct path *path,
if (dentry == root->dentry && vfsmnt == root->mnt)
break;
- if (dentry == vfsmnt->mnt_root || IS_ROOT(dentry)) {
+ if (unlikely(!vfsmnt)) {
+ if (IS_ROOT(dentry))
+ break;
+ } else if (dentry == vfsmnt->mnt_root || IS_ROOT(dentry)) {
/* Global root? */
if (vfsmnt->mnt_parent == vfsmnt) {
goto global_root;
--- linux-2.6.orig/lib/radix-tree.c
+++ linux-2.6/lib/radix-tree.c
@@ -564,7 +564,6 @@ out:
}
EXPORT_SYMBOL(radix_tree_tag_clear);
-#ifndef __KERNEL__ /* Only the test harness uses this at present */
/**
* radix_tree_tag_get - get a tag on a radix tree node
* @root: radix tree root
@@ -627,7 +626,6 @@ int radix_tree_tag_get(struct radix_tree
}
}
EXPORT_SYMBOL(radix_tree_tag_get);
-#endif
/**
* radix_tree_next_hole - find the next hole (not-present entry)
--- linux-2.6.orig/fs/inode.c
+++ linux-2.6/fs/inode.c
@@ -82,6 +82,10 @@ static struct hlist_head *inode_hashtabl
*/
DEFINE_SPINLOCK(inode_lock);
+EXPORT_SYMBOL(inode_in_use);
+EXPORT_SYMBOL(inode_unused);
+EXPORT_SYMBOL(inode_lock);
+
/*
* iprune_mutex provides exclusion between the kswapd or try_to_free_pages
* icache shrinking path, and the umount path. Without this exclusion,
@@ -108,6 +112,14 @@ static void wake_up_inode(struct inode *
wake_up_bit(&inode->i_state, __I_LOCK);
}
+static inline void inode_created_by(struct inode *inode, struct task_struct *task)
+{
+#ifdef CONFIG_PROC_FILECACHE_EXTRAS
+ inode->i_cuid = task->uid;
+ memcpy(inode->i_comm, task->comm, sizeof(task->comm));
+#endif
+}
+
static struct inode *alloc_inode(struct super_block *sb)
{
static const struct address_space_operations empty_aops;
@@ -183,6 +195,7 @@ static struct inode *alloc_inode(struct
}
inode->i_private = NULL;
inode->i_mapping = mapping;
+ inode_created_by(inode, current);
}
return inode;
}
@@ -247,6 +260,8 @@ void __iget(struct inode * inode)
inodes_stat.nr_unused--;
}
+EXPORT_SYMBOL(__iget);
+
/**
* clear_inode - clear an inode
* @inode: inode to clear
@@ -1353,6 +1368,16 @@ void inode_double_unlock(struct inode *i
}
EXPORT_SYMBOL(inode_double_unlock);
+
+struct hlist_head * get_inode_hash_budget(unsigned long index)
+{
+ if (index >= (1 << i_hash_shift))
+ return NULL;
+
+ return inode_hashtable + index;
+}
+EXPORT_SYMBOL_GPL(get_inode_hash_budget);
+
static __initdata unsigned long ihash_entries;
static int __init set_ihash_entries(char *str)
{
--- linux-2.6.orig/fs/super.c
+++ linux-2.6/fs/super.c
@@ -45,6 +45,9 @@
LIST_HEAD(super_blocks);
DEFINE_SPINLOCK(sb_lock);
+EXPORT_SYMBOL(super_blocks);
+EXPORT_SYMBOL(sb_lock);
+
/**
* alloc_super - create new superblock
* @type: filesystem type superblock should belong to
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -230,6 +230,7 @@ unsigned long shrink_slab(unsigned long
up_read(&shrinker_rwsem);
return ret;
}
+EXPORT_SYMBOL(shrink_slab);
/* Called without lock on whether page is mapped, so answer is unstable */
static inline int page_mapping_inuse(struct page *page)
--- linux-2.6.orig/mm/swap_state.c
+++ linux-2.6/mm/swap_state.c
@@ -44,6 +44,7 @@ struct address_space swapper_space = {
.i_mmap_nonlinear = LIST_HEAD_INIT(swapper_space.i_mmap_nonlinear),
.backing_dev_info = &swap_backing_dev_info,
};
+EXPORT_SYMBOL_GPL(swapper_space);
#define INC_CACHE_INFO(x) do { swap_cache_info.x++; } while (0)
--- linux-2.6.orig/Documentation/filesystems/proc.txt
+++ linux-2.6/Documentation/filesystems/proc.txt
@@ -266,6 +266,7 @@ Table 1-4: Kernel info in /proc
driver Various drivers grouped here, currently rtc (2.4)
execdomains Execdomains, related to security (2.4)
fb Frame Buffer devices (2.4)
+ filecache Query/drop in-memory file cache
fs File system parameters, currently nfs/exports (2.4)
ide Directory containing info about the IDE subsystem
interrupts Interrupt usage
@@ -456,6 +457,88 @@ varies by architecture and compile optio
> cat /proc/meminfo
+..............................................................................
+
+filecache:
+
+Provides access to the in-memory file cache.
+
+To list an index of all cached files:
+
+ echo ls > /proc/filecache
+ cat /proc/filecache
+
+The output looks like:
+
+ # filecache 1.0
+ # ino size cached cached% state refcnt dev file
+ 1026334 91 92 100 -- 66 03:02(hda2) /lib/ld-2.3.6.so
+ 233608 1242 972 78 -- 66 03:02(hda2) /lib/tls/libc-2.3.6.so
+ 65203 651 476 73 -- 1 03:02(hda2) /bin/bash
+ 1026445 261 160 61 -- 10 03:02(hda2) /lib/libncurses.so.5.5
+ 235427 10 12 100 -- 44 03:02(hda2) /lib/tls/libdl-2.3.6.so
+
+FIELD INTRO
+---------------------------------------------------------------------------
+ino inode number
+size inode size in KB
+cached cached size in KB
+cached% percent of file data cached
+state1 '-' clean; 'd' metadata dirty; 'D' data dirty
+state2 '-' unlocked; 'L' locked, normally indicates file being written out
+refcnt file reference count, it's an in-kernel one, not exactly open count
+dev major:minor numbers in hex, followed by a descriptive device name
+file file path _inside_ the filesystem. There are several special names:
+ '(noname)': the file name is not available
+ '(03:02)': the file is a block device file of major:minor
+ '...(deleted)': the named file has been deleted from the disk
+
+To list the cached pages of a perticular file:
+
+ echo /bin/bash > /proc/filecache
+ cat /proc/filecache
+
+ # file /bin/bash
+ # flags R:referenced A:active U:uptodate D:dirty W:writeback M:mmap
+ # idx len state refcnt
+ 0 36 RAU__M 3
+ 36 1 RAU__M 2
+ 37 8 RAU__M 3
+ 45 2 RAU___ 1
+ 47 6 RAU__M 3
+ 53 3 RAU__M 2
+ 56 2 RAU__M 3
+
+FIELD INTRO
+----------------------------------------------------------------------------
+idx page index
+len number of pages which are cached and share the same state
+state page state of the flags listed in line two
+refcnt page reference count
+
+Careful users may notice that the file name to be queried is remembered between
+commands. Internally, the module has a global variable to store the file name
+parameter, so that it can be inherited by newly opened /proc/filecache file.
+However it can lead to interference for multiple queriers. The solution here
+is to obey a rule: only root can interactively change the file name parameter;
+normal users must go for scripts to access the interface. Scripts should do it
+by following the code example below:
+
+ filecache = open("/proc/filecache", "rw");
+ # avoid polluting the global parameter filename
+ filecache.write("set private");
+
+To instruct the kernel to drop clean caches, dentries and inodes from memory,
+causing that memory to become free:
+
+ # drop clean file data cache (i.e. file backed pagecache)
+ echo drop pagecache > /proc/filecache
+
+ # drop clean file metadata cache (i.e. dentries and inodes)
+ echo drop slabcache > /proc/filecache
+
+Note that the drop commands are non-destructive operations and dirty objects
+are not freeable, the user should run `sync' first.
MemTotal: 16344972 kB
MemFree: 13634064 kB
--- /dev/null
+++ linux-2.6/fs/proc/filecache.c
@@ -0,0 +1,1045 @@
+/*
+ * fs/proc/filecache.c
+ *
+ * Copyright (C) 2006, 2007 Fengguang Wu <wfg@mail.ustc.edu.cn>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/radix-tree.h>
+#include <linux/page-flags.h>
+#include <linux/pagevec.h>
+#include <linux/pagemap.h>
+#include <linux/vmalloc.h>
+#include <linux/writeback.h>
+#include <linux/buffer_head.h>
+#include <linux/parser.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#include <linux/file.h>
+#include <linux/namei.h>
+#include <linux/module.h>
+#include <asm/uaccess.h>
+
+/*
+ * Increase minor version when new columns are added;
+ * Increase major version when existing columns are changed.
+ */
+#define FILECACHE_VERSION "1.0"
+
+/* Internal buffer sizes. The larger the more effcient. */
+#define SBUF_SIZE (128<<10)
+#define IWIN_PAGE_ORDER 3
+#define IWIN_SIZE ((PAGE_SIZE<<IWIN_PAGE_ORDER) / sizeof(struct inode *))
+
+/*
+ * Session management.
+ *
+ * Each opened /proc/filecache file is assiocated with a session object.
+ * Also there is a global_session that maintains status across open()/close()
+ * (i.e. the lifetime of an opened file), so that a casual user can query the
+ * filecache via _multiple_ simple shell commands like
+ * 'echo cat /bin/bash > /proc/filecache; cat /proc/filecache'.
+ *
+ * session.query_file is the file whose cache info is to be queried.
+ * Its value determines what we get on read():
+ * - NULL: ii_*() called to show the inode index
+ * - filp: pg_*() called to show the page groups of a filp
+ *
+ * session.query_file is
+ * - cloned from global_session.query_file on open();
+ * - updated on write("cat filename");
+ * note that the new file will also be saved in global_session.query_file if
+ * session.private_session is false.
+ */
+
+struct session {
+ /* options */
+ int private_session;
+ unsigned long ls_options;
+ dev_t ls_dev;
+
+ /* parameters */
+ struct file *query_file;
+
+ /* seqfile pos */
+ pgoff_t start_offset;
+ pgoff_t next_offset;
+
+ /* inode at last pos */
+ struct {
+ unsigned long pos;
+ unsigned long state;
+ struct inode *inode;
+ struct inode *pinned_inode;
+ } ipos;
+
+ /* inode window */
+ struct {
+ unsigned long cursor;
+ unsigned long origin;
+ unsigned long size;
+ struct inode **inodes;
+ } iwin;
+};
+
+static struct session global_session;
+
+/*
+ * Session address is stored in proc_file->f_ra.start:
+ * we assume that there will be no readahead for proc_file.
+ */
+static struct session *get_session(struct file *proc_file)
+{
+ return (struct session *)proc_file->f_ra.start;
+}
+
+static void set_session(struct file *proc_file, struct session *s)
+{
+ BUG_ON(proc_file->f_ra.start);
+ proc_file->f_ra.start = (unsigned long)s;
+}
+
+static void update_global_file(struct session *s)
+{
+ if (s->private_session)
+ return;
+
+ if (global_session.query_file)
+ fput(global_session.query_file);
+
+ global_session.query_file = s->query_file;
+
+ if (global_session.query_file)
+ get_file(global_session.query_file);
+}
+
+/*
+ * Cases of the name:
+ * 1) NULL (new session)
+ * s->query_file = global_session.query_file = 0;
+ * 2) "" (ls/la)
+ * s->query_file = global_session.query_file;
+ * 3) a regular file name (cat newfile)
+ * s->query_file = global_session.query_file = newfile;
+ */
+static int session_update_file(struct session *s, char *name)
+{
+ static DEFINE_MUTEX(mutex); /* protects global_session.query_file */
+ int err = 0;
+
+ mutex_lock(&mutex);
+
+ /*
+ * We are to quit, or to list the cached files.
+ * Reset *.query_file.
+ */
+ if (!name) {
+ if (s->query_file) {
+ fput(s->query_file);
+ s->query_file = NULL;
+ }
+ update_global_file(s);
+ goto out;
+ }
+
+ /*
+ * This is a new session.
+ * Inherit options/parameters from global ones.
+ */
+ if (name[0] == '\0') {
+ *s = global_session;
+ if (s->query_file)
+ get_file(s->query_file);
+ goto out;
+ }
+
+ /*
+ * Open the named file.
+ */
+ if (s->query_file)
+ fput(s->query_file);
+ s->query_file = filp_open(name, O_RDONLY|O_LARGEFILE, 0);
+ if (IS_ERR(s->query_file)) {
+ err = PTR_ERR(s->query_file);
+ s->query_file = NULL;
+ } else
+ update_global_file(s);
+
+out:
+ mutex_unlock(&mutex);
+
+ return err;
+}
+
+static struct session *session_create(void)
+{
+ struct session *s;
+ int err = 0;
+
+ s = kmalloc(sizeof(*s), GFP_KERNEL);
+ if (s)
+ err = session_update_file(s, "");
+ else
+ err = -ENOMEM;
+
+ return err ? ERR_PTR(err) : s;
+}
+
+static void session_release(struct session *s)
+{
+ if (s->ipos.pinned_inode)
+ iput(s->ipos.pinned_inode);
+ if (s->query_file)
+ fput(s->query_file);
+ kfree(s);
+}
+
+
+/*
+ * Listing of cached files.
+ *
+ * Usage:
+ * echo > /proc/filecache # enter listing mode
+ * cat /proc/filecache # get the file listing
+ */
+
+/* code style borrowed from ib_srp.c */
+enum {
+ LS_OPT_ERR = 0,
+ LS_OPT_NOCLEAN = 1 << 0,
+ LS_OPT_NODIRTY = 1 << 1,
+ LS_OPT_NOUNUSED = 1 << 2,
+ LS_OPT_EMPTY = 1 << 3,
+ LS_OPT_ALL = 1 << 4,
+ LS_OPT_DEV = 1 << 5,
+};
+
+static match_table_t ls_opt_tokens = {
+ { LS_OPT_NOCLEAN, "noclean" },
+ { LS_OPT_NODIRTY, "nodirty" },
+ { LS_OPT_NOUNUSED, "nounused" },
+ { LS_OPT_EMPTY, "empty" },
+ { LS_OPT_ALL, "all" },
+ { LS_OPT_DEV, "dev=%s" },
+ { LS_OPT_ERR, NULL }
+};
+
+static int ls_parse_options(const char *buf, struct session *s)
+{
+ substring_t args[MAX_OPT_ARGS];
+ char *options, *sep_opt;
+ char *p;
+ int token;
+ int ret = 0;
+
+ if (!buf)
+ return 0;
+ options = kstrdup(buf, GFP_KERNEL);
+ if (!options)
+ return -ENOMEM;
+
+ s->ls_options = 0;
+ sep_opt = options;
+ while ((p = strsep(&sep_opt, " ")) != NULL) {
+ if (!*p)
+ continue;
+
+ token = match_token(p, ls_opt_tokens, args);
+
+ switch (token) {
+ case LS_OPT_NOCLEAN:
+ case LS_OPT_NODIRTY:
+ case LS_OPT_NOUNUSED:
+ case LS_OPT_EMPTY:
+ case LS_OPT_ALL:
+ s->ls_options |= token;
+ break;
+ case LS_OPT_DEV:
+ p = match_strdup(args);
+ if (!p) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ if (*p == '/') {
+ struct kstat stat;
+ struct nameidata nd;
+ ret = path_lookup(p, LOOKUP_FOLLOW, &nd);
+ if (!ret)
+ ret = vfs_getattr(nd.path.mnt,
+ nd.path.dentry, &stat);
+ if (!ret)
+ s->ls_dev = stat.rdev;
+ } else
+ s->ls_dev = simple_strtoul(p, NULL, 0);
+ /* printk("%lx %s\n", (long)s->ls_dev, p); */
+ kfree(p);
+ break;
+
+ default:
+ printk(KERN_WARNING "unknown parameter or missing value "
+ "'%s' in ls command\n", p);
+ ret = -EINVAL;
+ goto out;
+ }
+ }
+
+out:
+ kfree(options);
+ return ret;
+}
+
+/*
+ * Add possible filters here.
+ * No permission check: we cannot verify the path's permission anyway.
+ * We simply demand root previledge for accessing /proc/filecache.
+ */
+static int may_show_inode(struct session *s, struct inode *inode)
+{
+ if (!atomic_read(&inode->i_count))
+ return 0;
+ if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE))
+ return 0;
+ if (!inode->i_mapping)
+ return 0;
+
+ if (s->ls_dev && s->ls_dev != inode->i_sb->s_dev)
+ return 0;
+
+ if (s->ls_options & LS_OPT_ALL)
+ return 1;
+
+ if (!(s->ls_options & LS_OPT_EMPTY) && !inode->i_mapping->nrpages)
+ return 0;
+
+ if ((s->ls_options & LS_OPT_NOCLEAN) && !(inode->i_state & I_DIRTY))
+ return 0;
+
+ if ((s->ls_options & LS_OPT_NODIRTY) && (inode->i_state & I_DIRTY))
+ return 0;
+
+ if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
+ S_ISLNK(inode->i_mode) || S_ISBLK(inode->i_mode)))
+ return 0;
+
+ return 1;
+}
+
+/*
+ * Full: there are more data following.
+ */
+static int iwin_full(struct session *s)
+{
+ return !s->iwin.cursor ||
+ s->iwin.cursor > s->iwin.origin + s->iwin.size;
+}
+
+static int iwin_push(struct session *s, struct inode *inode)
+{
+ if (!may_show_inode(s, inode))
+ return 0;
+
+ s->iwin.cursor++;
+
+ if (s->iwin.size >= IWIN_SIZE)
+ return 1;
+
+ if (s->iwin.cursor > s->iwin.origin)
+ s->iwin.inodes[s->iwin.size++] = inode;
+ return 0;
+}
+
+/*
+ * Travease the inode lists in order - newest first.
+ * And fill @s->iwin.inodes with inodes positioned in [@pos, @pos+IWIN_SIZE).
+ */
+static int iwin_fill(struct session *s, unsigned long pos)
+{
+ struct inode *inode;
+ struct super_block *sb;
+
+ s->iwin.origin = pos;
+ s->iwin.cursor = 0;
+ s->iwin.size = 0;
+
+ /*
+ * We have a cursor inode, clean and expected to be unchanged.
+ */
+ if (s->ipos.inode && pos >= s->ipos.pos &&
+ !(s->ipos.state & I_DIRTY) &&
+ s->ipos.state == s->ipos.inode->i_state) {
+ inode = s->ipos.inode;
+ s->iwin.cursor = s->ipos.pos;
+ goto continue_from_saved;
+ }
+
+ if (s->ls_options & LS_OPT_NODIRTY)
+ goto clean_inodes;
+
+ spin_lock(&sb_lock);
+ list_for_each_entry(sb, &super_blocks, s_list) {
+ if (s->ls_dev && s->ls_dev != sb->s_dev)
+ continue;
+
+ list_for_each_entry(inode, &sb->s_dirty, i_list) {
+ if (iwin_push(s, inode))
+ goto out_full_unlock;
+ }
+ list_for_each_entry(inode, &sb->s_io, i_list) {
+ if (iwin_push(s, inode))
+ goto out_full_unlock;
+ }
+ }
+ spin_unlock(&sb_lock);
+
+clean_inodes:
+ list_for_each_entry(inode, &inode_in_use, i_list) {
+ if (iwin_push(s, inode))
+ goto out_full;
+continue_from_saved:
+ ;
+ }
+
+ if (s->ls_options & LS_OPT_NOUNUSED)
+ return 0;
+
+ list_for_each_entry(inode, &inode_unused, i_list) {
+ if (iwin_push(s, inode))
+ goto out_full;
+ }
+
+ return 0;
+
+out_full_unlock:
+ spin_unlock(&sb_lock);
+out_full:
+ return 1;
+}
+
+static struct inode *iwin_inode(struct session *s, unsigned long pos)
+{
+ if ((iwin_full(s) && pos >= s->iwin.origin + s->iwin.size)
+ || pos < s->iwin.origin)
+ iwin_fill(s, pos);
+
+ if (pos >= s->iwin.cursor)
+ return NULL;
+
+ s->ipos.pos = pos;
+ s->ipos.inode = s->iwin.inodes[pos - s->iwin.origin];
+ BUG_ON(!s->ipos.inode);
+ return s->ipos.inode;
+}
+
+static void show_inode(struct seq_file *m, struct inode *inode)
+{
+ char state[] = "--"; /* dirty, locked */
+ struct dentry *dentry;
+ loff_t size = i_size_read(inode);
+ unsigned long nrpages;
+ int percent;
+ int refcnt;
+ int shift;
+
+ if (!size)
+ size++;
+
+ if (inode->i_mapping)
+ nrpages = inode->i_mapping->nrpages;
+ else {
+ nrpages = 0;
+ WARN_ON(1);
+ }
+
+ for (shift = 0; (size >> shift) > ULONG_MAX / 128; shift += 12)
+ ;
+ percent = min(100UL, (((100 * nrpages) >> shift) << PAGE_CACHE_SHIFT) /
+ (unsigned long)(size >> shift));
+
+ if (inode->i_state & (I_DIRTY_DATASYNC|I_DIRTY_PAGES))
+ state[0] = 'D';
+ else if (inode->i_state & I_DIRTY_SYNC)
+ state[0] = 'd';
+
+ if (inode->i_state & I_LOCK)
+ state[0] = 'L';
+
+ refcnt = 0;
+ list_for_each_entry(dentry, &inode->i_dentry, d_alias) {
+ refcnt += atomic_read(&dentry->d_count);
+ }
+
+ seq_printf(m, "%10lu %10llu %8lu %7d ",
+ inode->i_ino,
+ DIV_ROUND_UP(size, 1024),
+ nrpages << (PAGE_CACHE_SHIFT - 10),
+ percent);
+
+ seq_printf(m, "%6d %5s ",
+ refcnt,
+ state);
+
+#ifdef CONFIG_PROC_FILECACHE_EXTRAS
+ seq_printf(m, "%8u %5u %-16s",
+ inode->i_access_count,
+ inode->i_cuid,
+ inode->i_comm);
+#endif
+
+ seq_printf(m, "%02x:%02x(%s)\t",
+ MAJOR(inode->i_sb->s_dev),
+ MINOR(inode->i_sb->s_dev),
+ inode->i_sb->s_id);
+
+ if (list_empty(&inode->i_dentry)) {
+ if (!atomic_read(&inode->i_count))
+ seq_puts(m, "(noname)\n");
+ else
+ seq_printf(m, "(%02x:%02x)\n",
+ imajor(inode), iminor(inode));
+ } else {
+ struct path path = {
+ .mnt = NULL,
+ .dentry = list_entry(inode->i_dentry.next,
+ struct dentry, d_alias)
+ };
+
+ seq_path(m, &path, " \t\n\\");
+ seq_putc(m, '\n');
+ }
+}
+
+static int ii_show(struct seq_file *m, void *v)
+{
+ unsigned long index = *(loff_t *) v;
+ struct session *s = m->private;
+ struct inode *inode;
+
+ if (index == 0) {
+ seq_puts(m, "# filecache " FILECACHE_VERSION "\n");
+ seq_puts(m, "# ino size cached cached% "
+ "refcnt state "
+#ifdef CONFIG_PROC_FILECACHE_EXTRAS
+ "accessed uid process "
+#endif
+ "dev\t\tfile\n");
+ }
+
+ inode = iwin_inode(s,index);
+ show_inode(m, inode);
+
+ return 0;
+}
+
+static void *ii_start(struct seq_file *m, loff_t *pos)
+{
+ struct session *s = m->private;
+
+ s->iwin.size = 0;
+ s->iwin.inodes = (struct inode **)
+ __get_free_pages( GFP_KERNEL, IWIN_PAGE_ORDER);
+ if (!s->iwin.inodes)
+ return NULL;
+
+ spin_lock(&inode_lock);
+
+ return iwin_inode(s, *pos) ? pos : NULL;
+}
+
+static void *ii_next(struct seq_file *m, void *v, loff_t *pos)
+{
+ struct session *s = m->private;
+
+ (*pos)++;
+ return iwin_inode(s, *pos) ? pos : NULL;
+}
+
+static void ii_stop(struct seq_file *m, void *v)
+{
+ struct session *s = m->private;
+ struct inode *inode = s->ipos.inode;
+
+ if (!s->iwin.inodes)
+ return;
+
+ if (inode) {
+ __iget(inode);
+ s->ipos.state = inode->i_state;
+ }
+ spin_unlock(&inode_lock);
+
+ free_pages((unsigned long) s->iwin.inodes, IWIN_PAGE_ORDER);
+ if (s->ipos.pinned_inode)
+ iput(s->ipos.pinned_inode);
+ s->ipos.pinned_inode = inode;
+}
+
+/*
+ * Listing of cached page ranges of a file.
+ *
+ * Usage:
+ * echo 'file name' > /proc/filecache
+ * cat /proc/filecache
+ */
+
+unsigned long page_mask;
+#define PG_MMAP PG_lru /* reuse any non-relevant flag */
+#define PG_BUFFER PG_swapcache /* ditto */
+#define PG_DIRTY PG_error /* ditto */
+#define PG_WRITEBACK PG_buddy /* ditto */
+
+/*
+ * Page state names, prefixed by their abbreviations.
+ */
+struct {
+ unsigned long mask;
+ const char *name;
+ int faked;
+} page_flag [] = {
+ {1 << PG_referenced, "R:referenced", 0},
+ {1 << PG_active, "A:active", 0},
+ {1 << PG_MMAP, "M:mmap", 1},
+
+ {1 << PG_uptodate, "U:uptodate", 0},
+ {1 << PG_dirty, "D:dirty", 0},
+ {1 << PG_writeback, "W:writeback", 0},
+ {1 << PG_reclaim, "X:readahead", 0},
+
+ {1 << PG_private, "P:private", 0},
+ {1 << PG_owner_priv_1, "O:owner", 0},
+
+ {1 << PG_BUFFER, "b:buffer", 1},
+ {1 << PG_DIRTY, "d:dirty", 1},
+ {1 << PG_WRITEBACK, "w:writeback", 1},
+};
+
+static unsigned long page_flags(struct page* page)
+{
+ unsigned long flags;
+ struct address_space *mapping = page_mapping(page);
+
+ flags = page->flags & page_mask;
+
+ if (page_mapped(page))
+ flags |= (1 << PG_MMAP);
+
+ if (page_has_buffers(page))
+ flags |= (1 << PG_BUFFER);
+
+ if (mapping) {
+ if (radix_tree_tag_get(&mapping->page_tree,
+ page_index(page),
+ PAGECACHE_TAG_WRITEBACK))
+ flags |= (1 << PG_WRITEBACK);
+
+ if (radix_tree_tag_get(&mapping->page_tree,
+ page_index(page),
+ PAGECACHE_TAG_DIRTY))
+ flags |= (1 << PG_DIRTY);
+ }
+
+ return flags;
+}
+
+static int pages_similiar(struct page* page0, struct page* page)
+{
+ if (page_count(page0) != page_count(page))
+ return 0;
+
+ if (page_flags(page0) != page_flags(page))
+ return 0;
+
+ return 1;
+}
+
+static void show_range(struct seq_file *m, struct page* page, unsigned long len)
+{
+ int i;
+ unsigned long flags;
+
+ if (!m || !page)
+ return;
+
+ seq_printf(m, "%lu\t%lu\t", page->index, len);
+
+ flags = page_flags(page);
+ for (i = 0; i < ARRAY_SIZE(page_flag); i++)
+ seq_putc(m, (flags & page_flag[i].mask) ?
+ page_flag[i].name[0] : '_');
+
+ seq_printf(m, "\t%d\n", page_count(page));
+}
+
+#define BATCH_LINES 100
+static pgoff_t show_file_cache(struct seq_file *m,
+ struct address_space *mapping, pgoff_t start)
+{
+ int i;
+ int lines = 0;
+ pgoff_t len = 0;
+ struct pagevec pvec;
+ struct page *page;
+ struct page *page0 = NULL;
+
+ for (;;) {
+ pagevec_init(&pvec, 0);
+ pvec.nr = radix_tree_gang_lookup(&mapping->page_tree,
+ (void **)pvec.pages, start + len, PAGEVEC_SIZE);
+
+ if (pvec.nr == 0) {
+ show_range(m, page0, len);
+ start = ULONG_MAX;
+ goto out;
+ }
+
+ if (!page0)
+ page0 = pvec.pages[0];
+
+ for (i = 0; i < pvec.nr; i++) {
+ page = pvec.pages[i];
+
+ if (page->index == start + len &&
+ pages_similiar(page0, page))
+ len++;
+ else {
+ show_range(m, page0, len);
+ page0 = page;
+ start = page->index;
+ len = 1;
+ if (++lines > BATCH_LINES)
+ goto out;
+ }
+ }
+ }
+
+out:
+ return start;
+}
+
+static int pg_show(struct seq_file *m, void *v)
+{
+ struct session *s = m->private;
+ struct file *file = s->query_file;
+ pgoff_t offset;
+
+ if (!file)
+ return ii_show(m, v);
+
+ offset = *(loff_t *) v;
+
+ if (!offset) { /* print header */
+ int i;
+
+ seq_puts(m, "# file ");
+ seq_path(m, &file->f_path, " \t\n\\");
+
+ seq_puts(m, "\n# flags");
+ for (i = 0; i < ARRAY_SIZE(page_flag); i++)
+ seq_printf(m, " %s", page_flag[i].name);
+
+ seq_puts(m, "\n# idx\tlen\tstate\t\trefcnt\n");
+ }
+
+ s->start_offset = offset;
+ s->next_offset = show_file_cache(m, file->f_mapping, offset);
+
+ return 0;
+}
+
+static void *file_pos(struct file *file, loff_t *pos)
+{
+ loff_t size = i_size_read(file->f_mapping->host);
+ pgoff_t end = DIV_ROUND_UP(size, PAGE_CACHE_SIZE);
+ pgoff_t offset = *pos;
+
+ return offset < end ? pos : NULL;
+}
+
+static void *pg_start(struct seq_file *m, loff_t *pos)
+{
+ struct session *s = m->private;
+ struct file *file = s->query_file;
+ pgoff_t offset = *pos;
+
+ if (!file)
+ return ii_start(m, pos);
+
+ rcu_read_lock();
+
+ if (offset - s->start_offset == 1)
+ *pos = s->next_offset;
+ return file_pos(file, pos);
+}
+
+static void *pg_next(struct seq_file *m, void *v, loff_t *pos)
+{
+ struct session *s = m->private;
+ struct file *file = s->query_file;
+
+ if (!file)
+ return ii_next(m, v, pos);
+
+ *pos = s->next_offset;
+ return file_pos(file, pos);
+}
+
+static void pg_stop(struct seq_file *m, void *v)
+{
+ struct session *s = m->private;
+ struct file *file = s->query_file;
+
+ if (!file)
+ return ii_stop(m, v);
+
+ rcu_read_unlock();
+}
+
+struct seq_operations seq_filecache_op = {
+ .start = pg_start,
+ .next = pg_next,
+ .stop = pg_stop,
+ .show = pg_show,
+};
+
+/*
+ * Implement the manual drop-all-pagecache function
+ */
+
+#define MAX_INODES (PAGE_SIZE / sizeof(struct inode *))
+static int drop_pagecache(void)
+{
+ struct hlist_head *head;
+ struct hlist_node *node;
+ struct inode *inode;
+ struct inode **inodes;
+ unsigned long i, j, k;
+ int err = 0;
+
+ inodes = (struct inode **)__get_free_pages(GFP_KERNEL, IWIN_PAGE_ORDER);
+ if (!inodes)
+ return -ENOMEM;
+
+ for (i = 0; (head = get_inode_hash_budget(i)); i++) {
+ if (hlist_empty(head))
+ continue;
+
+ j = 0;
+ cond_resched();
+
+ /*
+ * Grab some inodes.
+ */
+ spin_lock(&inode_lock);
+ hlist_for_each (node, head) {
+ inode = hlist_entry(node, struct inode, i_hash);
+ if (!atomic_read(&inode->i_count))
+ continue;
+ if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE))
+ continue;
+ if (!inode->i_mapping || !inode->i_mapping->nrpages)
+ continue;
+ __iget(inode);
+ inodes[j++] = inode;
+ if (j >= MAX_INODES)
+ break;
+ }
+ spin_unlock(&inode_lock);
+
+ /*
+ * Free clean pages.
+ */
+ for (k = 0; k < j; k++) {
+ inode = inodes[k];
+ invalidate_mapping_pages(inode->i_mapping, 0, ~1);
+ iput(inode);
+ }
+
+ /*
+ * Simply ignore the remaining inodes.
+ */
+ if (j >= MAX_INODES && !err) {
+ printk(KERN_WARNING
+ "Too many collides in inode hash table.\n"
+ "Pls boot with a larger ihash_entries=XXX.\n");
+ err = -EAGAIN;
+ }
+ }
+
+ free_pages((unsigned long) inodes, IWIN_PAGE_ORDER);
+ return err;
+}
+
+static void drop_slabcache(void)
+{
+ int nr_objects;
+
+ do {
+ nr_objects = shrink_slab(1000, GFP_KERNEL, 1000);
+ } while (nr_objects > 10);
+}
+
+/*
+ * Proc file operations.
+ */
+
+static int filecache_open(struct inode *inode, struct file *proc_file)
+{
+ struct seq_file *m;
+ struct session *s;
+ unsigned size;
+ char *buf = 0;
+ int ret;
+
+ if (!try_module_get(THIS_MODULE))
+ return -ENOENT;
+
+ s = session_create();
+ if (IS_ERR(s)) {
+ ret = PTR_ERR(s);
+ goto out;
+ }
+ set_session(proc_file, s);
+
+ size = SBUF_SIZE;
+ buf = kmalloc(size, GFP_KERNEL);
+ if (!buf) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ ret = seq_open(proc_file, &seq_filecache_op);
+ if (!ret) {
+ m = proc_file->private_data;
+ m->private = s;
+ m->buf = buf;
+ m->size = size;
+ }
+
+out:
+ if (ret) {
+ kfree(s);
+ kfree(buf);
+ module_put(THIS_MODULE);
+ }
+ return ret;
+}
+
+static int filecache_release(struct inode *inode, struct file *proc_file)
+{
+ struct session *s = get_session(proc_file);
+ int ret;
+
+ session_release(s);
+ ret = seq_release(inode, proc_file);
+ module_put(THIS_MODULE);
+ return ret;
+}
+
+ssize_t filecache_write(struct file *proc_file, const char __user * buffer,
+ size_t count, loff_t *ppos)
+{
+ struct session *s;
+ char *name;
+ int err = 0;
+
+ if (count >= PATH_MAX + 5)
+ return -ENAMETOOLONG;
+
+ name = kmalloc(count+1, GFP_KERNEL);
+ if (!name)
+ return -ENOMEM;
+
+ if (copy_from_user(name, buffer, count)) {
+ err = -EFAULT;
+ goto out;
+ }
+
+ /* strip the optional newline */
+ if (count && name[count-1] == '\n')
+ name[count-1] = '\0';
+ else
+ name[count] = '\0';
+
+ s = get_session(proc_file);
+ if (!strcmp(name, "set private")) {
+ s->private_session = 1;
+ goto out;
+ }
+
+ if (!strncmp(name, "cat ", 4)) {
+ err = session_update_file(s, name+4);
+ goto out;
+ }
+
+ if (!strncmp(name, "ls", 2)) {
+ err = session_update_file(s, NULL);
+ if (!err)
+ err = ls_parse_options(name+2, s);
+ if (!err && !s->private_session) {
+ global_session.ls_dev = s->ls_dev;
+ global_session.ls_options = s->ls_options;
+ }
+ goto out;
+ }
+
+ if (!strncmp(name, "drop pagecache", 14)) {
+ err = drop_pagecache();
+ goto out;
+ }
+
+ if (!strncmp(name, "drop slabcache", 14)) {
+ drop_slabcache();
+ goto out;
+ }
+
+ /* err = -EINVAL; */
+ err = session_update_file(s, name);
+
+out:
+ kfree(name);
+
+ return err ? err : count;
+}
+
+static struct file_operations proc_filecache_fops = {
+ .owner = THIS_MODULE,
+ .open = filecache_open,
+ .release = filecache_release,
+ .write = filecache_write,
+ .read = seq_read,
+ .llseek = seq_lseek,
+};
+
+
+static __init int filecache_init(void)
+{
+ int i;
+ struct proc_dir_entry *entry;
+
+ entry = create_proc_entry("filecache", 0600, NULL);
+ if (entry)
+ entry->proc_fops = &proc_filecache_fops;
+
+ for (page_mask = i = 0; i < ARRAY_SIZE(page_flag); i++)
+ if (!page_flag[i].faked)
+ page_mask |= page_flag[i].mask;
+
+ return 0;
+}
+
+static void filecache_exit(void)
+{
+ remove_proc_entry("filecache", NULL);
+ if (global_session.query_file)
+ fput(global_session.query_file);
+}
+
+MODULE_AUTHOR("Fengguang Wu <wfg@mail.ustc.edu.cn>");
+MODULE_LICENSE("GPL");
+
+module_init(filecache_init);
+module_exit(filecache_exit);
--- linux-2.6.orig/include/linux/fs.h
+++ linux-2.6/include/linux/fs.h
@@ -685,6 +685,12 @@ struct inode {
void *i_security;
#endif
void *i_private; /* fs or device private pointer */
+
+#ifdef CONFIG_PROC_FILECACHE_EXTRAS
+ unsigned int i_access_count; /* opened how many times? */
+ uid_t i_cuid; /* opened first by which user? */
+ char i_comm[16]; /* opened first by which app? */
+#endif
};
/*
@@ -773,6 +779,13 @@ static inline unsigned imajor(const stru
return MAJOR(inode->i_rdev);
}
+static inline void inode_accessed(struct inode *inode)
+{
+#ifdef CONFIG_PROC_FILECACHE_EXTRAS
+ inode->i_access_count++;
+#endif
+}
+
extern struct block_device *I_BDEV(struct inode *inode);
struct fown_struct {
@@ -1907,6 +1920,7 @@ extern void remove_inode_hash(struct ino
static inline void insert_inode_hash(struct inode *inode) {
__insert_inode_hash(inode, inode->i_ino);
}
+struct hlist_head * get_inode_hash_budget(unsigned long index);
extern struct file * get_empty_filp(void);
extern void file_move(struct file *f, struct list_head *list);
--- linux-2.6.orig/fs/open.c
+++ linux-2.6/fs/open.c
@@ -828,6 +828,7 @@ static struct file *__dentry_open(struct
goto cleanup_all;
}
+ inode_accessed(inode);
f->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping);
--- linux-2.6.orig/fs/Kconfig
+++ linux-2.6/fs/Kconfig
@@ -750,6 +750,36 @@ config CONFIGFS_FS
Both sysfs and configfs can and should exist together on the
same system. One is not a replacement for the other.
+config PROC_FILECACHE
+ tristate "/proc/filecache support"
+ default m
+ depends on PROC_FS
+ help
+ This option creates a file /proc/filecache which enables one to
+ query/drop the cached files in memory.
+
+ A quick start guide:
+
+ # echo 'ls' > /proc/filecache
+ # head /proc/filecache
+
+ # echo 'cat /bin/bash' > /proc/filecache
+ # head /proc/filecache
+
+ # echo 'drop pagecache' > /proc/filecache
+ # echo 'drop slabcache' > /proc/filecache
+
+ For more details, please check Documentation/filesystems/proc.txt .
+
+ It can be a handy tool for sysadms and desktop users.
+
+config PROC_FILECACHE_EXTRAS
+ bool "track extra states"
+ default y
+ depends on PROC_FILECACHE
+ help
+ Track extra states that costs a little more time/space.
+
endmenu
menu "Miscellaneous filesystems"
--- linux-2.6.orig/fs/proc/Makefile
+++ linux-2.6/fs/proc/Makefile
@@ -2,7 +2,8 @@
# Makefile for the Linux proc filesystem routines.
#
-obj-$(CONFIG_PROC_FS) += proc.o
+obj-$(CONFIG_PROC_FS) += proc.o
+obj-$(CONFIG_PROC_FILECACHE) += filecache.o
proc-y := nommu.o task_nommu.o
proc-$(CONFIG_MMU) := mmu.o task_mmu.o
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2009-03-20 5:08 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090317100049.33f67964@osiris.boeblingen.de.ibm.com>
[not found] ` <20090317111738.3cd32fa4@osiris.boeblingen.de.ibm.com>
[not found] ` <20090317112842.3b8e7724@osiris.boeblingen.de.ibm.com>
2009-03-17 10:49 ` oom-killer killing even if memory is available? Nick Piggin
2009-03-17 11:39 ` Heiko Carstens
2009-03-20 5:08 ` Wu Fengguang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).