Java程序在Linux上运行虚拟内存耗用很大

今天忽然想看看新部署的几个程序的运行情况,登上服务器,输入了top命令。

Java程序在Linux上运行虚拟内存耗用很大

W T F?我的小程序虚拟内存居然用了接近20个g?

这样下去怎么行,赶紧看看我的JVM参数配置:

Java程序在Linux上运行虚拟内存耗用很大

好像参数并没有什么问题,1 - 2G的配置,为啥虚拟内存会耗得那么高呢?

查查内存使用情况:pmap -x 196586|grep anon

Java程序在Linux上运行虚拟内存耗用很大

咦,这里一大堆大小刚刚好为64M的是什么东东?

我们知道Linux下glibc的内存管理机制用了一个很奇妙的东西,叫arena。在glibc分配内存的时候,大内存从从*分配区分配,小内存则在线程创建时,从缓存区分配。为了解决分配内存的性能的问题,就引入了这个叫做arena的memory pool。而恰好,在64bit系统下面,它的缺省配置为64M。一个进程可以最多有cores*8个arena,假如服务器是4核的,那么最多有4*8=32个arena,也就是32*64 = 2048M内存。然而,为了满足业务,我这台服务器居然是12核的,单单一个进程的arena占用的内存就达到了6G。

引用一下官方文档原文:

Before, malloc tried to emulate a per-core memory pool. Every time when contention for all existing memory pools was detected a new pool is created. Threads stay with the last used pool if possible… This never worked 100% because a thread can be descheduled while executing a malloc call. When some other thread tries to use the memory pool used in the call it would detect contention. A second problem is that if multiple threads on multiple core/sockets happily use malloc without contention memory from the same pool is used by different cores/on different sockets. This can lead to false sharing and definitely additional cross traffic because of the meta information updates. There are more potential problems not worth going into here in detail.
The changes which are in glibc now create per-thread memory pools. This can eliminate false sharing in most cases. The meta data is usually accessed only in one thread (which hopefully doesn’t get migrated off its assigned core). To prevent the memory handling from blowing up the address space use too much the number of memory pools is capped. By default we create up to two memory pools per core on 32-bit machines and up to eight memory per core on 64-bit machines. The code delays testing for the number of cores (which is not cheap, we have to read /proc/stat) until there are already two or eight memory pools allocated, respectively.

While these changes might increase the number of memory pools which are created (and thus increase the address space they use) the number can be controlled. Because using the old mechanism there could be a new pool being created whenever there are collisions the total number could in theory be higher. Unlikely but true, so the new mechanism is more predictable.

… Memory use is not that much of a premium anymore and most of the memory pool doesn’t actually require memory until it is used, only address space… We have done internally some measurements of the effects of the new implementation and they can be quite dramatic.

原文二:

Red Hat Enterprise Linux 6 features version 2.11 of glibc, providing many features and enhancements, including… An enhanced dynamic memory allocation (malloc) behaviour enabling higher scalability across many sockets and cores.This is achieved by assigning threads their own memory pools and by avoiding locking in some situations. The amount of additional memory used for the memory pools (if any) can be controlled using the environment variables MALLOC_ARENA_TEST and MALLOC_ARENA_MAX. MALLOC_ARENA_TEST specifies that a test for the number of cores is performed once the number of memory pools reaches this value. MALLOC_ARENA_MAX sets the maximum number of memory pools used, regardless of the number of cores.

可以发现,通过服务器上一个参数MALLOC_ARENA_MAX可以控制最大的arena数量,于是赶紧实施:

Java程序在Linux上运行虚拟内存耗用很大

执行完后,发现内存好像还是没有啥变化,甚至还有所增加:

Java程序在Linux上运行虚拟内存耗用很大

重启应用程序,再观察:

Java程序在Linux上运行虚拟内存耗用很大

观察了一段时间,发现内存已经稳定了,可喜可贺,问题解决!