V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
simonlu9
V2EX  ›  程序员

Java gc 回收导致 cpu 虚高,用了 jstack 分析,一点头绪都没有,希望大家帮忙分析一下

  •  
  •   simonlu9 · 2020-04-09 10:05:34 +08:00 · 4021 次点击
    这是一个创建于 1668 天前的主题,其中的信息可能已经有所发展或是发生改变。

    2020-04-08 15:47:54 Full thread dump Java HotSpot(TM) Server VM (25.152-b16 mixed mode):

    "threadDeathWatcher-3-1" #29 daemon prio=1 os_prio=0 tid=0x1cf3e800 nid=0x5d6c sleeping[0x1845c000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hbase.thirdparty.io.netty.util.ThreadDeathWatcher$Watcher.run(ThreadDeathWatcher.java:152) at org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138) at java.lang.Thread.run(Thread.java:748)

    "RpcClient-timer-pool1-t1" #25 daemon prio=5 os_prio=0 tid=0x16d91800 nid=0x5d6b sleeping[0x184ad000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.waitForNextTick(HashedWheelTimer.java:560) at org.apache.hbase.thirdparty.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:459) at java.lang.Thread.run(Thread.java:748)

    "Idle-Rpc-Conn-Sweeper-pool2-t1" #26 daemon prio=5 os_prio=0 tid=0x1cd51400 nid=0x5d68 waiting on condition [0x193fe000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x32b10890> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

    "ReadOnlyZKClient-linux1:2181,linux2:2181,linux3:2181@0x01b403ee" #22 daemon prio=10 os_prio=0 tid=0x1faf7000 nid=0x5d65 waiting on condition [0x197fd000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x32bf7ff8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.DelayQueue.poll(DelayQueue.java:259) at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient.run(ReadOnlyZKClient.java:303) at org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$27/31456712.run(Unknown Source) at java.lang.Thread.run(Thread.java:748)

    "OkHttp ConnectionPool" #18 daemon prio=5 os_prio=0 tid=0x1b4cf400 nid=0x5d5f in Object.wait() [0x1ddad000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:460) at com.squareup.okhttp.ConnectionPool.performCleanup(ConnectionPool.java:305) - locked <0x329bedb8> (a com.squareup.okhttp.ConnectionPool) at com.squareup.okhttp.ConnectionPool.runCleanupUntilPoolIsEmpty(ConnectionPool.java:242) at com.squareup.okhttp.ConnectionPool.access$000(ConnectionPool.java:54) at com.squareup.okhttp.ConnectionPool$1.run(ConnectionPool.java:97) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

    "Thread-3" #16 prio=5 os_prio=0 tid=0x1dc7d000 nid=0x5d57 runnable [0x1ddfe000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.net.SocketInputStream.read(SocketInputStream.java:127) at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:199) at redis.clients.jedis.util.RedisInputStream.readByte(RedisInputStream.java:43) at redis.clients.jedis.Protocol.process(Protocol.java:154) at redis.clients.jedis.Protocol.read(Protocol.java:219) at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:309) at redis.clients.jedis.Connection.getRawObjectMultiBulkReply(Connection.java:276) at redis.clients.jedis.JedisPubSub.process(JedisPubSub.java:123) at redis.clients.jedis.JedisPubSub.proceed(JedisPubSub.java:117) at redis.clients.jedis.Jedis.subscribe(Jedis.java:2779) at redis.clients.jedis.JedisCluster$153.execute(JedisCluster.java:1693) at redis.clients.jedis.JedisCluster$153.execute(JedisCluster.java:1690) at redis.clients.jedis.JedisClusterCommand.runWithAnyNode(JedisClusterCommand.java:76) at redis.clients.jedis.JedisCluster.subscribe(JedisCluster.java:1696) at com.flo.recommend.init.RedisSubThread.run(RedisSubThread.java:28) at java.lang.Thread.run(Thread.java:748)

    "commons-pool-evictor-thread" #14 prio=5 os_prio=0 tid=0x1dc4e800 nid=0x5d56 runnable [0x1e3ad000] java.lang.Thread.State: RUNNABLE at java.lang.Throwable.fillInStackTrace(Native Method) at java.lang.Throwable.fillInStackTrace(Throwable.java:783) - locked <0xcf435f08> (a java.lang.IllegalStateException) at java.lang.Throwable.<init>(Throwable.java:265) at java.lang.Exception.<init>(Exception.java:66) at java.lang.RuntimeException.<init>(RuntimeException.java:62) at java.lang.IllegalStateException.<init>(IllegalStateException.java:55) at org.apache.commons.pool2.impl.BaseGenericObjectPool.assertOpen(BaseGenericObjectPool.java:713) at org.apache.commons.pool2.impl.GenericObjectPool.evict(GenericObjectPool.java:721) at org.apache.commons.pool2.impl.BaseGenericObjectPool$Evictor.run(BaseGenericObjectPool.java:1077) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

    "Thread-1" #11 prio=5 os_prio=0 tid=0x1eaee400 nid=0x5d54 sleeping[0x1e3fe000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.flo.recommend.reduce.PushUrlCache.run(PushUrlCache.java:55) at java.lang.Thread.run(Thread.java:748)

    "Thread-0" #10 prio=5 os_prio=0 tid=0x1e90d400 nid=0x5d53 runnable [0x1e652000] java.lang.Thread.State: RUNNABLE at com.flo.recommend.util.MyLevenshtein.levenshtein(MyLevenshtein.java:19) at com.flo.recommend.mahout.DistancesRedue.apprun(DistancesRedue.java:70) at com.flo.recommend.init.RedisInit.initAppInfo(RedisInit.java:129) at com.flo.recommend.init.RedisInit.run(RedisInit.java:37) at java.lang.Thread.run(Thread.java:748)

    "mysql-cj-abandoned-connection-cleanup" #9 daemon prio=5 os_prio=0 tid=0x1fc77800 nid=0x5d52 runnable [0x20d95000] java.lang.Thread.State: RUNNABLE at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findResource(URLClassLoader.java:566) at java.lang.ClassLoader.getResource(ClassLoader.java:1096) at java.lang.ClassLoader.getResource(ClassLoader.java:1091) at java.lang.ClassLoader.getResource(ClassLoader.java:1091) at com.mysql.jdbc.AbandonedConnectionCleanupThread.checkThreadContextClassLoader(AbandonedConnectionCleanupThread.java:112) at com.mysql.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:79) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

    "Service Thread" #8 daemon prio=9 os_prio=0 tid=0x216aac00 nid=0x5d50 runnable [0x00000000] java.lang.Thread.State: RUNNABLE

    "C1 CompilerThread2" #7 daemon prio=9 os_prio=0 tid=0x216a7800 nid=0x5d4f waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE

    "C2 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x216a6400 nid=0x5d4e waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE

    "C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x216a4400 nid=0x5d4d runnable [0x00000000] java.lang.Thread.State: RUNNABLE

    "Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x216a2c00 nid=0x5d4c runnable [0x00000000] java.lang.Thread.State: RUNNABLE

    "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x21688800 nid=0x5d4b in Object.wait() [0x28a7d000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) - locked <0x29ebc1b0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

    "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x21685c00 nid=0x5d4a in Object.wait() [0x2157d000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked <0x29ebc3b0> (a java.lang.ref.Reference$Lock) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

    "main" #1 prio=5 os_prio=0 tid=0xf7607000 nid=0x5d44 in Object.wait() [0xf7788000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x29ebc4b0> (a io.netty.channel.AbstractChannel$CloseFuture) at java.lang.Object.wait(Object.java:502) at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:231) - locked <0x29ebc4b0> (a io.netty.channel.AbstractChannel$CloseFuture) at io.netty.channel.DefaultChannelPromise.await(DefaultChannelPromise.java:131) at io.netty.channel.DefaultChannelPromise.await(DefaultChannelPromise.java:30) at io.netty.util.concurrent.DefaultPromise.sync(DefaultPromise.java:337) at io.netty.channel.DefaultChannelPromise.sync(DefaultChannelPromise.java:119) at io.netty.channel.DefaultChannelPromise.sync(DefaultChannelPromise.java:30) at com.flo.recommend.HttpServer.main(HttpServer.java:110) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48) at org.springframework.boot.loader.Launcher.launch(Launcher.java:87) at org.springframework.boot.loader.Launcher.launch(Launcher.java:50) at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)

    "VM Thread" os_prio=0 tid=0x21681000 nid=0x5d49 runnable

    "GC task thread#0 (ParallelGC)" os_prio=0 tid=0xf7610800 nid=0x5d45 runnable

    "GC task thread#1 (ParallelGC)" os_prio=0 tid=0xf7611c00 nid=0x5d46 runnable

    "GC task thread#2 (ParallelGC)" os_prio=0 tid=0xf7613000 nid=0x5d47 runnable

    "GC task thread#3 (ParallelGC)" os_prio=0 tid=0xf7614400 nid=0x5d48 runnable

    "VM Periodic Task Thread" os_prio=0 tid=0x216bf000 nid=0x5d51 waiting on condition

    JNI global references: 431

    20 条回复    2020-04-27 17:59:52 +08:00
    myz2018
        1
    myz2018  
       2020-04-09 10:40:43 +08:00   ❤️ 1
    cpu 看 top -H 。然后找到对应的线程号。转 16 机制。然后再用 jstack.看。ps.你怎么知道是 gc 导致的?
    araraloren
        2
    araraloren  
       2020-04-09 10:51:32 +08:00
    虚高是什么 意思。。?
    realpg
        3
    realpg  
       2020-04-09 10:53:16 +08:00
    我好奇你是怎么得出“虚” 这个结论的
    Aresxue
        4
    Aresxue  
       2020-04-09 10:55:34 +08:00
    gc 占用的 cpu 已经很少了,不知道你怎么先入为主得出这个结论的。顺便吐槽下 gc 日志不给,只给个线程的快照有什么用
    hcymk2
        5
    hcymk2  
       2020-04-09 10:58:10 +08:00   ❤️ 1
    securityCoding
        6
    securityCoding  
       2020-04-09 11:30:41 +08:00
    jvm 调优是门玄学,劝君慎入此道
    simonlu9
        7
    simonlu9  
    OP
       2020-04-09 11:53:24 +08:00
    @myz2018 我也是按照这个方法去定位 cpu 线程的,确实是回收线程导致 cpu 很高
    simonlu9
        8
    simonlu9  
    OP
       2020-04-09 11:53:46 +08:00
    @araraloren cpu 达到 300%
    huntcool001
        9
    huntcool001  
       2020-04-09 12:06:10 +08:00
    是 JDK8 吗? 改 G1,重启,问题解决. /doge
    tt0411
        10
    tt0411  
       2020-04-09 12:17:52 +08:00
    先确定下是不是 gc 问题: dump gc 日志, 看看 gc 时间和频率; cpu 300% 不太像是 gc 导致的
    icegreen
        11
    icegreen  
       2020-04-09 13:53:28 +08:00
    jstat -gcutil 1000
    ty916
        12
    ty916  
       2020-04-09 13:58:14 +08:00
    可以试试 jmap -dump:format=b,file=serviceDump.dat 进程号 导出 dump 文件,然后用 jprofile 分析下,那个类占用内存最大,定位到那个线程,再看线程的堆栈日志
    CFM880
        13
    CFM880  
       2020-04-09 14:05:24 +08:00
    难道不是先定位 cpu 占用过高的进程,后定位该进程下的哪个线程 CPU 占用过高的么
    top -m 10 -n 1 -s cpu
    top -t | grep "pid"
    gz911122
        14
    gz911122  
       2020-04-09 14:08:15 +08:00
    我好奇你是怎么得出“虚” 这个结论的
    simonlu9
        15
    simonlu9  
    OP
       2020-04-09 14:30:45 +08:00
    @huntcool001 是的,jdk8 环境,我试试
    rihkddd
        16
    rihkddd  
       2020-04-09 15:42:48 +08:00
    如果你已经确定了是 GC 导致的 CPU 占用,问题其实是要找内存泄漏点。可行的方法,使用 jcmd 统计堆内对象计数(先把问题复现),主要关注排名靠前的自己代码中 new 的对象,基本然后关注一下 new 这些对象的地方,基本上大概可以确定。还不是很确定可以用 idea 调试,关注上述代码逻辑,看是不是持续的,idea 调试的时候有窗口可以看到对象计数,占用内存大小。调试下来基本上都能找到问题点。
    bobuick
        17
    bobuick  
       2020-04-09 15:51:47 +08:00
    gc 日志一打,分分钟就知道是不是 gc 导致的 cpu 高了。
    simonlu9
        18
    simonlu9  
    OP
       2020-04-10 09:58:06 +08:00
    @rihkddd 是的,找到了 AtomicLong 类型占用内存最多,但是引用的类有很多,现在用排除法来试试那部分代码出的问题
    ElmerZhang
        19
    ElmerZhang  
       2020-04-10 10:17:57 +08:00
    很久没用 Java 了,是 mat 已经过时了么?为什么没人提到这个呢?我印象中分析内存泄漏的话 mat 比 jprofiler 方便。
    simonlu9
        20
    simonlu9  
    OP
       2020-04-27 17:59:52 +08:00
    @simonlu9 已解决
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   2488 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 28ms · UTC 15:42 · PVG 23:42 · LAX 08:42 · JFK 11:42
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.