V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
TangMonk
V2EX  ›  Redis

"Redis aborting for OUT OF MEMORY" 怎么破。

  •  
  •   TangMonk · 2016-02-01 15:21:16 +08:00 · 8822 次点击
    这是一个创建于 3219 天前的主题,其中的信息可能已经有所发展或是发生改变。
    === REDIS BUG REPORT START: Cut & paste starting from here ===
    5235:M 01 Feb 15:19:43.180 # ------------------------------------------------
    5235:M 01 Feb 15:19:43.180 # !!! Software Failure. Press left mouse button to continue
    5235:M 01 Feb 15:19:43.180 # Guru Meditation: "Redis aborting for OUT OF MEMORY" #redis.c:3560
    5235:M 01 Feb 15:19:43.180 # (forcing SIGSEGV in order to print the stack trace)
    5235:M 01 Feb 15:19:43.180 # ------------------------------------------------
    5235:M 01 Feb 15:19:43.180 #     Redis 3.0.5 crashed by signal: 11
    5235:M 01 Feb 15:19:43.181 #     Failed assertion: <no assertion failed> (<no file>:0)
    5235:M 01 Feb 15:19:43.181 # --- STACK TRACE
    /usr/local/bin/redis-server 127.0.0.1:6379(logStackTrace+0x33)[0x450f03]
    /usr/local/bin/redis-server 127.0.0.1:6379(_redisPanic+0x7a)[0x44ffaa]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340)[0x7f5d27a9c340]
    /usr/local/bin/redis-server 127.0.0.1:6379(_redisPanic+0x7a)[0x44ffaa]
    /usr/local/bin/redis-server 127.0.0.1:6379(redisOutOfMemoryHandler+0x2c)[0x41e32c]
    /usr/local/bin/redis-server 127.0.0.1:6379(zrealloc+0xd1)[0x425b01]
    /usr/local/bin/redis-server 127.0.0.1:6379[0x4242c0]
    /usr/local/bin/redis-server 127.0.0.1:6379(sdscatlen+0x58)[0x4245a8]
    /usr/local/bin/redis-server 127.0.0.1:6379[0x42b7e5]
    /usr/local/bin/redis-server 127.0.0.1:6379(addReply+0xcf)[0x42b8ef]
    /usr/local/bin/redis-server 127.0.0.1:6379(addReplyBulk+0x1c)[0x42c5fc]
    /usr/local/bin/redis-server 127.0.0.1:6379(lrangeCommand+0x150)[0x43b3d0]
    /usr/local/bin/redis-server 127.0.0.1:6379(call+0x83)[0x4205d3]
    /usr/local/bin/redis-server 127.0.0.1:6379(processCommand+0x39f)[0x42331f]
    /usr/local/bin/redis-server 127.0.0.1:6379(processInputBuffer+0x4f)[0x42d01f]
    /usr/local/bin/redis-server 127.0.0.1:6379(readQueryFromClient+0xb2)[0x42d152]
    /usr/local/bin/redis-server 127.0.0.1:6379(aeProcessEvents+0x250)[0x41b360]
    /usr/local/bin/redis-server 127.0.0.1:6379(aeMain+0x2b)[0x41b59b]
    /usr/local/bin/redis-server 127.0.0.1:6379(main+0x322)[0x41a2d2]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f5d276e8ec5]
    /usr/local/bin/redis-server 127.0.0.1:6379[0x41a489]
    5235:M 01 Feb 15:19:43.340 # --- INFO OUTPUT
    5235:M 01 Feb 15:19:43.357 # # Server
    redis_version:3.0.5
    redis_git_sha1:00000000
    redis_git_dirty:0
    redis_build_id:4d5b5b8902c0dc5b
    redis_mode:standalone
    os:Linux 3.13.0-32-generic x86_64
    arch_bits:64
    multiplexing_api:epoll
    gcc_version:4.8.2
    process_id:5235
    run_id:cea02151ce0f56499edc695587c715646acfab89
    tcp_port:6379
    uptime_in_seconds:65
    uptime_in_days:0
    hz:10
    lru_clock:11470606
    config_file:/etc/redis/6379.conf
    
    # Clients
    connected_clients:23
    client_longest_output_list:271685
    client_biggest_input_buf:12749
    blocked_clients:13
    
    # Memory
    used_memory:6065698552
    used_memory_human:5.65G
    used_memory_rss:5629136896
    used_memory_peak:6065698552
    used_memory_peak_human:5.65G
    used_memory_lua:36864
    mem_fragmentation_ratio:0.93
    mem_allocator:jemalloc-3.6.0
    
    # Persistence
    loading:0
    rdb_changes_since_last_save:147
    rdb_bgsave_in_progress:0
    rdb_last_save_time:1454311117
    rdb_last_bgsave_status:ok
    rdb_last_bgsave_time_sec:-1
    rdb_current_bgsave_time_sec:-1
    aof_enabled:0
    aof_rewrite_in_progress:0
    aof_rewrite_scheduled:0
    aof_last_rewrite_time_sec:-1
    aof_current_rewrite_time_sec:-1
    aof_last_bgrewrite_status:ok
    aof_last_write_status:ok
    
    # Stats
    total_connections_received:24
    total_commands_processed:3957
    instantaneous_ops_per_sec:292
    total_net_input_bytes:230433
    total_net_output_bytes:7249088
    instantaneous_input_kbps:15.75
    instantaneous_output_kbps:308.47
    rejected_connections:0
    sync_full:0
    sync_partial_ok:0
    sync_partial_err:0
    expired_keys:0
    evicted_keys:0
    keyspace_hits:3191
    keyspace_misses:5
    pubsub_channels:1
    pubsub_patterns:0
    latest_fork_usec:0
    migrate_cached_sockets:0
    
    # Replication
    role:master
    connected_slaves:0
    master_repl_offset:0
    repl_backlog_active:0
    repl_backlog_size:1048576
    repl_backlog_first_byte_offset:0
    repl_backlog_histlen:0
    
    # CPU
    used_cpu_sys:1.84
    used_cpu_user:5.71
    used_cpu_sys_children:0.00
    used_cpu_user_children:0.00
    
    # Commandstats
    cmdstat_get:calls=10,usec=43,usec_per_call=4.30
    cmdstat_psetex:calls=5,usec=30,usec_per_call=6.00
    cmdstat_del:calls=17,usec=53,usec_per_call=3.12
    cmdstat_exists:calls=5,usec=20,usec_per_call=4.00
    cmdstat_lpush:calls=12,usec=70,usec_per_call=5.83
    cmdstat_rpop:calls=12,usec=29,usec_per_call=2.42
    cmdstat_brpop:calls=540,usec=2735,usec_per_call=5.06
    cmdstat_lrange:calls=3163,usec=7462970,usec_per_call=2359.46
    cmdstat_ltrim:calls=6,usec=33,usec_per_call=5.50
    cmdstat_sadd:calls=18,usec=82,usec_per_call=4.56
    cmdstat_zadd:calls=5,usec=51,usec_per_call=10.20
    cmdstat_zrem:calls=6,usec=37,usec_per_call=6.17
    cmdstat_zrangebyscore:calls=10,usec=160,usec_per_call=16.00
    cmdstat_hset:calls=4,usec=12,usec_per_call=3.00
    cmdstat_hmset:calls=12,usec=94,usec_per_call=7.83
    cmdstat_incrby:calls=48,usec=265,usec_per_call=5.52
    cmdstat_mset:calls=5,usec=28,usec_per_call=5.60
    cmdstat_expire:calls=12,usec=66,usec_per_call=5.50
    cmdstat_keys:calls=1,usec=84,usec_per_call=84.00
    cmdstat_multi:calls=21,usec=49,usec_per_call=2.33
    cmdstat_exec:calls=21,usec=235,usec_per_call=11.19
    cmdstat_info:calls=2,usec=100,usec_per_call=50.00
    cmdstat_subscribe:calls=1,usec=9,usec_per_call=9.00
    cmdstat_watch:calls=11,usec=56,usec_per_call=5.09
    cmdstat_unwatch:calls=10,usec=27,usec_per_call=2.70
    
    # Cluster
    cluster_enabled:0
    
    # Keyspace
    db0:keys=118,expires=3,avg_ttl=87872671954
    hash_init_value: 1453522225
    
    5235:M 01 Feb 15:19:43.357 # --- CLIENT LIST OUTPUT
    5235:M 01 Feb 15:19:43.358 # id=3 addr=127.0.0.1:51405 fd=6 name= age=65 idle=5 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=5 oll=0 omem=0 events=rw cmd=brpop
    id=4 addr=127.0.0.1:51406 fd=7 name= age=65 idle=5 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=5 oll=0 omem=0 events=rw cmd=brpop
    id=20 addr=127.0.0.1:51422 fd=23 name= age=64 idle=5 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=5 oll=0 omem=0 events=rw cmd=brpop
    id=21 addr=127.0.0.1:51423 fd=24 name= age=64 idle=5 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=5 oll=0 omem=0 events=rw cmd=brpop
    id=24 addr=127.0.0.1:51443 fd=26 name= age=35 idle=35 flags=N db=0 sub=1 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=subscribe
    id=25 addr=127.0.0.1:51444 fd=27 name= age=35 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=12749 qbuf-free=20021 obl=0 oll=271685 omem=6059850408 events=rw cmd=lrange
    id=10 addr=127.0.0.1:51412 fd=13 name= age=64 idle=2 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=11 addr=127.0.0.1:51413 fd=14 name= age=64 idle=3 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=12 addr=127.0.0.1:51414 fd=15 name= age=64 idle=3 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=18 addr=127.0.0.1:51420 fd=21 name= age=64 idle=3 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=19 addr=127.0.0.1:51421 fd=22 name= age=64 idle=3 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=22 addr=127.0.0.1:51424 fd=25 name= age=61 idle=3 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=13 addr=127.0.0.1:51415 fd=16 name= age=64 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=14 addr=127.0.0.1:51416 fd=17 name= age=64 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=15 addr=127.0.0.1:51417 fd=18 name= age=64 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=16 addr=127.0.0.1:51418 fd=19 name= age=64 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=5 oll=0 omem=0 events=rw cmd=watch
    id=17 addr=127.0.0.1:51419 fd=20 name= age=64 idle=4 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=5 addr=127.0.0.1:51407 fd=8 name= age=65 idle=4 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=6 addr=127.0.0.1:51408 fd=9 name= age=65 idle=4 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=7 addr=127.0.0.1:51409 fd=10 name= age=65 idle=4 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=2 addr=127.0.0.1:51404 fd=5 name= age=65 idle=4 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=8 addr=127.0.0.1:51410 fd=11 name= age=64 idle=4 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    id=9 addr=127.0.0.1:51411 fd=12 name= age=64 idle=5 flags=b db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=brpop
    
    5235:M 01 Feb 15:19:43.358 # --- CURRENT CLIENT INFO
    5235:M 01 Feb 15:19:43.359 # Out Of Memory allocating 115 bytes!
    5235:M 01 Feb 15:19:43.359 # ------------------------------------------------
    5235:M 01 Feb 15:19:43.359 # !!! Software Failure. Press left mouse button to continue
    5235:M 01 Feb 15:19:43.359 # Guru Meditation: "Redis aborting for OUT OF MEMORY" #redis.c:3560
    5235:M 01 Feb 15:19:43.359 # (forcing SIGSEGV in order to print the stack trace)
    5235:M 01 Feb 15:19:43.359 # --------------------
    
    10 条回复    2016-02-04 13:48:33 +08:00
    c4pt0r
        1
    c4pt0r  
       2016-02-01 16:16:57 +08:00   ❤️ 1
    OOM 了你能有什么办法..

    要么就设定 max-memory 然后指定 lru 策略,要么该架构往分布式上走
    要么就换个大内存的机器...
    msg7086
        2
    msg7086  
       2016-02-01 16:21:06 +08:00   ❤️ 1
    内存不足了还能干啥……要么加内存要么限制占用咯。
    要问解决方案的话你至少贴给 free 出来看看啊……
    msg7086
        3
    msg7086  
       2016-02-01 16:24:39 +08:00   ❤️ 1
    used_memory:6065698552
    used_memory_human:5.65G
    used_memory_rss:5629136896
    used_memory_peak:6065698552
    used_memory_peak_human:5.65G

    目测就是内存吃完了。你干了什么?
    ToughGuy
        4
    ToughGuy  
       2016-02-01 17:02:00 +08:00   ❤️ 1
    sysctl -w vm.overcommit_memory=1
    c4pt0r
        5
    c4pt0r  
       2016-02-02 11:04:07 +08:00
    @ToughGuy 这个是头痛医头脚疼医脚吧。。。
    ToughGuy
        6
    ToughGuy  
       2016-02-03 17:35:05 +08:00
    @c4pt0r
    官网有说哦

    http://redis.io/topics/faq

    Redis background saving schema relies on the copy-on-write semantic of fork in modern operating systems: Redis forks (creates a child process) that is an exact copy of the parent. The child process dumps the DB on disk and finally exits. In theory the child should use as much memory as the parent being a copy.

    简单说就是不开启 overcommit_memory 在 save 的时候 redis 内存用量理论上为原来的两倍。
    ToughGuy
        7
    ToughGuy  
       2016-02-03 17:40:56 +08:00
    @ToughGuy

    简单说就是不开启 overcommit_memory , 而在 save 的时候 redis 内存用量理论上为原来的两倍, 这个时候就会内存不足。

    不过我也知识提到一种可能性, 楼主的问题是不是这个我也不清楚。
    c4pt0r
        8
    c4pt0r  
       2016-02-04 12:21:50 +08:00
    @ToughGuy
    1. 首先你要理解为什么是两倍,什么情况下会两倍?
    2. 而且,开 overcommit_memory 就不会两倍吗?

    对于第一个问题, redis 利用操作系统的 cow 特性, fork 出来然后 dump 内存,当且仅当这段时间被改写的 key 会被操作系统 cow ,老的版本的 redis 的 dict 实现会频繁的 rehash ,很容易就会导致新的进程的 dict 和老进程的 dict 完全不一样,操作系统不得不拷贝。但在 redis 2.6 ?后有一个优化,就是在 bgsave 的过程中,会禁止 rehash ,这段时间内,只有被更新的 key 才会拷贝,一般来说除了你的业务比较变态,在 bgsave 的过程中不停的从头到尾修改所有的 kv pairs ,否则基本不会两倍。

    第二个问题, overcommit_memory 只能说让操作系统给你分配所有的物理内存, malloc 永远不给你报错,这在生产环境中显然是掩耳盗铃的策略,你物理内存就 1g ,你非存 10g 的数据,还指望操作系统能帮你,这样其实大大的增加了系统的不稳定性,仍然避免不了 OOM kill 。

    lz 这个问题,只有一个解,那就是规划好内存使用,设定 maxmemory 。
    c4pt0r
        9
    c4pt0r  
       2016-02-04 12:22:51 +08:00
    @ToughGuy 还有我觉得官网的 http://redis.io/topics/admin 相关段落是有点误人子弟的,尽信书不如无书,多看代码吧。
    ToughGuy
        10
    ToughGuy  
       2016-02-04 13:48:33 +08:00
    @c4pt0r

    你说的对。

    有 copy on write 特性不会出现所谓"两倍"内存.
    楼主问题也不是由于 bgsave 时出现 oom , 所以 vercommit_memory 在这里帮不了楼主

    结论, 我误人子弟。
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   1271 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 27ms · UTC 17:33 · PVG 01:33 · LAX 09:33 · JFK 12:33
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.