menu LittleJake's Blog
color_lens
avatar
Jake Liu
Never Settle
creative commons by-nc-sa
hit
Category
keyboard_arrow_down

© 2024 LittleJake's Blog.

萌ICP备20223020号

生产服务器centos系统watchdog: BUG: soft lockup

背景

服务器上运行NFS文件系统、频繁读写文件的程序
内存:256G
CPU:Intel(R) Xeon(R) Gold 5118 2.30GHz
内核版本:3.10.0_693

发现问题

系统频繁弹出watchdog报错:
soft lockup - CPU#X stuck for 22s! [java:PID]
A kernel problem occurred, but your kernel has been tainted.
BUG1
BUG2
据此判断,疑似内核出错

临时解决方案

修改内核参数watchdog监控时间超过22秒

检查邮件/var/spool/mail/root发现dump了许多日志信息和stack trace

sar -B查看内存页面交换
buff/cache回收 - 不等到free为min

sync && echo 2 > /proc/sys/vm/drop_caches

数据库节点慎用
sysctl
vm.extra_free_kbytes
vm.vfs_cache_pressure

slabtop 查看slab内存占用排序情况
dentry
nfs_inode_cache 频繁读写nfs上文件
radix_tree_node

再次排查为内核未知bug,使用
一个临时解决方案是

sysctl -w vm.zone_reclaim_mode=1

https://blog.csdn.net/fireroll/article/details/23668505

Buy me a beer
Jake Liu
Never Settle

Title: 生产服务器centos系统watchdog: BUG: soft lockup

Author: Jake Liu

Origin:

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) For any re-post you must give appropriate credit.

文章遵循CC许可 署名-非商业性使用-相同方式共享 4.0 国际 (CC BY-NC-SA 4.0) 转载请注明出处

Tag:centos, watchdog, soft lockup, BUG

评论区

Add a new comment.

Theme