apt自动更新踩坑实录
apt自动更新踩坑
前几天检查vps运行状况时被堵外面了,从秘钥对到ip修改登录全尝试了发现都不行,突然记起有vnc救援登录,进去一看发现ssh服务挂了幸好有vnc,不然只能重装系统了。
寻找原因
但是这个问题不应该发生呀,我明明已经运行了systemctl enable ssh让ssh服务开机自启,在没有其他干扰的情况下没理由会挂掉。猜想可能是黑客攻破了我的ssh,拿到了sudo用户组的权限并手动停止了ssh服务,因此检查ssh日志
sudo journalctl -u ssh | grep "Accepted"Mar 08 23:33:10 sshd: Accepted publickey for ** from *.*.*.* port **发现最新的登录行为是自己,检查docker容器以及其他的一些文件发现没有篡改过的痕迹,大致可以排除黑客入侵。但是到底是什么有这么大的权限能够让ssh停止服务,这个问题不找出来以后可能又会被锁外面。
因此再次检查ssh日志,找到ssh停止的时间点
# 查看谁停止了服务sudo journalctl -u ssh | grep -i "stopping"Jan 24 18:09:28 systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...Jan 24 18:30:19 systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...Jan 24 18:42:04 systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...Jan 24 18:46:20 systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...Jan 25 17:31:02 systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...Jan 29 06:24:13 systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...Feb 04 06:34:40 systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...Mar 15 06:06:23 systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...最近的宕机时间是Mar 15 06:06:23,所以从找时间点附近开始排查,看看是谁的问题。
sudo journalctl --since "2026-03-15 06:00" --until "2026-03-15 06:10" | grep -i "ssh\|kill\|oom\|memory"Mar 15 06:06:23 systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...Mar 15 06:06:23 sshd[273849]: Received signal 15; terminating.Mar 15 06:06:23 systemd[1]: ssh.service: Deactivated successfully.Mar 15 06:06:23 systemd[1]: Stopped ssh.service - OpenBSD Secure Shell server.这是系统kill ssh,印证了之前的推断:不是被攻击了。
确定了是系统终止了ssh服务,接下来继续查找,看看是系统运行的哪个程序执行了systemd stop ssh。
在Mar 15 06:06:23时间附近,发现在执行apt每日自动升级服务:
Mar 15 06:06:44 systemd[1]: apt-daily-upgrade.service: Consumed 41.064s CPU time, 415.0M memory peak, 0B memory swap peak.因此有很大把握怀疑是apt自动更新ssh时,将其停止更新后却异常没有重新启动,由此去检查apt的更新日志:
Start-Date: 2026-03-15 06:06:23Commandline: /usr/bin/unattended-upgradeUpgrade: openssh-client:amd64 (1:9.6p1-3ubuntu13.14, 1:9.6p1-3ubuntu13.15), openssh-server:amd64 (1:9.6p1-3ubuntu13.14, 1:9.6p1-3ubuntu13.15), openssh-sftp-server:amd64 (1:9.6p1-3ubuntu13.14, 1:9.6p1-3ubuntu13.15)End-Date: 2026-03-15 06:06:26 果然发现在06:06:23 - 26这个时间段里更新了openssh-client、 openssh-serve、openssh-sftp-server。已经破案了,apt终止了ssh,把门锁上了。
过程还原
在06:06:23时,apt自动更新ssh。它拥有足够的权限将ssh停止,然后执行ssh的更新程序。但是不知道出于何种原因,在停掉过后没有触发重新启动机制,导致ssh彻底挂掉。
进一步分析ssh没有重启原因,在Ubuntu中通常是unattended-upgrade自动安装更新,而他会使用dpkg工具进行安装。因此检查unattended-upgrade-dpkg.log:
Log started: 2026-03-15 06:06:22Preconfiguring packages ...Preconfiguring packages ...(Reading database ... ^M(Reading database ... 5%^M(Reading database ... 10%^M(Reading database ... 15%^M(Reading database ... 20%^M(Reading database ... 25%^M(Reading database ... 30%^M(Reading database ... 35%^M(Reading database ... 40%^M(Reading database ... 45%^M(Reading database ... 50%^M(Reading database ... 55%^M(Reading database ... 60%^M(Reading database ...65%^M(Reading database ... 70%^M(Reading database ... 75%^M(Reading database ... 80%^M(Reading database ... 85%^M(Reading database ... 90%^M(Reading database ... 95%^M(Reading database ... 100%^M(Reading database ... 86180 files and directories currently installed.)Preparing to unpack .../openssh-sftp-server_1%3a9.6p1-3ubuntu13.15_amd64.deb ...Unpacking openssh-sftp-server (1:9.6p1-3ubuntu13.15) over (1:9.6p1-3ubuntu13.14) ...Preparing to unpack .../openssh-server_1%3a9.6p1-3ubuntu13.15_amd64.deb ...Unpacking openssh-server (1:9.6p1-3ubuntu13.15) over (1:9.6p1-3ubuntu13.14) ...Preparing to unpack .../openssh-client_1%3a9.6p1-3ubuntu13.15_amd64.deb ...Unpacking openssh-client (1:9.6p1-3ubuntu13.15) over (1:9.6p1-3ubuntu13.14) ...Setting up openssh-client (1:9.6p1-3ubuntu13.15) ...Setting up openssh-sftp-server (1:9.6p1-3ubuntu13.15) ...Setting up openssh-server (1:9.6p1-3ubuntu13.15) ...Processing triggers for man-db (2.12.0-4build2) ...Processing triggers for ufw (0.36.2-6) ...
Running kernel seems to be up-to-date.
Restarting services...
Service restarts being deferred: systemctl restart apt-daily-upgrade.service /etc/needrestart/restart.d/dbus.service systemctl restart docker.service systemctl restart getty@tty1.service systemctl restart networkd-dispatcher.service systemctl restart serial-getty@ttyS0.service systemctl restart systemd-logind.service systemctl restart unattended-upgrades.service
No containers need to be restarted.
No user sessions are running outdated binaries.注意到Service restarts being deferred这里,所有的服务将会延迟重启,但是最终没有重启,这就是导致最终ssh挂掉的原因。
解决方案
unattended-upgrade启动安装更新后有needrestart来检查并提醒哪些服务或进程需要重启。因此可以修改needrestart代表配置,将ssh服务纳入白名单,不让他在更新过后重启。
修改/etc/needrestart/needrestart.conf配置文件,找到
$nrconf{override_rc} = {...}在其中加入 qr(^ssh) => 0,即:
$nrconf{override_rc} = { qr(^ssh) => 0,};这里的0表示禁止自动重启匹配到的服务。这样处理后便可以让ssh自动更新后不重启。
但是为了安全起见,建议还是在更新过后手动重启ssh服务。
文章分享
如果这篇文章对你有帮助,欢迎分享给更多人!
部分内容可能已过时
忆枫の博客