Ubuntu Server 24.04 实战:基于dnsmasq与Nginx构建高可用DNS负载均衡系统

Ubuntu Server 24.04 实战:基于dnsmasq与Nginx构建高可用DNS负载均衡系统 1. 环境准备与系统配置在开始搭建DNS负载均衡系统之前我们需要确保Ubuntu Server 24.04的基础环境已经就绪。这个版本代号Noble Numbat是2024年发布的最新LTS版本内核版本通常为6.8.x对网络性能有显著优化。我建议使用全新安装的系统避免已有服务造成端口冲突。先检查系统基本信息cat /etc/os-release uname -a对于生产环境我强烈推荐配置静态IP地址。Ubuntu 24.04使用Netplan进行网络配置修改/etc/netplan/00-installer-config.yaml文件后执行sudo netplan apply生效。同时关闭不必要的防火墙端口只保留后续需要的53DNS、853DNS over TLS等端口。系统更新也很关键sudo apt update sudo apt upgrade -y sudo apt install -y build-essential libpcre3 libpcre3-dev zlib1g-dev libssl-dev我在实际部署中发现调整系统内核参数可以显著提升DNS查询性能。建议修改/etc/sysctl.conf文件增加以下参数net.core.rmem_max4194304 net.core.wmem_max4194304 net.ipv4.udp_mem4096 87380 4194304执行sudo sysctl -p立即生效。这些参数将UDP缓冲区大小调整为更适合DNS服务的值特别是在高并发查询场景下效果明显。2. dnsmasq的安装与基础配置dnsmasq作为轻量级DNS服务器安装非常简单sudo apt install -y dnsmasq dnsmasq -v但默认配置远不能满足生产需求我们需要深度定制。首先备份原始配置文件sudo mv /etc/dnsmasq.conf /etc/dnsmasq.conf.bak新建配置文件时我建议采用模块化方式组织。主配置文件/etc/dnsmasq.conf应只包含基础参数具体解析规则通过include方式引入。这是我的推荐配置模板# 基础参数 port53 no-hosts no-poll strict-order log-queries log-facility/var/log/dnsmasq.log local-ttl300 cache-size10000 listen-address127.0.0.1,192.168.50.18 # 模块化配置 conf-dir/etc/dnsmasq.d/,*.conf addn-hosts/etc/dnsmasq.d/hosts resolv-file/etc/dnsmasq.d/resolv.conf创建对应的目录和文件sudo mkdir -p /etc/dnsmasq.d/ sudo touch /etc/dnsmasq.d/{hosts,resolv.conf}在resolv.conf中设置上游DNS时建议至少配置3个不同的公共DNS服务器并测试每个的响应速度nameserver 223.6.6.6 nameserver 114.114.114.114 nameserver 8.8.4.4本地域名记录放在hosts文件中格式与/etc/hosts一致192.168.50.18 app1.example.com 192.168.50.19 app2.example.com启动服务前务必进行语法检查sudo dnsmasq --test sudo systemctl restart dnsmasq sudo systemctl enable dnsmasq3. 构建dnsmasq高可用集群单节点dnsmasq无法满足高可用需求我们需要部署至少两个节点。第二台服务器的配置与第一台类似但需要注意以下几点差异监听地址改为该服务器的IP本地域名记录需要保持同步可以配置不同的上游DNS组合我通常使用rsync配合inotify-tools实现配置文件的实时同步。首先在两台服务器上安装sudo apt install -y rsync inotify-tools然后在主节点配置同步脚本/usr/local/bin/sync_dnsmasq.sh#!/bin/bash inotifywait -mrq -e modify,create,delete /etc/dnsmasq.d/ | while read path action file do rsync -az --delete /etc/dnsmasq.d/ backup-server:/etc/dnsmasq.d/ ssh backup-server sudo systemctl restart dnsmasq done给脚本执行权限并设置为开机启动sudo chmod x /usr/local/bin/sync_dnsmasq.sh sudo crontab -e reboot /usr/local/bin/sync_dnsmasq.sh测试时可以使用dig命令分别向两个服务器发起查询验证响应是否一致dig 192.168.50.18 example.com dig 192.168.50.19 example.com4. OpenResty/Nginx的安装与优化我们选择OpenResty而非普通Nginx因为它内置了增强的UDP负载均衡功能。编译安装能获得最佳性能wget https://openresty.org/download/openresty-1.25.3.1.tar.gz tar zxvf openresty-1.25.3.1.tar.gz cd openresty-1.25.3.1 ./configure --with-http_stub_status_module --with-stream --with-stream_ssl_module --with-stream_realip_module make -j$(nproc) sudo make install安装后验证版本/usr/local/openresty/nginx/sbin/nginx -v为方便管理创建systemd服务文件/etc/systemd/system/openresty.service[Unit] DescriptionOpenResty Afternetwork.target [Service] Typeforking PIDFile/usr/local/openresty/nginx/logs/nginx.pid ExecStart/usr/local/openresty/nginx/sbin/nginx ExecReload/usr/local/openresty/nginx/sbin/nginx -s reload ExecStop/usr/local/openresty/nginx/sbin/nginx -s quit PrivateTmptrue [Install] WantedBymulti-user.target然后启用服务sudo systemctl daemon-reload sudo systemctl enable --now openresty5. Nginx UDP/TCP负载均衡配置OpenResty的stream模块是负载均衡的核心。创建专用配置文件/usr/local/openresty/nginx/conf/stream.confstream { upstream dns_udp { server 192.168.50.18:53; server 192.168.50.19:53; } upstream dns_tcp { server 192.168.50.18:53; server 192.168.50.19:53; } server { listen 53 udp reuseport; proxy_pass dns_udp; proxy_timeout 3s; proxy_responses 1; error_log /var/log/nginx/dns_udp.log; } server { listen 53; proxy_pass dns_tcp; proxy_timeout 3s; error_log /var/log/nginx/dns_tcp.log; } }在nginx.conf的http块外添加引入语句include /usr/local/openresty/nginx/conf/stream.conf;几个关键参数说明reuseport为每个worker创建独立socket提升UDP性能proxy_timeout控制查询超时时间DNS通常3秒足够proxy_responsesUDP只需一个响应测试配置并重载sudo /usr/local/openresty/nginx/sbin/nginx -t sudo /usr/local/openresty/nginx/sbin/nginx -s reload6. 系统测试与性能调优配置完成后需要全面测试系统功能。从客户端执行# UDP查询测试 dig nginx-ip example.com # TCP查询测试 dig nginx-ip example.com tcp # 查看负载均衡效果 for i in {1..10}; do dig nginx-ip example.com short; done在服务器端监控日志tail -f /var/log/dnsmasq.log /var/log/nginx/dns_udp.log性能测试可以使用dnsperf工具# 安装 sudo apt install -y dnsperf # 准备测试文件 echo example.com A queries.txt # 执行测试 dnsperf -s nginx-ip -d queries.txt -l 30 -Q 100根据测试结果可能需要调整以下参数dnsmasq的cache-size和local-ttlNginx的worker_processes和worker_connections内核的UDP缓冲区大小我建议使用systemd-resolved作为本地缓存可以显著减轻dnsmasq压力。修改/etc/systemd/resolved.conf[Resolve] DNS127.0.0.1 Cacheyes DNSStubListenerno然后重启服务sudo systemctl restart systemd-resolved7. 监控与维护方案生产环境必须建立完善的监控体系。我推荐使用PrometheusGrafana组合安装dnsmasq exporter收集指标wget https://github.com/google/dnsmasq_exporter/releases/download/v0.2.3/dnsmasq_exporter-0.2.3.linux-amd64.tar.gz tar zxvf dnsmasq_exporter-*.tar.gz sudo mv dnsmasq_exporter /usr/local/bin/创建systemd服务文件[Unit] DescriptionDnsmasq Exporter Afternetwork.target [Service] ExecStart/usr/local/bin/dnsmasq_exporter --dnsmasq.server127.0.0.1:53 Restartalways [Install] WantedBymulti-user.targetNginx指标可以通过OpenResty的stub_status模块获取location /nginx_status { stub_status; allow 127.0.0.1; deny all; }日常维护时定期清理缓存很重要。我使用cronjob每周重启服务0 3 * * 1 root systemctl restart dnsmasq openresty日志轮转配置示例/etc/logrotate.d/dnsmasq/var/log/dnsmasq.log { daily missingok rotate 7 compress delaycompress notifempty postrotate systemctl reload dnsmasq endscript }