数据库死锁分析与解决实战

数据库死锁分析与解决实战 数据库死锁分析与解决实战一、死锁概述死锁是指两个或多个事务在执行过程中因互相等待对方持有的锁而陷入无限等待的状态。1.1 死锁条件条件说明互斥资源只能被一个事务占用请求与保持事务持有资源并请求新资源不可剥夺资源不能被强制剥夺循环等待事务形成循环等待链1.2 死锁示例事务A: BEGIN; UPDATE accounts SET balance balance - 100 WHERE id 1; 事务B: BEGIN; UPDATE accounts SET balance balance - 100 WHERE id 2; 事务A: UPDATE accounts SET balance balance 100 WHERE id 2; -- 等待事务B 事务B: UPDATE accounts SET balance balance 100 WHERE id 1; -- 等待事务A 结果死锁二、死锁检测2.1 MySQL死锁检测-- 查看当前死锁 SHOW ENGINE INNODB STATUS\G -- 查看锁等待 SELECT * FROM performance_schema.data_locks; SELECT * FROM performance_schema.data_lock_waits; -- 查看事务 SELECT * FROM information_schema.INNODB_TRX;2.2 死锁日志分析------------------------ LATEST DETECTED DEADLOCK ------------------------ 2024-01-15 10:00:00 0x7f1234567890 *** (1) TRANSACTION: TRANSACTION 123456, ACTIVE 10 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 2 lock struct(s), heap size 1136, 1 row lock(s) MySQL thread id 100, OS thread handle 0x7f1234567890, query id 500 localhost user updating UPDATE accounts SET balance balance 100 WHERE id 2 *** (2) TRANSACTION: TRANSACTION 123457, ACTIVE 10 sec starting index read mysql tables in use 1, locked 1 3 lock struct(s), heap size 1136, 1 row lock(s) MySQL thread id 101, OS thread handle 0x7f1234567891, query id 501 localhost user updating UPDATE accounts SET balance balance 100 WHERE id 1 *** WE ROLL BACK TRANSACTION (1)三、死锁预防3.1 统一访问顺序// 按主键顺序访问资源 public void transfer(Long fromId, Long toId, BigDecimal amount) { Long firstId Math.min(fromId, toId); Long secondId Math.max(fromId, toId); try (Connection conn getConnection()) { conn.setAutoCommit(false); // 先更新ID较小的记录 String sql1 UPDATE accounts SET balance balance - ? WHERE id ?; try (PreparedStatement pstmt conn.prepareStatement(sql1)) { pstmt.setBigDecimal(1, amount); pstmt.setLong(2, firstId); pstmt.executeUpdate(); } // 再更新ID较大的记录 String sql2 UPDATE accounts SET balance balance ? WHERE id ?; try (PreparedStatement pstmt conn.prepareStatement(sql2)) { pstmt.setBigDecimal(1, amount); pstmt.setLong(2, secondId); pstmt.executeUpdate(); } conn.commit(); } catch (SQLException e) { // 回滚处理 } }3.2 缩短事务长度Transactional public void processOrder(Order order) { // 1. 验证订单 validateOrder(order); // 2. 更新库存使用独立事务或延迟更新 updateInventoryAsync(order); // 3. 创建订单记录 orderRepository.save(order); // 4. 发送通知异步处理 notificationService.sendAsync(order); }3.3 使用乐观锁Transactional public boolean updateWithOptimisticLock(Long id, BigDecimal amount) { Account account accountRepository.findById(id).orElseThrow(); if (account.getVersion() ! expectedVersion) { throw new OptimisticLockingFailureException(Data has been modified); } account.setBalance(account.getBalance().add(amount)); account.setVersion(account.getVersion() 1); accountRepository.save(account); return true; }四、死锁解决4.1 超时机制public void executeWithTimeout(SupplierVoid operation, long timeoutMs) { ExecutorService executor Executors.newSingleThreadExecutor(); FutureVoid future executor.submit(() - { operation.get(); return null; }); try { future.get(timeoutMs, TimeUnit.MILLISECONDS); } catch (TimeoutException e) { future.cancel(true); throw new RuntimeException(Operation timed out); } catch (Exception e) { throw new RuntimeException(Operation failed, e); } finally { executor.shutdown(); } }4.2 重试机制Retryable(maxAttempts 3, backoff Backoff(delay 100, multiplier 2)) Transactional public void transferWithRetry(Long fromId, Long toId, BigDecimal amount) { // 转账逻辑 accountRepository.debit(fromId, amount); accountRepository.credit(toId, amount); }4.3 死锁检测与回滚public void executeWithDeadlockRetry(Runnable operation) { int attempts 0; int maxAttempts 3; long backoffMs 100; while (attempts maxAttempts) { try { operation.run(); return; } catch (SQLException e) { if (isDeadlockError(e) attempts maxAttempts - 1) { attempts; try { Thread.sleep(backoffMs * attempts); } catch (InterruptedException ie) { Thread.currentThread().interrupt(); throw new RuntimeException(ie); } } else { throw e; } } } }五、最佳实践5.1 索引优化-- 确保WHERE条件有索引 CREATE INDEX idx_accounts_id ON accounts(id); -- 复合索引优化 CREATE INDEX idx_orders_user_status ON orders(user_id, status); -- 避免全表扫描 EXPLAIN SELECT * FROM orders WHERE status PENDING;5.2 隔离级别选择// 使用合适的隔离级别 Transactional(isolation Isolation.READ_COMMITTED) public void processTransaction() { // 业务逻辑 }隔离级别说明死锁风险READ_UNCOMMITTED最低级别低READ_COMMITTED读已提交中REPEATABLE_READ可重复读高SERIALIZABLE串行化最高5.3 监控告警# Prometheus告警规则 groups: - name: database.rules rules: - alert: DeadlockDetected expr: rate(innodb_deadlocks_total[5m]) 0 for: 1m labels: severity: warning annotations: summary: Deadlock detected in database description: {{ $value }} deadlocks detected in the last 5 minutes六、总结死锁预防和解决策略统一访问顺序按固定顺序访问资源缩短事务减少事务持有锁的时间使用乐观锁避免悲观锁竞争超时重试自动处理死锁场景监控告警及时发现死锁问题通过合理的设计和优化可以有效减少数据库死锁的发生。