1. 功能概述Envoy 的 KillRequest 过滤器是一个用于调试和测试目的的强大工具它允许通过特定的 HTTP 请求来模拟 Envoy 进程崩溃的场景。该过滤器可以在收到特定请求时通过崩溃 Envoy 进程来帮助开发和测试团队验证系统的容错性、崩溃恢复机制以及相关的监控和告警系统。需要注意的是该过滤器默认情况下并未内置到 Envoy 二进制文件中。如果需要使用该功能需要在构建 Envoy 时显式启用--//source/extensions/filters/http/kill_request:enabled。通过使用该过滤器开发和测试团队可以更有效地验证系统的容错性、崩溃恢复机制以及监控和告警系统的正确性。该功能的主要特点包括可控的崩溃场景通过特定的 HTTP 头来触发崩溃灵活的配置支持概率控制、方向控制和自定义头名称安全性需要在构建时显式启用并可以通过配置进行严格控制性能优化与 Envoy 架构深度集成提供高效的请求处理通过合理配置和使用 KillRequest 过滤器开发人员可以在不修改 Envoy 核心代码的情况下快速实现复杂的崩溃测试方案提高系统的可维护性和可靠性。需要注意的是该过滤器应谨慎使用特别是在生产环境中应限制其使用范围和访问权限以避免不必要的系统中断。2. 解决的问题2.1 系统容错性测试提供一种可控的方式来测试系统在 Envoy 进程崩溃时的行为验证崩溃恢复机制的正确性和效率确保系统在部分组件失效时能够继续正常工作2.2 调试工具在开发过程中用于模拟特定条件下的崩溃场景帮助定位难以复现的崩溃问题提供一种一致的崩溃测试方法2.3 监控和告警验证验证监控系统是否能够正确检测 Envoy 进程崩溃验证告警系统是否能够及时发送通知测试崩溃后的自动恢复流程2.4 生产环境准备在部署到生产环境之前验证系统对崩溃的处理能力确保系统在高负载或异常情况下的稳定性3. 架构设计3.1 核心组件架构3.2 类图4. 配置用例4.1 基础配置http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 100 kill_request_header: x-envoy-kill-request direction: REQUEST route_config: name: local_route virtual_hosts: - name: backend domains: [*] routes: - match: { prefix: / } route: { cluster: backend_cluster } static_resources: clusters: - name: backend_cluster connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: backend_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: address: backend.example.com port_value: 804.2 路由级配置route_config: name: local_route virtual_hosts: - name: backend domains: [*] routes: - match: prefix: /test/crash route: { cluster: test_cluster } per_filter_config: envoy.filters.http.kill_request: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 50 denominator: HUNDRED direction: REQUEST - match: prefix: / route: { cluster: backend_cluster } static_resources: clusters: - name: test_cluster connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: test_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: address: test.example.com port_value: 80 - name: backend_cluster connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: backend_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: address: backend.example.com port_value: 804.3 响应方向配置http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 20 direction: RESPONSE kill_request_header: x-response-kill route_config: name: local_route virtual_hosts: - name: backend domains: [*] routes: - match: { prefix: / } route: { cluster: backend_cluster }4.4 与其他过滤器配合使用http_filters:- name: envoy.filters.http.jwt_authn typed_config: type: type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication # JWT认证配置- name: envoy.filters.http.set_metadata typed_config: type: type.googleapis.com/envoy.extensions.filters.http.set_metadata.v3.Config metadata_namespace: envoy.auth value: service: api_gateway version: v1- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 10 kill_request_header: x-admin-kill- name: envoy.filters.http.router5. 工作流程分析5.1 过滤器执行流程5.2 Kill 请求识别流程6. 代码实现 ER 图7. 最佳实践7.1 测试环境配置http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 50 denominator: HUNDRED kill_request_header: x-test-kill direction: REQUEST route_config: name: local_route virtual_hosts: - name: test_backend domains: [*] routes: - match: { prefix: /test/ } route: { cluster: test_cluster } per_filter_config: envoy.filters.http.kill_request: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 100 kill_request_header: x-force-kill7.2 生产环境准备http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 1 kill_request_header: x-admin-kill direction: REQUEST route_config: name: local_route virtual_hosts: - name: backend domains: [*] routes: - match: { prefix: / } route: { cluster: backend_cluster } - match: { prefix: /admin/ } route: { cluster: admin_cluster } per_filter_config: envoy.filters.http.kill_request: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 07.3 响应方向配置http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 5 direction: RESPONSE kill_request_header: x-response-kill7.4 与请求验证配合使用http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 10 kill_request_header: x-admin-kill direction: REQUEST- name: envoy.filters.http.lua typed_config: type: type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua inline_code: | function envoy_on_request(request_handle) local header request_handle:headers():get(x-admin-kill) if header and header:lower() true then local user request_handle:headers():get(x-user) if user ~ admin then request_handle:respond({[:status] 403}, Forbidden) end end end8. 代码实现细节8.1 过滤器初始化KillRequestFilter::KillRequestFilter( const envoy::extensions::filters::http::kill_request::v3::KillRequest kill_request, Random::RandomGenerator random_generator) : kill_request_(kill_request), random_generator_(random_generator) {}8.2 方向和概率检查bool KillRequestFilter::isKillRequestEnabled() { return ProtobufPercentHelper::evaluateFractionalPercent(kill_request_.probability(), random_generator_.random());}8.3 Kill 请求识别bool KillRequestFilter::isKillRequest(Http::HeaderMap headers) { const Http::LowerCaseString kill_request_header_name kill_request_.kill_request_header().empty() ? KillRequestHeaders::get().KillRequest : Http::LowerCaseString(kill_request_.kill_request_header()); const auto kill_request_header headers.get(kill_request_header_name); bool is_kill_request false; if (kill_request_header.empty() || !absl::SimpleAtob(kill_request_header[0]-value().getStringView(), is_kill_request)) { return false; } return is_kill_request;}8.4 请求处理Http::FilterHeadersStatus KillRequestFilter::decodeHeaders(Http::RequestHeaderMap headers, bool) { bool is_correct_direction kill_request_.direction() KillRequest::REQUEST; const bool is_kill_request isKillRequest(headers); if (!is_kill_request) { return Http::FilterHeadersStatus::Continue; } // 路由级配置覆盖过滤器级配置 const auto* per_route_kill_settings Http::Utility::resolveMostSpecificPerFilterConfigKillSettings(decoder_callbacks_); if (per_route_kill_settings) { is_correct_direction per_route_kill_settings-getDirection() KillRequest::REQUEST; kill_request_.mutable_probability()-CopyFrom(per_route_kill_settings-getProbability()); } if (is_kill_request is_correct_direction isKillRequestEnabled()) { // 崩溃 Envoy RELEASE_ASSERT(false, KillRequestFilter is crashing Envoy!!!); } return Http::FilterHeadersStatus::Continue;}8.5 响应处理Http::FilterHeadersStatus KillRequestFilter::encodeHeaders(Http::ResponseHeaderMap headers, bool) { if (kill_request_.direction() KillRequest::REQUEST) { return Http::FilterHeadersStatus::Continue; } if (isKillRequest(headers) isKillRequestEnabled()) { // 崩溃 Envoy RELEASE_ASSERT(false, KillRequestFilter is crashing Envoy!!!); } return Http::FilterHeadersStatus::Continue;}8.6 配置工厂Http::FilterFactoryCb KillRequestFilterFactory::createFilterFactoryFromProtoTyped( const envoy::extensions::filters::http::kill_request::v3::KillRequest proto_config, const std::string, Server::Configuration::FactoryContext context) { return [proto_config, context](Http::FilterChainFactoryCallbacks callbacks) - void { callbacks.addStreamFilter( std::make_sharedKillRequestFilter(proto_config, context.api().randomGenerator())); };}
Envoy KillRequest 过滤器功能实现分析
1. 功能概述Envoy 的 KillRequest 过滤器是一个用于调试和测试目的的强大工具它允许通过特定的 HTTP 请求来模拟 Envoy 进程崩溃的场景。该过滤器可以在收到特定请求时通过崩溃 Envoy 进程来帮助开发和测试团队验证系统的容错性、崩溃恢复机制以及相关的监控和告警系统。需要注意的是该过滤器默认情况下并未内置到 Envoy 二进制文件中。如果需要使用该功能需要在构建 Envoy 时显式启用--//source/extensions/filters/http/kill_request:enabled。通过使用该过滤器开发和测试团队可以更有效地验证系统的容错性、崩溃恢复机制以及监控和告警系统的正确性。该功能的主要特点包括可控的崩溃场景通过特定的 HTTP 头来触发崩溃灵活的配置支持概率控制、方向控制和自定义头名称安全性需要在构建时显式启用并可以通过配置进行严格控制性能优化与 Envoy 架构深度集成提供高效的请求处理通过合理配置和使用 KillRequest 过滤器开发人员可以在不修改 Envoy 核心代码的情况下快速实现复杂的崩溃测试方案提高系统的可维护性和可靠性。需要注意的是该过滤器应谨慎使用特别是在生产环境中应限制其使用范围和访问权限以避免不必要的系统中断。2. 解决的问题2.1 系统容错性测试提供一种可控的方式来测试系统在 Envoy 进程崩溃时的行为验证崩溃恢复机制的正确性和效率确保系统在部分组件失效时能够继续正常工作2.2 调试工具在开发过程中用于模拟特定条件下的崩溃场景帮助定位难以复现的崩溃问题提供一种一致的崩溃测试方法2.3 监控和告警验证验证监控系统是否能够正确检测 Envoy 进程崩溃验证告警系统是否能够及时发送通知测试崩溃后的自动恢复流程2.4 生产环境准备在部署到生产环境之前验证系统对崩溃的处理能力确保系统在高负载或异常情况下的稳定性3. 架构设计3.1 核心组件架构3.2 类图4. 配置用例4.1 基础配置http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 100 kill_request_header: x-envoy-kill-request direction: REQUEST route_config: name: local_route virtual_hosts: - name: backend domains: [*] routes: - match: { prefix: / } route: { cluster: backend_cluster } static_resources: clusters: - name: backend_cluster connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: backend_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: address: backend.example.com port_value: 804.2 路由级配置route_config: name: local_route virtual_hosts: - name: backend domains: [*] routes: - match: prefix: /test/crash route: { cluster: test_cluster } per_filter_config: envoy.filters.http.kill_request: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 50 denominator: HUNDRED direction: REQUEST - match: prefix: / route: { cluster: backend_cluster } static_resources: clusters: - name: test_cluster connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: test_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: address: test.example.com port_value: 80 - name: backend_cluster connect_timeout: 0.25s type: STRICT_DNS lb_policy: ROUND_ROBIN load_assignment: cluster_name: backend_cluster endpoints: - lb_endpoints: - endpoint: address: socket_address: address: backend.example.com port_value: 804.3 响应方向配置http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 20 direction: RESPONSE kill_request_header: x-response-kill route_config: name: local_route virtual_hosts: - name: backend domains: [*] routes: - match: { prefix: / } route: { cluster: backend_cluster }4.4 与其他过滤器配合使用http_filters:- name: envoy.filters.http.jwt_authn typed_config: type: type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication # JWT认证配置- name: envoy.filters.http.set_metadata typed_config: type: type.googleapis.com/envoy.extensions.filters.http.set_metadata.v3.Config metadata_namespace: envoy.auth value: service: api_gateway version: v1- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 10 kill_request_header: x-admin-kill- name: envoy.filters.http.router5. 工作流程分析5.1 过滤器执行流程5.2 Kill 请求识别流程6. 代码实现 ER 图7. 最佳实践7.1 测试环境配置http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: numerator: 50 denominator: HUNDRED kill_request_header: x-test-kill direction: REQUEST route_config: name: local_route virtual_hosts: - name: test_backend domains: [*] routes: - match: { prefix: /test/ } route: { cluster: test_cluster } per_filter_config: envoy.filters.http.kill_request: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 100 kill_request_header: x-force-kill7.2 生产环境准备http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 1 kill_request_header: x-admin-kill direction: REQUEST route_config: name: local_route virtual_hosts: - name: backend domains: [*] routes: - match: { prefix: / } route: { cluster: backend_cluster } - match: { prefix: /admin/ } route: { cluster: admin_cluster } per_filter_config: envoy.filters.http.kill_request: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 07.3 响应方向配置http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 5 direction: RESPONSE kill_request_header: x-response-kill7.4 与请求验证配合使用http_filters:- name: envoy.filters.http.kill_request typed_config: type: type.googleapis.com/envoy.extensions.filters.http.kill_request.v3.KillRequest probability: 10 kill_request_header: x-admin-kill direction: REQUEST- name: envoy.filters.http.lua typed_config: type: type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua inline_code: | function envoy_on_request(request_handle) local header request_handle:headers():get(x-admin-kill) if header and header:lower() true then local user request_handle:headers():get(x-user) if user ~ admin then request_handle:respond({[:status] 403}, Forbidden) end end end8. 代码实现细节8.1 过滤器初始化KillRequestFilter::KillRequestFilter( const envoy::extensions::filters::http::kill_request::v3::KillRequest kill_request, Random::RandomGenerator random_generator) : kill_request_(kill_request), random_generator_(random_generator) {}8.2 方向和概率检查bool KillRequestFilter::isKillRequestEnabled() { return ProtobufPercentHelper::evaluateFractionalPercent(kill_request_.probability(), random_generator_.random());}8.3 Kill 请求识别bool KillRequestFilter::isKillRequest(Http::HeaderMap headers) { const Http::LowerCaseString kill_request_header_name kill_request_.kill_request_header().empty() ? KillRequestHeaders::get().KillRequest : Http::LowerCaseString(kill_request_.kill_request_header()); const auto kill_request_header headers.get(kill_request_header_name); bool is_kill_request false; if (kill_request_header.empty() || !absl::SimpleAtob(kill_request_header[0]-value().getStringView(), is_kill_request)) { return false; } return is_kill_request;}8.4 请求处理Http::FilterHeadersStatus KillRequestFilter::decodeHeaders(Http::RequestHeaderMap headers, bool) { bool is_correct_direction kill_request_.direction() KillRequest::REQUEST; const bool is_kill_request isKillRequest(headers); if (!is_kill_request) { return Http::FilterHeadersStatus::Continue; } // 路由级配置覆盖过滤器级配置 const auto* per_route_kill_settings Http::Utility::resolveMostSpecificPerFilterConfigKillSettings(decoder_callbacks_); if (per_route_kill_settings) { is_correct_direction per_route_kill_settings-getDirection() KillRequest::REQUEST; kill_request_.mutable_probability()-CopyFrom(per_route_kill_settings-getProbability()); } if (is_kill_request is_correct_direction isKillRequestEnabled()) { // 崩溃 Envoy RELEASE_ASSERT(false, KillRequestFilter is crashing Envoy!!!); } return Http::FilterHeadersStatus::Continue;}8.5 响应处理Http::FilterHeadersStatus KillRequestFilter::encodeHeaders(Http::ResponseHeaderMap headers, bool) { if (kill_request_.direction() KillRequest::REQUEST) { return Http::FilterHeadersStatus::Continue; } if (isKillRequest(headers) isKillRequestEnabled()) { // 崩溃 Envoy RELEASE_ASSERT(false, KillRequestFilter is crashing Envoy!!!); } return Http::FilterHeadersStatus::Continue;}8.6 配置工厂Http::FilterFactoryCb KillRequestFilterFactory::createFilterFactoryFromProtoTyped( const envoy::extensions::filters::http::kill_request::v3::KillRequest proto_config, const std::string, Server::Configuration::FactoryContext context) { return [proto_config, context](Http::FilterChainFactoryCallbacks callbacks) - void { callbacks.addStreamFilter( std::make_sharedKillRequestFilter(proto_config, context.api().randomGenerator())); };}