FIN
报文时,才会关闭自己侧的连接。FIN
报文,则连接会一直保持并消耗着资源,为了防止这种情况,一般接收方都会主动中断一段时间没有数据传输的TCP连接,比如LVS会默认中断90秒内没有数据传输的TCP连接,F5会中断5分钟内没有数据传输的TCP连接slow start
机制,新的TCP连接开始数据传输速度是比较慢的,我们希望通过连接池模式,保持一部分空闲连接,当需要传输数据时,可以从连接池中直接拿一个空闲的TCP连接来全速使用,这样对性能有很大提升Implementors MAY include "keep-alives" in their TCP implementations, although this practice is not universally accepted. If keep-alives are included, the application MUST be able to turn them on or off for each TCP connection, and they MUST default to off.
SO_KEEPALIVE
才会生效java.net.SocketOptions
来开启TCP连接的KeepAlive
/**
* When the keepalive option is set for a TCP socket and no data
* has been exchanged across the socket in either direction for
* 2 hours (NOTE: the actual value is implementation dependent),
* TCP automatically sends a keepalive probe to the peer. This probe is a
* TCP segment to which the peer must respond.
* One of three responses is expected:
* 1. The peer responds with the expected ACK. The application is not
* notified (since everything is OK). TCP will send another probe
* following another 2 hours of inactivity.
* 2. The peer responds with an RST, which tells the local TCP that
* the peer host has crashed and rebooted. The socket is closed.
* 3. There is no response from the peer. The socket is closed.
*
* The purpose of this option is to detect if the peer host crashes.
*
* Valid only for TCP socket: SocketImpl
*
* @see Socket#setKeepAlive
* @see Socket#getKeepAlive
*/
@Native public final static int SO_KEEPALIVE = 0x0008;
SO_KEEPALIVE
的工作机制做了比较详细的说明,具体来说就是,如果某连接开启了TCP KeepAlive,当连接空闲了两个小时(依赖操作系统的 net.ipv4.tcp_keepalive_time
设置),TCP会自动发送一个KeepAlive探测报文给对端。对端必须回复这个探测报文,假设对端正常,就可以回复ACK报文,收到ACK后该连接就会继续维持,直到再次出现两个小时空闲然后探测;假设对端不正常,比如重启了,应该回复一个RST报文来关闭该连接。假设对端没有任何响应,TCP会每隔75秒(依赖操作系统的 net.ipv4.tcp_keepalive_intvl
设置)再次重试,重试9次(依赖OS的 net.ipv4.tcpkeepaliveprobes
设置)后如果依然没有回复则关闭连接
chendw@chendw-PC:~$ sysctl -a | grep keepalive
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
FIN
报文,因此即使TCP已经设置了KeepAlive,TCP连接也会被正常关闭Connection: Keep-Alive
信息,告知服务器不要关闭该TCP连接,当服务器收到该请求,完成响应后,不会主动主动关闭该TCP连接。而浏览器当然也不会主动关闭,而是在后续请求里复用该TCP连接来发送下一个HTTP请求Connection: Keep-Alive头部,应用服务器同样也要支持;而HTTP1.1规范明确规定了要默认开启KeepAlive,所以支持HTTP1.1的浏览器不需要显式指定,发送请求时会自动携带该头部,只有在想关闭时可以通过设置 Connection: Close
头部告知对端
Keep-Alive: max=5, timeout=120
来控制连接关闭时间,比如如上头部就表示该TCP连接还会保持120秒,max表示可以发送的请求数,不过在非管道连接下会被忽略,我们基本都是非管道连接,因此可以忽略keep-alive
报文来防止TCP连接被对端、防火墙或其他中间设备意外中断,和上层应用没有任何关系,只负责维护单个TCP连接的状态,其上层应用可以复用该TCP长连接,也可以关闭该TCP长连接keepalive
头部规定的 timeout
才会关闭该TCP连接,不过这具体依赖应用服务器,应用服务器也可以根据自己的设置在响应后主动关闭这个TCP连接,只要在响应的时候携带 Connection: Close
告知对方net.ipv4.tcp_keepalive_time
为2个小时,是不是太长了?感觉太长了,2小时监测一次感觉黄花菜都凉了。我们公司F5后面的Nginx服务器配置了30分钟,但应该也是太长了吧,F5维持空闲连接5分钟,那超时监测不应该低于这个值吗 ???,比如Google Cloud说其防火墙允许10分钟空闲连接,因此建议 net.ipv4.tcp_keepalive_time
设置为6分钟hop-by-hop
类型的End-to-end
头部和 Hop-by-hop
头部,End-to-end
头部会被中间的代理原样转发,比如浏览器请求报文中的 host
头部,会被负载均衡、反向代理原样转发到Tomcat里,除非特意修改。而 Hop-by-hop
头部则只在当前TCP连接里有效,大部分头部都是 End-to-end
,但KeepAlive相关头部很明显和TCP连接有密切关系,因此是 Hop-by-hop
的* End-to-end headers which are transmitted to the ultimate recipient of a request or response. End-to-end headers in responses MUST be stored as part of a cache entry and MUST be transmitted in any response formed from a cache entry. * Hop-by-hop headers which are meaningful only for a single transport-level connection and are not stored by caches or forwarded by proxies.
Connection: Keep-Alive
,也只表示浏览器到负载均衡之间是长连接,但负载均衡到nginx、nginx到tomcat是否是长连接则需要具体分析。比如Nginx虽然支持HTTP的Keep-Alive,但由Nginx发起的HTTP请求默认不是长连接Hop-by-hop
的特性,HTTP长连接中的 timeout
设置就十分可疑了,不过一般来说应用服务器都是根据自己的设置来管理TCP连接的,因此HTTP长连接中 Connection
头部每个请求都携带, keepalive
头部用的就比较少keepalive
头部来管理TCP连接,也可以根据自己的设置来管理,Nginx一般根据自己的设置来管理Syntax: keepalive_requests number; Default: keepalive_requests 100; Context: http, server, location This directive appeared in version 0.8.0. Sets the maximum number of requests that can be served through one keep-alive connection. After the maximum number of requests are made, the connection is closed. Closing connections periodically is necessary to free per-connection memory allocations. Therefore, using too high maximum number of requests could result in excessive memory usage and not recommended. Syntax: keepalivetimeout timeout [headertimeout]; Default: keepalive_timeout 75s; Context: http, server, location The first parameter sets a timeout during which a keep-alive client connection will stay open on the server side. The zero value disables keep-alive client connections. The optional second parameter sets a value in the “Keep-Alive: timeout=time” response header field. Two parameters may differ.
http {
keepalive_requests 100;
keepalive_timeout 75s;
upstream backend {
server 192.167.61.1:8080;
}
}
upstream
区块使用 keepalive
开启,数字表示每个work开启的最大长连接数proxy_http_version
为1.0,因此需要配置 proxy_http_version
,并清空 connection
,这样即使前一跳是短连接,Nginx与上游也可以是长连接upstream
里的 keepalive_requests
和 http
区块里的一样是100,但 keepalive_timeout
默认为60秒,比 http
区块里的少15秒,不过也正常,毕竟是里层,这个设置是比较合理的,使用默认的就可以
upstream backend {
server 192.167.61.1:8080;
server 192.167.61.1:8082 back;
keepalive 100;
# keepalive_requests 100;
# keepalive_timeout 60s;
}
local /test {
proxy_http_version 1.1;
proxy_set_header Connection ""; // 传递给上游服务器的头信息
proxy_pass http://backend;
}
listner
指令上提供了一个 so_keepalive
选项,来开启Nginx对TCP长连接的支持,应该开启的是客户端与Nginx之间的TCP长连接,但一般没有人使用,那负载均衡和Nginx、Nginx和Tomcat之间是不需要TCP长连接吗?因为中间没有网络设备?否则TCP长连接是由谁来做检测?keepalive_requests
默认的100表示的是单个长连接能处理的最大请求数,而并不是Nginx能维护的长连接数。Nginx能维护的TCP连接数,为工作进程个数 worker_processes
乘以每个工作进程允许维护的最大连接数 worker_connections
(默认512);如果想计算Nginx能服务的最大请求数,还需要在最大TCP连接数外,加上操作系统允许的排队等待数 net.core.somaxconn
,默认12865000/2*60
,大约500左右领取专属 10元无门槛券
私享最新 技术干货