You are here

Solving the proxy timeouts problem

When requesting a web URL through a proxy, you might encounter a timeout and never see the result of your request. This usually happens when the resource you are requesting needs much time to answer, such as a cgi-bin script or a web service loading a huge set of data from a database. The problem is even more complicated when for any reason, you should not run this same web service twice (known as the resubmission problem). For all this article, we will suppose we are not the admins of the proxy since there are some ways to configure the proxies for longer timeouts than their default values.

Solution for Java sockets
The idea is to use the TCP Keep-Alive functionality as described in the STD 3 Internet Standard. If you need some understanding about what a standard and what standardization are, please have a look on RFC 2026.
I suppose that you already know what TCP is, and I will just remind that once enabled on a socket, the keep-alive interval time is used to send a packet on a regular basis. Although the STD/RFC says we must not use a smaller keep-alive than 2 hours, we can think about this value and consider it is coming with no justification. Indeed, networking and internet are a bit different from the time of writing of this RFC (1989). Technically, we can set a value of one minute with no problem at all, it will not drive our client or server or firewall out of resources.

On our Linux client, we can change this keep-alive time by setting two system parameters :

# echo 60 > /proc/sys/net/ipv4/tcp_keepalive_time
# echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl

Check the settings have been updated :

# cat /proc/sys/net/ipv4/tcp_keepalive_time
# cat /proc/sys/net/ipv4/tcp_keepalive_time

From now on, all sockets enabling the keep-alive will send a keep-alive packet every 60 seconds. With Java sockets, we do this by calling setKeepAlive(true) method. Of course, we must ensure that the Timeout value is properly set, for example for 4 hours, calling setSoTimeout(4 * 60 * 60 * 1000).
Here is a complete network capture of an HTTP request going through an F5 proxy, showing these keep-alive packets (in bold font) sent every minute :


# tcpdump -n 'tcp dst port 9876 && (host 10.152.1.185)'
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
11:59:29.871573 IP 10.209.72.22.45265 > 10.152.1.185.sd: Flags [F.], seq 815506191, ack 1183280570, win 115, options [nop,nop,TS val 2018959767 ecr 420618075], length 0
11:59:32.369489 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [S], seq 220546869, win 14600, options [mss 1460,sackOK,TS val 2018962265 ecr 0,nop,wscale 7], length 0
11:59:32.370700 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [.], ack 1946371361, win 115, options [nop,nop,TS val 2018962267 ecr 420633871], length 0
11:59:32.373485 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [P.], seq 0:1, ack 1, win 115, options [nop,nop,TS val 2018962269 ecr 420633871], length 1
11:59:32.474834 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [P.], seq 1:34, ack 1, win 115, options [nop,nop,TS val 2018962371 ecr 420633975], length 33
12:00:32.575416 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [.], ack 1, win 115, options [nop,nop,TS val 2019022472 ecr 420634076], length 0
12:01:32.577540 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [.], ack 1, win 115, options [nop,nop,TS val 2019082473 ecr 420694077], length 0
12:02:32.578421 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [.], ack 1, win 115, options [nop,nop,TS val 2019142475 ecr 420754079], length 0
12:03:32.579419 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [.], ack 1, win 115, options [nop,nop,TS val 2019202476 ecr 420814080], length 0
12:04:32.580436 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [.], ack 1, win 115, options [nop,nop,TS val 2019262477 ecr 420874081], length 0
12:05:32.484362 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [.], ack 248, win 123, options [nop,nop,TS val 2019322380 ecr 420993984], length 0

12:05:32.489096 IP 10.209.72.22.45279 > 10.152.1.185.sd: Flags [F.], seq 34, ack 249, win 123, options [nop,nop,TS val 2019322385 ecr 420993985], length 0

This same request going through an Apache configured as a proxy (either normal http proxy or with AJP13) , however, shows no keep-alive packets because Apache is blocking them, thus leading to the timeout problem :

# tcpdump -n 'tcp dst port 9876 && (host 10.209.72.23)'
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
13:44:12.201934 IP 10.209.72.23.41547 > 10.209.72.22.sd: Flags [S], seq 980108978, win 14600, options [mss 1460,sackOK,TS val 2025224246 ecr 0,nop,wscale 7], length 0
13:44:12.202054 IP 10.209.72.23.41547 > 10.209.72.22.sd: Flags [.], ack 2412067563, win 115, options [nop,nop,TS val 2025224246 ecr 2025242098], length 0
13:44:12.202219 IP 10.209.72.23.41547 > 10.209.72.22.sd: Flags [P.], seq 0:141, ack 1, win 115, options [nop,nop,TS val 2025224246 ecr 2025242098], length 141
13:47:47.481192 IP 10.209.72.23.41400 > 10.209.72.22.sd: Flags [F.], seq 1396112308, ack 1488283294, win 115, options [nop,nop,TS val 2025439525 ecr 2025157284], length 0
13:48:47.392360 IP 10.209.72.23.41400 > 10.209.72.22.sd: Flags [R], seq 1396112309, win 0, length 0
13:49:12.203630 IP 10.209.72.23.41547 > 10.209.72.22.sd: Flags [F.], seq 141, ack 1, win 115, options [nop,nop,TS val 2025524247 ecr 2025242098], length 0
13:50:12.206188 IP 10.209.72.23.41547 > 10.209.72.22.sd: Flags [R], seq 980109121, win 0, length 0

From this, we conclude that this solution could help in some cases but not all, but it's still good to know that.

Solution for Apache Axis1 web services
In the world of web services, Apache Axis1 is widely known and still used today. Although this runs well on Java platforms, it is a little bit more complicated to make it use the keep-alive packets because there is no direct access to the sockets.
The solution to this problem is to use the Axis pluggable API to replace the default SocketFactory by our own, as described in the Axis Integration Guide. Once understood how to implement and call our factory, this becomes incredibly easy to code our factory :

# cat KASocketFactory.java
import java.util.*;
import java.net.Socket;
import org.apache.axis.components.net.*;

public class KASocketFactory extends DefaultSocketFactory {

public KASocketFactory (Hashtable h) {
super(h);
}

public Socket create(String host, int port, StringBuffer otherHeaders, BooleanHolder useFullURL) throws Exception {
Socket s = new Socket(host, port);
s.setKeepAlive(true);
s.setSoTimeout(4 * 60 * 60 * 1000)
return s;
}

}

Of course we also need to implement a long-waiting web service and a client to call it. You will find everything in the attachment files of this article. Now, just launch our client and see what happens :

# time /usr/java/latest7/bin/java -classpath .:axis.jar:jaxrpc.jar:commons-discovery-0.2.jar:commons-logging-1.0.4.jar:wsdl4j-1.5.1.jar -Dorg.apache.axis.components.net.SocketFactory=KASocketFactory WSClient -lhttp://10.152.1.185:9876/test/waitws.jws
May 21, 2014 10:23:14 PM org.apache.axis.utils.JavaUtils isAttachmentSupported
WARNING: Unable to find required classes (javax.activation.DataHandler and javax.mail.internet.MimeMultipart). Attachment support is disabled.
OK

real 6m1.077s
user 0m1.499s
sys 0m0.103s

We see that our job goes to the end, maintaining the connection for 6 minutes although the proxy has a default 5 minutes timeout. Again, the same difference between the F5 proxy and Apache applies. I don't put a copy of the tcpdump capture since it would show the keep-alive packets being sent every one minute as in the capture we saw earlier.

Solution for Apache Axis2 web services
Apache Axis2 is a very different implementation of web services than its previous version Axis1, and it doesn't come with a pluggable API. Instead, it suggests using its transport mechanisms (such as HTTP, TCP, JMS..) which already existed by the time of Axis1 but in an unfinished development and not easily usable.
With Axis2, the HTTP transport relies on Apache Commons HttpClient which is now deprecated and replaced by Apache HttpComponents. Unfortunately, HttpClient doesn't support the keep-alive configuration as we see in the HTTP connection parameters documentation.
We then have the choice between implementing a whole HTTP transport, or extending HttpClient by adding to it the TCP Keep-Alive support. For the educational purposes of this article, we just need to hard-code the Keep-Alive in every socket opening of HttpClient. Now, let's see how to achieve this goal.

First, download the HttpClient source (commons-httpclient-3.1-src.zip), and edit the HttpConnection.java file in order to make this change :

$ diff /tmp/HttpConnection.java ./java/org/apache/commons/httpclient/HttpConnection.java
712a713,717
> /**
> * we force TCP Keep-Alive to be used on all sockets
> */
> socket.setKeepAlive(true);
>

After this, you need to recompile the source into a jar file :
$ ant dist

and replace the commons-httpclient-3.1.jar which comes with AXIS2 (AXIS2_HOME/lib) by your new jar file.

Now, as for AXIS1, we need to implement a web service and a client, for testing our library. Our web service will be a simple POJO, such as :

$cat axis2-1.6.2/samples/quickstart/WaitService.java
public class WaitService {

public String runService() {
try {
Thread.sleep( 75 * 1000 );
}
catch(InterruptedException ie) {}
return "OK";
}
}

After compiling this with 'ant generate.service', copy the aar file into our Tomcat server in webapps/axis2/WEB-INF/services/WaitService.aar directory.
Our client will be based on AXIOM, and you will find its source code in the attachment file of this article. Now that everything is ready, let's run our client with 'ant run.client' :

# time ant run.client
Buildfile: build.xml

compile:

run.client:
[java] log4j:WARN No appenders could be found for logger (org.apache.axis2.context.AbstractContext).
[java] log4j:WARN Please initialize the log4j system properly.
[java] Current price of WSO: OK

BUILD SUCCESSFUL
Total time: 1 minute 17 seconds

real 1m18.527s
user 0m4.324s
sys 0m0.225s

The following network capture shows that our AXIS2 client is now sending TCP Keep-Alive packets, thus avoiding the timeout under the very same circumstances that we saw before :

11:33:55.542108 IP 10.209.72.22.33817 > 10.152.1.185.9876: Flags [S], seq 4263214255, win 14600, options [mss 1460,sackOK,TS val 2967825438 ecr 0,nop,wscale 7], length 0
11:33:55.543345 IP 10.209.72.22.33817 > 10.152.1.185.9876: Flags [.], ack 1998153969, win 115, options [nop,nop,TS val 2967825439 ecr 1369497045], length 0
11:33:55.655406 IP 10.209.72.22.33817 > 10.152.1.185.9876: Flags [P.], seq 0:172, ack 1, win 115, options [nop,nop,TS val 2967825551 ecr 1369497045], length 172
11:33:55.656371 IP 10.209.72.22.33817 > 10.152.1.185.9876: Flags [P.], seq 172:449, ack 1, win 115, options [nop,nop,TS val 2967825552 ecr 1369497045], length 277
11:34:55.657433 IP 10.209.72.22.33817 > 10.152.1.185.9876: Flags [.], ack 1, win 115, options [nop,nop,TS val 2967885554 ecr 1369497159], length 0
11:35:10.666860 IP 10.209.72.22.33817 > 10.152.1.185.9876: Flags [.], ack 432, win 123, options [nop,nop,TS val 2967900563 ecr 1369572169], length 0
11:35:10.667111 IP 10.209.72.22.33817 > 10.152.1.185.9876: Flags [.], ack 437, win 123, options [nop,nop,TS val 2967900563 ecr 1369572169], length 0
11:35:10.724892 IP 10.209.72.22.33817 > 10.152.1.185.9876: Flags [R.], seq 449, ack 437, win 123, options [nop,nop,TS val 2967900621 ecr 1369572169], length 0

Conclusion
The TCP Keep-Alive packets are an old but still relevant solutions to timeout problems, especially when dealing with HTTP requests through proxies. Although we notice that Apache based proxies don't support them now, it might work with some other proxies, or maybe with Apache in a future release if someone wants to implement and publish this. On a wide Linux platform with many web services behind load-balancing proxies such as F5, I bet it's a good thing to know.
We see that Java supports the TCP Keep-Alive packets and this functionality is quite easy to implement with Java sockets. When using AXIS web services, things are more complicated and it is necessary to spend some time before you exactly know how to achieve this goal. But at the end, it's still very few lines of code to implement this. Unfortunately, we conclude that AXIS developers were maybe lazy to implement this, or maybe they just didn't know how this could be used. As a rule of thumb, when implementing a library that will be used widely, I think it is better considering all the functionalities coming from the underlying API, rather than choosing to just implement a subset. This way, you allow future usages of your product that seem weird or unknown by the time of writing.

Attachment: 

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer