Recently, I have been learning some socket programing skills. Thanks to @_qiujuer_, I finally came to get the hang of the mysterious Sticky & Lost Package Problem and had a way to address it. Here is my little summary about what I have discovered.
What is Socket Sticky Package and Lost Package
You would have wondered: TCP as a stable protocole should never have sticky package problem as the order and integrity is guaranteed by the protocole itself.
And that is very true. As a matter of fact, instead of being a transport level problem, Sticky Package and Lost Package are actually logical level problems which could occur if we don't handle it right.
I will illustrate it by some images.
Sticky Package
Ideally, datatransfer could be like this so that the receiver could always distinguish each independant package.
However, in the real life, it is very often that the client send two or more packages continuously, which make it possible for server to treat two packages as a single package and hence leads to the famous sticky package problem just like the following image.
Lost Package
Basically, as server receives data, it will read from the channel and transfer it to user space buffer. If the server receives a single biiiiig package that exceeds the buffer's capacity, only part of the data will be extracted and hence results in problems like the following cases.
You see that together with sticky packages, you server could ended with a mess and all the packages just get mixed.
General Solution
The simplest solution is to add a length header at the beginning of each message so that it will be easy for the server to know how many bytes to read in order to get the whole message and only its own message.
Design a box to wrap the complete message
Here we have designed a packet box with a supplementary length
property and position property. 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23public class StringReceivePacket extends ReceivePacket {
private byte[] buffer;
private int position;
public StringReceivePacket(int len) {
this.buffer = new byte[len];
length = len;
}
@Override
public void save(byte[] bytes, int count) {
System.arraycopy(bytes, 0, buffer, position, count);
position += count;
}
public String string() {
return new String(buffer);
}
@Override
public void close() throws IOException {
}
}
Read data in an asynchronous manner
Everytime when the socket channel became readable, the selector that was registered for this channel will be notified and the selectionkey which corresponds to the socket channel will be selected. Hence, we could possibly define a callback runnable and put it together with the selectionkey in a map-like structure.
1 | abstract class HandleInputCallback implements Runnable { |
In this callback handler, we need to define the strategy about how we receive the bytes from the socket channel. Obviously, here we need an intermediate object that works as a transit medium to transport data from socket channel to StringReceivePacket continuously.
Here is our transit medium called IoArgs, which is just like a truck with fixed sized container (the buffer) that can be used as a intermediate relay.
1 | public class IoArgs { |
Firstly, we need to know the size of the arriving message by reading 4 bytes in the first place(int type represents 4 bytes). After knowing the size, we could initiate the StringReceivePacket, the final cutomizable container.
Then we will get all remaining data until it fills all the IoArg's(the transit medium) fixed buffer.
It is definitely possible that the remaining data is far more larger than the fixed buffer of the transit medium, which makes it impossible to carry all the data from the socket channel. But never fear, as the rest of the data is still in socket channel, it will be carried the next time by the transit medium through the callback runnable. What's more important is that the transit medium should be capable of reading exactly what StringReceivePacket is expecting, not a byte more, not a byte less!
1 | int receiveSize; |
Fill the box as a Japanese
Japanese are famous for their precision and that is exactly what we need in socket transport, to be accurate to bytes!
You should have noticed that in the code, we had some indicators like total, position, count... And these are exactly what we depends on to achieve our precision objective.
When we first get the length of the expecting message, we will manage a new variable called total. And in the mean time, we create a new variable called postion with a initial value of 0.
Each time the transit medium tries to read from the socket channel, we will set a limit [Math.min(total - position, args.capacity()] to make sure that read not more not less.
Afterwards, transit medium will give us a count variable for us to know how many bytes has been actually read from socket channel.
Then, when transit medium transports the data to StringReceivePacket, we need to accumulate the position variable by the count variable. Finally, by comparing the position and total variable, we could know if the StringReceivePacket has get its expected complete message.
1 | // transfer data received from socket channel via transit medium IoArgs to our dispatcher buffer |
Conclusion
To sum up, we have a StringReceivePacket as a final container and a IoArgs as a transit medium which continuously relay data from socket channel to our final container in an asynchronous manner. With some extra indicators (total, position, count ...), all participants knows exactly answers to the below questions.
- what's the size of expected message
- how many bytes have I read
- how many bytes yet to read
- Am I finished reading
By this way, we are able to solve the cumbersome sticky & lost package problem.