In my current journey to master Java again I decided to explore network programming as a way to practice. I'm used to code network application in plain old C, so in these posts I will be comparing Java and C so you can have some starting points for your transition in either direction.
Let's start with a basic TCP server. The classical echo
server will work fine to introduce the API for both languages.
A basic TCP ECHO Server in C
Let's start with the C version as this is, somehow the simplest code we can write interacting directly with the underlying operating system... Well, actually, at the OS level, the socket functions are implemented using the socketcall
system call, which encapsulates all the functions we are going to use in this example. Anyway, despite of the actual interface, this code describes what the OS offers us to build a simple TCP server.
This is the minimal code for a TCP Server
#include <stdio.h>
#include <string.h>
#include <stdlib.h> // exit
#include <unistd.h> // read/write
#include <sys/types.h> // Socket
#include <sys/socket.h>
#include <arpa/inet.h> // hton
#define FAIL(s) do {perror (s); exit (EXIT_FAILURE);} while (0)
int main () {
const int port = 1234;
int s, s1;
int sa_len = sizeof(struct sockaddr_in);
struct sockaddr_in server, client;
if ((s = socket (AF_INET, SOCK_STREAM, 0)) < 0) FAIL ("socket:");
server.sin_addr.s_addr = INADDR_ANY;
server.sin_family = AF_INET;
server.sin_port = htons (port);
if ((bind (s, (struct sockaddr *) &server,
sizeof (struct sockaddr_in))) < 0) FAIL ("bind:");
if (listen (s, 1) < 0) FAIL ("listen:");
while (1) {
pritnf ("+ Accepting connections on port : %d\n", port);
if ((s1 = accept (s, (struct sockaddr*) &client, &sa_len)) < 0)
FAIL ("accept:");
printf ("+ Connection from : %s\n", inet_ntoa (client.sin_addr));
unsigned char ibuf[1024];
int n;
if ((n = read (s1, ibuf, 1024)) <= 0) continue;
write (s1, "ECHO:\n", 6);
write (s1, ibuf, n);
close (s1);
}
close (s);
return 0;
}
The program shows the main steps required to set up an echo server:
- Create the socket
- Bind the socket to a port
- Set it to accept connections (
listen
) and configure the backlog (how many connections can be pending at a given time). - Enter an infinite loop where connections are accepted and whatever is received (
read
) is written back (write
)
NOTE: This is not a basic network programming tutorial, so I'm not going to explain every single line in the programs. I'm assuming you already know the basics.
In the code above I have used read
and write
, instead of send
and recv
to make the code closer to the Java version that we are going to see in a sec. But we could have used those system calls instead.
Finally, note that the only calls that are server specific are listen
and accept
. The bind
syscall binds the socket to a specific address and port and that can be used also with client sockets to, for instance force the use of a specific interface (or IP Address if you prefer) in case multiple are available or a specific source port.
Same for the server, in this case we used INADDR_ANY
that is actually 0.0.0.0
. This means the server will listen on all available interfaces/addresses. In case we had multiple interfaces and we want our server to just accept connections in one of those interfaces we could indicate this here. If you have ever configured a webserver you have likely seen this.
SO_BINDTODEVICE
socket option that can be set using setsockopt
to specify the actual interface to bind to for some types of sockets.
To illustrate this, let's bind our server to our current IP. We can write code to read it and set this automatically, but just for this test, lets quickly modify the code above. Imagine the server is running on IP 192.168.1.16
or in hexadecimal C0.A8.1.20
, so we can change the code above like this:
server.sin_addr.s_addr = htonl(0xC0A80120);
Now if you try to connect to the server using netcat
. You will get this:
$ nc localhost 1234 # This fails $ nc 192.168.1.16 1234 # This works
The first netcat tries a local connection and therefore it will use the loopback interface with IP 127.0.0.1
, while the second one, will try to use the current active interface (that is just one in my case).
A basic TCP ECHO Server in Java
From its first versions Java provides Socket
objects to build basic TCP and UDP applications. There was also HTTP
support even in the first versions. Later on, the NIO
package was introduced adding more possibilities ... but we will look into that a bit later.
Let's see how a minimal TCP server looks like in Java
import java.net.*;
import java.io.*;
class TCPServerSocket {
public static final int port = 1234;
public static void main (String[] args) {
ServerSocket s;
try {
s = new ServerSocket (port);
while (true) { // Will run forever
System.out.println ("Accepting connection on port : "+port);
Socket s1 = s.accept ();
System.out.println ("+ Connection from : " + s1.getRemoteSocketAddress());
OutputStream out = s1.getOutputStream ();
InputStream in = s1.getInputStream ();
byte[] buf = new byte[1024];
in.read (buf);
out.write ("ECHO:\n".getBytes());
out.write (buf);
s1.close ();
}
s.close ();
} catch (Exception e) {
e.printStackTrace ();
}
}
}
As we can see the code is shorter. A closer look shows this ServerSocket
class that in addition to create the socket is also encapsulates the listen
and bind
syscalls which contributes to make the code sorter.
Also the accept
method hides the return of the client address which saves some variable declarations.
In order to read and write data to/from the socket, the Socket
class doesn't provide direct methods for that. Instead it integrates the class in the general Java class hierarchy and provides methods to obtain general streams
to read and write data. Once those stream objects are acquired, we can just read
and write
byte arrays as we did in the C version.
In case you wonder if is it possible to change the server socket backlog or specify an interface?, the answer is yes. The ServerSocket
class provides additional constructors to specify those values. It also provides the bind
method that can be also used to specify that information. This is just one way to specify the binding address:
s = new ServerSocket (port, 1, InetAddress.getByName("192.168.1.16"));
Text Oriented Protocols
The use of read
and write
with byte arrays is the lowest level interface provided by the operating system, but it is very inconvenient when working with text oriented protocols. There are many TCP applications that uses text-based protocols: HTTP, SMTP, etc... So it is useful, in those cases, to have printf
and scanf
like functions.
Java already provides us with those functions thanks to the use of streams to interact with the socket. That lets us decorate those streams with other classes that will allow us to work easily with different types of protocols.
For text based protocols the PrintWriter
and the BufferedReader
classes do the job. This is how the server looks like using these utility classes:
import java.net.*;
import java.io.*;
class TCPServerSocket {
public static final int port = 1234;
public static void main (String[] args) {
int cnt = 0;
ServerSocket s;
PrintWriter out;
BufferedReader in;
try {
s = new ServerSocket (port);
while (true) { // Will run forever
System.out.println ("+ Accepting connection on port : "+port);
Socket s1 = s.accept ();
System.out.println ("+ Connection from : " + s1.getRemoteSocketAddress());
cnt ++;
out = new PrintWriter (s1.getOutputStream (), true);
in = new BufferedReader (new InputStreamReader (s1.getInputStream ()));
String msg = in.readLine();
out.println ("Connection " + cnt + " | You Said: " + msg);
s1.close ();
}
} catch (Exception e) {
e.printStackTrace ();
}
}
}
As you can see, now we can work with the socket as we use to do with System.out
and System.in
. Note that in the previous version, when writing the echo back to the client, we had to call the getBytes()
method to convert the string
into the byte[]
expected by write
. Now we can use strings directly with this modification.
Text Based protocols in C
In C there are no support classes that will help us working with the socket
as text streams, so we need to write some code to get that functionality. When implementing this, there are two things to take into account:
- We can read more than one text line in a single
read
- We can get partial lines from our read operations.
This basically means that we need to buffer the network data. Our echo server is very simple so we will start with a very simple implementation. We will extend it in following posts.
But first, let's see live the issues we had mentioned before. Let's create a test file with, for instance, a content like this:
00 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF12345
01 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
02 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
03 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
04 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
05 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
06 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
07 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
08 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
09 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
10 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
11 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
12 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
13 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
14 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
15 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQ
If we send this file to our first Java server, we get the following output:
$ nc 192.168.2.6 1234 < kk.txt ECHO: 00 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF12345 01 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 02 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 03 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 04 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 05 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 06 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 07 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 08 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 09 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 10 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 11 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 12 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 13 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 14 $
We can clearly see how the lines are ignored and all the file is considered like a block. Also note that the line 14th is no returned.. only the two first characters with the line number we introduced in our text file. That is because our buffer in the program is set to 1024 bytes and our file is 1169 bytes, so we cannot read it in one single shot.
If we do the same against the second version of our Java server:
$ nc 192.168.2.6 1234 < kk.txt Connection 1 | You Said: 00 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF12345 $
In this case we return just the first line and the rest of the of the file is ignored. We can now extend the server to read all the lines send in one connection before closing it. For the Java version, we just need to put the line reading and echoing in a loop.
(...)
String msg;
while ((msg = in.readLine()) != null) out.println ("Connection " + cnt + " | You Said: " + msg);
(...)
The C version
To implement the same thing on C we need to write our own BufferedReader
class. In other words, whenever we ask for a line, this is what the code will do.
- Check if there is a complete line in the internal buffer. If so, return that and remove the line from the internal buffer
- If not, read data from the network until we found a new line
- Get back to step 1
So, first we will implement a buffer object to abstract these operations with a similar interface to the one provided by Java so we get an idea of what is going on under the hood on that single line loop we used in the previous section to extend the echo server.
#define ERROR(s) do {fprintf (stderr, s); exit (EXIT_FAILURE);} while (0)
typedef struct buf_T {
int s; // Socket to read data from
unsigned char *buf; // Internal buffer
int size; // Size of the internal buffer
int off; // Offset to the empty part of the buffer
} BUF;
BUF *buf_new (int s, int size) {
BUF *tmp;
tmp = malloc (sizeof(BUF));
tmp->buf = malloc (size);
tmp->s = s;
tmp->size = size;
tmp->off= 0;
return tmp;
}
void buf_free (BUF *b) {
if (!b) return;
if (b->buf) free (b->buf);
free (b);
}
char *buf_readline (BUF *b) {
int len, r;
unsigned char *str, *aux;
aux = strchr (b->buf, '\n');
if (aux == NULL) // No line found... try to read more data from network
{
while (1) {
r = read (b->s, b->buf + b->off, b->size - b->off -1);
if (r == 0) return NULL; // This means connection closed by client
if (r < 0 ) FAIL("read:"); // On error we just abort
b->off += r; // Update internal offset
if (b->off >= b->size) ERROR ("buffer overflow");
// If we have already got a line, let's process it
if ((aux = strchr (b->buf, '\n'))) break;
}
}
// Duplicate the string
len = aux-b->buf + 1;
str = strndup (b->buf, len);
// Compact buffer. Use memmove as memory regions may overlap
memmove (b->buf, b->buf + len, b->size - len);
b->off -= len;
b->buf[b->off] = 0;
return str;
}
The two first functions are the constructor
and destructor
for our object and the third one is the readline
method that just implements the algorithm we described above.
With this new object, our main function will look like this:
int main () {
const int port = 1234;
int s, s1;
int sa_len = sizeof(struct sockaddr_in);
struct sockaddr_in server, client;
BUF *buf;
if ((s = socket (AF_INET, SOCK_STREAM, 0)) < 0) FAIL ("socket:");
server.sin_addr.s_addr = htonl(INADDR_ANY);
server.sin_family = AF_INET;
server.sin_port = htons (port);
if ((bind (s, (struct sockaddr *) &server, sizeof (struct sockaddr_in))) < 0)
FAIL ("bind:");
if (listen (s, 1) < 0) FAIL ("listen:");
while (1) {
printf ("+ Accepting connections on port : %d\n", port);
if ((s1 = accept (s, (struct sockaddr*) &client, &sa_len)) < 0) FAIL ("accept:");
printf ("+ Connection from : %s\n", inet_ntoa (client.sin_addr));
buf = buf_new (s1, 1024);
char *str;
while ((str = buf_readline (buf)) != NULL)
{
write (s1, "ECHO:\n", 6);
write (s1, str, strlen(str));
free (str);
}
buf_free (buf);
close (s1);
}
close (s);
return 0;
}
Implementing a PrintWriter
The previous code solves the problem of reading multiple lines from the network, even when they are send all together in a single packet and they arrive all at once and, potentially, the last one is incomplete.
But, we are still using the write
syscall to send data and we would like to also use something more user friendly like the Java PrintWriter
. The C equivalent to a PrintWriter
is a dprintf
function. Using this function the main loop will look like this:
while ((str = buf_readline (buf)) != NULL)
{
dprintf (s1, "ECHO:\n%s", str);
free (str);
However, it would be useful to know how to implement these kind of functions so we can customise what will be outputted to the network. As example, this is a function that just pre-pends a global counter to each message, effectively showing the number of lines received by the server, since it was started up.
int cnt = 0;
int net_printf (int s, const char *format,...) {
va_list args;
unsigned char buf[1024];
int r, l;
va_start(args, format);
cnt ++;
l = snprintf (buf, 1024, "[%03d] : ", cnt);
l += vsnprintf (buf + l, 1024-l, format, args);
r = write (s, buf, l);
va_end(args);
return r;
}
First we add the counter to the string using snprintf
. Then we just append everything else to the string using vsnprintf
. In this example we could had used dpritnf
and obtain the exact same result.
A more realistic example could be a customised printf
extended with some extra format string, for instance imagine you want to add a '%H` to print MD5 hashes as hex dump, then you need to write something like the function above.
Conclusions
In this first instalment we have seen how a minimal TCP server can be implemented in C and in Java. We have also get a glimpse of what happens under the hood when we use decorator classes like PrintWriter
or BufferedReader
. Overall, the main conclusion of this first part is that network programming is kind of independent of the programming language.... despite of the verbosity of the specific language the steps we have to take to code a server are the same and they are, somehow, imposed by the underlying OS interface.
■