Java vs C Network Programming. Java NIO Buffers
 
PROGRAMMING
Java vs C Network Programming. Java NIO Buffers
2021-05-24 | by David "DeMO" Martínez Oliveira

In the previous instalment of this series we built a TCP echo server using standard Java sockets and also the direct operating system syscalls in a C version. Both programs looked very similar. In this instalment we are going to explore the Java NIO package which offers some advantages.

The original Java socket interface was pretty limited when compared to what the OS offers to a C programmer. The NIO package, even when not just covering sockets, brings to Java some of those functions.

Let's start looking to the basic channel and buffer objects. Specifically these two doesn't bring up much, but they are the basic objects required to build more complex NIO applications.

Java Echo Server with NIO

Let's start re-writing our basic TCP Server using NIO channels and buffers and briefly describe these two entities.

import java.net.*;
import java.nio.channels.*;
import java.nio.ByteBuffer;

class TCPServerNIO {
  public static final int port = 1234;

  public static void main (String[] args) {
    ServerSocketChannel s;
      
    try {
      // Create server socket and bind
      s = ServerSocketChannel.open();
      s.bind(new InetSocketAddress(port));

      while(true){
        System.out.println ("+ Accepting connections");
        SocketChannel sc = s.accept();
        ByteBuffer    buf = ByteBuffer.allocate (1024);
        System.out.println ("+ Connection from " + sc.getRemoteAddress());
    
        // Echo server
        while (true) {
          int len = sc.read (buf);
          if (len <= 0) break;

          // Socket channel was writting in buffer. Now we gonna read from buffer
          buf.flip ();

          // Convert message into string
          byte[] msg = new byte [len];
          buf.get(msg);
          String desc = new String (msg);
      
          if (len > 0) {
            // Build response
            buf.clear ();
            buf.put ("ECHO : ".getBytes());
            buf.put (msg);
            // We are done, ready to write back
            buf.flip();

            sc.write (buf);
          }
          buf.clear ();
        }
        sc.close ();    
      }
    } catch (Exception e) {
      e.printStackTrace ();
    }
  }
}

The channel object ServerSocketChannel works like the ServerSocket we used in our previous server, but it now has a read method that can get data from the socket directly into a Buffer and a wite method that can also work with Buffer objects. Other than that, the main difference is how those so-called Buffers work.

This is a quick and dirty summary:

  • A Buffer has a capacity (that is the size we use to allocate it), a limit (that is the how many data we can read or write in it) and a position (the index in the buffer when the next read/write operation will work).
  • A Buffer is initialised in WRITE mode. In this mode the limit is set to the capacity (we can use the whole buffer to write) and the position is set to 0. Every time we write something in the buffer, the position is updated.
  • At any time the mode of the Buffercan be switched to READ, that means that the limit is set to the current position value (we can read as much data as we had written) and position is reset to 0, so we start reading from the beginning of the buffer.

That is roughly how it works. There are additional methods to work with Buffer but we had only used clear in our example that, basically resets the Buffer setting limit and position to zero, leaving the Buffer ready to start writing.

Many of the Buffer methods just returns the Buffer object which allows us to chain invocations. Using this feature the code in our server that writes the response:

    buf.clear ();
    buf.put ("ECHO : ".getBytes());
    buf.put (msg);
    buf.flip();

Can be written like this:

buf.clear ().put ("ECHO : ".getBytes()).put (msg).flip();

Now, let's try to re-write this program in C to better understand what happens under the hood.

Channels in C

Let's start implementing some basic Channel object in C. C is not object oriented so I'm going to take some shortcuts, specially inheritance-wise. Our original Java code use two channel classes: ServerSocketChannel and SocketChannel.

This hierarchy properly removes, for instance, read, write and connect methods in the ServerSocketChannel where they do not make much sense, or the accept method in the normal socket channel. I'm not going to implement the whole interface here, but just the relevant parts for the current example.

So, the Channel class in C will be very simple:

#define CHANNEL_SERVER -1
#define CHANNEL_CLIENT -2

typedef struct channel_t
{
  int                  s;
  int                  type;
} CHANNEL;

Just a field to store the OS socket and a flag to know if our channel is a server or a client. This is mostly used during initialisation. As we already know, the main difference between these two types (client and server) at initialisation is the call to listen or connect. We are not dealing with clients yet so, let's forget about connect.

You may have noted that I used negative constants for the type... This is unorthodox as negative values always represent errors, but in thus case it was convenient for me as you will see in a second. It will let me take a few shortcuts and reduce the number of paramters and function calls so the size of the C and the Java version are similar.

Channel Constructor and Destructor

Let's start with the class constructor (it is really just a function, but the function is actually to create an instance):

CHANNEL *channel_new (int type, char *host, int port)
{
  int                  must_bind = 0;
  struct sockaddr_in   addr;
  size_t               len = sizeof(struct sockaddr_in);
  
  CHANNEL *aux = malloc (sizeof (struct channel_t));
  if (type < 0) {
    if ((aux->s = socket (AF_INET, SOCK_STREAM, 0)) < 0) FAIL ("socket:");
    aux->type = type;
  } else {
    aux->s = type;
    aux->type = CHANNEL_CLIENT;
  }
  
  addr.sin_family = AF_INET;
  addr.sin_addr.s_addr = htonl(INADDR_ANY);
  addr.sin_port = htons (1234); // Default port
  // If host == NULL and port != -1 => Bind to INADDR_ANY + port
  // If host != NULL and port != -1 => Bind to host + 1234 (default port)
  // If host != NULL and port != -1 => Bind to host and port
  // If host == NULL and port == -1 => Do not bind (normal client)
  if (host != 0)
     {
      addr.sin_addr.s_addr = inet_addr(host);
      must_bind = 1;
    }
  if (port != -1)
    {
      addr.sin_port = htons (port);
      must_bind = 1;
    }
  if (must_bind)
    if ((bind (aux->s, (struct sockaddr *) &addr, len)) < 0)
      FAIL ("bind:");
  if (type == CHANNEL_SERVER)
    if ((listen (aux->s, 1)) < 0) FAIL("listen:");
  return aux;
}

Using this function we can mimic the Java ServerSocket classes that hides the socket creation as well as the bind and listen calls. That was one of the main reasons for the longer size of our original C program.

If you pay attention to the first lines in the function, I use the negative type value to indicate that I will like to create the socket. A positive value means that the socket has already been created externally and it is actually passed in the type parameter. As I said, this is very unorthodox and something you shouldn't do in real SW, but for this current discussion it doesn't really make a difference (we are looking to how network apps works under the hood).

The destructor is even simpler:

void channel_free (CHANNEL *c)
{
  if (!c) return;
  close (c->s); // Close associated socket
  free (c);
}

Just close the socket and free memory.

Channel accept

Let's take a quick look again to the Java program. We just have the Channel open and bind and then the accept method. As mentioned before I integrated the bind call into the constructor kindof what is done in the ServerSocket class (see previous instalment). However, the accept method know needs to return a new Channel instead of a Socket so we can use the Buffer object with it.

Roughly this means that we also need to implement this method:

CHANNEL *channel_accept (CHANNEL *c) {
  CHANNEL              *aux;
  struct sockaddr_in   client;
  socklen_t            sa_len = sizeof (struct sockaddr_in);
  int                  s;
  
  if (!c) return NULL;
  if ((s = accept (c->s, (struct sockaddr*) &client, &sa_len)) < 0)
    FAIL ("accept:");
  
  printf ("+ Connection from : %s (fd:%d)\n", inet_ntoa (client.sin_addr), s);
  aux = channel_new (s, NULL, -1); // Do not bind
  return aux;
}

Here we just run the accept on the socket stored in our channel object (we should check if the type is server here...). Then, whenever a connection is accepted we create a new Channel object passing as type the socket returned by accept.... effectively initialising the new Channel object with that descriptor.

Now we just need to implement the read and write methods working with Buffer... but first, lets implement the Buffer class

The Buffer class

The information we need to store for each buffer object is summarised in the data structure be;low:

#define BUF_MODE_WRITE 1
#define BUF_MODE_READ  2

typedef struct buffer_t
{
  int            mode;  // READ/WRITE
  unsigned char *buf;   // Internal buffer
  int            size;  // Internal buffer size
  int            len;   // Available Space
  int            off;   // Read/Write Pointer
} BUFFER;

I slightly changed the name of the fields compared to the Java documentation... just to keep your attention. Respectively size = capacity, len = limit and off = position.

The constructor for this class is also very basic just allocates the required memory and initialises all the fields in the structure:

BUFFER *buffer_new (int size)
{
  BUFFER  *tmp;
  
  tmp = malloc (sizeof(BUFFER));
  tmp->buf = malloc (size);
  tmp->size = size;
  tmp->off= 0;
  tmp->len = size;
  tmp->mode = BUF_MODE_WRITE;
  
  return tmp;
}

The destructor just frees the memory

void buffer_free (BUFFER *b)
{
  if (!b) return;
  
  if (b->buf) free (b->buf);
  free (b);
}

Now we can write the channel read and write methods:

From Channels to Buffers and back

Now we can implement the equivalent read and write methods we used with the Java Channels. Let's start with the read.

int channel_read (CHANNEL *c, BUFFER *b)
{
  if (!b) return -1;
  if (!c) return -1;
  if (b->mode != BUF_MODE_WRITE) return -1;
  
  int r = read (c->s, b->buf+b->off, b->len);
  b->len -= r;
  b->off += r;
  
  return r;
}

Pretty straightforward. After some basic sanity checks, we just read data directly into the buffer. We try to read as much data as available space (b->len). Then, we move the pointer in the buffer and update the available space depending on how much data we read.

The write method is pretty similar:

int channel_write (CHANNEL *c, BUFFER *b)
{
  if (!c) return -1;
  if (!b) return -1;
  if (b->mode != BUF_MODE_READ) return -1;
  
  int r = write (c->s, b->buf, b->len);
  b->len += r;
  b->off -= r;
  
  return r;
}

In this case we write whatever is in the buffer, and move the pointer (for further reads get) and the available space. Easy peasy.

Finishing the Buffer implementation

To finish the Buffer implementation, let's implement the rest of the metods we used in the Java version (e.g. put, get, clear and flip).

BUFFER* buffer_get (BUFFER *b, unsigned char *p, int size)
{
  if (!b) return NULL;
  if (!p) return NULL;
  if (b->mode != BUF_MODE_READ) return NULL;
  if (size < 0 || size < (b->size - b->len)) return NULL;
  
  memcpy (p, b->buf, size);
  b->len += size;
  b->off -= size;
  
  return b;
}

The get method is basically the same thing that the channel_write but copying the buffer content to some memory location, instead of sending it through the network.

The same way, put method looks pretty much like the channel_read but working on memory:

BUFFER* buffer_put (BUFFER *b, unsigned char *p, int size)
{
  if (!b) return NULL;
  if (!p) return NULL;
  if (b->mode != BUF_MODE_WRITE) return NULL;
  if (size < 0 || size > b->len) return NULL;
  
  memcpy (b->buf + b->off, p, size);
  b->len -= size;
  b->off += size;
  
  return b;
}

The flip method just swaps the mode....

BUFFER* buffer_flip (BUFFER *b)
{
  if (!b) return NULL;
  if  (b->mode == BUF_MODE_WRITE) b->mode = BUF_MODE_READ;
  else b->mode = BUF_MODE_WRITE;

  return b;
}

And the clear method, just resets every thing:

BUFFER* buffer_clear (BUFFER *b)
{
  if (!b) return NULL;
  
  b->off = 0;
  b->len = b->size;
  b->mode = BUF_MODE_WRITE;
  memset (b->buf, 0, b->size);
  
  return b;
}

A new main

Now with this preliminary Buffer class we can re-implement our main function. It will look like this:

int main () {
  CHANNEL              *c, *c1; 
  BUFFER               *buf;

  if ((c = channel_new (CHANNEL_SERVER, NULL, 1234)) < 0) FAIL("channel_new:");
  
  while (1) {
    printf ("+ Accepting connections\n");
    if ((c1 = channel_accept (c)) == NULL) FAIL ("channel_accept:");
    buf = buffer_new (1024);
    while (1)
      {
        int len = channel_read (c1, buf);
        if (len <=0) break;
      
        buffer_flip (buf);
        unsigned char *msg = malloc (len);
        buffer_get (buf, msg, len);
        printf ("RECV (%d) : %s", len, msg);
        if (len > 0)
          {
            buffer_clear (buf);
            buffer_put (buf, "ECHO : ", 7);
            buffer_put (buf, msg, len);
            buffer_flip (buf);
          
            channel_write (c1, buf);
          }
        free (msg);
        buffer_clear (buf);
      }
      buffer_free (buf);
      channel_free (c1);
      
    }
  channel_free (c);
  return 0;   
}

This version looks pretty much the same than the original Java version. The main difference are due to Java's syntatic sugar, as the this parameter that is automatically added to methods (we have to explicitly add it on our C implementation), and the garbage collector that allows us to forget about freeing available memory.

You may have noticed that I have been returned the buffer from the different Buffer functions. Unfortunately, because of the lack of automatic this parameter in C we cannot chain the calls as nicely as we did in the Java version....

buffer_flip (buffer_put (buffer_put (buffer_clear (buf), "ECHO : ",7), msg, len));

That works fine, but the fact that the functions has to be called in reverse order doesn't add much and, to be honest, this is harder to read than the original calls. Yes, this is the same we get with the function composition operand in maths.

A closer OO interface

We can do a bit better to chain the buffer calls in our main program. For that we need to improve our Buffer Class and make it more OO-fiendly.

The first thing we need to do is to.... somehow, include the methods inside the object. We can do that adding function pointers in the data structure. Something like this:

typedef struct buffer_t
{
  int               mode;  // READ/WRITE
  unsigned char     *buf;   // Internal buffer
  int               size;  // Internal buffer size
  int               len;   // Current buffer usage
  int               off;   // Read/Write Pointer
  // Methods
  struct buffer_t * (*flip) (struct buffer_t *b);
  struct buffer_t * (*clear)(struct buffer_t *b);
  struct buffer_t * (*put)  (struct buffer_t *b, unsigned char*p, int l);
} BUFFER;

You may prefer to use your type alias (BUFFER) instead of the long struct tag. In that case you can just use a forward declaration before the struct.

typedef struct buffer_t BUFFER;

typedef struct buffer_t
{
  int            mode;  // READ/WRITE
  unsigned char *buf;   // Internal buffer
  int            size;  // Internal buffer size
  int            len;   // Current buffer usage
  int            off;   // Read/Write Pointer
  // Methods
  BUFFER         *(*flip) (BUFFER *b);
  BUFFER         *(*clear)(BUFFER *b);
  BUFFER         *(*put)  (BUFFER *b, unsigned char*p, int l);
} BUFFER;

This is the equivalent to a class with virtual (actually abstract at this point) methods. So we need to provide an implementation. This is done on OO languages defining the class but C doesn't give us that option so we need to initialise the methods in the constructor.

Somehow this is closer to prototype-based or object-based languages where objects are created out of other objects instead of classes. Anyhow, the constructor will now look like this:

BUFFER *buffer_new (int size)
{
  BUFFER  *tmp;
  
  tmp = malloc (sizeof(BUFFER));
  tmp->buf  = malloc (size);
  tmp->size = size;
  tmp->off  = 0;
  tmp->len  = size;
  tmp->mode = BUF_MODE_WRITE;
  
  tmp->flip  = buffer_flip;
  tmp->clear = buffer_clear;
  tmp->put   = buffer_put;
  
  return tmp;
}

And now, with this change, we can re-write the code to send the echo response back like this:

 buf->clear(buf)->put(buf, "ECHO : ", 7)->put(buf, msg,len)->flip(buf);

We still need to pass the this object to each function, and this just works here because all functions returns always the same object so we can just pass the same object over and over

Conclusion

We had got started with the Java NIO and we did a simplified C implementation to get an idea of what happens under the hood while using the classes from that package. We have just looked into the basics Buffers and Channels, but the NIO package provides much more possibilities that we will be exploring in next instalments.


 
Tu publicidad aquí :)