Make your I/O - quick!

This is a quick post about doing I/O quickly with some quick help of the TrueZIP File* API.

OK, slow down - there’s no need to rush over this. Have you seen something like this lately or even wrote it yourself?

import java.io.*;

void method() throws IOException {
    InputStream in = ...
    try {
        OutputStream out = ...
        try {
            cat(in, out);
        } finally {
            out.close(); // be a good girl and close all streams in a finally-block!
        }
    } finally {
        in.close(); // you do it like that all the time, right?
    }
}

/** Con<em>cat</em>enates {@code in} to {@code out}. */
void cat(InputStream in, OutputStream out) throws IOException {
    byte[] buf = new byte[512];
    for (int n; 0 <= (n = in.read(buf));) // Yoda conditions I like!
        out.write(buf, 0, n);
}

Then you need a cure! The problem is the naive read-stop-write-stop-loop implementation of the cat(*) method. There are many issues with this:

  1. A new byte buffer is allocated on the heap upon each call to the method.
  2. The byte buffer is way too small for modern computers and networks.
  3. The CPU reads the input, then stops to write the output and starts all over again. This causes scattered I/O with bad overall performance, especially when accessing a network.
  4. If an IOException occurs, you’ve got no clue if it happened when reading the input or writing the output.

Now here is your cure: Instead of using or even implementing a naive read-stop-write-stop-loop, use the method TFile.cat(InputStream, OutputStream) in the TrueZIP File* module:

import de.schlichtherle.truezip.io.*;
import java.io.*;

void method() throws IOException {
    InputStream in = ...
    try {
        OutputStream out = ...
        try {
            TFile.cat(in, out);
        } finally {
            out.close(); // be a good girl and close all streams in a finally-block!
        }
    } finally {
        in.close(); // you do it like this all the time, right?
    }
}

To make this compile, add the following dependency to your Maven Project Object Model pom.xml:

<dependency>
    <groupId>de.schlichtherle.truezip</groupId>
    <artifactId>truezip-file</artifactId>
    <version>7.0</version>
</dependency>

The benefits of TFile.cat(InputStream, OutputStream) are:

  1. A pooled background thread is used to read the input while writing the output concurrently.
  2. Data between the reader and writer thread is exchanged in a ring of pooled buffers being large enough for smooth data streaming even when accessing jittering network connections.
  3. If an IOException occurs when reading the input, it’s wrapped in an InputException which is a subclass of IOException.

The result is a performance which compares equal to FileChannel.transferTo(*), but works with any plain old InputStream and OutputStream.

Enjoy TrueZIP!