Useful abstractions for I/O

Pretty much any program needs to do I/O to process some data. Sure, in Java we have InputStream and OutputStream (or its modern day cousin, Channel) to do this with relative ease. However, this is not the end of the story. In everyday programming, I often need some higher level abstractions. Actually, I need them so often that I wonder why there are no abstractions like the following in the JSE API.

So what are these abstractions? Let me gradually introduce them to you.

Sources and Sinks

The first abstraction is for reading and writing data:

public interface Source {
    InputStream input() throws IOException;
}

And its sister interface:

public interface Sink {
    OutputStream output() throws IOException;
}

Couldn’t be much simpler, could it? True, but this is a very powerful abstraction! Let’s see why:

  • The interfaces are pre-configured: A caller has to provide no parameters (well, they couldn’t even do that).
  • The interfaces are reusable: Calling their methods multiple times creates new streams every time (welcome to the factory pattern).
  • The interfaces are immutable (look ma, no silly properties!): Multiple threads could use a Source to read from the same source at the same time (obviously this doesn’t work with a Sink)
  • The interfaces are storage media neutral: A Source could be reading data from a file, a preferences node, a resource on the class path or an array on the heap
    • whatever! Likewise for a Sink, obviously.
  • The interfaces are simple to implement: The fact that they only have a single method - but a very useful one - makes it more likely to meet lots of reusable implementations.
  • The interfaces are simple to use: This makes it more likely to use them for many different use cases.

Given, an implementation of these (we’ll meet some later), let’s see what you could do with them. Here’s a poor man’s copy algorithm (assuming Java 7):

public class Copy {
    public static void copy(Source source, Sink sink) throws IOException {
        try (InputStream in = source.input()) {
            try (OutputStream out = sink.output()) {
                final byte[] buffer = new byte[8192];
                int read;
                while (0 <= (read = in.read(buffer)))
                    out.write(buffer, 0, read);
            }
        }
    }

    private Copy() { }
}

This is a general-purpose copy algorithm. It’s a bit naive because it doesn’t utilize the system’s I/O channels to the fullest: A more sophisticated implementation would offload all reading to a pooled background thread and exchange the data with the foreground thread via a ring buffer. But showing the code for this would be well beyond the scope of this posting.

Transformations

Given the Source and Sink interfaces, lets look at the Transformation interface:

public interface Transformation {
    Sink apply(Sink sink);
    Source unapply(Source source);
}

This seems a bit abstract, so let’s explain: A transformation is simply a function which gets applied when writing data and unapplied when reading data - that’s it! The most simple implementation is the identity transformation:

public class IdentityTransformation implements Transformation [
    @Override public Sink apply(Sink sink) { return sink; }
    @Override public Source unapply(Source source) { return source; }
}

More useful implementations could compress data, encrypt data, or whatever you can imagine. Here’s an example for compressing data (assuming Java 7):

public class Compression implements Transformation {

    @Override public Sink apply(final Sink sink) {
        return new Sink() {
            @Override public OutputStream output() throws IOException {
                OutputStream out = sink.output();
                try {
                    return new GZIPOutputStream(out, 8192);
                } catch (Throwable ex) {
                    try { out.close(); }
                    catch (Throwable ex2) { ex.addSuppressed(ex2); }
                    finally { throw ex; }
                }
            }
        };
    }

    @Override public Source unapply(final Source source) {
        return new Source() {
            @Override public InputStream input() throws IOException {
                InputStream in = source.input();
                try {
                    return new GZIPInputStream(in, 8192);
                } catch (Throwable ex) {
                    try { in.close(); }
                    catch (Throwable ex2) { ex.addSuppressed(ex2); }
                    finally { throw ex; }
                }
            }
        };
    }
}

The Transformation interface shares its design principles with Source and Sink, which makes it extremely powerful. For example, consider you had the following setup:

Source source = ... // some source
Sink sink = ... // some sink
Transformation compression = new Compression();
Transformation encryption = new PbeEncryption(password); // assume you had this

Now you could make a copy of the data in the source and compress and encrypt it when writing to the sink like this:

Copy.copy(source, compression.apply(encryption.apply(sink)));

Simple and elegant thanks to its functional design, isn’t it? To inverse the transformation on a compressed and encrypted source, you would do this:

Copy.copy(compression.unapply(encryption.unapply(source)), sink);

That’s it for now. I’ll leave you imagining some useful transformations to use in your code. Next time, I’ll look into some caveats, provide some useful implementations of these interfaces and provide another useful interface, the Store.

Enjoy!