Downloading files via HDFS and the Java API

Last post covered uploading files, so I thought it would be useful to do a quick download client as well. Again, we are using DFSClient and BufferedInput and BufferedOutputStreams to do the work. I split the file into 1024 byte chunks in the byte array, but for larger files, I guess you may want to modify that too.

Enough jabbering, to the code!

public void downloadFile() {
		try {
			Configuration conf = new Configuration();
			conf.set("fs.defaultFS", this.hdfsUrl);
			DFSClient client = new DFSClient(new URI(this.hdfsUrl), conf);
			OutputStream out = null;
			InputStream in = null;
			try {
				if (client.exists(sourceFilename)) {
					in = new BufferedInputStream(client.open(sourceFilename));
					out = new BufferedOutputStream(new FileOutputStream(
							destinationFilename, false));

					byte[] buffer = new byte[1024];

					int len = 0;
					while ((len = in.read(buffer)) > 0) {
						out.write(buffer, 0, len);
					}
				}
				else {
					System.out.println("File does not exist!");
				}
			} finally {
				if (client != null) {
					client.close();
				}
				if (out != null) {
					out.close();
				}
			}
		} catch (Exception e) {
			e.printStackTrace();
		}
	}

I use simple getters() and setters() to set the source and destination filenames, and have set the hdfsUrl to my namenode URI on the correct port.

Summary
Author Rating
4
Software Name
Hadoop
Landing Page