hdfs - Find first block of a file in hadoop -
hdfs - Find first block of a file in hadoop -
i storing 500 mb or larger video file in hdfs. since larger block size, distributed. have collect or work first on first block of data(here video file), since contain sequence header. how can or how find first block of info of file in hadoop?
it want read first block, can inputstream
filesystem
, read bytes until reaches predetermined amount (example block size 64mb 64 * 1024 * 1024 bytes). here's illustration (though 64mb lot of data. if think info need before 64mb, alter bytesleft)
import java.io.eofexception; import java.io.inputstream; import java.io.outputstream; import java.net.uri; import org.apache.hadoop.conf.configuration; import org.apache.hadoop.fs.filesystem; import org.apache.hadoop.fs.path; import org.apache.zookeeper.common.ioutils; public class testreaderfirstblock { private static final string uri = "hdfs://localhost:9000/path/to/file"; private static int bytesleft = 64 * 1024 * 1024; private static final byte[] buffer = new byte[4096]; public static void main(string[] args) throws exception { configuration conf = new configuration(); filesystem fs = filesystem.get(uri.create(uri), conf); inputstream = fs.open(new path(uri)); outputstream out = system.out; while (bytesleft > 0) { int read = is.read(buffer, 0, math.min(bytesleft, buffer.length)); if (read == -1) { throw new eofexception("unexpected end of data"); } out.write(buffer, 0, read); bytesleft -= read; } ioutils.closestream(is); } }
hadoop hdfs
Comments
Post a Comment