hdfs - Find first block of a file in hadoop -



hdfs - Find first block of a file in hadoop -

i storing 500 mb or larger video file in hdfs. since larger block size, distributed. have collect or work first on first block of data(here video file), since contain sequence header. how can or how find first block of info of file in hadoop?

it want read first block, can inputstream filesystem , read bytes until reaches predetermined amount (example block size 64mb 64 * 1024 * 1024 bytes). here's illustration (though 64mb lot of data. if think info need before 64mb, alter bytesleft)

import java.io.eofexception; import java.io.inputstream; import java.io.outputstream; import java.net.uri; import org.apache.hadoop.conf.configuration; import org.apache.hadoop.fs.filesystem; import org.apache.hadoop.fs.path; import org.apache.zookeeper.common.ioutils; public class testreaderfirstblock { private static final string uri = "hdfs://localhost:9000/path/to/file"; private static int bytesleft = 64 * 1024 * 1024; private static final byte[] buffer = new byte[4096]; public static void main(string[] args) throws exception { configuration conf = new configuration(); filesystem fs = filesystem.get(uri.create(uri), conf); inputstream = fs.open(new path(uri)); outputstream out = system.out; while (bytesleft > 0) { int read = is.read(buffer, 0, math.min(bytesleft, buffer.length)); if (read == -1) { throw new eofexception("unexpected end of data"); } out.write(buffer, 0, read); bytesleft -= read; } ioutils.closestream(is); } }

hadoop hdfs

Comments

Popular posts from this blog

php - Android app custom user registration and login with cookie using facebook sdk -

django - Access session in user model .save() -

php - .htaccess Multiple Rewrite Rules / Prioritizing -