Quantcast
Channel: Tech Tutorials
Viewing all articles
Browse latest Browse all 897

How to Compress MapReduce Job Output in Hadoop

$
0
0

You can choose to compress the output of a Map-Reduce job in Hadoop. You can configure to do it for all the jobs in a cluster or you can set properties for specific jobs.

Configuration parameters for compressing MapReduce job output

  • mapreduce.output.fileoutputformat.compress- Set this property to true if you want to compress the MapReduce job output. Default value is false.
  • mapreduce.output.fileoutputformat.compress.type- This configuration is applicable if your MapReduce job output is a sequence file. In that case you can specify any one of these value for compression- None, Record or Block. Default is Record.
  • mapreduce.output.fileoutputformat.compress.codec– Which codec is to be used for compression. Default is org.apache.hadoop.io.compress.DefaultCodec

Configuring at cluster level

If you want to compress output of all MapReduce jobs running on the cluster, then you can configure these parameters in mapred-site.xml.
As example- If you want to compress the output of MapReduce jobs and the compression format used is Gzip.


<property>
<name>mapreduce.output.fileoutputformat.compress</name>
<value>true</value>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress.type</name>
<value>RECORD</value>
</property>
<property>
<name>mapreduce.output.fileoutputformat.compress.codec</name>
<value>org.apache.hadoop.io.compress.GzipCodec</value>
</property>

Configuring at per-job basis

If you want to compress output of the specific MapReduce job then add the following properties in your job configuration.


FileOutputFormat.setCompressOutput(job, true);
FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class);
If output is a sequence file then you can set compression type too.

SequenceFileOutputFormat.setOutputCompressionType(job, CompressionType.BLOCK);

That's all for this topic How to Compress MapReduce Job Output in Hadoop. If you have any doubt or any suggestions to make please drop a comment. Thanks!


Related Topics

  1. How to Compress Intermediate Map Output in Hadoop
  2. Data Compression in Hadoop
  3. Compressing File in snappy Format in Hadoop - Java Program
  4. Word Count MapReduce Program in Hadoop
  5. How MapReduce Works in Hadoop

You may also like -

>>>Go to Hadoop Framework Page


Viewing all articles
Browse latest Browse all 897

Trending Articles