Hadoop Admin Interview Questions

  1. Explain Big data and its characteristics?
  2. How can you skip the bad records in Hadoop?
  3. What is the difference between a regular file system and HDFS?
  4. What is Avro Serialization in Hadoop?
  5. RDBMS vs Hadoop?
  6. What is Hadoop and list its components?
  7. What is YARN and explain its components?
  8. What are the Hadoop daemons and explain their roles in a Hadoop cluster?

Also Read: Hadoop vs Splunk

Hadoop HDFS Interview Questions 

  1. Does Hadoop requires RAID?
  2. What is the command used for printing the topology?
  3. What are the features of HDFS?
  4. How will you resolve the NameNode failure issue?
  5. What is the main purpose of the Hadoop fsck command?
  6. What is the default replication factor?
  7. Compare HDFS (Hadoop Distributed File System) and NAS (Network Attached Storage)?
  8. What is HDFS Federation?
  9. Explain HDFS and its components?
  10. What is RAID?
  11. List the various HDFS Commands?
  12. What are the limitations of Hadoop 1.0?
  13. What is the HDFS block size?
  14. What is the main functionality of NameNode?
  15. How to keep an HDFS cluster balanced?
  16. What is the difference between active and passive NameNodes?
  17. Which command is used to format the NameNode?
  18. How to commission (adding) the nodes in the Hadoop cluster?
  19. What is a Checkpoint Node in Hadoop?
  20. What is a rack-aware replica placement policy?
  21. What is DistCp?
  22. How to decommission (removing) the nodes in the Hadoop cluster?
  23. What is HDFS High Availability?
  24. List the various site-specific configuration files available in Hadoop?
  25. What is the purpose of the admin tool?
  26. List the different types of Hadoop schedulers.
  27. Compare Hadoop 1.x and Hadoop 2.x
  28. What is the purpose of a DataNode block scanner?
  29. How a client application interacts with the NameNode?

Also Read: Hadoop Alternatives

Hadoop MapReduce Interview Questions

  1. What is Identity Mapper?
  2. What are the phases of MapReduce Reducer?
  3. How does the MapReduce framework view its input internally?
  4. What are the different modes in which Hadoop can run?
  5. What is the purpose of Distributed Cache in a MapReduce Framework?
  6. Why aggregation cannot be performed in Mapperside?
  7. What are the basic parameters of Mapper?
  8. How will you write a custom partitioner for a Hadoop MapReduce job?
  9. What is the use of SequenceFileInputFormat in Hadoop?
  10. What are the features of MapReduce?
  11. What does the MapReduce framework consist of?
  12. What is “speculative execution” in Hadoop?
  13. What are Writables and explain their importance in Hadoop?
  14. What are the main configuration parameters for a MapReduce application?
  15. What are the methods used for restarting the NameNode in Hadoop?
  16. What are the two main components of ResourceManager?
  17. What is the purpose of MapReduce Partitioner in Hadoop?
  18. What is a Combiner?
  19. What is the importance of “RecordReader” in Hadoop?
  20. What is the difference between an “HDFS Block” and “MapReduce Input Split”?
  21. What is MapReduce and list its features?
  22. How do reducers communicate with each other in Hadoop?
  23. Why comparison of types is important for MapReduce?
  24. What are the steps involved to submit a Hadoop job?
  25. What is a Hadoop counter?

Apache Pig Interview Questions

  1. What is Apache Pig?
  2. What are the Hadoop Pig data types?
  3. List the various relational operators used in “Pig Latin”?
  4. What are the benefits of Apache Pig over MapReduce?

Apache Hive Interview Questions

  1. What is a SerDe?
  2. Where do Hive stores table data in HDFS?
  3. What are the differences between Hive and RDBMS?
  4. Can the default “Hive Metastore” be used by multiple users (processes) at the same time?
  5. What is Apache Hive?

Also Read: Hive Alternatives

Apache HBase Interview Questions

  1. What are the differences between the Relational database and HBase?
  2. What is an Apache HBase?
  3. What is WAL in HBase?
  4. What are the various components of Apache HBase?

Apache Spark Interview Questions

  1. Can we build “Spark” with any particular Hadoop version?
  2. What is Apache Spark?
  3. What is RDD?

Also Read: Hadoop vs Splunk

Apache ZooKeeper Interview Questions

  1. How can you configure the “Oozie” job in Hadoop?
  2. What is Apache Oozie?
  3. What is Apache ZooKeeper?

Apache Flume Interview Questions

  1. List the Apache Flume features?
  2. What is an Apache Flume?

Apache Sqoop Interview Questions

  1. Where do Hadoop Sqoop scripts are stored?
  2. What is the use of Apache Sqoop in Hadoop?

Also Read:  .Net Interview Questions & Answers