Latest Practice Test of HDPCD

Hortonworks HDPCD Questions & Answers

Full Version: 108 Q&A

Latest HDPCD Exam Questions and Practice Tests 2025 - Killexams.com

HDPCD Dumps HDPCD Braindumps HDPCD Real Questions HDPCD Practice Test

HDPCD Actual Questions

killexams.com Hortonworks HDPCD

Hortonworks Data Platform Certified Developer

https://killexams.com/pass4sure/exam-detail/HDPCD

QUESTION: 97

You write MapReduce job to process 100 files in HDFS. Your MapReduce algorithm uses TextInputFormat: the mapper applies a regular expression over input values and emits key- values pairs with the key consisting of the matching text, and the value containing the filename and byte offset. Determine the difference between setting the number of reduces to one and settings the number of reducers to zero.

There is no difference in output between the two settings.
With zero reducers, no reducer runs and the job throws an exception. With one reducer, instances of matching patterns are stored in a single file on HDFS.
With zero reducers, all instances of matching patterns are gathered together in one
file on HDFS. With one reducer, instances of matching patterns are stored in multiple files on HDFS.
With zero reducers, instances of matching patterns are stored in multiple files on

HDFS. With one reducer, all instances of matching patterns are gathered together in one file on HDFS.

Answer: D

Explanation:

It is legal to set the number of reduce-tasks to zero if no reduction is desired.
In this case the outputs of the map-tasks go directly to the FileSystem, into the output path set by setOutputPath(Path). The framework does not sort the map-outputs before writing them out to the FileSystem.
Often, you may want to process input data using a map function only. To do this,
simply set mapreduce.job.reduces to zero. The MapReduce framework will not create any reducer tasks. Rather, the outputs of the mapper tasks will be the final output of the job.
Note: Reduce
In this phase the reduce(WritableComparable, Iterator, OutputCollector, Reporter) method is called for each <key, (list of values)> pair in the grouped inputs.
The output of the reduce task is typically written to the FileSystem via OutputCollector.collect(WritableComparable, Writable).
Applications can use the Reporter to report progress, set application-level status messages and update Counters, or just indicate that they are alive.
The output of the Reducer is not sorted.

QUESTION: 98
Indentify the utility that allows you to create and run MapReduce jobs with any executable or script as the mapper and/or the reducer?
Answer: D

Explanation:
Hadoop streaming is a utility that comes with the Hadoop distribution. The utility allows you to create and run Map/Reduce jobs with any executable or script as the mapper and/or the reducer.

Reference:
http://hadoop.apache.org/common/docs/r0.20.1/streaming.html (Hadoop Streaming, second sentence)

QUESTION: 99
Which one of the following statements is true about a Hive-managed table?
1. Records can only be added to the table using the Hive INSERT command.
2. When the table is dropped, the underlying folder in HDFS is deleted.
3. Hive dynamically defines the schema of the table based on the FROM clause of a SELECT query.
4. Hive dynamically defines the schema of the table based on the format of the underlying data.
Answer: B

QUESTION: 100
You need to move a file titled “weblogs” into HDFS. When you try to copy the file, you can’t. You know you have ample space on your DataNodes. Which action should you take to relieve this situation and store more files in HDFS?
1. Increase the block size on all current files in HDFS.
2. Increase the block size on your remaining files.
3. Decrease the block size on your remaining files.
4. Increase the amount of memory for the NameNode.
5. Increase the number of disks (or size) for the NameNode.
6. Decrease the block size on all current files in HDFS.
Answer: C

QUESTION: 101
Which process describes the lifecycle of a Mapper?
1. The JobTracker calls the TaskTracker’s configure () method, then its map () method and finally its close () method.
2. The TaskTracker spawns a new Mapper to process all records in a single input split.
3. The TaskTracker spawns a new Mapper to process each key-value pair.
4. The JobTracker spawns a new Mapper to process all records in a single file.
Answer: B

Explanation:
For each map instance that runs, the TaskTracker creates a new instance of your mapper.
Note:
The Mapper is responsible for processing Key/Value pairs obtained from the InputFormat. The mapper may perform a number of Extraction and Transformation functions on the Key/Value pair before ultimately outputting none, one or many Key/Value pairs of the same, or different Key/Value type.
With the new Hadoop API, mappers extend the org.apache.hadoop.mapreduce.Mapper class. This class defines an 'Identity' map function by default - every input Key/Value pair obtained from the InputFormat is written out.
Examining the run() method, we can see the lifecycle of the mapper:
/**
Expert users can override this method for more complete control over the
execution of the Mapper.
@param context
@throws IOException

public void run(Context context) throws IOException, InterruptedException { setup(context);

while (context.nextKeyValue()) {

map(context.getCurrentKey(), context.getCurrentValue(), context);

}

cleanup(context);

}

setup(Context) - Perform any setup for the mapper. The default implementation is a no-op method.

map(Key, Value, Context) - Perform a map operation in the given Key / Value pair. The default implementation calls Context.write(Key, Value)

cleanup(Context) - Perform any cleanup for the mapper. The default implementation

is a no-op method.

Reference:

Hadoop/MapReduce/Mapper

QUESTION: 102

Which one of the following files is required in every Oozie Workflow application?

job.properties
Config-default.xml
Workflow.xml
Oozie.xml

Answer: C

QUESTION: 103

Which one of the following statements is FALSE regarding the communication between DataNodes and a federation of NameNodes in Hadoop 2.2?

Each DataNode receives commands from one designated master NameNode.
DataNodes send periodic heartbeats to all the NameNodes.
Each DataNode registers with all the NameNodes.
DataNodes send periodic block reports to all the NameNodes.

Answer: A

QUESTION: 104

In a MapReduce job with 500 map tasks, how many map task attempts will there be?

It depends on the number of reduces in the job.
Between 500 and 1000.
At most 500.
At least 500.
Exactly 500.

Answer: D

Explanation:

From Cloudera Training Course:

Task attempt is a particular instance of an attempt to execute a task

There will be at least as many task attempts as there are tasks
If a task attempt fails, another will be started by the JobTracker
Speculative execution can also result in more task attempts than completed tasks