Spark Multiple Choice Questions and Answers:
1)Point out the incorrect statement in the context of Cassandra:
A) Cassandra is a centralized key -value store
B) Cassandra is originally designed at Facebook
C) Cassandra is designed to handle a large amount of data across many commodity servers, providing high availability with no single point if failure.
D) Cassandra uses a right based DHT*Distribution Hash Table) but without finger tables or routing
Ans : D
2. Which of the following are the simplest NoSQL databases in BigData environment?
A) Document B) Key-Value Pair
C) Wide – Column D) All of the above mentioned
Ans : ) All of the above mentioned
3) Which of the following is not a NoSQL database?
A) Cassandra B) MongoDB
C) SQL Server D) HBase
Ans: SQL Server
4) Which of the following is a distributed graph processing framework on top of Spark?
A) Spark Streaming B)MLlib
C)GraphX D) All of the above
5) Which of the following is leverage of Spark core fast scheduling capability to perform streaming analytics?
A) Spark Streaming B) MLlib
C)GraphX D) RDDs
Ans: Spark Streaming
6) Which of the following Machine Learning API for Spark based on Which one:
A) RDD B) Dataset
C)DataFrame D) All of the above
7) Based on which functional programming language construct for Spark optimizer
A) Python B) R
C) Java D)Scala
Ans: Scala is a functional programming language
8) Which of the following is a basic abstraction of Spark Streaming?
A)Shared variable B)RDD
C)Dstream D)All of the above
9) In a which cluster manager to do support of Spark?
A) MESOS B)YARN
C) Standalone Cluster manager D) Pseudo Cluster manager
E) All of the above
Ans: All of the above
10) Which of the following is the reason for Spark being faster than MapReduce while execution time?
A) It supports different programming languages like Scala, Python, R, and Java.
C)DAG execution engine and in-memory computation (RAM based)
D) All of the above
Ans: DAG execution engine and in-memory computation (RAM based)